Journal of Computer Science and Technology ›› 2022, Vol. 37 ›› Issue (1): 29-49.doi: 10.1007/s11390-021-1663-7

Special Issue: Software Systems; Theory and Algorithms

• Special Section on Software Systems 2021 • Previous Articles     Next Articles

DeltaFuzz: Historical Version Information Guided Fuzz Testing

Jia-Ming Zhang1 (张家铭), Zhan-Qi Cui1,* (崔展齐), Senior Member, CCF, Member, IEEE, Xiang Chen2 (陈翔), Senior Member, CCF, Member, IEEE, Huan-Huan Wu1 (吴欢欢), Li-Wei Zheng1 (郑丽伟), Member, CCF, and Jian-Bin Liu1 (刘建宾)        

  1. 1Computer School, Beijing Information Science and Technology University, Beijing 100101, China
    2School of Information Science and Technology, Nantong University, Nantong 226019, China
  • Received:2021-06-01 Revised:2021-11-28 Accepted:2021-12-16 Online:2022-01-28 Published:2022-01-28
  • Contact: Zhan-Qi Cui E-mail:czq@bistu.edu.cn
  • About author:Zhan-Qi Cui received his B.E. degree in software engineering and his Ph.D. degree in computer software and theory, in 2005 and 2011, respectively, both from Nanjing University, Nanjing. He was a visiting Ph.D. student in University of Virginia, Virginia, from Sept. 2009 to Sept. 2010. He is currently an associate professor at Beijing Information Science and Technology University, Beijing. His research interests include software analysis and testing.
  • Supported by:
    This work was partially supported by the Leading-Edge Technology Program of Jiangsu Natural Science Foundation of China under Grant No.BK20202001, the National Natural Science Foundation of China under Grant No.61702041, and the Beijing Information Science and Technology University "Qin-Xin Talent'' Cultivation Project under Grant No.QXTCP C201906.

With the widespread use of agile software development methods, such as agile and scrum, software is iteratively updated more frequently. To ensure the quality of the software, regression testing is conducted before new versions are released. Moreover, to improve the efficiency of regression testing, testing efforts should be concentrated on the modified and impacted parts of a program. However, the costs of manually constructing new test cases for the modified and impacted parts are relatively expensive. Fuzz testing is an effective method for generating test data automatically, but it is usually devoted to achieving higher code coverage, which makes fuzz testing unsuitable for direct regression testing scenarios. For this reason, we propose a fuzz testing method based on the guidance of historical version information. First, the differences between the program being tested and the last version are analyzed, and the results of the analysis are used to locate change points. Second, change impact analysis is performed to find the corresponding impacted basic blocks. Finally, the fitness values of test cases are calculated according to the execution traces, and new test cases are generated iteratively by the genetic algorithm. Based on the proposed method, we implement a prototype tool DeltaFuzz and conduct experiments on six open-source projects. Compared with the fuzzing tool AFLGo, AFLFast and AFL, DeltaFuzz can reach the target faster, and the time taken by DeltaFuzz was reduced by 20.59%, 30.05% and 32.61%, respectively.

Key words: fuzz testing; regression testing; change impact analysis; fitness function;

[1] Masso J, Pino F J, Pardo C et al. Risk management in the software life cycle: A systematic literature review. Computer Standards & Interfaces, 2020, 71: Article No.103431. DOI: 10.1016/j.csi.2020.103431.
[2] Gu T, Ma X, Xu C et al. Synthesizing object transformation for dynamic software updating. In Proc. the 39th IEEE/ACM International Conference on Software Engineering Companion, May 2017, pp.336-338. DOI: 10.1109/ICSE-C.2017.96.
[3] Khatibsyarbini M, Isa M A, Jawawi D N A et al. Test case prioritization approaches in regression testing: A systematic literature review. Information and Software Technology, 2018, 93: 74-93. DOI: 10.1016/j.infsof.2017.08.014.
[4] Han J C, Zhou Z Q. Metamorphic fuzz testing of autonomous vehicles. In Proc. the 42nd IEEE/ACM International Conference on Software Engineering, June 27-July 19, 2020, pp.380-385. DOI: 10.1145/3387940.3392252.
[5] Dong Z, Böhme M, Cojocaru L et al. Time-travel testing of Android apps. In Proc. the 42nd IEEE/ACM International Conference on Software Engineering, June 27-July 19, 2020, pp.481-492. DOI: 10.1145/3377811.3380402.
[6] Qian J, Zhou D. Prioritizing test cases for memory leaks in Android applications. Journal of Computer Science and Technology, 2016, 31(5): 869-882. DOI: 10.1007/s11390-016-1670-2.
[7] Chen Y, Su T, Su Z. Deep differential testing of JVM implementations. In Proc. the 41st IEEE/ACM International Conference on Software Engineering, May 2019, pp.1257-1268. DOI: 10.1109/ICSE.2019.00127.
[8] Wüstholz V, Christakis M. Targeted greybox fuzzing with static lookahead analysis. In Proc. the 42nd IEEE/ACM International Conference on Software Engineering, June 27-July 19, 2020, pp.789-800. DOI: 10.1145/3377811.3380388.
[9] Böhme M, Manès V J M, Cha S K. Boosting fuzzer efficiency: An information theoretic perspective. In Proc. the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, November 2020, pp.678-689. DOI: 10.1145/3368089.3409748.
[10] Song S, Song C, Jang Y et al. CrFuzz: Fuzzing multi-purpose programs through input validation. In Proc. the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, November 2020, pp.690-700. DOI: 10.1145/3368089.3409769.
[11] Havrikov N. Efficient fuzz testing leveraging input, code, and ution. In Proc. the 39th IEEE/ACM International Conference on Software Engineering Companion, May 2017, pp.417-420. DOI: 10.1109/ICSE-C.2017.26.
[12] Klees G, Ruef A, Cooper B et al. Evaluating fuzz testing. In Proc. the 2018 ACM SIGSAC Conference on Computer and Communications Security, October 2018, pp.2123-2138. DOI: 10.1145/3243734.3243804.
[13] Zhang M Z, Gong Y Z, Wang Y W et al. Unit test data generation for C using rule-directed symbolic ution. Journal of Computer Science and Technology, 2019, 34(3): 670-689. DOI: 10.1007/s11390-019-1935-7.
[14] He J, Balunovic M, Ambroladze N et al. Learning to fuzz from symbolic ution with application to smart contracts. In Proc. the 2019 ACM SIGSAC Conference on Computer and Communications Security, November 2019, pp.531-548. DOI: 10.1145/3319535.3363230.
[15] Zhang Q, Wang J, Gulzar M A et al. BigFuzz: Efficient fuzz testing for data analytics using framework abstraction. In Proc. the 35th IEEE/ACM International Conference on Automated Software Engineering, September 2020, pp.722-733. DOI: 10.1145/3324884.3416641.
[16] Nguyen H L, Nassar N, Kehrer T et al. MoFuzz: A fuzzer suite for testing model-driven software engineering tools. In Proc. the 35th IEEE/ACM International Conference on Automated Software Engineering, September 2020, pp.1103-1115. DOI: 10.1145/3324884.3416668.
[17] Olsthoorn M, Van Deursen A, Panichella A. Generating highly-structured input data by combining search-based testing and grammar-based fuzzing. In Proc. the 35th IEEE/ACM International Conference on Automated Software Engineering, September 2020, pp.1224-1228. DOI: 10.1145/3324884.3418930.
[18] Nguyen T D, Pham L H, Sun J et al. sfuzz: An efficient adaptive fuzzer for solidity smart contracts. In Proc. the 42nd ACM/IEEE International Conference on Software Engineering, June 27-July 19, 2020, pp.778-788. DOI: 10.1145/3377811.3380334.
[19] Manès V J M, Kim S, Cha S K. Ankou: Guiding grey-box fuzzing towards combinatorial difference. In Proc. the 42nd ACM/IEEE International Conference on Software Engineering, June 27-July 19, 2020, pp.1024-1036. DOI: 10.1145/3377811.3380421.
[20] Böhme M, Pham V T, Nguyen M D et al. Directed greybox fuzzing. In Proc. the 2017 ACM SIGSAC Conference on Computer and Communications Security, October 30-November 3, 2017, pp.2329-2344. DOI: 10.1145/3133956.3134020.
[21] Gao X, Saha R K, Prasad M R et al. Fuzz testing based data augmentation to improve robustness of deep neural networks. In Proc. the 42nd IEEE/ACM International Conference on Software Engineering, June 27-July 19, 2020, pp.1147-1158. DOI: 10.1145/3377811.3380415.
[22] Wen C, Wang H, Li Y et al. MemLock: Memory usage guided fuzzing. In Proc. the 42nd ACM/IEEE International Conference on Software Engineering, June 27-July 19, 2020, pp.765-777. DOI: 10.1145/3377811.3380396.
[23] Babic D, Bucur S, Chen Y et al. FUDGE: Fuzz driver generation at scale. In Proc. the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, August 2019, pp.975-985. DOI: 10.1145/3338906.3340456.
[24] Chen H, Xue Y, Li Y et al. Hawkeye: Towards a desired directed grey-box fuzzer. In Proc. the 2018 ACM SIGSAC Conference on Computer and Communications Security, October 2018, pp.2095-2108. DOI: 10.1145/3243734.3243849.
[25] Medicherla R K, Komondoor R, Roychoudhury A. Fitness guided vulnerability detection with greybox fuzzing. In Proc. the 42nd IEEE/ACM International Conference on Software Engineering, June 27--July 19, 2020, pp.513-520. DOI: 10.1145/3387940.3391457.
[26] österlund S, Razavi K, Bos H et al. ParmeSan: Sanitizer-guided greybox fuzzing. In Proc. the 29th USENIX Security Symposium, August 2020, pp.2289-2306.
[27] Gao X, Mechtaev S, Roychoudhury A. Crash-avoiding program repair. In Proc. the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, July 2019, pp.8-18. DOI: 10.1145/3293882.3330558.
[28] Fioraldi A, Maier D, Eißfeldt H et al. AFL++: Combining incremental steps of fuzzing research. In Proc. the 14th USENIX Workshop on Offensive Technologies, August 2020.
[29] Grieco G, Ceresa M, Buiras P. QuickFuzz: An automatic random fuzzer for common file formats. ACM SIGPLAN Notices, 2016, 51(12): 13-20. DOI: 10.1145/2976002.2976017.
[30] Liang H, Zhang Y, Yu Y et al. Sequence coverage directed greybox fuzzing. In Proc. the 27th IEEE/ACM International Conference on Program Comprehension, May 2019, pp.249-259. DOI: 10.1109/ICPC.2019.00044.
[31] Zhang M, Liu J, Ma F et al. IntelliGen: Automatic driver synthesis for fuzz testing. In Proc. the 43rd IEEE/ACM International Conference on Software Engineering, May 2021, pp.318-327. DOI: 10.1109/ICSE-SEIP52600.2021.00041.
[32] You W, Liu X, Ma S et al. SLF: Fuzzing without valid seed inputs. In Proc. the 41st IEEE/ACM International Conference on Software Engineering, May 2019, pp.712-723. DOI: 10.1109/ICSE.2019.00080.
[33] Choi W, Sen K, Necul G et al. DetReduce: Minimizing Android GUI test suites for regression testing. In Proc. the 40th IEEE/ACM International Conference on Software Engineering, May 27-June 3, 2018, pp.445-455. DOI: 10.1145/3180155.3180173.
[34] Zhang L. Hybrid regression test selection. In Proc. the 40th IEEE/ACM International Conference on Software Engineering, May 27-June 3, 2018, pp.199-209. DOI: 10.1145/3180155.3180198.
[35] Nagy S, Hicks M. Full-speed fuzzing: Reducing fuzzing overhead through coverage-guided tracing. In Proc. the 2019 IEEE Symposium on Security and Privacy, May 2019, pp.787-802. DOI: 10.1109/SP.2019.00069.
[36] Liang J, Jiang Y, Wang M et al. DeepFuzzer: Accelerated deep greybox fuzzing. IEEE Transactions on Dependable and Secure Computing, 2019, 18(6): 2675-2688. DOI: 10.1109/TDSC.2019.2961339.
[37] Fioraldi A, D'Elia D C, Coppa E. WEIZZ: Automatic grey-box fuzzing for structured binary formats. In Proc. the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, July 2020, pp.1-13. DOI: 10.1145/3395363.3397372.
[38] Chen Y, Poskitt C M, Sun J et al. Learning-guided network fuzzing for testing cyber-physical system defences. In Proc. the 34th IEEE/ACM International Conference on Automated Software Engineering, November 2019, pp.962-973. DOI: 10.1109/ASE.2019.00093.
[39] Peng H, Shoshitaishvili Y, Payer M. T-Fuzz: Fuzzing by program transformation. In Proc. the 2018 IEEE Symposium on Security and Privacy, May 2018, pp.697-710. DOI: 10.1109/SP.2018.00056.
[40] Padhye R, Lemieux C, Sen K. JQF: Coverage-guided property-based testing in Java. In Proc. the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, July 2019, pp.398-401. DOI: 10.1145/3293882.3339002.
[41] Noller Y, Kersten R, Pasareanu C S. Badger: Complexity analysis with fuzzing and symbolic ution. In Proc. the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, July 2018, pp.322-332. DOI: 10.1145/3213846.3213868.
[42] Zhou C, Wang M, Liang J et al. Zeror: Speed up fuzzing with coverage-sensitive tracing and scheduling. In Proc. the 35th IEEE/ACM International Conference on Automated Software Engineering, September 2020, pp.858-870. DOI: 10.1145/3324884.3416572.
[43] Wang T, Wei T, Gu G et al. TaintScope: A checksum-aware directed fuzzing tool for automatic software vulnerability detection. In Proc. the 2010 IEEE Symposium on Security and Privacy, May 2010, pp.497-512. DOI: 10.1109/SP.2010.37.
[44] Lemieux C, Sen K. Fairfuzz: A targeted mutation strategy for increasing greybox fuzz testing coverage. In Proc. the 33rd ACM/IEEE International Conference on Automated Software Engineering, September 2018, pp.475-485. DOI: 10.1145/3238147.3238176.
[45] Situ L, Wang L, Li X, Guan L, Zhang W, Liu P. Energy distribution matters in greybox fuzzing. In Proc. the 41st IEEE/ACM International Conference on Software Engineering, May 2019, pp.270-271. DOI: 10.1109/ICSE-Companion.2019.00109.
[46] Böhme M, Pham V T, Roychoudhury A. Coverage-based greybox fuzzing as Markov chain. IEEE Transactions on Software Engineering, 2017, 45(5): 489-506. DOI: 10.1109/TSE.2017.2785841.
[47] Wüstholz V, Christakis M. Harvey: A greybox fuzzer for smart contracts. In Proc. the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, November 2020, pp.1398-1409. DOI: 10.1145/3368089.3417064.
[48] Gan S, Zhang C, Qin X et al. CollAFL: Path sensitive fuzzing. In Proc. the 2018 IEEE Symposium on Security and Privacy, May 2018, pp.679-696. DOI: 10.1109/SP.2018.00040.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Pan Qijing;. A Routing Algorithm with Candidate Shortest Path[J]. , 1986, 1(3): 33 -52 .
[2] Huang Guoxiang; Liu Jian;. A Key-Lock Access Control[J]. , 1987, 2(3): 236 -243 .
[3] Chen Guoliang;. A Partitioning Selection Algorithm on Multiprocessors[J]. , 1988, 3(4): 241 -250 .
[4] Wang Haiying;. A Framework for Command Recovery in User Interface[J]. , 1990, 5(3): 296 -301 .
[5] Shen Yidong;. A Comparison of Closed World Assumptions[J]. , 1992, 7(3): 243 -246 .
[6] Shen Yidong;. Form alizing Incomplete Knowledge in Incomplete Databases[J]. , 1992, 7(4): 295 -304 .
[7] Jin Zhi;. The Structure and Semantics of an Object-Oriented Logic Programming Language: SCKE[J]. , 1995, 10(1): 74 -84 .
[8] Jiamg Xiong;. Some Undecidable Problems on Approximability of NP Optimization Problems[J]. , 1996, 11(2): 126 -132 .
[9] Qin Kaihuai;. Representing Quadric Surfaces Using NURBS Surfaces[J]. , 1997, 12(3): 210 -216 .
[10] Sun Jizhou; Richard L;. A Radiosity Solution for Curved Surface Environments[J]. , 1997, 12(5): 414 -424 .

ISSN 1000-9000(Print)

         1860-4749(Online)
CN 11-2296/TP

Home
Editorial Board
Author Guidelines
Subscription
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Tel.:86-10-62610746
E-mail: jcst@ict.ac.cn
 
  Copyright ©2015 JCST, All Rights Reserved