Journal of Computer Science and Technology ›› 2021, Vol. 36 ›› Issue (2): 397-418.doi: 10.1007/s11390-020-9754-4

Special Issue: Software Systems

• Regular Paper • Previous Articles     Next Articles

SymPas: Symbolic Program Slicing

Ying-Zhou Zhang, Senior Member, CCF, Member, ACM, IEEE        

  1. College of Computer Science and Technology, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
  • Received:2019-06-05 Revised:2020-01-22 Online:2021-03-05 Published:2021-04-01
  • About author:Ying-Zhou Zhang received his Ph.D. degree in computer science and engineering from University of Southeast, Nanjing, in 2005, and his M.S. degree in mathematics from University of Hohai, Nanjing, in 2002. He is currently working as a professor in the School of Computer Science and Technology, Nanjing University of Posts and Communications, Nanjing. He worked with the Hong Kong Polytechnic University, Hong Kong, in 2011, and as a visiting scholar with the University of Cambridge between 2012 and 2013. He has published more than 80 research papers. His research interests include formal methods, software analysis and program slicing, software reliability and security, service computing and functional programming. He is a senior member of CCF, and a member of ACM and IEEE.
  • Supported by:
    The work was supported by the National Natural Science Foundation of China under Grant Nos. 60973046 and 61300054, the Qing Lan Project of Jiangsu Province of China, and the 1311 Talent Program Funding of Nanjing University of Posts and Telecommunications.

Program slicing is a technique for simplifying programs by focusing on selected aspects of their behavior. Current mainstream static slicing methods operate on dependence graph PDG (program dependence graph) or SDG (system dependence graph), but these friendly graph representations may be a bit expensive for some users. In this paper we attempt to study a light-weight approach of static program slicing, called Symbolic Program Slicing (SymPas), which works as a dataflow analysis on LLVM (low-level virtual machine). In our SymPas approach, slices are stored in symbolic forms, not in procedures being re-analyzed (cf. procedure summaries). Instead of re-analyzing a procedure multiple times to find its slices for each callling context, we calculate a single symbolic slice which can be instantiated at call sites avoiding re-analysis; SymPas is implemented with LLVM to perform slicing on LLVM intermediate representation (IR). For comparison, we systematically adapt IFDS (interprocedural finite distributive subset) analysis and the SDG-based slicing method (SDGIFDS) to statically slice IR programs. Evaluated on open-source and benchmark programs, our backward SymPas shows a factor-of-6 reduction in time cost and a factor-of-4 reduction in space cost, compared with backward SDG-IFDS, thus being more efficient. In addition, the result shows that after studying slices from 66 programs, ranging up to 336 800 IR instructions in size, SymPas is highly size-scalable.

Key words: data flow analysis; instruction dependency table; low-level virtual machine (LLVM); procedure symbolic slice; program slicing;

[1] Weiser M. Program slicing. IEEE Transaction on Software Engineering, 1984, 16(5):498-509. DOI:10.1109/TSE.1984.5010248.
[2] Tip F. A survey of program slicing techniques. Journal of Programming Languages, 1995, 3(3):121-189.
[3] Gallagher K B, Lyle J R. Using program slicing in software maintenance. IEEE Transactions on Software Engineering, 1991, 17(8):751-761. DOI:10.1109/32.83912.
[4] Harman M, Hierons R M. An overview of program slicing. Software Focus, 2001, 2(3):85-92. DOI:10.1002/swf.41.
[5] Binkley D, Gallagher K B. Program slicing. Advances in Computers, 1996, 43:1-50. DOI:10.1016/S0065-2458(08)60641-5.
[6] Josep S. A vocabulary of program slicing-based techniques. ACM Computing Surveys, 2012, 44(3):Article No. 12. DOI:10.1145/2187671.2187674.
[7] Binkley D, Harman M. A survey of empirical results on program slicing. Advances in Computers, 2004, 62:105-178. DOI:10.1016/S0065-2458(03)62003-6.
[8] Binkley D, Danicic S, Gyimothy T, Harman M, Kiss A, Ouarbya L. Formalizing executable dynamic and forward slicing. In Proc. the 4th IEEE International Workshop on Source Code Analysis and Manipulation, September 2004, pp.43-52. DOI:10.1109/SCAM.2004.13.
[9] Horwiz S, Reps T, Binkley D. Interprocedural slicing using dependence graphs. ACM Transactions on Programming Languages and Systems, 1990, 12(1):26-60. DOI:10.1145/77606.77608.
[10] Reps T, Horwitz S, Sagiv M. Precise interprocedural dataflow analysis via graph reachability. In Proc. the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, January 1995, pp.49-61. DOI:10.1145/199448.199462.
[11] Ottenstein K J, Ottenstein L M. The program dependence graph in a software development environment. ACM SIGPLAN Notices, 1984, 19(5):177-184. DOI:10.1145/390011.808263.
[12] Reps T, Sagiv M, Horwitz S. Interprocedural dataflow analysis via graph reachability. Technical Report, University of Copenhagen, 1994. https://research.cs.wisc.edu/wpis/papers/diku-tr94-14.pdf, Nov. 2020.
[13] Naeem A, Lhotak O, Rodriguez J. Practical extensions to the IFDS algorithm. In Proc. the 19th International Conference on Compiler Construction, March 2010, pp.124-144. DOI:10.1007/978-3-642-11970-58.
[14] Cytron R, Ferrante J, Rosen B K, Wegman M N, Zadeck F K. Efficiently computing static single assignment form and the control dependence graph. ACM Trans. Prog. Langs. and Systs., 1991, 13(4):451-490. DOI:10.1145/115372.115320.
[15] Zhao J, Zdancewic S, Nagarakatte S, Martin M. Formalizing the LLVM intermediate representation for verified program transformation. In Proc. the 39th Annual ACM SIGPLANSIGACT Symp. Principles of Programming Languages, January 2012, pp.427-440. DOI:10.1145/2103656.2103709.
[16] Canfora G, Cimitile A, De Lucia A. Conditioned program slicing. Information and Software Technology, 1998, 40(11/12):595-607. DOI:10.1016/S0950-5849(98)00086-X.
[17] Harman M, Binkley D, Danicic S. Amorphous program slicing. Journal of Systems and Software, 2003, 68(1):45-64. DOI:10.1016/S0164-1212(02)00135-8.
[18] Morrison D R. PATRICIA-Practical algorithm to retrieve information coded in alphanumeric. Journal of the ACM, 1968, 15(4):514-534. DOI:10.1145/321479.321481.
[19] Okasaki C, Gill A. Fast mergeable integer maps. In Proc. the 1998 ACM SIGPLAN Workshop on Machine Learning, September 1998, pp.77-86.
[20] Adams S. Efficient sets:A balancing act. Journal of Functional Programming, 1993, 3(4):553-562. DOI:10.1017/S0956796800000885.
[21] Sharir M, Pnueli A. Two approaches to interprocedural data flow analysis. In Program Flow Analysis:Theory and Applications, Muchnick S S, Jones N D (eds.), Prentice-Hall Inc., 1981, pp.189-233.
[22] Gustafsson J, Betts A, Ermedahl A, Lisper B. The mälardalen WCET benchmarks-Past, present and future. In Proc. the 10th Workshop Worst-Case Execution Time Analysis, July 2010, pp.137-147. DOI:10.4230/OASIcs.WCET.2010.136.
[23] Binkley D, Gold N, Harman M. An empirical study of static program slice size. ACM Transactions on Software Engineering and Methodology, 2007, 16(2):Article No. 8. DOI:10.1145/1217295.1217297.
[24] Hwang J C, Du M W, Chou C R. Finding program slices for recursive procedures. In Proc. the 12th Annual International Computer Software & Applications Conference, Oct. 1988, pp.220-227. DOI:10.1109/CMPSAC.1988.17176.
[25] Reps T. On the sequential nature of interprocedural program-analysis problems. Acta Informatica, 1996, 33(8):739-757. DOI:10.1007/s002360050068.
[26] Lakhotia A. Improved interprocedural slicing algorithm. Technical Report, University of Southwestern Louisian, 1992. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.28.3466&rep=rep1&type=pdf, Nov. 2020.
[27] Livadas P, Johnson T. An optimal algorithm for the construction of the system dependence graph. Information Sciences, 2000, 125(1/2/3/4):99-131. DOI:10.1016/S0020-0255(99)00030-4.
[28] Alomari H, Collard M, Maletic J, Alhindawi N, Meqdadi O. srcSlice:Very efficient and scalable forward static slicing. Journal of Software:Evolution and Process, 2014, 26(11):931-961. DOI:10.1002/smr.1651.
[29] Mastroeni I, Zanardini D. Abstract program slicing:An abstract interpretation-based approach to program slicing. ACM Trans. Comput. Logic, 2017, 18(1):Article No. 7. DOI:10.1145/3029052.
[30] Cousot P, Cousot R. Abstract interpretation:A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proc. the 4th ACM Symp. Principles of Programming Languages, January 1977, pp.238-252. DOI:10.1145/512950.512973.
[31] Zhang Y. A novel formal approach to program slicing. Science in China, Ser. F:Info. Sci., 2007, 50(5):657-670. DOI:10.1007/s11432-007-0061-2.
[32] Zhang Y. A precise monadic dynamic slicing method. Knowledge-Based Systems, 2017, 115:40-54. DOI:10.1016/j.knosys.2016.10.013.
[33] Liang S. Modular monadic semantics and compilation[Ph.D. Thesis]. University of Yale, 1998.
[34] Wadler P. Comprehending monads. In Proc. the 1990 ACM Conf. LISP and Functional Programming, May 1990, pp.61-78. DOI:10.1145/91556.91592.
[35] Liang S, Hudak P, Jones M. Monad transformers and modular interpreters. In Proc. the 22nd ACM SIG-PLANSIGACT Sym. Principles of Programming Languages, January 1995, pp.333-343. DOI:10.1145/199448.199528.
[36] Binkley D, Gold N, Harman M, Islam S, Krinke J, Yoo S. ORBS:Language-independent program slicing. In Proc. the 22nd Int. Symp. Foundations of Software Engineering, November 2014, pp.109-120. DOI:10.1145/2635868.2635893.
[37] Binkley D, Gold N, Islam S, Krinke J, Yoo S. A comparison of tree-and line-oriented observational slicing. Empir. Software Eng., 2019, 24(1):3077-3113. DOI:10.1007/s10664-018-9675-9.
[38] Beszédes A, Faragó C, Szabó Z, Csirik J, Gyimóthy T. Union slices for program maintenance. In Proc. the 2002 IEEE Int. Conference on Software Maintenance, Oct. 2002, pp.12-21. DOI:10.1109/ICSM.2002.1167743.
[39] Lisper B, Masud A, Khanfar H. Static backward demanddriven slicing. In Proc. the 2015 Workshop on Partial Evaluation and Program Manipulation, January 2015, pp.115-126. DOI:10.1145/2678015.2682538.
[40] Srinivasan V, Reps T. An improved algorithm for slicing machine code. In Proc. the 2016 ACM SIGPLAN Int. Conf. Object-Oriented Programming, Systems, Languages, and Applications, October 2016, pp.378-393. DOI:10.1145/2983990.2984003.
[41] Meyers T, Binkley D. An empirical study of slice-based cohesion and coupling metrics. ACM Transactions on Software Methodology, 2007, 17(1):Article No. 2. DOI:10.1145/1314493.1314495.
[42] Zhang X, Gupta R, Zhang Y. Precise dynamic slicing algorithms. In Proc. the 25th Int. Conf. Software Engineering, May 2003, pp.319-329.
[1] Bo Guo, Young-Woo Kwon, Myoungkyu Song. Decomposing Composite Changes for Code Review and Regression Test Selection in Evolving Software [J]. Journal of Computer Science and Technology, 2019, 34(2): 416-436.
[2] Bi-Xin Li, Xiao-Cong Fan, Jun Pang, and Jian-Jun Zhao. Model for Slicing JAVA Programs Hierarchically [J]. , 2004, 19(6): 0-0.
[3] Tun Li, Yang Guo, and Si-Kun Li. Automatic Circuit Extractor for HDL Description Using Program Slicing [J]. , 2004, 19(5): 0-0.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Zhou Di;. A Recovery Technique for Distributed Communicating Process Systems[J]. , 1986, 1(2): 34 -43 .
[2] Chen Shihua;. On the Structure of Finite Automata of Which M Is an(Weak)Inverse with Delay τ[J]. , 1986, 1(2): 54 -59 .
[3] Li Wanxue;. Almost Optimal Dynamic 2-3 Trees[J]. , 1986, 1(2): 60 -71 .
[4] Wang Xuan; Lü Zhimin; Tang Yuhai; Xiang Yang;. A High Resolution Chinese Character Generator[J]. , 1986, 1(2): 1 -14 .
[5] C.Y.Chung; H.R.Hwa;. A Chinese Information Processing System[J]. , 1986, 1(2): 15 -24 .
[6] Zhang Cui; Zhao Qinping; Xu Jiafu;. Kernel Language KLND[J]. , 1986, 1(3): 65 -79 .
[7] Wang Jianchao; Wei Daozheng;. An Effective Test Generation Algorithm for Combinational Circuits[J]. , 1986, 1(4): 1 -16 .
[8] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[9] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[10] Zheng Guoliang; Li Hui;. The Design and Implementation of the Syntax-Directed Editor Generator(SEG)[J]. , 1986, 1(4): 39 -48 .

ISSN 1000-9000(Print)

         1860-4749(Online)
CN 11-2296/TP

Home
Editorial Board
Author Guidelines
Subscription
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Tel.:86-10-62610746
E-mail: jcst@ict.ac.cn
 
  Copyright ©2015 JCST, All Rights Reserved