Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3453483.3454086acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article

Path-sensitive sparse analysis without path conditions

Published: 18 June 2021 Publication History

Abstract

Sparse program analysis is fast as it propagates data flow facts via data dependence, skipping unnecessary control flows. However, when path-sensitively checking millions of lines of code, it is still prohibitively expensive because a huge number of path conditions have to be computed and solved via an SMT solver. This paper presents Fusion, a fused approach to inter-procedurally path-sensitive sparse analysis. In Fusion, the SMT solver does not work as a standalone tool on path conditions but directly on the program together with the sparse analysis. Such a fused design allows us to determine the path feasibility without explicitly computing path conditions, not only saving the cost of computing path conditions but also providing an opportunity to enhance the SMT solving algorithm. To the best of our knowledge, Fusion, for the first time, enables whole program bug detection on millions of lines of code in a common personal computer, with the precision of inter-procedural path-sensitivity. Compared to two state-of-the-art tools, Fusion is 10× faster but consumes only 10% of memory on average. Fusion has detected over a hundred bugs in mature open-source software, some of which have even been assigned CVE identifiers due to their security impact.

References

[1]
Alessandro Armando and Silvio Ranise. 2003. Constraint contextual rewriting. Journal of Symbolic Computation, 36, 1-2 (2003), 193–216. https://doi.org/10.1016/S0747-7171(03)00025-7
[2]
Domagoj Babic and Alan J. Hu. 2008. Calysto: Scalable and precise extended static checking. In Proceedings of the 30th International Conference on Software Engineering (ICSE ’08). IEEE, 211–220. https://doi.org/10.1145/1368088.1368118
[3]
Thomas Ball and Sriram K. Rajamani. 2002. The SLAM project: Debugging system software via static analysis. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’02). ACM, 1–3. https://doi.org/10.1145/503272.503274
[4]
Armin Biere, Alessandro Cimatti, Edmund Clarke, and Yunshan Zhu. 1999. Symbolic model checking without BDDs. In Proceedings of the 5th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS ’99). Springer, 193–207. https://doi.org/10.1007/3-540-49059-0_14
[5]
David Binkley. 1994. Interprocedural constant propagation using dependence graphs and a data-flow model. In Proceedings of the 3rd International Conference on Compiler Construction (CC ’94). Springer, 374–388. https://doi.org/10.1007/3-540-57877-3_25
[6]
Randal E. Bryant. 1986. Graph-based algorithms for boolean function manipulation. IEEE Trans. Comput., 100, 8 (1986), 677–691. https://doi.org/10.1109/TC.1986.1676819
[7]
Cristian Cadar, Daniel Dunbar, and Dawson R. Engler. 2008. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’08). USENIX, 209–224. https://doi.org/10.5555/1855741.1855756
[8]
Cristiano Calcagno, Dino Distefano, Peter W. O’Hearn, and Hongseok Yang. 2011. Compositional shape analysis by means of bi-abduction. J. ACM, 58, 6 (2011), 26:1–26:66. https://doi.org/10.1145/2049697.2049700
[9]
Sagar Chaki, Edmund M. Clarke, Alex Groce, Somesh Jha, and Helmut Veith. 2004. Modular verification of software components in C. IEEE Transactions on Software Engineering, 30, 6 (2004), 388–402. https://doi.org/10.1109/TSE.2004.22
[10]
Jianhui Chen and Fei He. 2018. Control flow-guided SMT solving for program verification. In Proceedings of the 33rd International Conference on Automated Software Engineering (ASE ’18). ACM, 351–361. https://doi.org/10.1145/3238147.3238218
[11]
Sigmund Cherem, Lonnie Princehouse, and Radu Rugina. 2007. Practical memory leak detection using guarded value-flow analysis. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’07). ACM, 480–491. https://doi.org/10.1145/1250734.1250789
[12]
Chia Yuan Cho, Vijay D’Silva, and Dawn Song. 2013. BLITZ: Compositional bounded model checking for real-world programs. In Proceedings of the 28th International Conference on Automated Software Engineering (ASE ’13). IEEE, 136–146. https://doi.org/10.1109/ASE.2013.6693074
[13]
Jong-Deok Choi, Ron Cytron, and Jeanne Ferrante. 1991. Automatic construction of sparse data flow evaluation graphs. In Proceedings of the 18th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’91). ACM, 55–66. https://doi.org/10.1145/99583.99594
[14]
Edmund Clarke, Daniel Kroening, and Karen Yorav. 2003. Behavioral consistency of C and Verilog programs using bounded model checking. In Proceedings of the 40th Design Automation Conference (DAC ’03). ACM, 368–371. https://doi.org/10.1145/775832.775928
[15]
Keith D. Cooper, L. Taylor Simpson, and Christopher A. Vick. 2001. Operator strength reduction. ACM Transactions on Programming Languages and Systems (TOPLAS), 23, 5 (2001), 603–625. https://doi.org/10.1145/504709.504710
[16]
Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. 1989. An efficient method of computing static single assignment form. In Proceedings of the 16th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’89). ACM, 25–35. https://doi.org/10.1145/75277.75280
[17]
Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems (TOPLAS), 13, 4 (1991), 451–490. https://doi.org/10.1145/115372.115320
[18]
Manuvir Das, Sorin Lerner, and Mark Seigle. 2002. ESP: Path-sensitive program verification in polynomial time. In Proceedings of the 23rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’02). ACM, 57–68. https://doi.org/10.1145/512529.512538
[19]
James H. Davenport and Joos Heintz. 1988. Real quantifier elimination is doubly exponential. Journal of Symbolic Computation, 5, 1-2 (1988), 29–35. https://doi.org/10.1016/S0747-7171(88)80004-X
[20]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In Proceedings of the 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS ’08). Springer, 337–340. https://doi.org/10.1007/978-3-540-78800-3_24
[21]
Isil Dillig, Thomas Dillig, and Alex Aiken. 2008. Sound, complete and scalable path-sensitive analysis. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’08). ACM, 270–280. https://doi.org/10.1145/1375581.1375615
[22]
Isil Dillig, Thomas Dillig, and Alex Aiken. 2010. Small formulas for large programs: On-line constraint simplification in scalable static analysis. In Proceedings of the 17th International Static Analysis Symposium (SAS ’10). Springer, 236–252. https://doi.org/10.1007/978-3-642-15769-1_15
[23]
Isil Dillig, Thomas Dillig, Alex Aiken, and Mooly Sagiv. 2011. Precise and compact modular procedure summaries for heap manipulating programs. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’11). ACM, 567–577. https://doi.org/10.1145/1993498.1993565
[24]
Dino Distefano, Manuel Fahndrich, Francesco Logozzo, and Peter O’Hearn. 2019. Scaling static analyses at Facebook. Commun. ACM, 62, 8 (2019), 62–70. https://doi.org/10.1145/3338112
[25]
Jeanne Ferrante, Karl J. Ottenstein, and Joe D. Warren. 1987. The program dependence graph and its use in optimization. ACM Transactions on Programming Languages and Systems (TOPLAS), 9, 3 (1987), 319–349. https://doi.org/10.1145/24039.24041
[26]
Michael J. Fischer and Michael O. Rabin. 1998. Super-exponential complexity of Presburger arithmetic. In Quantifier Elimination and Cylindrical Algebraic Decomposition. Springer, 122–135. https://doi.org/10.1007/978-3-7091-9459-1_5
[27]
Ben Hardekopf and Calvin Lin. 2009. Semi-sparse flow-sensitive pointer analysis. In Proceedings of the 36th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’09). ACM, 226–238. https://doi.org/10.1145/1480881.1480911
[28]
Ben Hardekopf and Calvin Lin. 2011. Flow-sensitive pointer analysis for millions of lines of code. In Proceedings of the 9th International Symposium on Code Generation and Optimization (CGO ’11). IEEE, 289–298. https://doi.org/10.1109/CGO.2011.5764696
[29]
William R Harris, Sriram Sankaranarayanan, Franjo Ivančić, and Aarti Gupta. 2010. Program analysis via satisfiability modulo path programs. In Proceedings of the 37th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, 71–82. https://doi.org/10.1145/1706299.1706309
[30]
John L. Henning. 2000. SPEC CPU2000: Measuring CPU performance in the new millennium. Computer, 33, 7 (2000), 28–35. https://doi.org/10.1109/2.869367
[31]
Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar, and Grégoire Sutre. 2002. Lazy abstraction. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’02). ACM, 58–70. https://doi.org/10.1145/503272.503279
[32]
Susan Horwitz, Jan Prins, and Thomas Reps. 1988. On the adequacy of program dependence graphs for representing programs. In Proceedings of the 15th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’88). ACM, 146–157. https://doi.org/10.1145/73560.73573
[33]
Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the 2nd International Symposium on Code Generation and Optimization (CGO ’04). IEEE, 75:1–75:12. https://doi.org/10.1109/CGO.2004.1281665
[34]
Benjamin Livshits and Monica S. Lam. 2003. Tracking pointers with path and context sensitivity for bug detection in C programs. In Proceedings of the 9th European Software Engineering Conference Held Jointly with the 11th ACM SIGSOFT International Symposium on the Foundations of Software Engineering (ESEC/FSE ’03). ACM, 317–326. https://doi.org/10.1145/940071.940114
[35]
Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondřej Lhoták, J. Nelson Amaral, Bor-Yuh Evan Chang, Samuel Z. Guyer, Uday P. Khedker, Anders Møller, and Dimitrios Vardoulakis. 2015. In defense of soundiness: A manifesto. Commun. ACM, 58, 2 (2015), 44–46. https://doi.org/10.1145/2644805
[36]
Salvador Lucas. 1995. Fundamentals of context-sensitive rewriting. In Proceedings of the 21st International Conference on Current Trends in Theory and Practice of Computer Science. Springer, 405–412. https://doi.org/10.1007/3-540-60609-2_25
[37]
Magnus Madsen and Anders Møller. 2014. Sparse dataflow analysis with pointers and reachability. In Proceedings of the 21st International Static Analysis Symposium (SAS ’14). Springer, 201–218. https://doi.org/10.1007/978-3-319-10936-7_13
[38]
Laurent Mauborgne and Xavier Rival. 2005. Trace partitioning in abstract interpretation based static analyzers. In Proceedings of the 14th European Symposium on Programming (ESOP ’05). Springer, 5–20. https://doi.org/10.1007/978-3-540-31987-0_2
[39]
Hakjoo Oh, Kihong Heo, Wonchan Lee, Woosuk Lee, and Kwangkeun Yi. 2012. Design and implementation of sparse global analyses for C-like languages. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’12). ACM, 229–238. https://doi.org/10.1145/2254064.2254092
[40]
Ganesan Ramalingam. 2002. On sparse evaluation representations. Theoretical Computer Science, 277, 1-2 (2002), 119–147. https://doi.org/10.1016/S0304-3975(00)00315-7
[41]
John H. Reif and Harry R. Lewis. 1977. Symbolic evaluation and the global value graph. In Proceedings of the 4th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL ’77). ACM, 104–118. https://doi.org/10.1145/512950.512961
[42]
Thomas Reps. 1998. Program analysis via graph reachability. Information and Software Technology, 40, 11-12 (1998), 701–726. https://doi.org/10.1016/S0950-5849(98)00093-7
[43]
Mooly Sagiv, Thomas Reps, and Susan Horwitz. 1996. Precise interprocedural dataflow analysis with applications to constant propagation. Theoretical Computer Science, 167, 1 (1996), 131–170. https://doi.org/10.1016/0304-3975(96)00072-2
[44]
Sriram Sankaranarayanan, Franjo Ivančić, Ilya Shlyakhter, and Aarti Gupta. 2006. Static analysis in disjunctive numerical domains. In Proceedings of the 13th International Static Analysis Symposium. Springer, 3–17. https://doi.org/10.1007/11823230_2
[45]
Qingkai Shi, Rongxin Wu, Gang Fan, and Charles Zhang. 2020. Conquering the extensional scalability problem for value-flow analysis frameworks. In Proceedings of the 42nd International Conference on Software Engineering (ICSE ’20). ACM, 812–823. https://doi.org/10.1145/3377811.3380346
[46]
Qingkai Shi, Xiao Xiao, Rongxin Wu, Jinguo Zhou, Gang Fan, and Charles Zhang. 2018. Pinpoint: Fast and precise sparse value flow analysis for million lines of code. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’18). ACM, 693–706. https://doi.org/10.1145/3192366.3192418
[47]
Qingkai Shi and Charles Zhang. 2020. Pipelining bottom-up data flow analysis. In Proceedings of the 42nd International Conference on Software Engineering (ICSE ’20). ACM, 835–847. https://doi.org/10.1145/3377811.3380425
[48]
Yulei Sui and Jingling Xue. 2016. SVF: Interprocedural static value-flow analysis in LLVM. In Proceedings of the 25th International Conference on Compiler Construction (CC ’16). ACM, 265–266. https://doi.org/10.1145/2892208.2892235
[49]
Yulei Sui, Ding Ye, and Jingling Xue. 2014. Detecting memory leaks statically with full-sparse value-flow analysis. IEEE Transactions on Software Engineering, 40, 2 (2014), 107–122. https://doi.org/10.1109/TSE.2014.2302311
[50]
Peng Tu and David Padua. 1995. Efficient building and placing of gating functions. In Proceedings of the 16th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’95). ACM, 47–55. https://doi.org/10.1145/223428.207115
[51]
Mark N. Wegman and F. Kenneth Zadeck. 1991. Constant propagation with conditional branches. ACM Transactions on Programming Languages and Systems (TOPLAS), 13, 2 (1991), 181–210. https://doi.org/10.1145/103135.103136
[52]
Mark Weiser. 1984. Program slicing. IEEE Transactions on Software Engineering, 10, 4 (1984), 352–357. https://doi.org/10.1109/TSE.1984.5010248
[53]
Yichen Xie and Alex Aiken. 2005. Scalable error detection using Boolean satisfiability. In Proceedings of the 32nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’05). ACM, 351–363. https://doi.org/10.1145/1040305.1040334
[54]
Peisen Yao, Qingkai Shi, Heqing Huang, and Charles Zhang. 2020. Fast bit-vector satisfiability. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA ’20). ACM, 38–50. https://doi.org/10.1145/3395363.3397378
[55]
Greta Yorsh, Eran Yahav, and Satish Chandra. 2008. Generating precise and concise procedure summaries. In Proceedings of the 35th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’08). ACM, 221–234. https://doi.org/10.1145/1328897.1328467

Cited By

View all
  • (2024)SPATA: Effective OS Bug Detection with Summary-Based, Alias-Aware, and Path-Sensitive Typestate AnalysisACM Transactions on Computer Systems10.1145/369525042:3-4(1-40)Online publication date: 6-Sep-2024
  • (2024)Accelerating Static Null Pointer Dereference Detection with Parallel ComputingProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3671385(135-144)Online publication date: 24-Jul-2024
  • (2024)Fast Graph Simplification for Path-Sensitive Typestate Analysis through Tempo-Spatial Multi-Point SlicingProceedings of the ACM on Software Engineering10.1145/36437491:FSE(494-516)Online publication date: 12-Jul-2024
  • Show More Cited By

Index Terms

  1. Path-sensitive sparse analysis without path conditions

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PLDI 2021: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation
    June 2021
    1341 pages
    ISBN:9781450383912
    DOI:10.1145/3453483
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 June 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. SMT solving
    2. Sparse analysis
    3. path sensitivity
    4. program dependence graph

    Qualifiers

    • Research-article

    Conference

    PLDI '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 406 of 2,067 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)133
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 01 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)SPATA: Effective OS Bug Detection with Summary-Based, Alias-Aware, and Path-Sensitive Typestate AnalysisACM Transactions on Computer Systems10.1145/369525042:3-4(1-40)Online publication date: 6-Sep-2024
    • (2024)Accelerating Static Null Pointer Dereference Detection with Parallel ComputingProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3671385(135-144)Online publication date: 24-Jul-2024
    • (2024)Fast Graph Simplification for Path-Sensitive Typestate Analysis through Tempo-Spatial Multi-Point SlicingProceedings of the ACM on Software Engineering10.1145/36437491:FSE(494-516)Online publication date: 12-Jul-2024
    • (2024) Octopus: Scaling Value-Flow Analysis via Parallel Collection of Realizable Path ConditionsACM Transactions on Software Engineering and Methodology10.1145/363274333:3(1-33)Online publication date: 24-Jan-2024
    • (2024)Fast and Precise Static Null Exception Analysis With Synergistic PreprocessingIEEE Transactions on Software Engineering10.1109/TSE.2024.346655150:11(3022-3036)Online publication date: Nov-2024
    • (2023)Place your locks wellProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620446(3727-3744)Online publication date: 9-Aug-2023
    • (2023) Anchor: Fast and Precise Value-flow Analysis for Containers via Memory OrientationACM Transactions on Software Engineering and Methodology10.1145/356580032:3(1-39)Online publication date: 26-Apr-2023
    • (2023)Verifying Data Constraint Equivalence in FinTech SystemsProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00117(1329-1341)Online publication date: 14-May-2023
    • (2023)VALAR: Streamlining Alarm Ranking in Static Analysis with Value-Flow Assisted Active Learning2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE)10.1109/ASE56229.2023.00098(1940-1951)Online publication date: 11-Sep-2023
    • (2023)ConfTainter: Static Taint Analysis For Configuration Options2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE)10.1109/ASE56229.2023.00067(1640-1651)Online publication date: 11-Sep-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media