Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1480881.1480911acmconferencesArticle/Chapter ViewAbstractPublication PagespoplConference Proceedingsconference-collections
research-article

Semi-sparse flow-sensitive pointer analysis

Published: 21 January 2009 Publication History

Abstract

Pointer analysis is a prerequisite for many program analyses, and the effectiveness of these analyses depends on the precision of the pointer information they receive. Two major axes of pointer analysis precision are flow-sensitivity and context-sensitivity, and while there has been significant recent progress regarding scalable context-sensitive pointer analysis, relatively little progress has been made in improving the scalability of flow-sensitive pointer analysis.
This paper presents a new interprocedural, flow-sensitive pointer analysis algorithm that combines two ideas-semi-sparse analysis and a novel use of BDDs-that arise from a careful understanding of the unique challenges that face flow-sensitive pointer analysis. We evaluate our algorithm on 12 C benchmarks ranging from 11K to 474K lines of code. Our fastest algorithm is on average 197x faster and uses 4.6x less memory than the state of the art, and it can analyze programs that are an order of magnitude larger than the previous state of the art.

References

[1]
J. Aycock and R. N. Horspool. Simple generation of static single-assignment form. In 9th International Conference on Compiler Construction (CC), pages 110--124, London, UK, 2000. Springer-Verlag.
[2]
T. Ball, R. Majumdar, T. D. Millstein, and S. K. Rajamani. Automatic predicate abstraction of c programs. In Programming Language Design and Implementation (PLDI), pages 203--213, 2001.
[3]
R. Barua, W. Lee, S. Amarasinghe, and A. Agarawal. Compiler support for scalable and efficient memory systems. IEEE Trans. Comput., 50(11):1234--1247, 2001.
[4]
M. Berndl, O. Lhotak, F. Qian, L. Hendren, and N. Umanee. Points-to analysis using BDDs. In Programming Language Design and Implementation (PLDI), 2003,pages 103--114.
[5]
G. Bilardi and K. Pingali. Algorithms for computing the static single assignment form. Journal of the ACM, 50(3):375--425, 2003.
[6]
R. E. Bryant. Graph-based algorithms for Boolean function manipulation. IEEETC, C--35(8):677--691, Aug 1986.
[7]
W. Chang, B. Streiff, and C. Lin. Efficient and extensible security enforcement using dynamic data flow analysis. In Computer and Communications Security (CCS), 2008,pages 39--50.
[8]
D. R. Chase, M. Wegman, and F. K. Zadeck. Analysis of pointers and structures. In Programming Language Design and Implementation (PLDI), pages 296--310, 1990.
[9]
P.-S. Chen, M.-Y. Hung, Y.-S. Hwang, R. D.-C. Ju, and J. K. Lee. Compiler support for speculative multithreading architecture with probabilistic points-to analysis. SIGPLAN Not., 38(10):25--36, 2003.
[10]
B.-C. Cheng and W.-M. W. Hwu. Modular interprocedural pointer analysis using access paths: Design, implementation, and evaluation. ACM SIG-PLAN Notices, 35(5):57--69, 2000.
[11]
J.-D. Choi, R. Cytron, and J. Ferrante. Automatic construction of sparse data flow evaluation graphs. In Symposium on Principles of Programming Languages (POPL), pages 55--66, New York, NY, USA, 1991. ACM Press.
[12]
F. Chow, S. Chan, S.-M. Liu, R. Lo, and M. Streich. Effective representation of aliases and indirect memory operations in SSA form. In Compiler Construction, 1996, pages 253--267.
[13]
R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems, 13(4):451--490, 1991.
[14]
R. Cytron and R. Gershbein. Efficient accommodation of may-alias information in SSA form. In Programming Language Design and Implementation (PLDI), June 1993, pages 36--45.
[15]
R. K. Cytron and J. Ferrante. Efficiently computing Φ-nodes on-the-fly. ACM Trans. Program. Lang. Syst, 17(3):487--506, 1995.
[16]
E. Duesterwald, R. Gupta, and M. L. Soffa. Reducing the cost of data flow analysis by congruence partitioning. In Compiler Construction, 1994, pages 357--373.
[17]
S. Fink, E. Yahav, N. Dor, G. Ramalingam, and E. Geay. Effective typestate verification in the presence of aliasing. In International Symposium on Software Testing and Analysis, pages 133--144, 2006.
[18]
R. Ghiya. Putting pointer analysis to work. In Principles of Programming Languages (POPL), 1998,pages 121--133.
[19]
D. Goyal. An improved intra-procedural may-alias analysis algorithm. Technical report TR1999--777, New York University, 1999.
[20]
S. Z. Guyer and C. Lin. Error checking with client-driven pointer analysis. Science of Computer Programming, 58(1-2):83--114, 2005.
[21]
B. Hackett and R. Rugina. Region-based shape analysis with tracked locations. In Symposium on Principles of Programming Languages, pages 310--323, 2005.
[22]
B. Hardekopf and C. Lin. The Ant and the Grasshopper: Fast and accurate pointer analysis for millions of lines of code. In Programming Language Design and Implementation (PLDI), pages 290--299, San Diego, CA, USA, 2007.
[23]
B. Hardekopf and C. Lin. Exploiting pointer and location equivalence to optimize pointer analysis. In International Static Analysis Symposium (SAS), pages 265--280, 2007.
[24]
R. Hasti and S. Horwitz. Using static single assignment form to improve flow-insensitive pointer analysis. In Programming Language Design and Implementation (PLDI), 1998,pages 97--105.
[25]
N. Heintze and O. Tardieu. Ultra-fast aliasing analysis using CLA: A million lines of C code in a second. In Programming Language Design and Implementation (PLDI), pages 23--34, 2001.
[26]
M. Hind. Pointer analysis: haven't we solved this problem yet? In Workshop on Program Analysis for Software Tools and Engineering (PASTE), pages 54--61, New York, NY, USA, 2001. ACM Press.
[27]
M. Hind, M. Burke, P. Carini, and J.-D. Choi. Interprocedural pointer alias analysis. ACM Transactions on Programming Languages and Systems, 21(4):848--894, 1999.
[28]
M. Hind and A. Pioli. Assessing the effects of flow-sensitivity on pointer alias analyses. In Static Analysis Symposium, pages 57--81, 1998.
[29]
V. Kahlon. Bootstrapping: a technique for scalable flow and context-sensitive pointer alias analysis. In Programming language design and implementation, pages 249--259, 2008.
[30]
H.-S. Kim, E. M. Nystrom, R. D. Barnes, and W.-M. W. Hwu. Compaction algorithm for precise modular context-sensitive points--to analysis. Technical report IMPACT-03-03, Center for Reliable and High Performance Computing, University of Illinois, Urbana-Champaign, 2003.
[31]
C. Lapkowski and L. J. Hendren. Extended SSA numbering: introducing SSA properties to languages with multi-level pointers. In CASCON '96: Proceedings of the 1996 conference of the Centre for Advanced Studies on Collaborative research, page 23, 1996.
[32]
C. Lattner. LLVM: An infrastructure for multi-stage optimization. Master's thesis, Computer Science Dept., University of Illinois at Urbana-Champaign, Dec 2002.
[33]
C. Lattner and V. Adve. Data structure analysis: An efficient context-sensitive heap analysis. Technical Report UIUCDCS-R-2003-2340, Computer Science Dept, University of Illinois at Urbana-Champaign, 2003.
[34]
O. Lhotak, S. Curial, and J. Amaral. Using ZBDDs in points-to analysis. In Workshops on Languages and Compilers for Parallel Computing (LCPC), 2007.
[35]
J. Lind-Nielson. BuDDy, a binary decision package.
[36]
A. Milanova and B. G. Ryder. Annotated inclusion constraints for precise flow analysis. In ICSM '05: Proceedings of the 21st IEEE International Conference on Software Maintenance (ICSM'05), pages 187--196, 2005.
[37]
M. Mock, D. Atkinson, C. Chambers, and S. Eggers. Improving program slicing with dynamic points-to data. In Foundations of Software Engineering, pages 71--80, 2002.
[38]
D. Novillo. Design and implementation of Tree SSA, 2004.
[39]
E. M. Nystrom, H.-S. Kim, and W. mei W. Hwu. Bottom-up and top-down context-sensitive summary-based pointer analysis. In International Symposium on Static Analysis, pages 165--180, 2004.
[40]
D. Pearce, P. Kelly, and C. Hankin. Efficient field-sensitive pointer analysis for C. In ACM Workshop on Program Analysis for Software Tools and Engineering (PASTE), pages 37--42, 2004.
[41]
D. J. Pearce, P. H. J. Kelly, and C. Hankin. Online cycle detection and difference propagation for pointer analysis. In 3rd International IEEE Workshop on Source Code Analysis and Manipulation (SCAM), pages 3--12, 2003.
[42]
G. Ramalingam. On sparse evaluation representations. Theoretical Computer Science, 277(1-2):119--147, 2002.
[43]
J. H. Reif and H. R. Lewis. Symbolic evaluation and the global value graph. In Principles of programming languages (POPL), pages 104--118, 1977.
[44]
A. Rountev and S. Chandra. Off-line variable substitution for scaling points-to analysis. ACM SIGPLAN Notices, 35(5):47--56, 2000.
[45]
A. Salcianu and M. Rinard. Pointer and escape analysis for multithreaded programs. In PPoPP '01: Proceedings of the Eighth ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, pages 12--23, 2001.
[46]
M. Shapiro and S. Horwitz. The effects of the precision of pointer analysis. Lecture Notes in Computer Science, 1302:16--34, 1997.
[47]
T. B. Tok, S. Z. Guyer, and C. Lin. Efficient flow-sensitive interprocedural data-flow analysis in the presence of pointers. In 15th International Conference on Compiler Construction (CC), pages 17--31, 2006.
[48]
J. Whaley and M. S. Lam. Cloning--based context-sensitive pointer alias analysis. In Programming Language Design and Implementation (PLDI), pages 131--144, 2004.
[49]
R. P. Wilson and M. S. Lam. Efficient context-sensitive pointer analysis for C programs. In Programming Language Design and Implementation (PLDI), pages 1--12, 1995.
[50]
J. Zhu. Symbolic pointer analysis. In International Conference on Computer-Aided Design (ICCAD), pages 150---157, New York, NY, USA, 2002. ACM Press.
[51]
J. Zhu. Towards scalable flow and context sensitive pointer analysis. In DAC '05: Proceedings of the 42nd Annual Conference on Design Automation, pages 831--836, 2005.
[52]
J. Zhu and S. Calman. Symbolic pointer analysis revisited. In Programming Language Design and Implementation (PLDI), pages 145--157, New York, NY, USA, 2004. ACM Press.

Cited By

View all
  • (2024)MEA2: A Lightweight Field-Sensitive Escape Analysis with Points-to Calculation for GolangProceedings of the ACM on Programming Languages10.1145/36897598:OOPSLA2(1362-1389)Online publication date: 8-Oct-2024
  • (2024)Don’t Write, but Return: Replacing Output Parameters with Algebraic Data Types in C-to-Rust TranslationProceedings of the ACM on Programming Languages10.1145/36564068:PLDI(716-740)Online publication date: 20-Jun-2024
  • (2024)Automatically Inspecting Thousands of Static Bug Warnings with Large Language Model: How Far Are We?ACM Transactions on Knowledge Discovery from Data10.1145/365371818:7(1-34)Online publication date: 26-Mar-2024
  • Show More Cited By

Recommendations

Reviews

Charles Robert Morgan

This paper describes techniques for dramatically improving the performance of flow-sensitive, context-insensitive pointer analysis. It is a combination of improved engineering, careful data structure decisions, and new algorithm optimizations. The algorithm is used to analyze 12 large C programs. Two of the programs-Ghostscript and GDB-require more resources than are available, but the other ten programs-including a version of GCC-are analyzed, showing improvements of two orders of magnitude over previous algorithms. The engineering involves the insight that variables can be divided into three categories: variables that have nothing to do with pointers, variables whose address is never taken, and variables whose address is taken. The first set of variables can be ignored for pointer analysis. The second set of variables is efficiently analyzed using static single assignment form. The third set of variables uses def-use information in nonstatic single assignment form. The algorithm is organized to decrease memory and computational requirements. The analysis is a worklist algorithm, where the list is organized so that predecessor nodes are analyzed before successors, thus increasing the quality of the points-to information available when analyzing each node. Information is pruned from points-to information when it is no longer relevant-such as returning from a procedure. The paper analyzes two representations for points-to information: bit vectors and binary decision diagrams (BDDs). The paper concludes that BDDs are more efficient in time and space; however, the paper must address the issue that previous uses of BDDs could not handle the case of strong updates or situations where a store operation removes all previous information about stores, using that variable as a pointer. The paper develops two techniques for determining when two variables have the same points-to information, allowing shared data structures. This is a paper worth studying. It seems to be part of a PhD thesis. This algorithm provides significant improvements in the computation of points-to information. It will probably be even more effective for a strongly typed language such as Java or C#. The techniques are not yet strong enough to handle all systems' programs, but Hardekopf and Lin hint at further progress that may lead to their analysis. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
POPL '09: Proceedings of the 36th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
January 2009
464 pages
ISBN:9781605583792
DOI:10.1145/1480881
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 44, Issue 1
    POPL '09
    January 2009
    453 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/1594834
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 January 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. alias analysis
  2. pointer analysis

Qualifiers

  • Research-article

Conference

POPL09

Acceptance Rates

Overall Acceptance Rate 824 of 4,130 submissions, 20%

Upcoming Conference

POPL '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)43
  • Downloads (Last 6 weeks)10
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)MEA2: A Lightweight Field-Sensitive Escape Analysis with Points-to Calculation for GolangProceedings of the ACM on Programming Languages10.1145/36897598:OOPSLA2(1362-1389)Online publication date: 8-Oct-2024
  • (2024)Don’t Write, but Return: Replacing Output Parameters with Algebraic Data Types in C-to-Rust TranslationProceedings of the ACM on Programming Languages10.1145/36564068:PLDI(716-740)Online publication date: 20-Jun-2024
  • (2024)Automatically Inspecting Thousands of Static Bug Warnings with Large Language Model: How Far Are We?ACM Transactions on Knowledge Discovery from Data10.1145/365371818:7(1-34)Online publication date: 26-Mar-2024
  • (2024)Evaluating the Effectiveness of Deep Learning Models for Foundational Program Analysis TasksProceedings of the ACM on Programming Languages10.1145/36498298:OOPSLA1(500-528)Online publication date: 29-Apr-2024
  • (2023)Rapid: Region-Based Pointer DisambiguationProceedings of the ACM on Programming Languages10.1145/36228597:OOPSLA2(1729-1757)Online publication date: 16-Oct-2023
  • (2023)A Cocktail Approach to Practical Call Graph ConstructionProceedings of the ACM on Programming Languages10.1145/36228337:OOPSLA2(1001-1033)Online publication date: 16-Oct-2023
  • (2023)BigDataflow: A Distributed Interprocedural Dataflow Analysis FrameworkProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616348(1431-1443)Online publication date: 30-Nov-2023
  • (2023)A Preamble to Feminist Ecologies in HCIInteractions10.1145/360491430:4(20-23)Online publication date: 28-Jun-2023
  • (2023)Hidden in Plain Sight: Discreet User InterfacesInteractions10.1145/360423530:4(6-8)Online publication date: 28-Jun-2023
  • (2023)Access Work: Laboring with Non-Innocent AuthorizationInteractions10.1145/360349430:4(60-64)Online publication date: 28-Jun-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media