Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/2738600.2738629acmconferencesArticle/Chapter ViewAbstractPublication PagescgoConference Proceedingsconference-collections
research-article

Approximating flow-sensitive pointer analysis using frequent itemset mining

Published: 07 February 2015 Publication History

Abstract

Pointer alias analysis is a well researched problem in the area of compilers and program verification. Many recent works in this area have focused on flow-sensitivity due to the additional precision it offers. However, a flow-sensitive analysis is computationally expensive, thus, preventing its use in larger programs.
In this work, we observe that a number of object sets, consisting of tens to hundreds of objects appear together and frequently in many points-to sets. By approximating each of these object sets by a single object, we can speedup computation of points-to sets. Although the proposed approach incurs a slight loss in precision, it is shown to be safe. We use a well known data mining technique called frequent itemset mining to find these frequently occurring objects.
We compare our approximation to a fully flow-sensitive pointer analysis on a set of ten benchmarks. We measure precision loss using two common client analysis queries and report an average precision loss of 0.25% on one measure and 1.40% on the other. The proposed approach results in a speedup of upto 12.9x (and an average speedup of 6.2x) in computing the points-to sets.

References

[1]
R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases, VLDB '94, pages 487--499, 1994. ISBN 1-55860-153-8.
[2]
A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques, and Tools (2nd Edition). Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2006. ISBN 0321486811.
[3]
L. O. Andersen. Program analysis and specialization for the c programming language. Technical report, 1994.
[4]
S. Blackshear, B.-Y. E. Chang, S. Sankaranarayanan, and M. Sridharan. The flow-insensitive precision of andersen's analysis in practice. In Proceedings of the 18th International Conference on Static Analysis, SAS'11, pages 60--76, 2011. ISBN 978-3-642-23701-0.
[5]
C. Borgelt. Efficient implementations of apriori and eclat. In Proc. 1st IEEE ICDM Workshop on Frequent Item Set Mining Implementations (FIMI 2003, Melbourne, FL). CEUR Workshop Proceedings 90, page 90, 2003.
[6]
W. Chang, B. Streiff, and C. Lin. Efficient and extensible security enforcement using dynamic data flow analysis. In Proceedings of the 15th ACM conference on Computer and communications security, CCS '08, pages 39--50, New York, NY, USA, 2008. ACM. ISBN 978-1-59593-810-7. URL http://doi.acm.org/10.1145/1455770.1455778.
[7]
R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Efficiently computing static single assignment form and the control dependence graph. ACM Trans. Program. Lang. Syst., 13(4):451--490, Oct. 1991. ISSN 0164-0925.
[8]
M. Das. Unification-based pointer analysis with directional assignments. In Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation, PLDI '00, pages 35--46, New York, NY, USA, 2000. ACM. ISBN 1-58113-199-2. URL http://doi.acm.org/10.1145/349299.349309.
[9]
M. Das, B. Liblit, M. Fähndrich, and J. Rehof. Estimating the impact of scalable pointer analysis on optimization. In Proceedings of the 8th International Symposium on Static Analysis, SAS '01, 2001.
[10]
I. Dillig, T. Dillig, and A. Aiken. Fluid updates: Beyond strong vs. weak updates. In Proceedings of the 19th European Conference on Programming Languages and Systems, ESOP'10, pages 246--266, 2010. ISBN 3-642-11956-5, 978-3-642-11956-9.
[11]
P. Ferrara. A fast and precise alias analysis for data race detection. In Proceedings of the Third Workshop on Bytecode Semantics, Verification, Analysis and Transformation (Bytecode' 08), volume Electronic Notes in Theoretical Computer Science. Elsevier, April 2008.
[12]
S. Z. Guyer and C. Lin. Error checking with client-driven pointer analysis. In Science of Computer Programming, 2005.
[13]
B. Hardekopf and C. Lin. The ant and the grasshopper: Fast and accurate pointer analysis for millions of lines of code. In Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '07, pages 290--299, 2007. ISBN 978-1-59593-633-2.
[14]
B. Hardekopf and C. Lin. Flow-sensitive pointer analysis for millions of lines of code. In Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO '11, 2011.
[15]
R. Hasti and S. Horwitz. Using static single assignment form to improve flow-insensitive pointer analysis. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation, PLDI '98, pages 97--105, New York, NY, USA, 1998. ACM. ISBN 0-89791-987-4. URL http://doi.acm.org/10.1145/277650.277668.
[16]
L. Hendren. Context-sensitive points-to analysis: Is it worth it. In Compiler Construction, 15th International Conference, volume 3923 of LNCS, pages 47--64. Springer, 2006.
[17]
J. L. Henning. Spec cpu2006 benchmark descriptions. SIGARCH Comput. Archit. News, 34(4):1--17, Sept. 2006. ISSN 0163-5964.
[18]
M. Hind. Pointer analysis: Haven't we solved this problem yet? In Proceedings of the 2001 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, PASTE '01, pages 54--61, 2001. ISBN 1-58113-413-4.
[19]
M. Hind and A. Pioli. Which pointer analysis should I use? In Proceedings of the 2000 ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA '00, pages 113--123, 2000. ISBN 1-58113-266-2.
[20]
S. Horwitz. Precise flow-insensitive may-alias analysis is np-hard. ACM Trans. Program. Lang. Syst., 19(1):1--6, Jan. 1997. ISSN 0164-0925.
[21]
N. Jovanovic, C. Kruegel, and E. Kirda. Precise alias analysis for static detection of web application vulnerabilities. In Proceedings of the 2006 Workshop on Programming Languages and Analysis for Security, PLAS '06, pages 27--36, 2006. ISBN 1-59593-374-3.
[22]
U. Khedker, A. Sanyal, and B. Karkare. Data Flow Analysis: Theory and Practice. CRC Press, Inc., Boca Raton, FL, USA, 1st edition, 2009. ISBN 0849328802, 9780849328800.
[23]
O. Lhoták and K.-C. A. Chung. Points-to analysis with efficient strong updates. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '11, pages 3--16, 2011. ISBN 978-1-4503-0490-0.
[24]
L. Li, C. Cifuentes, and N. Keynes. Boosting the performance of flow-sensitive points-to analysis using value flow. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, ESEC/FSE '11, pages 343--353, 2011. ISBN 978-1-4503-0443-6.
[25]
L. Li, C. Cifuentes, and N. Keynes. Precise and scalable context-sensitive pointer analysis via value flow graph. In Proceedings of the 2013 International Symposium on Memory Management, ISMM '13, pages 85--96, 2013. ISBN 978-1-4503-2100-6.
[26]
M. Méndez-lojo, A. Mathew, and K. Pingali. Parallel inclusion-based points-to analysis. In In Proceedings of the 24th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA10, 2010.
[27]
V. Nagaraj and R. Govindarajan. Parallel flow-sensitive pointer analysis by graph-rewriting. In Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, PACT '13, pages 19--28, 2013. ISBN 978-1-4799-1021-2.
[28]
R. Nasre. Approximating inclusion-based points-to analysis. In Proceedings of the 2011 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, MSPC '11, pages 66--73, New York, NY, USA, 2011. ACM. ISBN 978-1-4503-0794-9.
[29]
R. Nasre. Time- and space-efficient flow-sensitive points-to analysis. ACM Trans. Archit. Code Optim., 10(4):39:1--39:27, Dec. 2013. ISSN 1544-3566.
[30]
R. Nasre, K. Rajan, R. Govindarajan, and U. P. Khedker. Scalable context-sensitive points-to analysis using multi-dimensional bloom filters. In Proceedings of the 7th Asian Symposium on Programming Languages and Systems, APLAS '09, pages 47--62, 2009. ISBN 978-3-642-10671-2.
[31]
A. Rajaraman and J. D. Ullman. Mining of Massive Datasets. Cambridge University Press, New York, NY, USA, 2011. ISBN 1107015359, 9781107015357.
[32]
G. Ramalingam. The undecidability of aliasing. ACM Trans. Program. Lang. Syst., 16(5):1467--1471, Sept. 1994. ISSN 0164-0925.
[33]
A. Rountev and S. Chandra. Off-line variable substitution for scaling points-to analysis. In Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation, PLDI '00, pages 47--56, 2000. ISBN 1-58113-199-2.
[34]
A. Salcianu and M. Rinard. Pointer and escape analysis for multithreaded programs. In Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming, PPoPP '01, pages 12--23, New York, NY, USA, 2001. ACM. ISBN 1-58113-346-4. URL http://doi.acm.org/10.1145/379539.379553.
[35]
B. Steensgaard. Points-to analysis in almost linear time. In Proceedings of the 23rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages, POPL '96, pages 32--41, 1996. ISBN 0-89791-769-3.
[36]
O. Tripp, M. Pistoia, S. J. Fink, M. Sridharan, and O. Weisman. Taj: Effective taint analysis of web applications. In Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '09, pages 87--97, 2009.
[37]
S. Verdoolaege and T. Grosser. Polyhedral extraction tool.
[38]
X. Xiao, J. Zhou, C. Zhang, and Q. Zhang. Persistent pointer information. In Proceedings of the 35th ACM SIGPLAN conference on Programming Language Design and Implementation, PLDI '14, 2014.
[39]
G. Xu, D. Yan, and A. Rountev. Static detection of loop-invariant data structures. In Proceedings of the 26th European Conference on Object-Oriented Programming, ECOOP'12, pages 738--763, 2012.
[40]
H. Yu, J. Xue, W. Huo, X. Feng, and Z. Zhang. Level by level: Making flow- and context-sensitive pointer analysis scalable for millions of lines of code. In Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO '10, pages 218--229, 2010. ISBN 978-1-60558-635-9.
[41]
M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. New algorithms for fast discovery of association rules. Technical report, Rochester, NY, USA, 1997.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CGO '15: Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization
February 2015
280 pages
ISBN:9781479981618

Sponsors

Publisher

IEEE Computer Society

United States

Publication History

Published: 07 February 2015

Check for updates

Qualifiers

  • Research-article

Conference

CGO '15
Sponsor:

Acceptance Rates

CGO '15 Paper Acceptance Rate 24 of 88 submissions, 27%;
Overall Acceptance Rate 312 of 1,061 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 283
    Total Downloads
  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Sep 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media