Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2786805.2786851acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

A user-guided approach to program analysis

Published: 30 August 2015 Publication History

Abstract

Program analysis tools often produce undesirable output due to various approximations. We present an approach and a system EUGENE that allows user feedback to guide such approximations towards producing the desired output. We formulate the problem of user-guided program analysis in terms of solving a combination of hard rules and soft rules: hard rules capture soundness while soft rules capture degrees of approximations and preferences of users. Our technique solves the rules using an off-the-shelf solver in a manner that is sound (satisfies all hard rules), optimal (maximally satisfies soft rules), and scales to real-world analyses and programs. We evaluate EUGENE on two different analyses with labeled output on a suite of seven Java programs of size 131–198 KLOC. We also report upon a user study involving nine users who employ EUGENE to guide an information-flow analysis on three Java micro-benchmarks. In our experiments, EUGENE significantly reduces misclassified reports upon providing limited amounts of feedback.

References

[1]
Apache FTP Server. http://mina.apache.org/ftpserver-project/.
[2]
PJBench. https://code.google.com/p/pjbench/.
[3]
Securibench Micro. http://suif.stanford.edu/ ~livshits/work/securibench-micro/index.html.
[4]
N. Beckman and A. Nori. Probabilistic, modular and scalable inference of typestate specifications. In PLDI, 2011.
[5]
S. M. Blackburn, R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanovi´ c, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In OOPSLA, 2006.
[6]
S. Blackshear and S. Lahiri. Almost-correct specifications: A modular semantic framework for assigning confidence to warnings. In PLDI, 2013.
[7]
M. Bravenboer and Y. Smaragdakis. Strictly declarative specification of sophisticated points-to analyses. In OOPSLA, 2009.
[8]
A. Chaganty, A. Lal, A. Nori, and S. Rajamani. Combining relational learning with SMT solvers using CEGAR. In CAV, 2013.
[9]
I. Dillig, T. Dillig, and A. Aiken. Automated error diagnosis using abductive inference. In PLDI, 2012.
[10]
P. Domingos and D. Lowd. Markov Logic: An Interface Layer for Artificial Intelligence. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool Publishers, 2009.
[11]
S. Guarnieri and B. Livshits. Gatekeeper: Mostly static enforcement of security and reliability policies for JavaScript code. In USENIX Security Symposium, 2009.
[12]
K. Hoder, N. Bjørner, and L. M. de Moura. µZ - an efficient engine for fixed points with constraints. In CAV, 2011.
[13]
Y. Jung, J. Kim, J. Shin, and K. Yi. Taming false alarms from a domain-unaware C analyzer by a bayesian statistical post analysis. In SAS, 2005.
[14]
S. Kok, M. Sumner, M. Richardson, P. Singla, H. Poon, D. Lowd, and P. Domingos. The alchemy system for statistical relational AI. Technical report, Department of Computer Science and Engineering, University of Washington, Seattle, WA, 2007.
[15]
T. Kremenek, K. Ashcraft, J. Yang, and D. Engler. Correlation exploitation in error ranking. In FSE, 2004.
[16]
T. Kremenek and D. Engler. Z-ranking: Using statistical analysis to counter the impact of static analysis approximations. In SAS, 2003.
[17]
T. Kremenek, P. Twohey, G. Back, A. Ng, and D. Engler. From uncertainty to belief: Inferring the specification within. In OSDI, 2006.
[18]
W. Le and M. L. Soffa. Path-based fault correlations. In FSE, 2010.
[19]
W. Lee, W. Lee, and K. Yi. Sound non-statistical clustering of static analysis alarms. In VMCAI, 2012.
[20]
O. Lhoták. Spark: A flexible points-to analysis framework for Java. Master’s thesis, McGill University, 2002.
[21]
O. Lhoták and L. Hendren. Jedd: a BDD-based relational extension of Java. In PLDI, 2004.
[22]
O. Lhoták and L. Hendren. Context-sensitive points-to analysis: is it worth it? In CC, 2006.
[23]
B. Livshits and M. Lam. Finding security vulnerabilities in Java applications with static analysis. In USENIX Security Symposium, 2005.
[24]
B. Livshits, A. Nori, S. Rajamani, and A. Banerjee. Merlin: specification inference for explicit information flow problems. In PLDI, 2009.
[25]
B. Livshits, J. Whaley, and M. S. Lam. Reflection analysis for Java. In APLAS, 2005.
[26]
R. Mangal, X. Zhang, M. Naik, and A. Nori. Solving weighted constraints with applications to program analysis. http://hdl.handle.net/1853/53191, 2015.
[27]
M. Martin, B. Livshits, and M. Lam. Finding application errors and security flaws using PQL: a program query language. In OOPSLA, 2005.
[28]
A. Milanova, A. Rountev, and B. G. Ryder. Parameterized object sensitivity for points-to analysis for Java. ACM TOSEM, 14(1), 2005.
[29]
M. Naik. Chord: A program analysis platform for Java. http://jchord.googlecode.com/.
[30]
M. Naik, A. Aiken, and J. Whaley. Effective static race detection for Java. In PLDI, 2006.
[31]
M. Naik, C.-S. Park, K. Sen, and D. Gay. Effective static deadlock detection. In ICSE, 2009.
[32]
S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, and B. Calder. Automatically classifying benign and harmful data races using replay analysis. In PLDI, 2007.
[33]
F. Niu, C. Ré, A. Doan, and J. W. Shavlik. Tuffy: Scaling up statistical inference in markov logic networks using an RDBMS. In VLDB, 2011.
[34]
J. Noessner, M. Niepert, and H. Stuckenschmidt. RockIt: Exploiting parallelism and symmetry for MAP inference in statistical relational models. In AAAI, 2013.
[35]
C. H. Papadimitriou. Computational complexity. Addison-Wesley, 1994.
[36]
E. I. Psallida. Relational representation of the LLVM intermediate language. B.S. Thesis, University of Athens, Jan. 2014.
[37]
S. Riedel. Improving the accuracy and efficiency of MAP inference for Markov Logic. In UAI, 2008.
[38]
P. Singla and P. Domingos. Discriminative training of markov logic networks. In AAAI, 2005.
[39]
Y. Smaragdakis and M. Bravenboer. Using Datalog for fast and easy program analysis. In Datalog 2.0 Workshop, 2010.
[40]
Y. Smaragdakis, M. Bravenboer, and O. Lhoták. Pick your contexts well: Understanding object-sensitivity. In POPL, 2013.
[41]
Y. Smaragdakis, G. Kastrinis, and G. Balatsouras. Introspective analysis: context-sensitivity, across the board. In PLDI, 2014.
[42]
M. Sridharan and R. Bod´ık. Refinement-based context-sensitive points-to analysis for Java. In PLDI, 2006.
[43]
J. Whaley, D. Avots, M. Carbin, and M. Lam. Using Datalog with binary decision diagrams for program analysis. In APLAS, 2005.
[44]
J. Whaley and M. Lam. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In PLDI, 2004.
[45]
X. Zhang, R. Mangal, R. Grigore, M. Naik, and H. Yang. On abstraction refinement for program analyses in Datalog. In PLDI, 2014.

Cited By

View all
  • (2024)Learning Abstraction Selection for Bayesian Program AnalysisProceedings of the ACM on Programming Languages10.1145/36498458:OOPSLA1(954-982)Online publication date: 29-Apr-2024
  • (2024)On the Effectiveness of Machine Learning-based Call Graph Pruning: An Empirical StudyProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644897(457-468)Online publication date: 15-Apr-2024
  • (2023)The Call Graph Chronicles: Unleashing the Power WithinProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3617854(2210-2212)Online publication date: 30-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC/FSE 2015: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering
August 2015
1068 pages
ISBN:9781450336758
DOI:10.1145/2786805
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 August 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. User feedback
  2. program analysis
  3. report classification

Qualifiers

  • Research-article

Funding Sources

Conference

ESEC/FSE'15
Sponsor:

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)41
  • Downloads (Last 6 weeks)3
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Learning Abstraction Selection for Bayesian Program AnalysisProceedings of the ACM on Programming Languages10.1145/36498458:OOPSLA1(954-982)Online publication date: 29-Apr-2024
  • (2024)On the Effectiveness of Machine Learning-based Call Graph Pruning: An Empirical StudyProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644897(457-468)Online publication date: 15-Apr-2024
  • (2023)The Call Graph Chronicles: Unleashing the Power WithinProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3617854(2210-2212)Online publication date: 30-Nov-2023
  • (2023)Mitigating False Positive Static Analysis Warnings: Progress, Challenges, and OpportunitiesIEEE Transactions on Software Engineering10.1109/TSE.2023.332966749:12(5154-5188)Online publication date: Dec-2023
  • (2023)APICAD: Augmenting API Misuse Detection through Specifications from Code and Documents2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)10.1109/ICSE48619.2023.00032(245-256)Online publication date: May-2023
  • (2023)Tabby: Automated Gadget Chain Detection for Java Deserialization Vulnerabilities2023 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN58367.2023.00028(179-192)Online publication date: Jun-2023
  • (2023)VALAR: Streamlining Alarm Ranking in Static Analysis with Value-Flow Assisted Active Learning2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE)10.1109/ASE56229.2023.00098(1940-1951)Online publication date: 11-Sep-2023
  • (2023)WINEInformation and Software Technology10.1016/j.infsof.2022.107109155:COnline publication date: 1-Mar-2023
  • (2022)AutoPruner: transformer-based call graph pruningProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3540250.3549175(520-532)Online publication date: 7-Nov-2022
  • (2022)Striking a balanceProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510166(2043-2055)Online publication date: 21-May-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media