Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2635868.2635890acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Automatic mining of specifications from invocation traces and method invariants

Published: 11 November 2014 Publication History

Abstract

Software library documentation often describes individual methods' APIs, but not the intended protocols and method interactions. This can lead to library misuse, and restrict runtime detection of protocol violations and automated verification of software that uses the library. Specification mining, if accurate, can help mitigate these issues, which has led to significant research into new model-inference techniques that produce FSM-based models from program invariants and execution traces. However, there is currently a lack of empirical studies that, in a principled way, measure the impact of the inference strategies on model quality. To this end, we identify four such strategies and systematically study the quality of the models they produce for nine off-the-shelf libraries. We find that (1) using invariants to infer an initial model significantly improves model quality, increasing precision by 4% and recall by 41%, on average; (2) effective invariant filtering is crucial for quality and scalability of strategies that use invariants; and (3) using traces in combination with invariants greatly improves robustness to input noise. We present our empirical evaluation, implement new and extend existing model-inference techniques, and make public our implementations, ground-truth models, and experimental data. Our work can lead to higher-quality model inference, and directly improve the techniques and tools that rely on model inference.

References

[1]
N. Beckman, D. Kim, and J. Aldrich. An empirical study of object protocols in the wild. In the European Conference on Object-Oriented Programming (ECOOP), 2011.
[2]
I. Beschastnikh, Y. Brun, J. Abrahamson, M. D. Ernst, and A. Krishnamurthy. Unifying FSM-inference algorithms through declarative specification. In the International Conference on Software Engineering (ICSE), 2013.
[3]
I. Beschastnikh, Y. Brun, M. D. Ernst, and A. Krishnamurthy. Inferring Models of Concurrent Systems from Logs of their Behavior with CSight. In the International Conference on Software Engineering (ICSE), 2014.
[4]
I. Beschastnikh, Y. Brun, S. Schneider, M. Sloan, and M. D. Ernst. Leveraging existing instrumentation to automatically infer invariant-constrained models. In the Joint Meeting of European Software Engineering Conference and Symposium on Foundations of Software Engineering (ESEC/FSE), 2011.
[5]
A. Biermann and J. Feldman. On the synthesis of finite-state machines from samples of their behavior. IEEE Transactions on Computers, 21(6), 1972.
[6]
M. Bruch, M. Monperrus, and M. Mezini. Learning from examples to improve code completion systems. In the Joint Meeting of European Software Engineering Conference and Symposium on Foundations of Software Engineering (ESEC/FSE), 2009.
[7]
R. P. Buse and W. Weimer. Synthesizing API usage examples. In the International Conference on Software Engineering (ICSE), 2012.
[8]
E. Clarke, O. Grumberg, S. Jha, Y. Lu, and H. Veith. Counterexample-guided Abstraction Refinement. In Computer Aided Verification, pages 154–169, 2000.
[9]
Columba e-mail client. http://sourceforge.net/ projects/columba, 2013.
[10]
J. Cook and A. Wolf. Discovering models of software processes from event-based data. ACM Transactions on Software Engineering and Methodology, 7(3), 1998.
[11]
C. Csallner, N. Tillmann, and Y. Smaragdakis. DySy: Dynamic symbolic execution for invariant inference. In the International Conference on Software Engineering (ICSE), 2008.
[12]
DaCapo benchmark. http://www.dacapobench.org, 2009.
[13]
B. Dagenais and M. Robillard. Creating and evolving developer documentation: understanding the decisions of open source contributors. In the Symposium on Foundations of Software Engineering (FSE), 2010.
[14]
B. Dagenais and M. Robillard. Recovering traceability links between an API and its learning resources. In the International Conference on Software Engineering (ICSE), 2012.
[15]
The Daikon invariant detector. http://groups.csail.mit. edu/pag/daikon, 2009.
[16]
V. Dallmeier, N. Knopp, C. Mallon, G. Fraser, S. Hack, and A. Zeller. Automatically generating test cases for specification mining. IEEE Transactions on Software Engineering, 38(2), 2012.
[17]
V. Dallmeier, C. Lindig, A. Wasylkowski, and A. Zeller. Mining object behavior with ADABU. In the Workshop on Dynamic Analysis (WODA), 2006.
[18]
G. de Caso, V. Braberman, D. Garbervetsky, and S. Uchitel. Automated abstractions for contract validation. IEEE Transactions on Software Engineering, 38(1), 2012.
[19]
G. de Caso, V. Braberman, D. Garbervetsky, and S. Uchitel. Enabledness-based program abstractions for behavior validation. ACM Transactions on Software Engineering and Methodology, 22(3), 2013.
[20]
M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco, M. S. Tschantz, and C. Xiao. The Daikon system for dynamic detection of likely invariants. Science of Computer Programming, 69(1), 2007.
[21]
M. Gabel and Z. Su. Javert: Fully automatic mining of general temporal properties from dynamic traces. In the Symposium on Foundations of Software Engineering (FSE), 2008.
[22]
M. Gabel and Z. Su. Online inference and enforcement of temporal properties. In the International Conference on Software Engineering (ICSE), 2010.
[23]
D. Garlan, R. Allen, and J. Ockerbloom. Architectural mismatch: Why reuse is still so hard. IEEE Software, 26(4), 2009.
[24]
C. Ghezzi, M. Pezzè, M. Sama, and G. Tamburrelli. Mining Behavior Models from User-intensive Web Applications. In the International Conference on Software Engineering (ICSE), 2014.
[25]
JarInstaller. http://sourceforge.net/projects/ kurumix, 2013.
[26]
jEdit. http://www.jedit.org, 2014.
[27]
JFtp client. http://j-ftp.sourceforge.net, 2013.
[28]
jlGUI. http://www.javazoom.net/jlgui/jlgui.html, 2010.
[29]
I. Krka, Y. Brun, G. Edwards, and N. Medvidovic. Synthesizing partial component-level behavior models from system specifications. In the Joint Meeting of European Software Engineering Conference and Symposium on Foundations of Software Engineering (ESEC/FSE), 2009.
[30]
I. Krka, Y. Brun, and N. Medvidovic. Automatically mining specifications from invocation traces and method invariants. Technical Report CSSE-2013-509, Center for Systems and Software Engineering, University of Southern California, 2013.
[31]
I. Krka, Y. Brun, D. Popescu, J. Garcia, and N. Medvidovic. Using dynamic execution traces and program invariants to enhance behavioral model inference. In the International Conference on Software Engineering New Ideas and Emerging Results Track (ICSE NIER), 2010.
[32]
S. Kumar, S.-C. Khoo, A. Roychoudhury, and D. Lo. Inferring class level specifications for distributed systems. In the International Conference on Software Engineering (ICSE), 2012.
[33]
K. G. Larsen and B. Thomsen. A modal process logic. Logic in Computer Science, 1988.
[34]
C. Lee, F. Chen, and G. Ro¸su. Mining parametric specifications. In the International Conference on Software Engineering (ICSE), 2011.
[35]
K. Li, C. Reichenbach, Y. Smaragdakis, and M. Young. Second-order constraints in dynamic invariant inference. In the Joint Meeting of European Software Engineering Conference and Symposium on Foundations of Software Engineering (ESEC/FSE), 2013.
[36]
D. Lo and S. Khoo. QUARK: Empirical assessment of automaton-based specification miners. In the Working Conference on Reverse Engineering (WCRE), 2006.
[37]
D. Lo and S. Khoo. SMArTIC: Towards building an accurate, robust and scalable specification miner. In the Symposium on Foundations of Software Engineering (FSE), 2006.
[38]
D. Lo and S. Maoz. Scenario-based and value-based specification mining: Better together. In the International Conference on Automated Software Engineering (ICSE), 2010.
[39]
D. Lo, L. Mariani, and M. Pezzè. Automatic steering of behavioral model inference. In the Joint Meeting of European Software Engineering Conference and Symposium on Foundations of Software Engineering (ESEC/FSE), 2009.
[40]
D. Lo, L. Mariani, and M. Santoro. Learning extended fsa from software: An empirical assessment. Journal of Systems and Software, 85(9), 2012.
[41]
D. Lorenzoli, L. Mariani, and M. Pezzè. Automatic generation of software behavioral models. In the International Conference on Software Engineering (ICSE), 2008.
[42]
K. Mu¸slu, Y. Brun, R. Holmes, M. D. Ernst, and D. Notkin. Speculative analysis of integrated development environment recommendations. In the Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA), 2012.
[43]
T. Ohmann, M. Herzberg, S. Fiss, A. Halbert, M. Palyart, I. Beschastnikh, and Y. Brun. Behavioral Resource-Aware Model Inference. In International Conference On Automated Software Engineering (ASE), Västerås, Sweden, 2014.
[44]
T. Ohmann, K. Thai, I. Beschastnikh, and Y. Brun. Mining Precise Performance-Aware Behavioral Models from Existing Instrumentation. In the International Conference on Software Engineering New Ideas and Emerging Results (ICSE NIER) track, 2014.
[45]
J. H. Perkins, S. Kim, S. Larsen, S. Amarasinghe, J. Bachrach, M. Carbin, C. Pacheco, F. Sherwood, S. Sidiroglou, G. Sullivan, et al. Automatically patching errors in deployed software. In the Symposium on Operating Systems Principles (SOSP), 2009.
[46]
N. Polikarpova, I. Ciupa, and B. Meyer. A comparative study of programmer-written and automatically inferred contracts. In the International Symposium on Software Testing and Analysis (ISSTA), 2009.
[47]
M. Pradel, P. Bichsel, and T. R. Gross. A framework for the evaluation of specification miners based on finite state machines. In the International Conference on Software Maintenance (ICSM), 2010.
[48]
M. Pradel and T. R. Gross. Leveraging test generation and specification mining for automated bug detection without false positives. In the International Conference on Software Engineering (ICSE), 2012.
[49]
S. P. Reiss and M. Renieris. Encoding program executions. In the International Conference on Software Engineering (ICSE), 2001.
[50]
R. Robbes and M. Lanza. How program history can improve code completion. In the International Conference on Automated Software Engineering (ASE), 2008.
[51]
M. Robillard. What makes APIs hard to learn? Answers from developers. IEEE Software, 26(6), 2009.
[52]
M. Schur, A. Roth, and A. Zeller. Mining behavior models from enterprise web applications. In the Joint Meeting of European Software Engineering Conference and Symposium on Foundations of Software Engineering (ESEC/FSE), 2013.
[53]
S. Shoham, E. Yahav, S. J. Fink, and M. Pistoia. Static Specification Mining Using Automata-Based Abstractions. IEEE Transactions on Software Engineering, 34(5), 2008.
[54]
R. N. Taylor, N. Medvidovic, and E. M. Dashofy. Software Architecture: Foundations, Theory, and Practice. John Wiley & Sons, 2009.
[55]
Project Voldemort. http://www.project-voldemort.com, 2014.
[56]
N. Walkinshaw and K. Bogdanov. Inferring finite-state models with temporal constraints. In the International Conference on Automated Software Engineering (ASE), 2008.
[57]
Y. Wei, C. A. Furia, N. Kazmin, and B. Meyer. Inferring better contracts. In the International Conference on Software Engineering (ICSE), 2011.
[58]
J. Whaley, M. C. Martin, and M. S. Lam. Automatic extraction of object-oriented component interfaces. In the International Symposium on Software Testing and Analysis (ISSTA), 2002.
[59]
T. Xie et al. Data mining for software engineering. Computer, 42(8), 2009.
[60]
J. Yang, D. Evans, D. Bhardwaj, T. Bhat, and M. Das. Perracotta: Mining temporal API rules from imperfect traces. In the International Conference on Software Engineering, 2006.
[61]
Yices SMT Solver. http://yices.csl.sri.com, 2009.

Cited By

View all
  • (2024)Specification Mining Based on the Ordering Points to Identify the Clustering Structure Clustering Algorithm and Model CheckingAlgorithms10.3390/a1701002817:1(28)Online publication date: 10-Jan-2024
  • (2024)Unearthing Semantic Checks for Cloud Infrastructure-as-Code ProgramsProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695974(574-589)Online publication date: 4-Nov-2024
  • (2024)Detecting and Explaining Anomalies Caused by Web Tamper Attacks via Building Consistency-based NormalityProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695024(531-543)Online publication date: 27-Oct-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering
November 2014
856 pages
ISBN:9781450330565
DOI:10.1145/2635868
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Model inference
  2. execution traces
  3. log analysis

Qualifiers

  • Research-article

Conference

SIGSOFT/FSE'14
Sponsor:

Acceptance Rates

Overall Acceptance Rate 17 of 128 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)4
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Specification Mining Based on the Ordering Points to Identify the Clustering Structure Clustering Algorithm and Model CheckingAlgorithms10.3390/a1701002817:1(28)Online publication date: 10-Jan-2024
  • (2024)Unearthing Semantic Checks for Cloud Infrastructure-as-Code ProgramsProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695974(574-589)Online publication date: 4-Nov-2024
  • (2024)Detecting and Explaining Anomalies Caused by Web Tamper Attacks via Building Consistency-based NormalityProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695024(531-543)Online publication date: 27-Oct-2024
  • (2024)Rigorous Assessment of Model Inference Accuracy using Language CardinalityACM Transactions on Software Engineering and Methodology10.1145/364033233:4(1-39)Online publication date: 16-Jan-2024
  • (2024)ROSInfer: Statically Inferring Behavioral Component Models for ROS-based Robotics SystemsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639206(1-13)Online publication date: 20-May-2024
  • (2024)Raisin: Identifying Rare Sensitive Functions for Bug DetectionProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639165(1-12)Online publication date: 20-May-2024
  • (2024)APP-Miner: Detecting API Misuses via Automatically Mining API Path Patterns2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00043(4034-4052)Online publication date: 19-May-2024
  • (2023)Blindspots in Python and Java APIs Result in Vulnerable CodeACM Transactions on Software Engineering and Methodology10.1145/357185032:3(1-31)Online publication date: 26-Apr-2023
  • (2023)PURLTL: Mining LTL Specification from Imperfect Traces in TestingProceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE56229.2023.00202(1766-1770)Online publication date: 11-Nov-2023
  • (2023)An interview study about the use of logs in embedded software engineeringEmpirical Software Engineering10.1007/s10664-022-10258-828:2Online publication date: 11-Feb-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media