Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/2819009.2819107acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Inferring behavioral specifications from large-scale repositories by leveraging collective intelligence

Published: 16 May 2015 Publication History

Abstract

Despite their proven benefits, useful, comprehensible, and efficiently checkable specifications are not widely available. This is primarily because writing useful, non-trivial specifications from scratch is too hard, time consuming, and requires expertise that is not broadly available. Furthermore, the lack of specifications for widely-used libraries and frameworks, caused by the high cost of writing specifications, tends to have a snowball effect. Core libraries lack specifications, which makes specifying applications that use them expensive. To contain the skyrocketing development and maintenance costs of high assurance systems, this self-perpetuating cycle must be broken. The labor cost of specifying programs can be significantly decreased via advances in specification inference and synthesis, and this has been attempted several times, but with limited success. We believe that practical specification inference and synthesis is an idea whose time has come. Fundamental breakthroughs in this area can be achieved by leveraging the collective intelligence available in software artifacts from millions of open source projects. Finegrained access to such data sets has been unprecedented, but is now easily available. We identify research directions and report our preliminary results on advances in specification inference that can be had by using such data sets to infer specifications.

References

[1]
A. Hall, "Seven myths of formal methods," IEEE Software, vol. 7, no. 5, pp. 11--19, Sep. 1990.
[2]
G. T. Leavens and C. Clifton, "Lessons from the JML project," in Verified Software: Theories, Tools, Experiments, Zurich, Switzerland, ser. Lecture Notes in Computer Science, B. Meyer and J. Woodcock, Eds., vol. 4171. Springer-Verlag, 2008, pp. 134--143.
[3]
P. Behm, P. Benoit, A. Faivre, and J.-M. Meynadier, "Météor: A successful application of B in a large project," in FM'99: Formal Methods, ser. Lecture Notes in Computer Science, J. M. Wing, J. Woodcock, and J. Davies, Eds. Springer Berlin Heidelberg, 1999, vol. 1708, pp. 369--387.
[4]
A. Hall and D. Isaac, "Formal methods in a real air traffic control project," in Software in Air Traffic Control Systems - The Future, IEE Colloquium on, Jun. 1992, pp. 7/1--7/4.
[5]
G. Klein, J. Andronick, K. Elphinstone, G. Heiser, D. Cock, P. Derrin, D. Elkaduwe, K. Engelhardt, R. Kolanski, M. Norrish, T. Sewell, H. Tuch, and S. Winwood, "seL4: Formal verification of an operating-system kernel," Communications of the ACM, vol. 53, no. 6, pp. 107--115, Jun. 2010.
[6]
H. Rajan, T. N. Nguyen, R. Dyer, and H. A. Nguyen, "Boa website," http://boa.cs.iastate.edu/, 2013.
[7]
R. Dyer, H. A. Nguyen, H. Rajan, and T. N. Nguyen, "Boa: A language and infrastructure for analyzing ultra-large-scale software repositories," in Proceedings of the 35th International Conference on Software Engineering, ser. ICSE'13, 2013, pp. 422--431.
[8]
R. Dyer, "Task fusion: Improving utilization of multi-user clusters," in Proceedings of the 2013 companion publication for conference on Systems, programming, & applications: software for humanity, ser. SPLASH SRC, 2013, pp. 117--118.
[9]
R. Dyer, "Bringing ultra-large-scale software repository mining to the masses with Boa," Ph.D. dissertation, Iowa State University, 2013.
[10]
R. Dyer, H. Rajan, H. A. Nguyen, and T. N. Nguyen, "Mining billions of AST nodes to study actual and potential usage of Java language features," in Proceedings of the 36th International Conference on Software Engineering, ser. ICSE'14, 2014.
[11]
R. Dyer, H. A. Nguyen, H. Rajan, and T. N. Nguyen, The Art and Science of Analyzing Software Data. Morgan-Kaufmann, 2015, ch. Boa: an Enabling Language and Infrastructure for Ultra-large Scale MSR Studies.
[12]
H. A. Nguyen, R. Dyer, T. N. Nguyen, and H. Rajan, "Mining preconditions of APIs in large-scale code corpus," in 22nd International Symposium on Foundations of Software Engineering, ser. FSE'14, November 2014, pp. 166--177.
[13]
R. Dyer, H. Rajan, and T. N. Nguyen, "Declarative visitors to ease finegrained source code mining with full history on billions of AST nodes," in Proceedings of the 12th International Conference on Generative Programming: Concepts and Experiences, ser. GPCE, 2013, pp. 23--32.
[14]
A. M. Zaremski and J. M. Wing, "Specification matching of software components," ACM Transactions on Software Engineering and Methodology, vol. 6, no. 4, pp. 333--369, Oct. 1997.
[15]
G. Ammons, R. Bodík, and J. R. Larus, "Mining specifications," in POPL '02: Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages. ACM, 2002, pp. 4--16.
[16]
I. Beschastnikh, Y. Brun, S. Schneider, M. Sloan, and M. D. Ernst, "Leveraging existing instrumentation to automatically infer invariant-constrained models," in Foundations of Software Engineering (FSE). ACM, 2011, pp. 267--277.
[17]
M. D. Ernst, J. Cockrell, W. G. Griswold, and D. Notkin, "Dynamically discovering likely program invariants to support program evolution," in International conference on Software engineering, ser. ICSE'99. ACM, 1999, pp. 213--224.
[18]
L. Mariani and F. Pastore, "Automated identification of failure causes in system logs," in Software Reliability Engineering, 2008. ISSRE 2008. 19th International Symposium on. IEEE CS, 2008, pp. 117--126.
[19]
W. Weimer and G. C. Necula, "Mining temporal specifications for error detection," in TACAS. Springer-Verlag, 2005, pp. 461--476.
[20]
D. Distefano and M. J. Parkinson, "jStar: towards practical verification for Java," in Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), G. E. Harris, Ed. New York, NY: ACM, 2008, pp. 213--226.
[21]
D. Engler, D. Y. Chen, S. Hallem, A. Chou, and B. Chelf, "Bugs as deviant behavior: a general approach to inferring errors in systems code," in SOSP '01: Proceedings of the eighteenth ACM symposium on Operating systems principles. ACM, 2001, pp. 57--72.
[22]
M. Fähndrich and F. Logozzo, "Static contract checking with abstract interpretation," in Formal Verification of Object-Oriented Software, ser. Lecture Notes in Computer Science, B. Beckert and C. Marché, Eds. Springer-Verlag, 2011, vol. 6528, pp. 10--30.
[23]
C. Flanagan and K. R. M. Leino, "Houdini, an annotation assistant for ESC/Java," in FME 2001: Formal Methods for Increasing Software Productivity, ser. Lecture Notes in Computer Science, J. N. Oliveira and P. Zave, Eds., vol. 2021. Springer-Verlag, Mar. 2001, pp. 500--517.
[24]
T. Kremenek, P. Twohey, G. Back, A. Ng, and D. Engler, "From uncertainty to belief: inferring the specification within," in Proceedings of the 7th symposium on Operating systems design and implementation, ser. OSDI '06. USENIX Association, 2006, pp. 161--176.
[25]
M. K. Ramanathan, A. Grama, and S. Jagannathan, "Static specification inference using predicate mining," in Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation, ser. PLDI '07. ACM, 2007, pp. 123--134.
[26]
Y. Wei, C. A. Furia, N. Kazmin, and B. Meyer, "Inferring better contracts," in Proceedings of the 33rd International Conference on Software Engineering, ser. ICSE '11. ACM, 2011, pp. 191--200.
[27]
D. McAllester, "On the complexity of static analysis," Journal of the ACM, vol. 49, no. 4, pp. 512--537, Jul. 2002.
[28]
B. Livshits and T. Zimmermann, "Dynamine: finding common error patterns by mining software revision histories," SIGSOFT Softw. Eng. Notes, vol. 30, no. 5, pp. 296--305, 2005.
[29]
Z. Li and Y. Zhou, "PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code," in ESEC/FSE-13: Symposium on Foundations of software engineering. ACM, 2005, pp. 306--315.
[30]
A. Wasylkowski, A. Zeller, and C. Lindig, "Detecting object usage anomalies," in ESEC-FSE '07: Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering. ACM, 2007, pp. 35--44.
[31]
H. Zhong, T. Xie, L. Zhang, J. Pei, and H. Mei, "MAPO: Mining and recommending API usage patterns," in Proceedings of the 23rd European Conference on ECOOP 2009 --- Object-Oriented Programming. Springer-Verlag, 2009, pp. 318--343.
[32]
T. T. Nguyen, H. A. Nguyen, N. H. Pham, J. M. Al-Kofahi, and T. N. Nguyen, "Graph-based mining of multiple object usage patterns," in Symposium on The foundations of software engineering, ser. ESEC/FSE '09. ACM, 2009, pp. 383--392.
[33]
M. Gabel and Z. Su, "Javert: fully automatic mining of general temporal properties from dynamic traces," in SIGSOFT '08/FSE-16: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering. ACM, 2008, pp. 339--349.
[34]
J. Yang, D. Evans, D. Bhardwaj, T. Bhat, and M. Das, "Perracotta: mining temporal API rules from imperfect traces," in ICSE '06: Proceedings of the 28th international conference on Software engineering. ACM, 2006, pp. 282--291.
[35]
A. Wasylkowski and A. Zeller, "Mining temporal specifications from object usage," in ASE'09: Conference on Automated Software Engineering. IEEE CS, 2009, pp. 295--306.
[36]
C. C. Williams and J. K. Hollingsworth, "Automatic mining of source code repositories to improve bug finding techniques," IEEE Trans. Softw. Eng., vol. 31, no. 6, pp. 466--480, 2005.
[37]
S. Thummalapenta and T. Xie, "Alattin: Mining alternative patterns for detecting neglected conditions," in ASE'09: Conference on Automated Software Engineering. IEEE CS, 2009, pp. 283--294.
[38]
M. Pradel and T. R. Gross, "Automatic generation of object usage specifications from large method traces," in ASE'09: Conference on Automated Software Engineering. IEEE CS, 2009, pp. 371--382.
[39]
I. Krka, Y. Brun, and N. Medvidovic, "Automatic mining of specifications from invocation traces and method invariants," in Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, ser. FSE 2014, 2014, pp. 178--189.

Cited By

View all
  • (2018)Towards combining usage mining and implementation analysis to infer API preconditionsProceedings of the 1st ACM SIGSOFT International Workshop on Automated Specification Inference10.1145/3278177.3278182(15-16)Online publication date: 9-Nov-2018
  • (2018)Using consensus to automatically infer post-conditionsProceedings of the 40th International Conference on Software Engineering: Companion Proceeedings10.1145/3183440.3195096(202-203)Online publication date: 27-May-2018
  • (2018)Collective program analysisProceedings of the 40th International Conference on Software Engineering10.1145/3180155.3180252(620-631)Online publication date: 27-May-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '15: Proceedings of the 37th International Conference on Software Engineering - Volume 2
May 2015
1058 pages

Sponsors

Publisher

IEEE Press

Publication History

Published: 16 May 2015

Check for updates

Qualifiers

  • Research-article

Conference

ICSE '15
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Towards combining usage mining and implementation analysis to infer API preconditionsProceedings of the 1st ACM SIGSOFT International Workshop on Automated Specification Inference10.1145/3278177.3278182(15-16)Online publication date: 9-Nov-2018
  • (2018)Using consensus to automatically infer post-conditionsProceedings of the 40th International Conference on Software Engineering: Companion Proceeedings10.1145/3183440.3195096(202-203)Online publication date: 27-May-2018
  • (2018)Collective program analysisProceedings of the 40th International Conference on Software Engineering10.1145/3180155.3180252(620-631)Online publication date: 27-May-2018
  • (2018)Leveraging project-specificity to find suitable specificationsProceedings of the 33rd Annual ACM Symposium on Applied Computing10.1145/3167132.3167455(1579-1580)Online publication date: 9-Apr-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media