Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Typestate-based semantic code search over partial programs

Published: 19 October 2012 Publication History

Abstract

We present a novel code search approach for answering queries focused on API-usage with code showing how the API should be used. To construct a search index, we develop new techniques for statically mining and consolidating temporal API specifications from code snippets. In contrast to existing semantic-based techniques, our approach handles partial programs in the form of code snippets. Handling snippets allows us to consume code from various sources such as parts of open source projects, educational resources (e.g. tutorials), and expert code sites. To handle code snippets, our approach (i) extracts a possibly partial temporal specification from each snippet using a relatively precise static analysis tracking a generalized notion of typestate, and (ii) consolidates the partial temporal specifications, combining consistent partial information to yield consolidated temporal specifications, each of which captures a full(er) usage scenario.
To answer a search query, we define a notion of relaxed inclusion matching a query against temporal specifications and their corresponding code snippets.
We have implemented our approach in a tool called PRIME and applied it to search for API usage of several challenging APIs. PRIME was able to analyze and consolidate thousands of snippets per tested API, and our results indicate that the combination of a relatively precise analysis and consolidation allowed PRIME to answer challenging queries effectively.

References

[1]
ACHARYA, M., X IE, T., P EI, J., AND XU, J. Mining API patterns as partial orders from source code: from usage scenarios to specifications. In ESEC-FSE '07, pp. 25--34.
[2]
ALNUSAIR, A., Z HAO, T., AND BODDEN, E. Effective API navigation and reuse. In IRI (aug. 2010), pp. 7--12.
[3]
ALUR, R., C ERNY, P., M ADHUSUDAN, P., AND NAM, W. Synthesis of interface specifications for Java classes. In POPL (2005).
[4]
AMMONS, G., B ODIK, R., AND LARUS, J. R. Mining specifications. In POPL'02, pp. 4--16.
[5]
BAXTER, I. D., Y AHIN, A., MOURA, L., SANT'ANNA, M., AND BIER, L. Clone detection using abstract syntax trees. In ICSM '98.
[6]
BECKMAN, N., K IM, D., AND ALDRICH, J. An empirical study of object protocols in the wild. In ECOOP'11.
[7]
COOK, J. E., AND WOLF, A. L. Discovering models of software processes from event-based data. ACM Trans. Softw. Eng. Methodol. 7, 3 (1998), 215--249.
[8]
COUSOT, P., AND COUSOT, R. Modular static program analysis, invited paper. April 6-14 2002.
[9]
DAGENAIS, B., AND HENDREN, L. J. Enabling static analysis for partial Java programs. In OOPSLA'08, pp. 313--328.
[10]
DALLMEIER, V., L INDIG, C., W ASYLKOWSKI, A., AND ZELLER, A. Mining object behavior with ADABU. In WODA '06.
[11]
DUCASSE, S., R IEGER, M., AND DEMEYER, S. A language independent approach for detecting duplicated code. In ICSM '99.
[12]
FINK, S., Y AHAV, E., D OR, N., R AMALINGAM, G., AND GEAY, E. Effective typestate verification in the presence of aliasing. In ISSTA'06, pp. 133--144.
[13]
GABEL, M., J IANG, L., AND SU, Z. Scalable detection of semantic clones. In ICSE '08, pp. 321--330.
[14]
GABEL, M., AND SU, Z. Javert: fully automatic mining of general temporal properties from dynamic traces. In FSE'08.
[15]
github code search. https://github.com/search.
[16]
GRUSKA, N., W ASYLKOWSKI, A., AND ZELLER, A. Learn-ing from 6,000 projects: Lightweight cross-project anomaly detection. In ISSTA '10.
[17]
HOLMES, R., AND MURPHY, G. C. Using structural context to recommend source code examples. In ICSE '05.
[18]
HOLMES, R., W ALKER, R. J., AND MURPHY, G. C. Strath-cona example recommendation tool. In FSE'05, pp. 237--240.
[19]
J IANG, L., M ISHERGHI, G., S U, Z., AND GLONDU, S. Deckard: Scalable and accurate tree-based detection of code clones. IEEE Computer Society, pp. 96--105.
[20]
KAMIYA, T., K USUMOTO, S., AND I NOUE, K. CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng. 28, 7 (July 2002), 654--670.
[21]
KIM, J., L EE, S., WON HWANG, S., AND KIM, S. Towards an intelligent code search engine. In AAAI'10.
[22]
Koders. http://www.koders.com/.
[23]
KOMONDOOR, R., AND HORWITZ, S. Using slicing to iden-tify duplication in source code. In SAS '01, pp. 40--56.
[24]
KRINKE, J. Identifying similar code with program depen-dence graphs. In WCRE (2001), pp. 301--309.
[25]
LIVIERI, S., H IGO, Y., M ATUSHITA, M., AND I NOUE, K. Very-large scale code clone analysis and visualization of open source programs using distributed CCFinder: D-CCFinder. In ICSE'07.
[26]
LO, D., AND KHOO, S.-C. SMArTIC: towards building an accurate, robust and scalable specification miner. In FSE'06.
[27]
MANDELIN, D., X U, L., B ODIK, R., AND KIMELMAN, D. Jungloid mining: helping to navigate the API jungle. In PLDI '05, pp. 48--61.
[28]
MISHNE, A. Typestate-based semantic code search over partial programs. Master's thesis, Technion-Israel Institute of Technology, Haifa, Israel, 2012.
[29]
MONPERRUS, M., B RUCH, M., AND MEZINI, M. Detecting missing method calls in object-oriented software. In ECOOP (2010), T. D'Hondt, Ed., vol. 6183 of Lecture Notes in Computer Science, Springer, pp. 2--25.
[30]
REISS, S. P. Semantics-based code search. In ICSE'09.
[31]
SAHAVECHAPHAN, N., AND CLAYPOOL, K. XSnippet: mining for sample code. In OOPSLA '06.
[32]
SHOHAM, S., Y AHAV, E., F INK, S., AND PISTOIA, M. Static specification mining using automata-based abstractions. In ISSTA '07.
[33]
SOLAR-L EZAMA, A., R ABBAH, R., B ODÍK, R., AND EBCIOGLU, K. Programming by sketching for bit-streaming programs. In PLDI '05.
[34]
stackoverflow. http://stackoverflow.com/.
[35]
STROM, R. E., AND YEMINI, S. Typestate: A programming language concept for enhancing software reliability. IEEE Trans. Software Eng. 12, 1 (1986), 157--171.
[36]
THUMMALAPENTA, S., AND XIE, T. PARSEWeb: a programmer assistant for reusing open source code on the web. In ASE'07, pp. 204--213.
[37]
VALLÉE-R AI, R., C O, P., G AGNON, E., H ENDREN, L., LAM, P., AND SUNDARESAN, V. Soot - a Java bytecode optimization framework. In CASCON '99, IBM Press, pp. 13--.
[38]
WAHLER, V., S EIPEL, D., W OLFF, J., AND FISCHER, G. Clone detection in source code by frequent itemset techniques. In Source Code Analysis and Manipulation (2004).
[39]
WASYLKOWSKI, A., AND ZELLER, A. Mining temporal specifications from object usage. In Autom. Softw. Eng. (2011), vol. 18.
[40]
WASYLKOWSKI, A., Z ELLER, A., AND LINDIG, C. Detecting object usage anomalies. In FSE'07, pp. 35--44.
[41]
WEIMER, W., AND NECULA, G. Mining temporal specifications for error detection. In TACAS (2005).
[42]
WHALEY, J., M ARTIN, M. C., AND LAM, M. S. Automatic extraction of object-oriented component interfaces. In ISSTA'02.
[43]
YANG, J., E VANS, D., B HARDWAJ, D., B HAT, T., AND DAS, M. Perracotta: mining temporal API rules from imperfect traces. In ICSE '06, pp. 282--291.
[44]
ZHONG, H., X IE, T., Z HANG, L., P EI, J., AND MEI, H. MAPO: Mining and recommending API usage patterns. In ECOOP'09.

Cited By

View all
  • (2024)Intelligent code search aids edge software developmentJournal of Cloud Computing10.1186/s13677-024-00629-513:1Online publication date: 1-Apr-2024
  • (2021)CodeMatcher: Searching Code Based on Sequential Semantics of Important Query WordsACM Transactions on Software Engineering and Methodology10.1145/346540331:1(1-37)Online publication date: 28-Sep-2021
  • (2021)Task-Oriented API Usage Examples Prompting Powered By Programming Task Knowledge Graph2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME52107.2021.00046(448-459)Online publication date: Sep-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 47, Issue 10
OOPSLA '12
October 2012
1011 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2398857
Issue’s Table of Contents
  • cover image ACM Conferences
    OOPSLA '12: Proceedings of the ACM international conference on Object oriented programming systems languages and applications
    October 2012
    1052 pages
    ISBN:9781450315616
    DOI:10.1145/2384616
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2012
Published in SIGPLAN Volume 47, Issue 10

Check for updates

Author Tags

  1. code search engine
  2. ranking code samples
  3. specification mining
  4. static analysis
  5. typestate

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)18
  • Downloads (Last 6 weeks)2
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Intelligent code search aids edge software developmentJournal of Cloud Computing10.1186/s13677-024-00629-513:1Online publication date: 1-Apr-2024
  • (2021)CodeMatcher: Searching Code Based on Sequential Semantics of Important Query WordsACM Transactions on Software Engineering and Methodology10.1145/346540331:1(1-37)Online publication date: 28-Sep-2021
  • (2021)Task-Oriented API Usage Examples Prompting Powered By Programming Task Knowledge Graph2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME52107.2021.00046(448-459)Online publication date: Sep-2021
  • (2021)FACER: An API usage-based code-example recommender for opportunistic reuseEmpirical Software Engineering10.1007/s10664-021-10000-w26:6Online publication date: 18-Aug-2021
  • (2020)Exaggerated Error Handling Hurts! An In-Depth Study and Context-Aware DetectionProceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security10.1145/3372297.3417256(1203-1218)Online publication date: 30-Oct-2020
  • (2019)Recommending related functions from API usage-based function clone structuresProceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3338906.3342486(1193-1195)Online publication date: 12-Aug-2019
  • (2019)VajraProceedings of the 24th International Conference on Intelligent User Interfaces10.1145/3301275.3302267(30-39)Online publication date: 17-Mar-2019
  • (2019)Inferring API Correct Usage Rules: A Tree-based Approach2019 16th International ISC (Iranian Society of Cryptology) Conference on Information Security and Cryptology (ISCISC)10.1109/ISCISC48546.2019.8985157(78-84)Online publication date: Aug-2019
  • (2019)Slicing Based Code Recommendation for Type Based Instance RetrievalReuse in the Big Data Era10.1007/978-3-030-22888-0_11(149-167)Online publication date: 19-Jun-2019
  • (2019)Using Stack Overflow content to assist in code reviewSoftware: Practice and Experience10.1002/spe.270649:8(1255-1277)Online publication date: 27-May-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media