Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1287624.1287629acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
Article

Recommending random walks

Published: 07 September 2007 Publication History

Abstract

We improve on previous recommender systems by taking advantage of the layered structure of software. We use a random-walk approach, mimicking the more focused behavior of a developer, who browses the caller-callee links in the callgraph of a large program, seeking routines that are likely to be related to a function of interest. Inspired by Kleinberg's work [10], we approximate the steady-state of an infinite random walk on a subset of a callgraph in order to rank the functions by their steady-state probabilities. Surprisingly, this purely structural approach works quite well. Our approach, like that of Robillard's "Suade" algorithm [15], and earlier data mining approaches [13] relies solely on the always available current state of the code, rather than other sources such as comments, documentation or revision information. Using the Apache API documentation as an oracle, we perform a quantitative evaluation of our method, finding that our algorithm dramatically improves upon Suade in this setting. We also find that the performance of traditional data mining approaches is complementary to ours; this leads naturally to an evidence-based combination of the two, which shows excellent performance on this task.

References

[1]
G. Ammons, R. Bod1k, and J. Larus. Mining specifications. Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 4--16, 2002.
[2]
Y. Benjamini and Y. Hochberg. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society B, 57:289--300, 1995.
[3]
W. Cohen. Inductive specification recovery: Understanding software by learning from example behaviors. Automated Software Engineering, 2(2):107--129, 1995.
[4]
T. Corbi. Program Understanding: Challenge for the 1990s. IBM Systems Journal, 28(2):294--306, 1989.
[5]
D. Cubranic, G. Murphy, J. Singer, and K. Booth. Hipikat: a project memory for software development. Software Engineering, IEEE Transactions on, 31(6):446--465, 2005.
[6]
D. Engler, D. Chen, S. Hallem, A. Chou, and B. Chelf. Bugs as deviant behavior: a general approach to inferring errors in systems code. Proceedings of the eighteenth ACM symposium on Operating systems principles, pages 57--72, 2001.
[7]
Y. Hahsler, B. Grun, and K. Hornik. A computational environment for mining association rules and frequent item sets. Journal of Statistical Software, 14:1--25, 2005.
[8]
M. Hollander and D. A. Wolfe. Nonparametric Statistical Methods. 2nd edition, 1999.
[9]
K. Inoue, R. Yokomori, H. Fujiwara, T. Yamamoto, M. Matsushita, and S. Kusumoto. Component rank: relative significance rank for software component search. Proceedings of the 25th international conference on Software engineering, pages 14--24, 2003.
[10]
J. Kleinberg. Authoritative sources in a hyperlinked environment. In Proceedings, 9th SIAM Symposium on Discrete Algorithms, New York, NY, 1998. ACM, ACM.
[11]
Z. Li and Y. Zhou. PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code. In ESEC/FSE-13: Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering, 2005.
[12]
V. B. Livshits and T. Zimmermann. Dynamine: Finding common error patterns by mining software revision histories. In Proceedings of the 13th ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE-13), Sept. 2005.
[13]
A. Michail. Data mining library reuse patterns using generalized association rules. International Conference on Software Engineering, pages 167--176, 2000.
[14]
D. Poshyvanyk, Y. Gueheneuc, A. Marcus, G. Antoniol, and V. Rajlich. Combining Probabilistic Ranking and Latent Semantic Indexing for Feature Identification. Proceedings of 14th IEEE International Conference on Program Comprehension (ICPC'06), Athens, Greece, pages 137--148, 2006.
[15]
M. Robillard. Automatic generation of suggestions for program investigation. ACM SIGSOFT Software Engineering Notes, 30(5):11--20, 2005.
[16]
M. P. Robillard. Automatic generation of suggestions for program investigation. In SIGSOFT Symposium on the Foundations of Software Engineering, 2005.
[17]
N. Wilde, M. Buckellew, H. Page, V. Rajlich, and L. Pounds. A comparison of methods for locating features in legacy software. Journal of Systems and Software, 65(2):105--114, 2003.
[18]
C. Williams and J. K. Hollingsworth. Automatic mining of source code repositories to improve bug finding techniques. IEEE Transactions on Software Engineering, 31, 2005.
[19]
T. Xie and J. Pei. MAPO: mining API usages from open source repositories. Proceedings of the 2006 international workshop on Mining software repositories, pages 54--57, 2006.
[20]
A. Ying, G. Murphy, R. Ng, and M. Chu-Carroll. Predicting source code changes by mining change history. IEEE Transactions on Software Engineering, 30(9):574--586, 2004.
[21]
C. Zhang and H.-A. Jacobsen. Efficiently mining crosscutting concerns through random walks. In AOSD '07: Proceedings of the 6th international conference on Aspect-oriented software development, pages 226--238, New York, NY, USA, 2007. ACM Press.
[22]
W. Zhao, L. Zhang, Y. Liu, J. Sun, and F. Yang. SNIAFL: Towards a static noninteractive approach to feature location. ACM Transactions on Software Engineering and Methodology (TOSEM), 15(2):195--226, 2006.
[23]
T. Zimmermann, A. Zeller, P. Weissgerber, and S. Diehl. Mining version histories to guide software changes. Software Engineering, IEEE Transactions on, 31(6):429--445, 2005.

Cited By

View all
  • (2022)Phrase2Set: Phrase-to-Set Machine Translation and Its Software Engineering Applications2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER53432.2022.00068(502-513)Online publication date: Mar-2022
  • (2021)Locating Core Modules through the Association between Software Source Structure and ExecutionApplied Sciences10.3390/app1104168511:4(1685)Online publication date: 13-Feb-2021
  • (2021)An Experimental Analysis of Graph-Distance Algorithms for Comparing API Usages2021 IEEE 21st International Working Conference on Source Code Analysis and Manipulation (SCAM)10.1109/SCAM52516.2021.00034(214-225)Online publication date: Sep-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC-FSE '07: Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
September 2007
638 pages
ISBN:9781595938114
DOI:10.1145/1287624
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 September 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. graph theory
  2. recommender systems

Qualifiers

  • Article

Conference

ESEC/FSE07
Sponsor:

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Phrase2Set: Phrase-to-Set Machine Translation and Its Software Engineering Applications2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER53432.2022.00068(502-513)Online publication date: Mar-2022
  • (2021)Locating Core Modules through the Association between Software Source Structure and ExecutionApplied Sciences10.3390/app1104168511:4(1685)Online publication date: 13-Feb-2021
  • (2021)An Experimental Analysis of Graph-Distance Algorithms for Comparing API Usages2021 IEEE 21st International Working Conference on Source Code Analysis and Manipulation (SCAM)10.1109/SCAM52516.2021.00034(214-225)Online publication date: Sep-2021
  • (2021)Guided pattern mining for API misuse detection by change-based code analysisAutomated Software Engineering10.1007/s10515-021-00294-x28:2Online publication date: 17-Aug-2021
  • (2018)Effective API recommendation without historical software repositoriesProceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering10.1145/3238147.3238216(282-292)Online publication date: 3-Sep-2018
  • (2018)Statistical Translation of English Texts to API Code Templates2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME.2018.00029(194-205)Online publication date: Sep-2018
  • (2018)An Efficient Graph-Search Algorithm for Full Link Application SuggestionComputer Supported Cooperative Work and Social Computing10.1007/978-981-13-3044-5_41(536-545)Online publication date: 11-Dec-2018
  • (2017)Combining Word2Vec with revised vector space model for better code retrievalProceedings of the 39th International Conference on Software Engineering Companion10.1109/ICSE-C.2017.90(183-185)Online publication date: 20-May-2017
  • (2017)An effective change recommendation approach for supplementary bug fixesAutomated Software Engineering10.1007/s10515-016-0204-z24:2(455-498)Online publication date: 1-Jun-2017
  • (2016)API recommendation system for software developmentProceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering10.1145/2970276.2975940(896-899)Online publication date: 25-Aug-2016
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media