Article

Recommending random walks

Authors:

Zachary M. Saul,

Vladimir Filkov,

Premkumar Devanbu,

Christian BirdAuthors Info & Claims

ESEC-FSE '07: Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering

Pages 15 - 24

https://doi.org/10.1145/1287624.1287629

Published: 07 September 2007 Publication History

Abstract

We improve on previous recommender systems by taking advantage of the layered structure of software. We use a random-walk approach, mimicking the more focused behavior of a developer, who browses the caller-callee links in the callgraph of a large program, seeking routines that are likely to be related to a function of interest. Inspired by Kleinberg's work [10], we approximate the steady-state of an infinite random walk on a subset of a callgraph in order to rank the functions by their steady-state probabilities. Surprisingly, this purely structural approach works quite well. Our approach, like that of Robillard's "Suade" algorithm [15], and earlier data mining approaches [13] relies solely on the always available current state of the code, rather than other sources such as comments, documentation or revision information. Using the Apache API documentation as an oracle, we perform a quantitative evaluation of our method, finding that our algorithm dramatically improves upon Suade in this setting. We also find that the performance of traditional data mining approaches is complementary to ours; this leads naturally to an evidence-based combination of the two, which shows excellent performance on this task.

References

[1]

G. Ammons, R. Bod1k, and J. Larus. Mining specifications. Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 4--16, 2002.

Digital Library

[2]

Y. Benjamini and Y. Hochberg. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society B, 57:289--300, 1995.

[3]

W. Cohen. Inductive specification recovery: Understanding software by learning from example behaviors. Automated Software Engineering, 2(2):107--129, 1995.

[4]

T. Corbi. Program Understanding: Challenge for the 1990s. IBM Systems Journal, 28(2):294--306, 1989.

Digital Library

[5]

D. Cubranic, G. Murphy, J. Singer, and K. Booth. Hipikat: a project memory for software development. Software Engineering, IEEE Transactions on, 31(6):446--465, 2005.

Digital Library

[6]

D. Engler, D. Chen, S. Hallem, A. Chou, and B. Chelf. Bugs as deviant behavior: a general approach to inferring errors in systems code. Proceedings of the eighteenth ACM symposium on Operating systems principles, pages 57--72, 2001.

Digital Library

[7]

Y. Hahsler, B. Grun, and K. Hornik. A computational environment for mining association rules and frequent item sets. Journal of Statistical Software, 14:1--25, 2005.

[8]

M. Hollander and D. A. Wolfe. Nonparametric Statistical Methods. 2nd edition, 1999.

[9]

K. Inoue, R. Yokomori, H. Fujiwara, T. Yamamoto, M. Matsushita, and S. Kusumoto. Component rank: relative significance rank for software component search. Proceedings of the 25th international conference on Software engineering, pages 14--24, 2003.

Digital Library

[10]

J. Kleinberg. Authoritative sources in a hyperlinked environment. In Proceedings, 9th SIAM Symposium on Discrete Algorithms, New York, NY, 1998. ACM, ACM.

Digital Library

[11]

Z. Li and Y. Zhou. PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code. In ESEC/FSE-13: Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering, 2005.

Digital Library

[12]

V. B. Livshits and T. Zimmermann. Dynamine: Finding common error patterns by mining software revision histories. In Proceedings of the 13th ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE-13), Sept. 2005.

Digital Library

[13]

A. Michail. Data mining library reuse patterns using generalized association rules. International Conference on Software Engineering, pages 167--176, 2000.

Digital Library

[14]

D. Poshyvanyk, Y. Gueheneuc, A. Marcus, G. Antoniol, and V. Rajlich. Combining Probabilistic Ranking and Latent Semantic Indexing for Feature Identification. Proceedings of 14th IEEE International Conference on Program Comprehension (ICPC'06), Athens, Greece, pages 137--148, 2006.

Digital Library

[15]

M. Robillard. Automatic generation of suggestions for program investigation. ACM SIGSOFT Software Engineering Notes, 30(5):11--20, 2005.

Digital Library

[16]

M. P. Robillard. Automatic generation of suggestions for program investigation. In SIGSOFT Symposium on the Foundations of Software Engineering, 2005.

Digital Library

[17]

N. Wilde, M. Buckellew, H. Page, V. Rajlich, and L. Pounds. A comparison of methods for locating features in legacy software. Journal of Systems and Software, 65(2):105--114, 2003.

Digital Library

[18]

C. Williams and J. K. Hollingsworth. Automatic mining of source code repositories to improve bug finding techniques. IEEE Transactions on Software Engineering, 31, 2005.

Digital Library

[19]

T. Xie and J. Pei. MAPO: mining API usages from open source repositories. Proceedings of the 2006 international workshop on Mining software repositories, pages 54--57, 2006.

Digital Library

[20]

A. Ying, G. Murphy, R. Ng, and M. Chu-Carroll. Predicting source code changes by mining change history. IEEE Transactions on Software Engineering, 30(9):574--586, 2004.

Digital Library

[21]

C. Zhang and H.-A. Jacobsen. Efficiently mining crosscutting concerns through random walks. In AOSD '07: Proceedings of the 6th international conference on Aspect-oriented software development, pages 226--238, New York, NY, USA, 2007. ACM Press.

Digital Library

[22]

W. Zhao, L. Zhang, Y. Liu, J. Sun, and F. Yang. SNIAFL: Towards a static noninteractive approach to feature location. ACM Transactions on Software Engineering and Methodology (TOSEM), 15(2):195--226, 2006.

Digital Library

[23]

T. Zimmermann, A. Zeller, P. Weissgerber, and S. Diehl. Mining version histories to guide software changes. Software Engineering, IEEE Transactions on, 31(6):429--445, 2005.

Digital Library

Cited By

Nguyen TYadavally ANguyen T(2022)Phrase2Set: Phrase-to-Set Machine Translation and Its Software Engineering Applications2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER53432.2022.00068(502-513)Online publication date: Mar-2022
https://doi.org/10.1109/SANER53432.2022.00068
Huh SKim W(2021)Locating Core Modules through the Association between Software Source Structure and ExecutionApplied Sciences10.3390/app1104168511:4(1685)Online publication date: 13-Feb-2021
https://doi.org/10.3390/app11041685
Nielebock SBlockhaus PKruger JOrtmeier F(2021)An Experimental Analysis of Graph-Distance Algorithms for Comparing API Usages2021 IEEE 21st International Working Conference on Source Code Analysis and Manipulation (SCAM)10.1109/SCAM52516.2021.00034(214-225)Online publication date: Sep-2021
https://doi.org/10.1109/SCAM52516.2021.00034
Show More Cited By

Index Terms

Recommending random walks
1. Software and its engineering
  1. Software creation and management
    1. Software post-development issues
      1. Documentation

Recommendations

Recommending Followees Based on Content Weighted User Interest Homophily
ICIMCS'16: Proceedings of the International Conference on Internet Multimedia Computing and Service

We study the problem of recommending followees to users on content curation social networks (CCSNs). Different from existing friendship-oriented user recommendation approaches, we exploit user interest homophily to recommend users of similar interests, ...
Content-based book recommending using learning for text categorization
DL '00: Proceedings of the fifth ACM conference on Digital libraries

Recommender systems improve access to relevant products and information by making personalized suggestions based on previous examples of a user's likes and dislikes. Most existing recommender systems use collaborative filtering methods that base ...
Flexible recommendation using random walks on implicit feedback graph
ICUIMC '11: Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication

Realizing context-aware recommender systems (CARS) has been acknowledged as one of the most important topics in the area of recommender systems. CARSs aim to incorporate contextual information in recommendation to achieve better recommendation results. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ESEC-FSE '07: Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering

September 2007

638 pages

ISBN:9781595938114

DOI:10.1145/1287624

General Chair:
Ivica Crnkovic
Mälardalen University, Sweden
,
Program Chair:
Antonia Bertolino
ISTI-CNR, Italy

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 September 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

ESEC/FSE07

Sponsor:

ESEC/FSE07: Joint 11th European Software Engineering Conference 2007

September 3 - 7, 2007

Dubrovnik, Croatia

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

50
Total Citations
View Citations
565
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Nguyen TYadavally ANguyen T(2022)Phrase2Set: Phrase-to-Set Machine Translation and Its Software Engineering Applications2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER53432.2022.00068(502-513)Online publication date: Mar-2022
https://doi.org/10.1109/SANER53432.2022.00068
Huh SKim W(2021)Locating Core Modules through the Association between Software Source Structure and ExecutionApplied Sciences10.3390/app1104168511:4(1685)Online publication date: 13-Feb-2021
https://doi.org/10.3390/app11041685
Nielebock SBlockhaus PKruger JOrtmeier F(2021)An Experimental Analysis of Graph-Distance Algorithms for Comparing API Usages2021 IEEE 21st International Working Conference on Source Code Analysis and Manipulation (SCAM)10.1109/SCAM52516.2021.00034(214-225)Online publication date: Sep-2021
https://doi.org/10.1109/SCAM52516.2021.00034
Nielebock SHeumüller RSchott KOrtmeier F(2021)Guided pattern mining for API misuse detection by change-based code analysisAutomated Software Engineering10.1007/s10515-021-00294-x28:2Online publication date: 17-Aug-2021
https://doi.org/10.1007/s10515-021-00294-x
Liu XHuang LNg VHuchard MKästner CFraser G(2018)Effective API recommendation without historical software repositoriesProceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering10.1145/3238147.3238216(282-292)Online publication date: 3-Sep-2018
https://dl.acm.org/doi/10.1145/3238147.3238216
Nguyen ARigby PNguyen TPalani DKaranfil MNguyen T(2018)Statistical Translation of English Texts to API Code Templates2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME.2018.00029(194-205)Online publication date: Sep-2018
https://doi.org/10.1109/ICSME.2018.00029
Zhang HZeng NSong HXu MWang HYang BLyu C(2018)An Efficient Graph-Search Algorithm for Full Link Application SuggestionComputer Supported Cooperative Work and Social Computing10.1007/978-981-13-3044-5_41(536-545)Online publication date: 11-Dec-2018
https://doi.org/10.1007/978-981-13-3044-5_41
Van Nguyen TNguyen APhan HNguyen TNguyen TUchitel SOrso ARobillard M(2017)Combining Word2Vec with revised vector space model for better code retrievalProceedings of the 39th International Conference on Software Engineering Companion10.1109/ICSE-C.2017.90(183-185)Online publication date: 20-May-2017
https://dl.acm.org/doi/10.1109/ICSE-C.2017.90
Xia XLo D(2017)An effective change recommendation approach for supplementary bug fixesAutomated Software Engineering10.1007/s10515-016-0204-z24:2(455-498)Online publication date: 1-Jun-2017
https://dl.acm.org/doi/10.1007/s10515-016-0204-z
Thung FLo DApel SKhurshid S(2016)API recommendation system for software developmentProceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering10.1145/2970276.2975940(896-899)Online publication date: 25-Aug-2016
https://dl.acm.org/doi/10.1145/2970276.2975940
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents