Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3368089.3409764acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Recommending stack overflow posts for fixing runtime exceptions using failure scenario matching

Published: 08 November 2020 Publication History

Abstract

Using online Q&A forums, such as Stack Overflow (SO), for guidance to resolve program bugs, among other development issues, is commonplace in modern software development practice. Runtime exceptions (RE) is one such important class of bugs that is actively discussed on SO. In this work we present a technique and prototype tool called MAESTRO that can automatically recommend an SO post that is most relevant to a given Java RE in a developer's code. MAESTRO compares the exception-generating program scenario in the developer's code with that discussed in an SO post and returns the post with the closest match. To extract and compare the exception scenario effectively, MAESTRO first uses the answer code snippets in a post to implicate a subset of lines in the post's question code snippet as responsible for the exception and then compares these lines with the developer's code in terms of their respective Abstract Program Graph (APG) representations. The APG is a simplified and abstracted derivative of an abstract syntax tree, proposed in this work, that allows an effective comparison of the functionality embodied in the high-level program structure, while discarding many of the low-level syntactic or semantic differences. We evaluate MAESTRO on a benchmark of 78 instances of Java REs extracted from the top 500 Java projects on GitHub and show that MAESTRO can return either a highly relevant or somewhat relevant SO post corresponding to the exception instance in 71% of the cases, compared to relevant posts returned in only 8% - 44% instances, by four competitor tools based on state-of-the-art techniques. We also conduct a user experience study of MAESTRO with 10 Java developers, where the participants judge MAESTRO reporting a highly relevant or somewhat relevant post in 80% of the instances. In some cases the post is judged to be even better than the one manually found by the participant.

Supplementary Material

Auxiliary Teaser Video (fse20main-p894-p-teaser.mp4)
This is a presentation video of my talk at ESEC/FSE 2020 for our paper accepted in the research track. In this paper, we present an automated technique, MAESTRO, for recommending relevant Stack Overflow posts to Java developers which could assist them in fixing runtime exceptions in their code. MAESTRO finds the best post by comparing the exception-generating program scenario in the developer’s code with that discussed in the posts. To extract and compare the exception scenario effectively, MAESTRO identifies the failure producing lines in the question code in a post and then compares these lines with the developer’s code in terms of their high-level representations called the Abstract Program Graphs (APGs). In the evaluation on 78 real-world instances, MAESTRO was effective in recommending a relevant Stack Overflow post in 71% of the cases in only 2.6 seconds, on median. In comparison, four competitor techniques only reported a relevant post in 8 – 44% of the cases.
Auxiliary Presentation Video (fse20main-p894-p-video.mp4)
This is a presentation video of my talk at ESEC/FSE 2020 for our paper accepted in the research track. In this paper, we present an automated technique, MAESTRO, for recommending relevant Stack Overflow posts to Java developers which could assist them in fixing runtime exceptions in their code. MAESTRO finds the best post by comparing the exception-generating program scenario in the developer’s code with that discussed in the posts. To extract and compare the exception scenario effectively, MAESTRO identifies the failure producing lines in the question code in a post and then compares these lines with the developer’s code in terms of their high-level representations called the Abstract Program Graphs (APGs). In the evaluation on 78 real-world instances, MAESTRO was effective in recommending a relevant Stack Overflow post in 71% of the cases in only 2.6 seconds, on median. In comparison, four competitor techniques only reported a relevant post in 8 – 44% of the cases.

References

[1]
Sebastian Baltes and Stephan Diehl. 2018. Usage and Attribution of Stack Overflow Code Snippets in GitHub Projects. CoRR abs/ 1802.02938 ( 2018 ). arXiv: 1802.02938 http://arxiv.org/abs/ 1802.02938
[2]
Andrew Begel and Thomas Zimmermann. 2014. Analyze This! 145 Questions for Data Scientists in Software Engineering. In Proceedings of the 36th International Conference on Software Engineering (Hyderabad, India) (ICSE 2014 ). Association for Computing Machinery, New York, NY, USA, 12-23. https://doi.org/10.1145/ 2568225.2568233
[3]
Joel Brandt, Philip J. Guo, Joel Lewenstein, Mira Dontcheva, and Scott R. Klemmer. 2009. Two Studies of Opportunistic Programming: Interleaving Web Foraging, Learning, and Writing Code. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Boston, MA, USA) ( CHI '09). ACM, New York, NY, USA, 1589-1598. https://doi.org/10.1145/1518701.1518944
[4]
J. Cohen. 1960. A Coeficient of Agreement for Nominal Scales. Educational and Psychological Measurement 20, 1 ( 1960 ), 37.
[5]
James R. Cordy and Chanchal K. Roy. 2011. The NiCad Clone Detector. In Proceedings of the 2011 IEEE 19th International Conference on Program Comprehension (ICPC '11). IEEE Computer Society, Washington, DC, USA, 219-220. https://doi.org/10.1109/ICPC. 2011.26
[6]
Benoit Cornu, Thomas Durieux, Lionel Seinturier, and Martin Monperrus. 2015. NPEFix: Automatic Runtime Repair of Null Pointer Exceptions in Java. Technical Report 1512.07423. Arxiv. https://arxiv.org/pdf/1512.07423.pdf
[7]
Sonal Mahajan et al. [n.d.]. Maestro Evaluation Data. Retrieved Mar 2020 from https://doi.org/10.6084/m9.figshare.11948619
[8]
Qing Gao, Hansheng Zhang, Jie Wang, Yingfei Xiong, Lu Zhang, and Hong Mei. 2015. Fixing Recurring Crash Bugs via Analyzing Q & Amp;A Sites (T). In Proceedings of the 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE) (ASE '15). IEEE Computer Society, Washington, DC, USA, 307-318. https://doi.org/10.1109/ASE. 2015.81
[9]
Google. 2019. Search word order matters. Retrieved Aug 2019 from https://edu.google.com/coursebuilder/courses/pswg/1.2/assets/notes/Lesson1. 5/Lesson1.5Wordordermatters_Text_.html
[10]
Tianxiao Gu, Chengnian Sun, Xiaoxing Ma, Jian Lü, and Zhendong Su. 2016. Automatic Runtime Recovery via Error Handler Synthesis. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (Singapore, Singapore) ( ASE 2016). ACM, New York, NY, USA, 684-695. https: //doi.org/10.1145/2970276.2970360
[11]
JavaParser. 2019. JavaParser. Retrieved Aug 2019 from https://javaparser.org/
[12]
Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, and Stephane Glondu. 2007. DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones. In Proceedings of the 29th International Conference on Software Engineering (ICSE '07). IEEE Computer Society, Washington, DC, USA, 96-105. https://doi.org/10. 1109/ICSE. 2007.30
[13]
Toshihiro Kamiya, Shinji Kusumoto, and Katsuro Inoue. 2002. CCFinder: A Multilinguistic Token-based Code Clone Detection System for Large Scale Source Code. IEEE Trans. Softw. Eng. 28, 7 ( July 2002 ), 654-670. https://doi.org/10.1109/ TSE. 2002.1019480
[14]
Kisub Kim, Dongsun Kim, Tegawendé F. Bissyandé, Eunjong Choi, Li Li, Jacques Klein, and Yves Le Traon. 2018. FaCoY: A Code-to-code Search Engine. In Proceedings of the 40th International Conference on Software Engineering (Gothenburg, Sweden) (ICSE '18). ACM, New York, NY, USA, 946-957. https://doi.org/10.1145/ 3180155.3180187
[15]
Barbara A. Kitchenham and Shari L. Pfleeger. 2008. Personal Opinion Surveys. In Guide to Advanced Empirical Software Engineering, Forrest Shull, Janice Singer, and Dag I.K. SjÃÿberg (Eds.). Springer London, 63-92.
[16]
Ken Krugler. 2013. Krugle Code Search Architecture. Springer New York, New York, NY, 103-120. https://doi.org/10.1007/978-1-4614-6596-6_6
[17]
J. Richard Landis and Gary G. Koch. 1977. The Measurement of Observer Agreement for Categorical Data. Biometrics 33 ( 1977 ).
[18]
Zhenmin Li, Lin Tan, Xuanhui Wang, Shan Lu, Yuanyuan Zhou, and Chengxiang Zhai. 2006. Have Things Changed Now? An Empirical Study of Bug Characteristics in Modern Open Source Software. In Proceedings of the Workshop on Architectural and System Support for Improving Software Dependability (San Jose, California) ( ASID '06).
[19]
X. Liu and H. Zhong. 2018. Mining stackoverflow for program repair. In 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER). 118-129. https://doi.org/10.1109/SANER. 2018.8330202
[20]
David Lo, Nachiappan Nagappan, and Thomas Zimmermann. 2015. How Practitioners Perceive the Relevance of Software Engineering Research. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (Bergamo, Italy) (ESEC/FSE 2015 ). Association for Computing Machinery, New York, NY, USA, 415-425. https://doi.org/10.1145/2786805.2786809
[21]
Fan Long, Peter Amidon, and Martin Rinard. 2017. Automatic Inference of Code Transforms for Patch Generation. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (Paderborn, Germany) ( ESEC/FSE 2017). ACM, New York, NY, USA, 727-739. https://doi.org/10.1145/3106237.3106253
[22]
Fan Long, Stelios Sidiroglou-Douskos, and Martin Rinard. 2014. Automatic Runtime Error Repair and Containment via Recovery Shepherding. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (Edinburgh, United Kingdom) (PLDI '14). ACM, New York, NY, USA, 227-238. https://doi.org/10.1145/2594291.2594337
[23]
Sifei Luan, Di Yang, Celeste Barnaby, Koushik Sen, and Satish Chandra. 2019. Aroma: Code Recommendation via Structural Code Search. Proc. ACM Program. Lang. 3, OOPSLA, Article 152 (Oct. 2019 ), 28 pages.
[24]
Matias Martinez, Thomas Durieux, Romain Sommerard, Jifeng Xuan, and Martin Monperrus. 2017. Automatic Repair of Real Bugs in Java: A Large-Scale Experiment on the Defects4j Dataset. Empirical Software Engineering 22, 4 (Aug. 2017 ), 1936-1964.
[25]
Csaba Nagy and Anthony Cleve. 2015. Mining Stack Overflow for Discovering Error Patterns in SQL Queries. In Proceedings of the 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME ) (ICSME '15). IEEE Computer Society, Washington, DC, USA, 516-520. https://doi.org/10.1109/ ICSM. 2015.7332505
[26]
Stack Overflow. [n.d.]. Stack Overflow Search. Retrieved Mar 2020 from https: //stackoverflow.com/search
[27]
Mateusz Pawlik and Nikolaus Augsten. [n.d.]. APTED algorithm for the Tree Edit Distance Implemenataion. Retrieved Aug 2019 from https://github.com/ DatabaseGroup/apted
[28]
Mateusz Pawlik and Nikolaus Augsten. 2015. Eficient Computation of the Tree Edit Distance. ACM Trans. Database Syst. 40, 1, Article 3 (March 2015 ), 40 pages. https://doi.org/10.1145/2699485
[29]
Mateusz Pawlik and Nikolaus Augsten. 2016. Tree edit distance: Robust and memory-eficient. Information Systems 56 ( 2016 ), 157-173. https://doi.org/10. 1016/j.is. 2015. 08.004
[30]
Luca Ponzanelli, Alberto Bacchelli, and Michele Lanza. 2013. Leveraging Crowd Knowledge for Software Comprehension and Development. In Proceedings of the 2013 17th European Conference on Software Maintenance and Reengineering (CSMR '13). IEEE Computer Society, Washington, DC, USA, 57-66. https://doi. org/10.1109/CSMR. 2013.16
[31]
Luca Ponzanelli, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, and Michele Lanza. 2014. Mining StackOverflow to Turn the IDE into a Self-confident Programming Prompter. In Proceedings of the 11th Working Conference on Mining Software Repositories (Hyderabad, India) ( MSR 2014). ACM, New York, NY, USA, 102-111. https://doi.org/10.1145/2597073.2597077
[32]
Luca Ponzanelli, Simone Scalabrino, Gabriele Bavota, Andrea Mocci, Rocco Oliveto, Massimiliano Di Penta, and Michele Lanza. 2017. Supporting Software Developers with a Holistic Recommender System. In Proceedings of the 39th International Conference on Software Engineering (Buenos Aires, Argentina) (ICSE '17). IEEE Press, Piscataway, NJ, USA, 94-105. https://doi.org/10.1109/ICSE. 2017.17
[33]
prestodb. 2019. Presto project at commit 2babbe3. Retrieved Aug 2019 from https://github.com/prestodb/presto
[34]
Caitlin Sadowski, Kathryn T. Stolee, and Sebastian Elbaum. 2015. How Developers Search for Code: A Case Study. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (Bergamo, Italy) (ESEC/FSE 2015). ACM, New York, NY, USA, 191-201. https://doi.org/10.1145/2786805.2786855
[35]
Ripon K. Saha, Yingjun Lyu, Hiroaki Yoshida, and Mukul R. Prasad. 2017. ELIXIR: Efective Object Oriented Program Repair. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (Urbana-Champaign, IL, USA) ( ASE 2017). IEEE Press, 648âĂŞ659.
[36]
Vaibhav Saini, Farima Farmahinifarahani, Yadong Lu, Pierre Baldi, and Cristina V. Lopes. 2018. Oreo: Detection of Clones in the Twilight Zone. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Lake Buena Vista, FL, USA) ( ESEC/FSE 2018). ACM, New York, NY, USA, 354-365. https://doi.org/ 10.1145/3236024.3236026
[37]
Hitesh Sajnani, Vaibhav Saini, Jefrey Svajlenko, Chanchal K. Roy, and Cristina V. Lopes. 2016. SourcererCC: Scaling Code Clone Detection to Big-code. In Proceedings of the 38th International Conference on Software Engineering (Austin, Texas) ( ICSE '16). ACM, New York, NY, USA, 1157-1168. https://doi.org/10.1145/ 2884781.2884877
[38]
Saurabh Sinha, Hina Shah, Carsten Görg, Shujuan Jiang, Mijung Kim, and Mary Jean Harrold. 2009. Fault Localization and Repair for Java Runtime Exceptions. In Proceedings of the Eighteenth International Symposium on Software Testing and Analysis (Chicago, IL, USA) ( ISSTA '09). ACM, New York, NY, USA, 153-164. https://doi.org/10.1145/1572272.1572291
[39]
Inc. Stack Exchange. 2019. Stack Overflow Dump. Retrieved March 2019 from https://archive.org/details/stackexchange
[40]
Inc. Stack Exchange. 2019. Stack Overflow Statistics. Retrieved August 2019 from https://stackexchange.com/sites#trafic
[41]
Valerio Terragni, Yepang Liu, and Shing-Chi Cheung. 2016. CSNIPPEX: Automated Synthesis of Compilable Code Snippets from Q&A Sites. In Proceedings of the 25th International Symposium on Software Testing and Analysis (Saarbrücken, Germany) ( ISSTA 2016). ACM, New York, NY, USA, 118-129. https://doi.org/10.1145/2931037.2931058
[42]
Cambridge University. 2013. Cambridge University Study States Software Bugs Cost Economy $312 Billion Per Year. http://www.prweb.com/releases/2013/1/ prweb10298185.htm.
[43]
Pengcheng Wang, Jefrey Svajlenko, Yanzhao Wu, Yun Xu, and Chanchal K. Roy. 2018. CCAligner: A Token Based Large-gap Clone Detector. In Proceedings of the 40th International Conference on Software Engineering (Gothenburg, Sweden) (ICSE '18). ACM, New York, NY, USA, 1066-1077. https://doi.org/10.1145/3180155. 3180179
[44]
Martin White, Michele Tufano, Christopher Vendome, and Denys Poshyvanyk. 2016. Deep Learning Code Fragments for Code Clone Detection. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (Singapore, Singapore) ( ASE 2016). ACM, New York, NY, USA, 87-98. https: //doi.org/10.1145/2970276.2970326
[45]
Yuhao Wu, Shaowei Wang, Cor-Paul Bezemer, and Katsuro Inoue. 2019. How do developers utilize source code from stack overflow? Empirical Software Engineering 24, 2 ( 01 Apr 2019 ), 637-673. https://doi.org/10.1007/s10664-018-9634-5
[46]
Xuezheng Xu, Yulei Sui, Hua Yan, and Jingling Xue. 2019. VFix: Value-flowguided Precise Program Repair for Null Pointer Dereferences. In Proceedings of the 41st International Conference on Software Engineering (Montreal, Quebec, Canada) ( ICSE '19). IEEE Press, Piscataway, NJ, USA, 512-523. https://doi.org/10. 1109/ICSE. 2019.00063
[47]
Tianyi Zhang, Di Yang, Crista Lopes, and Miryung Kim. 2019. Analyzing and Supporting Adaptation of Online Code Examples. In Proceedings of the 41st International Conference on Software Engineering (Montreal, Quebec, Canada) ( ICSE '19). IEEE Press, Piscataway, NJ, USA, 316-327. https://doi.org/10.1109/ ICSE. 2019.00046

Cited By

View all
  • (2024)FuEPRe: a fusing embedding method with attention for post recommendationService Oriented Computing and Applications10.1007/s11761-024-00386-y18:1(67-79)Online publication date: 1-Mar-2024
  • (2023)KG4CraSolver: Recommending Crash Solutions via Knowledge GraphProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616317(1242-1254)Online publication date: 30-Nov-2023
  • (2023)A Programming Language Learning Service by Linking Stack Overflow with Textbooks2023 IEEE International Conference on Web Services (ICWS)10.1109/ICWS60048.2023.00043(234-245)Online publication date: Jul-2023
  • Show More Cited By

Index Terms

  1. Recommending stack overflow posts for fixing runtime exceptions using failure scenario matching

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ESEC/FSE 2020: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
    November 2020
    1703 pages
    ISBN:9781450370431
    DOI:10.1145/3368089
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 November 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. code search
    2. crowd intelligence
    3. runtime exceptions
    4. static analysis

    Qualifiers

    • Research-article

    Conference

    ESEC/FSE '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 112 of 543 submissions, 21%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)33
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 12 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)FuEPRe: a fusing embedding method with attention for post recommendationService Oriented Computing and Applications10.1007/s11761-024-00386-y18:1(67-79)Online publication date: 1-Mar-2024
    • (2023)KG4CraSolver: Recommending Crash Solutions via Knowledge GraphProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616317(1242-1254)Online publication date: 30-Nov-2023
    • (2023)A Programming Language Learning Service by Linking Stack Overflow with Textbooks2023 IEEE International Conference on Web Services (ICWS)10.1109/ICWS60048.2023.00043(234-245)Online publication date: Jul-2023
    • (2023)Software Entity Recognition with Noise-Robust Learning2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE)10.1109/ASE56229.2023.00203(484-496)Online publication date: 11-Sep-2023
    • (2022)An Exploration of npm Package Co-Usage Examples from Stack Overflow: A Case StudyIEICE Transactions on Information and Systems10.1587/transinf.2021MPP0003E105.D:1(11-18)Online publication date: 1-Jan-2022
    • (2022)Answer Summarization for Technical Queries: Benchmark and New ApproachProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3560421(1-13)Online publication date: 10-Oct-2022
    • (2022)Debugging with stack overflowProceedings of the ACM/IEEE 44th International Conference on Software Engineering: Software Engineering Education and Training10.1145/3510456.3514147(69-81)Online publication date: 21-May-2022
    • (2022)Do Developers Really Know How to Use Git Commands? A Large-scale Study Using Stack OverflowACM Transactions on Software Engineering and Methodology10.1145/349451831:3(1-29)Online publication date: 9-Apr-2022
    • (2022)Providing Real-time Assistance for Repairing Runtime Exceptions using Stack Overflow Posts2022 IEEE Conference on Software Testing, Verification and Validation (ICST)10.1109/ICST53961.2022.00030(196-207)Online publication date: Apr-2022
    • (2022)Debugging with Stack Overflow: Web Search Behavior in Novice and Expert Programmers2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering Education and Training (ICSE-SEET)10.1109/ICSE-SEET55299.2022.9794240(69-81)Online publication date: May-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media