Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3213846.3213870acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

Comparing developer-provided to user-provided tests for fault localization and automated program repair

Published: 12 July 2018 Publication History

Abstract

To realistically evaluate a software testing or debugging technique, it must be run on defects and tests that are characteristic of those a developer would encounter in practice. For example, to determine the utility of a fault localization or automated program repair technique, it could be run on real defects from a bug tracking system, using real tests that are committed to the version control repository along with the fixes. Although such a methodology uses real tests, it may not use tests that are characteristic of the information a developer or tool would have in practice. The tests that a developer commits after fixing a defect may encode more information than was available to the developer when initially diagnosing the defect.
This paper compares, both quantitatively and qualitatively, the developer-provided tests committed along with fixes (as found in the version control repository) versus the user-provided tests extracted from bug reports (as found in the issue tracker). It provides evidence that developer-provided tests are more targeted toward the defect and encode more information than user-provided tests. For fault localization, developer-provided tests overestimate a technique’s ability to rank a defective statement in the list of the top-n most suspicious statements. For automated program repair, developer-provided tests overestimate a technique’s ability to (efficiently) generate correct patches—user-provided tests lead to fewer correct patches and increased repair time. This paper also provides suggestions for improving the design and evaluation of fault localization and automated program repair techniques.

References

[1]
Rui Abreu, Peter Zoeteweij, and Arjan J. C. van Gemund. 2007. On the Accuracy of Spectrum-based Fault Localization. In Proceedings of the Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPARTMUTATION ’07). Washington, DC, USA, 89–98. http://dl.acm.org/citation.cfm? id=1308173.1308264
[2]
Aaron Ang, Alexandre Perez, Arie van Deursen, and Rui Abreu. 2017. Revisiting the Practical Use of Automated Software Fault Localization Techniques. IEEE, United States, 175–182.
[3]
J. Aranda and G. Venolia. 2009. The secret life of bugs: Going past the errors and omissions in software repositories. In ICSE 2009, Proceedings of the 31st International Conference on Software Engineering. Vancouver, BC, Canada, 298– 308.
[4]
Aritra Bandyopadhyay. 2011. Improving Spectrum-based Fault Localization Using Proximity-based Weighting of Test Cases. In ASE 2011: Proceedings of the 26th Annual International Conference on Automated Software Engineering (ASE ’11). Lawrence, KS, USA, 660–664.
[5]
Nicolas Bettenburg, Sascha Just, Adrian Schröter, Cathrin Weiss, Rahul Premraj, and Thomas Zimmermann. 2008. What Makes a Good Bug Report?. In FSE 2008: Proceedings of the ACM SIGSOFT 16th Symposium on the Foundations of Software Engineering (SIGSOFT ’08/FSE-16). New York, NY, USA, 308–318.
[6]
Nicolas Bettenburg, Rahul Premraj, Thomas Zimmermann, and Sunghun Kim. 2008. Extracting Structural Information from Bug Reports. In Proceedings of the 2008 International Working Conference on Mining Software Repositories (MSR ’08). New York, NY, USA, 27–30.
[7]
Marcel Böhme, Ezekiel O. Soremekun, Sudipta Chattopadhyay, Emamurho Ugherughe, and Andreas Zeller. 2017. Where is the Bug and How is It Fixed? An Experiment with Practitioners. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2017). New York, NY, USA, 117–128.
[8]
Hyunsook Do, Sebastian Elbaum, and Gregg Rothermel. 2005. Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and Its Potential Impact. Empirical Softw. Engg. 10, 4 (Oct. 2005), 405–435.
[9]
Claire Le Goues, Neal Holtschulte, Edward K. Smith, Yuriy Brun, Premkumar Devanbu, Stephanie Forrest, and Westley Weimer. 2015. The ManyBugs and IntroClass Benchmarks for Automated Repair of C Programs. IEEE Transactions on Software Engineering 41, 12 (2015), 1236–1256.
[10]
ieeecomputersociety.org/10.1109/TSE.2015.2454513
[11]
Ralph Guderlei, René Just, and Christoph Schneckenburger. 2008. Benchmarking testing strategies with tools from mutation analysis. In International Conference on Software Testing Verification and Validation Workshop (ICSTW). 360–364.
[12]
Mary Jean Harrold, Gregg Rothermel, Kent Sayre, Rui Wu, and Liu Yi. 2000. An empirical investigation of the relationship between spectra differences and regression faults. Software Testing, Verification and Reliability 10, 3 (2000), 171–194.
[13]
Monica Hutchins, Herb Foster, Tarak Goradia, and Thomas Ostrand. 1994. Experiments of the Effectiveness of Dataflow- and Controlflow-based Test Adequacy Criteria. In Proceedings of the 16th International Conference on Software Engineering (ICSE ’94). Los Alamitos, CA, USA, 191–200. http://dl.acm.org/citation.cfm? id=257734.257766
[14]
Siyuan Jiang, Ameer Armaly, and Collin McMillan. 2017. Automatically Generating Commit Messages from Diffs Using Neural Machine Translation. In Proceedings of the 32Nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2017). Piscataway, NJ, USA, 135–146.
[15]
James A. Jones and Mary Jean Harrold. 2005. Empirical Evaluation of the Tarantula Automatic Fault-localization Technique. In Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering (ASE ’05). New York, NY, USA, 273–282.
[16]
René Just. 2014. The Major Mutation Framework: Efficient and Scalable Mutation Analysis for Java. In ISSTA 2014, Proceedings of the 2014 International Symposium on Software Testing and Analysis. San Jose, CA, USA, 433–436.
[17]
René Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: A Database of Existing Faults to Enable Controlled Testing Studies for Java Programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis (ISSTA 2014). New York, NY, USA, 437–440.
[18]
[19]
René Just, Darioush Jalali, Laura Inozemtseva, Michael D. Ernst, Reid Holmes, and Gordon Fraser. 2014. Are Mutants a Valid Substitute for Real Faults in Software Testing?. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2014). New York, NY, USA, 654–665.
[20]
René Just and Franz Schweiggert. 2011. Automating unit and integration testing with partial oracles. Software Quality Journal (SQJ) 19, 4 (2011), 753–769.
[21]
Fabian Keller, Lars Grunske, Simon Heiden, Antonio Filieri, Andre van Hoorn, and David Lo. 2017. A critical evaluation of spectrum-based fault localization techniques on a large-scale software system. In International Conference on Software Quality, Reliability and Security (QRS). 114–125.
[22]
Pavneet Singh Kochhar, Xin Xia, David Lo, and Shanping Li. 2016. Practitioners’ expectations on automated fault localization. In Proceedings of the 25th International Symposium on Software Testing and Analysis. 165–176.
[23]
Tien-Duy B. Le, Richard J. Oentaryo, and David Lo. 2015. Information Retrieval and Spectrum Based Bug Localization: Better Together. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015). New York, NY, USA, 579–590.
[24]
Xi Victoria Lin, Chenglong Wang, Deric Pang, Kevin Vu, Luke Zettlemoyer, and Michael D. Ernst. 2017. Program synthesis from natural language using recurrent neural networks. Technical Report UW-CSE-17-03-01. University of Washington Department of Computer Science and Engineering, Seattle, WA, USA.
[25]
Fan Long and Martin Rinard. 2016. An Analysis of the Search Spaces for Generate and Validate Patch Generation Systems. In Proceedings of the 38th International Conference on Software Engineering (ICSE ’16). New York, NY, USA, 702–713.
[26]
Matias Martinez and Martin Monperrus. 2016. ASTOR: A Program Repair Library for Java. In Proceedings of ISSTA.
[27]
Martin Monperrus. 2018. Automatic Software Repair: A Bibliography. Comput. Surveys 51, 1, Article 17 (Jan. 2018), 24 pages.
[28]
Seokhyeon Moon, Yunho Kim, Moonzoo Kim, and Shin Yoo. 2014. Ask the Mutants: Mutating Faulty Programs for Fault Localization. In Proceedings of the 2014 IEEE International Conference on Software Testing, Verification, and Validation (ICST ’14). Washington, DC, USA, 153–162.
[29]
Manish Motwani, Sandhya Sankaranarayanan, René Just, and Yuriy Brun. 2017. Do Automated Program Repair Techniques Repair Hard and Important Bugs? Empirical Software Engineering Journal (ESEM) (2017), 1–47.
[30]
Mike Papadakis and Yves Le Traon. 2015. Metallaxis-FL: Mutation-based Fault Localization. Softw. Test. Verif. Reliab. 25, 5-7 (Aug. 2015), 605–628. org/10.1002/stvr.1509
[31]
Chris Parnin and Alessandro Orso. 2011. Are Automated Debugging Techniques Actually Helping Programmers?. In Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA ’11). New York, NY, USA, 199–209.
[32]
Spencer Pearson, José Campos, René Just, Gordon Fraser, Rui Abreu, Michael D. Ernst, Deric Pang, and Benjamin Keller. 2017. Evaluating and Improving Fault Localization. In Proceedings of the 39th International Conference on Software Engineering (ICSE ’17). Piscataway, NJ, USA, 609–620.
[33]
Zichao Qi, Fan Long, Sara Achour, and Martin Rinard. 2015. An Analysis of Patch Plausibility and Correctness for Generate-and-validate Patch Generation Systems. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA 2015). New York, NY, USA, 24–36. 2771783.2771791
[34]
Eric F. Rizzi, Sebastian Elbaum, and Matthew B. Dwyer. 2016. On the Techniques We Create, the Tools We Build, and Their Misalignments: A Study of KLEE. In Proceedings of the 38th International Conference on Software Engineering (ICSE ’16). New York, NY, USA, 132–143.
[35]
R. K. Saha, J. Lawall, S. Khurshid, and D. E. Perry. 2014. On the Effectiveness of Information Retrieval Based Bug Localization for C Programs. In 2014 IEEE International Conference on Software Maintenance and Evolution. Jaipur, India, 161–170.
[36]
Edward K Smith, Earl T Barr, Claire Le Goues, and Yuriy Brun. 2015. Is the cure worse than the disease? Overfitting in automated program repair. In ESEC/FSE 2015: The 10th joint meeting of the European Software Engineering Conference (ESEC) and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE). Bergamo, Italy, 532–543.
[37]
Qianqian Wang, Chris Parnin, and Alessandro Orso. 2015. Evaluating the Usefulness of IR-based Fault Localization Techniques. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA 2015). New York, NY, USA, 1–11.
[38]
W. Eric Wong, Ruizhi Gao, Yihao Li, Rui Abreu, and Franz Wotawa. 2016. A Survey on Software Fault Localization. IEEE Trans. Softw. Eng. 42, 8 (Aug. 2016), 707–740.
[39]
Franz Wotawa, Markus Stumptner, and Wolfgang Mayer. 2002. Model-Based Debugging or How to Diagnose Programs Automatically. In Proceedings of the 15th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems: Developments in Applied Artificial Intelligence (IEA/AIE ’02). London, UK, UK, 746–757.
[40]
Yingfei Xiong, Jie Wang, Runfa Yan, Jiachen Zhang, Shi Han, Gang Huang, and Lu Zhang. 2017. Precise Condition Synthesis for Program Repair. In Proceedings of the 39th International Conference on Software Engineering (ICSE ’17). Piscataway, NJ, USA, 416–426.
[41]
Baowen Xu, Ju Qian, Xiaofang Zhang, Zhongqiang Wu, and Lin Chen. 2005. A Brief Survey of Program Slicing. SIGSOFT Softw. Eng. Notes 30, 2 (March 2005), 1–36.

Cited By

View all
  • (2024)Exploring Data Cleanness in Defects4J and Its Influence on Fault Localization EfficiencyProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3643125(386-387)Online publication date: 14-Apr-2024
  • (2024)Evaluating Diverse Large Language Models for Automatic and General Bug ReproductionIEEE Transactions on Software Engineering10.1109/TSE.2024.345083750:10(2677-2694)Online publication date: 1-Oct-2024
  • (2024)Hierarchy-Aware Regression Test Prioritization2024 IEEE 35th International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE62328.2024.00041(343-354)Online publication date: 28-Oct-2024
  • Show More Cited By

Index Terms

  1. Comparing developer-provided to user-provided tests for fault localization and automated program repair

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ISSTA 2018: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis
    July 2018
    379 pages
    ISBN:9781450356992
    DOI:10.1145/3213846
    • General Chair:
    • Frank Tip,
    • Program Chair:
    • Eric Bodden
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 July 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. Automated program repair
    2. Fault localization
    3. Test effectiveness

    Qualifiers

    • Research-article

    Conference

    ISSTA '18
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 58 of 213 submissions, 27%

    Upcoming Conference

    ISSTA '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)20
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 05 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Exploring Data Cleanness in Defects4J and Its Influence on Fault Localization EfficiencyProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3643125(386-387)Online publication date: 14-Apr-2024
    • (2024)Evaluating Diverse Large Language Models for Automatic and General Bug ReproductionIEEE Transactions on Software Engineering10.1109/TSE.2024.345083750:10(2677-2694)Online publication date: 1-Oct-2024
    • (2024)Hierarchy-Aware Regression Test Prioritization2024 IEEE 35th International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE62328.2024.00041(343-354)Online publication date: 28-Oct-2024
    • (2024)A systematic mapping study of bug reproduction and localizationInformation and Software Technology10.1016/j.infsof.2023.107338165:COnline publication date: 1-Jan-2024
    • (2024)A Systematic Exploration of Mutation‐Based Fault Localization FormulaeSoftware Testing, Verification and Reliability10.1002/stvr.1905Online publication date: 11-Nov-2024
    • (2023)Impact analysis of bug localization accuracy oriented to bug reportSixth International Conference on Advanced Electronic Materials, Computers, and Software Engineering (AEMCSE 2023)10.1117/12.3004582(42)Online publication date: 16-Aug-2023
    • (2023)T-Evos: A Large-Scale Longitudinal Study on CI Test Execution and FailureIEEE Transactions on Software Engineering10.1109/TSE.2022.321826449:4(2352-2365)Online publication date: 1-Apr-2023
    • (2023)Large Language Models are Few-shot Testers: Exploring LLM-based General Bug Reproduction2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)10.1109/ICSE48619.2023.00194(2312-2323)Online publication date: May-2023
    • (2023)Better Automatic Program Repair by Using Bug Reports and Tests TogetherProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00109(1225-1237)Online publication date: 14-May-2023
    • (2023)Software Fault Localization: an Overview of Research, Techniques, and ToolsHandbook of Software Fault Localization10.1002/9781119880929.ch1(1-117)Online publication date: 21-Apr-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media