Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Human Competitiveness of Genetic Programming in Spectrum-Based Fault Localisation: Theoretical and Empirical Analysis

Published: 28 June 2017 Publication History

Abstract

We report on the application of Genetic Programming to Software Fault Localisation, a problem in the area of Search-Based Software Engineering (SBSE). We give both empirical and theoretical evidence for the human competitiveness of the evolved fault localisation formulæ under the single fault scenario, compared to those generated by human ingenuity and reported in many papers, published over more than a decade. Though there have been previous human competitive results claimed for SBSE problems, this is the first time that evolved solutions have been formally proved to be human competitive. We further prove that no future human investigation could outperform the evolved solutions. We complement these proofs with an empirical analysis of both human and evolved solutions, which indicates that the evolved solutions are not only theoretically human competitive, but also convey similar practical benefits to human-evolved counterparts.

References

[1]
R. Abreu, P. Zoeteweij, and A. J. C. van Gemund. 2009. Spectrum-based multiple fault localization. In Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering (ASE’09). 88--99.
[2]
Rui Abreu, Peter Zoeteweij, and Arjan J. C. van Gemund. 2006. An evaluation of similarity coefficients for software fault localization. In The Proceedings of the 12th Pacific Rim International Symposium on Dependable Computing (PRDC’06). IEEE, 39--46.
[3]
Rui Abreu, Peter Zoeteweij, and Arjan J. C. van Gemund. 2007. On the accuracy of spectrum-based fault localization. In Proceedings of the Testing: Academic and Industrial Conference Practice and Research Techniques—MUTATION. IEEE Computer Society, 89--98.
[4]
Wasif Afzal, Richard Torkar, and Robert Feldt. 2009. A systematic review of search-based testing for non-functional system properties. Info. Softw. Technol. 51, 6 (2009), 957--976.
[5]
Andrea Arcuri and Lionel Briand. 2011. A practical guide for using statistical tests to assess randomized algorithms in software engineering. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). ACM, New York, 1--10.
[6]
Shay Artzi, Julian Dolby, Frank Tip, and Marco Pistoia. 2010. Directed test generation for effective fault localization. In Proceedings of the 19th International Symposium on Software Testing and Analysis (ISSTA’10). ACM, New York, 49--60.
[7]
Earl T. Barr, Mark Harman, Yue Jia, Alexandru Marginean, and Justyna Petke. 2015. Automated software transplantation. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA’15). ACM, New York, 257--269.
[8]
Mariano Ceccato, Alessandro Marchetto, Leonardo Mariani, Cu D. Nguyen, and Paolo Tonella. 2015. Do automatically generated test cases make debugging easier? An experimental assessment of debugging effectiveness and efficiency. ACM Trans. Softw. Eng. Methodol. 25, 1 (Dec. 2015), 5:1--5:38.
[9]
Yanping Chen, Robert L. Probert, and D. Paul Sims. 2002. Specification-based regression test selection with risk analysis. In Proceedings of the Conference of the Centre for Advanced Studies on Collaborative research (CASCON’02). IBM Press, 1--14.
[10]
Gregory W. Corder and Dale I. Foreman. 2009. Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach. Wiley.
[11]
Valentin Dallmeier, Christian Lindig, and Andreas Zeller. 2005. Lightweight bug localization with AMPLE. In Proceedings of the 6th International Symposium on Automated Analysis-driven Debugging (AADEBUG’05). ACM, New York, 99--104.
[12]
Nicholas DiGiuseppe and James A. Jones. 2011. On the influence of multiple faults on coverage-based fault localization. In Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA’11). ACM, New York, NY, USA, 210--220.
[13]
Hyunsook Do, Sebastian G. Elbaum, and Gregg Rothermel. 2005. Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact. Empirical Software Engineering 10, 4 (2005), 405--435.
[14]
Stephanie Forrest, ThanhVu Nguyen, Westley Weimer, and Claire Le Goues. 2009. A genetic programming approach to automated software repair. In Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO’09). ACM, New York, 947--954.
[15]
Gordon Fraser, Matt Staats, Phil McMinn, Andrea Arcuri, and Frank Padberg. 2013. Does automated white-box test generation really help software testers? In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’13). 291--301.
[16]
Fabrício Gomes Freitas and Jerffeson Teixeira Souza. 2011. Ten years of search-based software engineering: A bibliometric analysis. In Search-Based Software Engineering, MyraB. Cohen and Mel Ó Cinnéide (Eds.). Lecture Notes in Computer Science, Vol. 6956. Springer, Berlin, 18--32.
[17]
A. Gonzalez-Sanchez, R. Abreu, H. G. Gross, and A. J. C. van Gemund. 2011b. Prioritizing tests for fault localization through ambiguity group reduction. In Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering (ASE’11). 83--92.
[18]
Alberto Gonzalez-Sanchez, Rui Abreu, Hans-Gerhard Gross, and Arjan J. C. van Gemund. 2011a. Spectrum-based sequential diagnosis. In Proceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI’11). AAAI Press, 189--196.
[19]
A. Gonzalez-Sanchez, E. Piel, H. G. Gross, and A. J. C. van Gemund. 2010. Prioritizing tests for software fault localization. In Proceedings of the 2010 10th International Conference on Quality Software. 42--51.
[20]
C. Gouveia, J. Campos, and R. Abreu. 2013. Using HTML5 visualizations in software fault localization. In Proceedings of the 1st IEEE Working Conference on Software Visualization (VISSOFT’13). 1--10.
[21]
Dan Hao, Lu Zhang, Ying Pan, Hong Mei, and Jiasu Sun. 2008. On similarity-awareness in testing-based fault localization. Auto. Softw. Eng. 15 (June 2008), 207--249. Issue 2.
[22]
Mark Harman. 2011. Software engineering meets evolutionary computation. IEEE Comput. 44, 10 (Oct. 2011), 31--39.
[23]
Mark Harman, S. Afshin Mansouri, and Yuanyuan Zhang. 2012. Search-based software engineering: Trends, techniques and applications. Comput. Surveys 45, 1, Article 11 (December 2012), 61 pages.
[24]
Mary Jean Harrold, Gregg Rothermel, Rui Wu, and Liu Yi. 1998. An empirical investigation of program spectra. In Proceedings of the ACM SIGPLAN-SIGSOFT workshop on Program Analysis for Software Tools and Engineering (PASTE’98). ACM, New York, 83--90.
[25]
Paul Jaccard. 1901. Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin del la Société Vaudoise des Sciences Naturelles 37 (1901), 547--579.
[26]
Wei Jin and Alessandro Orso. 2012. BugRedux: Reproducing field failures for in-house debugging. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12). IEEE Press, Piscataway, NJ, 474--484.
[27]
Wei Jin and Alessandro Orso. 2013. F3: Fault localization for field failures. In Proceedings of the 2013 International Symposium on Software Testing and Analysis (ISSTA’13). ACM, New York, 213--223.
[28]
James A. Jones and Mary Jean Harrold. 2005. Empirical evaluation of the tarantula automatic fault-localization technique. In Proceedings of the 20th International Conference on Automated Software Engineering (ASE’05). ACM, 273--282.
[29]
James A. Jones, Mary Jean Harrold, and John Stasko. 2002. Visualization of test information to assist fault localization. In Proceedings of the 24th International Conference on Software Engineering. ACM, New York, 467--477.
[30]
James A. Jones, Mary Jean Harrold, and John T. Stasko. 2001. Visualization for fault localization. In Proceedings of ICSE Workshop on Software Visualization. 71--75.
[31]
J. R. Koza. 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA.
[32]
Claire Le Goues, Stephanie Forrest, and Westley Weimer. 2013. Current challenges in automatic software repair. Softw. Qual. J. 21, 3 (2013), 421--443.
[33]
Hua Jie Lee. 2011. Software Debugging using Program Spectra. Ph.D. Dissertation. University of Melbourne.
[34]
W. Masri and R. A. Assi. 2010. Cleansing test suites from coincidental correctness to enhance fault-localization. In Proceedings of the 2010 3rd International Conference on Software Testing, Verification and Validation (ICST’10). 165--174.
[35]
Philip McMinn. 2004. Search-based software test data generation: A survey. Softw. Test. Verificat. Reliabil. 14, 2 (June 2004), 105--156.
[36]
Seokhyeon Moon, Yunho Kim, Moonzoo Kim, and Shin Yoo. 2014. Ask the mutants: Mutating faulty programs for fault localization. In Proceedings of the 7th International Conference on Software Testing, Verification and Validation (ICST’14). 153--162.
[37]
Lee Naish, Hua Jie Lee, and Kotagiri Ramamohanarao. 2011. A model for spectra-based software diagnosis. ACM Trans. Softw. Eng. Methodol. 20, 3, Article 11 (August 2011), 32 pages.
[38]
Lee Naish, Hua Jie Lee, and Kotagiri Ramamohanarao. 2012. Spectral debugging: How much better can we do? In Proceedings of the 35th Australasian Computer Science Conference—Volume 122 (ACSC’12). Australian Computer Society, Inc., Darlinghurst, Australia, 99--106.
[39]
A. Ochiai. 1957. Zoogeographic studies on the soleoid fishes found in Japan and its neighbouring regions. Bull. Japan. Soc. Sci. Fish. 22, 9 (1957), 526--530.
[40]
Annibale Panichella, Bogdan Dit, Rocco Oliveto, Massimiliano Di Penta, Denys Poshyvanyk, and Andrea De Lucia. 2013. How to effectively use topic models for software engineering tasks? An approach based on genetic algorithms. In Proceedings of the 2013 International Conference on Software Engineering (ICSE’13). IEEE, 522--531.
[41]
Sangmin Park, Richard W. Vuduc, and Mary Jean Harrold. 2010. Falcon: Fault localization in concurrent programs. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1 (ICSE’10). ACM, New York, 245--254.
[42]
Chris Parnin and Alessandro Orso. 2011. Are automated debugging techniques actually helping programmers? In Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA’11). ACM, New York, 199--209.
[43]
Riccardo Poli, William B. Langdon, and Nicholas Freitag McPhee. 2008. A Field Guide to Genetic Programming. Published via http://lulu.com and retrieved from http://www.gp-field-guide.org.uk (with contributions by J. R. Koza).
[44]
Yuhua Qi, Xiaoguang Mao, Yan Lei, and Chengsong Wang. 2013. Using automated program repair for evaluating the effectiveness of fault localization techniques. In Proceedings of the 2013 International Symposium on Software Testing and Analysis (ISSTA’13). ACM, New York, 191--201.
[45]
M. Renieres and S. P. Reiss. 2003. Fault localization with nearest neighbor queries. In Proceedings of the 18th International Conference on Automated Software Engineering. 30--39.
[46]
P. F. Russel and T. Ramachandra Rao. 1940. On habitat and association of species of anopheline larvae in south-eastern Madras. J. Malar. Inst. India 3, 1 (1940), 153--178.
[47]
SLOCCount. 2004. Retrieved from http://www.dwheeler.com/sloccount/sloccount.html (2004).
[48]
Friedrich Steimann, Marcus Frenkel, and Rui Abreu. 2013. Threats to the validity and value of empirical assessments of the accuracy of coverage-based fault locators. In Proceedings of the 2013 International Symposium on Software Testing and Analysis (ISSTA’13). ACM, New York, 314--324.
[49]
András Vargha and Harold D. Delaney. 2000. A critique and improvement of the “CL” common language effect size statistics of McGraw and Wong. J. Educat. Behav. Stat. 25, 2 (2000), pp. 101--132.
[50]
Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically finding patches using genetic programming. In Proceedings of the 31st IEEE International Conference on Software Engineering (ICSE’09). IEEE.
[51]
W. Eric Wong, Yu Qi, Lei Zhao, and Kai-Yuan Cai. 2007. Effective fault localization using code coverage. In Proceedings of the 31st Annual International Computer Software and Applications Conference—Volume 01 (COMPSAC’07). IEEE Computer Society, Washington, DC, 449--456.
[52]
Xiaoyuan Xie, Tsong Yueh Chen, Fei-Ching Kuo, and Baowen Xu. 2013a. A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Trans. Softw. Eng. Methodol. 22, 4, Article 31 (October 2013), 40 pages.
[53]
Xiaoyuan Xie, Fei-Ching Kuo, Tsong Yueh Chen, Shin Yoo, and Mark Harman. 2013b. Provably optimal and human-competitive results in SBSE for spectrum-based fault localisation. In Search-Based Software Engineering, Günther Ruhe and Yuanyuan Zhang (Eds.). Lecture Notes in Computer Science, Vol. 8084. Springer, Berlin, 224--238.
[54]
Jian Xu, W. K. Chan, Zhenyu Zhang, T. H. Tse, and Shanping Li. 2011. A dynamic fault localization technique with noise reduction for java programs. In Proceedings of the 11th International Conference on Quality Software, Manuel Núñez, Robert M. Hierons, and Mercedes G. Merayo (Eds.). IEEE Computer Society, 11--20.
[55]
Shin Yoo. 2012. Evolving human competitive spectra-based fault localisation techniques. In Search-Based Software Engineering, Gordon Fraser and Jerffeson Teixeira de Souza (Eds.). Lecture Notes in Computer Science, Vol. 7515. Springer, Berlin, 244--258.
[56]
Shin Yoo, Mark Harman, and David Clark. 2013. Fault localization prioritization: Comparing information-theoretic and coverage-based approaches. ACM Trans. Softw. Eng. Methodol. 22, 3 (July 2013), 19:1--19:29.
[57]
Yanbing Yu, James A. Jones, and Mary Jean Harrold. 2008. An empirical study of the effects of test-suite reduction on fault localization. In Proceedings of the International Conference on Software Engineering (ICSE’08). ACM, 201--210.

Cited By

View all
  • (2024)ReClues: Representing and indexing failures in parallel debugging with program variablesProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639098(1-13)Online publication date: 20-May-2024
  • (2024)Optimizing Mutation-Based Fault Localization Through Contribution-Based Test Case ReductionInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402450027X(1-28)Online publication date: 5-Jul-2024
  • (2024)Combining Error Guessing and Logical Reasoning for Software Fault Localization via Deep LearningInternational Journal of Software Engineering and Knowledge Engineering10.1142/S0218194024500219(1-27)Online publication date: 28-May-2024
  • Show More Cited By

Index Terms

  1. Human Competitiveness of Genetic Programming in Spectrum-Based Fault Localisation: Theoretical and Empirical Analysis

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Software Engineering and Methodology
      ACM Transactions on Software Engineering and Methodology  Volume 26, Issue 1
      January 2017
      176 pages
      ISSN:1049-331X
      EISSN:1557-7392
      DOI:10.1145/3092955
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 28 June 2017
      Accepted: 01 March 2017
      Revised: 01 January 2017
      Received: 01 December 2015
      Published in TOSEM Volume 26, Issue 1

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Spectrum-based fault localisation
      2. genetic programming
      3. search-based software engineering

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • National Natural Science Foundation of China
      • National Research Foundation of Korea (NRF)
      • Korean government (MEST)

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)34
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 18 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)ReClues: Representing and indexing failures in parallel debugging with program variablesProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639098(1-13)Online publication date: 20-May-2024
      • (2024)Optimizing Mutation-Based Fault Localization Through Contribution-Based Test Case ReductionInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402450027X(1-28)Online publication date: 5-Jul-2024
      • (2024)Combining Error Guessing and Logical Reasoning for Software Fault Localization via Deep LearningInternational Journal of Software Engineering and Knowledge Engineering10.1142/S0218194024500219(1-27)Online publication date: 28-May-2024
      • (2024)A systematic mapping study of bug reproduction and localizationInformation and Software Technology10.1016/j.infsof.2023.107338165:COnline publication date: 1-Jan-2024
      • (2024)Delta4Ms: Improving mutation‐based fault localization by eliminating mutant biasSoftware Testing, Verification and Reliability10.1002/stvr.187234:4Online publication date: 16-Jan-2024
      • (2023)A Bayesian Framework for Automated DebuggingProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598103(880-891)Online publication date: 12-Jul-2023
      • (2023)Adaptive Search-based Repair of Deep Neural NetworksProceedings of the Genetic and Evolutionary Computation Conference10.1145/3583131.3590477(1527-1536)Online publication date: 15-Jul-2023
      • (2023)Systematically Generated Formulas for Spectrum-Based Fault Localization2023 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)10.1109/ICSTW58534.2023.00065(344-352)Online publication date: Apr-2023
      • (2023)On The Efficiency Of Combination Of Program Slicing and Spectrum-Based Fault Localization2023 IEEE Conference on Software Testing, Verification and Validation (ICST)10.1109/ICST57152.2023.00061(499-501)Online publication date: Apr-2023
      • (2023)A Case Against Coverage-Based Program Spectra2023 IEEE Conference on Software Testing, Verification and Validation (ICST)10.1109/ICST57152.2023.00011(13-24)Online publication date: Apr-2023
      • Show More Cited By

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media