Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Deep imperative mutations have less impact

Published: 03 December 2024 Publication History

Abstract

Information theory and entropy loss predict deeper more hierarchical software will be more robust. Suggesting silent errors and equivalent mutations will be more common in deeper code, highly structured code will be hard to test, so explaining best practise preference for unit testing of small methods rather than system wide analysis. Using the genetic improvement (GI) tool MAGPIE, we measure the impact of source code mutations and how this varies with execution depth in two diverse multi-level nested software. gem5 is a million line single threaded state-of-the-art C++ discrete time VLSI circuit simulator, whilst PARSEC VIPS is a non-deterministic parallel computing multi-threaded image processing benchmark written in C. More than 28–53% of mutants compile and generate identical results to the original program. We observe 12% and 16% Failed Disruption Propagation (FDP). Excluding internal errors, exceptions and asserts, here most faults below about 30 nested function levels which are Executed and Infect data or divert control flow are not Propagated to the output, i.e. these deep PIE changes have no visible external effect. Suggesting automatic software engineering on highly structured code will be hard.

References

[1]
Androutsopoulos K, Clark D, Haitao Dan, et al (2014) An analysis of the relationship between conditional entropy and failed error propagation in software testing. In: Briand L, van der Hoek A (eds) 36th International Conference on Software Engineering (ICSE 2014). ACM, Hyderabad, India, pp 573–583,
[2]
Arcaini P, Yue T, Fredericks EM (eds) Search-Based Software Engineering - 15th International Symposium, SSBSE 2023, Proceedings, Lecture Notes in Computer Science, vol 14415. Springer, San Francisco, USA, (2023).
[3]
Bienia, C., Kumar, S., Singh, J.P., et al.: The PARSEC benchmark suite: characterization and architectural implications. In: Moshovos A, Tarditi D, Olukotun K (eds) 17th International Conference on Parallel Architectures and Compilation Techniques, PACT 2008. ACM, Toronto, Ontario, Canada, pp 72–81,
[4]
Binkert NL et al. The gem5 simulator ACM SIGARCH Comp. Archit. News 2011 39 2 1-7
[5]
Blot, A., Petke, J.: Comparing genetic programming approaches for non-functional genetic improvement case study: Improvement of MiniSAT’s running time. In: Ting Hu, Lourenco N, Medvet E (eds) EuroGP 2020: Proceedings of the 23rd European Conference on Genetic Programming, LNCS, vol 12101. Springer Verlag, Seville, Spain, pp 68–83, (2020).
[6]
Blot A and Petke J Empirical comparison of search heuristics for genetic improvement of software IEEE Trans. Evol. Comput. 2021 25 5 1001-1011
[7]
Blot, A., Petke, J.: A comprehensive survey of benchmarks for automated improvement of software’s non-functional properties. (2022a) arXiv, https://arxiv.org/abs/2212.08540
[8]
Blot, A., Petke, J.: MAGPIE: Machine automated general performance improvement via evolution of software. (2022b) https://doi.org/10.48550/arxiv.2208.02811
[9]
Blot, A., Aguirre, H.E., Dhaenens, C., et al.: Neutral but a winner! how neutrality helps multiobjective local search algorithms. In: Gaspar-Cunha A, Antunes CH, Coello CAC (eds) Evolutionary Multi-Criterion Optimization - 8th International Conference, EMO 2015, Part I, Lecture Notes in Computer Science, vol 9018. Springer, Guimaraes, Portugal, pp 34–47, (2015)
[10]
Bruce, B.R., Akram, A., Nguyen, H., et al.: Enabling reproducible and agile full-system simulation. In: IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS., Stony Brook, NY, USA, pp 183–193, (2021).
[11]
Chen J and Venkataramani G enDebug: a hardware-software framework for automated energy debugging J. Parallel and Distrib Comput. 2016 96 121-133
[12]
Cilibrasi RL and Vitanyi PMB The google similarity distance IEEE Trans. Knowl. Data Eng. 2007 19 3 370-383
[13]
Clark D and Hierons RM Squeeziness: an information theoretic measure for avoiding fault masking Inf. Process. Lett. 2012 112 8–9 335-340
[14]
Clark, D., Langdon, W.B., Petke, J.: Software robustness: A survey, a theory, and some prospects. Presented at Facebook Testing and Verification Symposium 2020, (2020). https://research.facebook.com/blog/2020/11/registration-now-open-for-the-2020-testing-and-verification-symposium/
[15]
Dakhama, A., Even-Mendoza, K., Langdon, W.B., et al.: SearchGEM5: towards reliable gem5 with search based software testing and large language models. In: Arcaini P, Tao Yue, Fredericks E (eds) SSBSE 2023: Challenge Track, LNCS, vol 14415. Springer, San Francisco, USA, pp 60–166, (2023)., winner best challenge track paper
[16]
DeMillo RA, Lipton RJ, and Sayward FG Hints on test data selection: help for the practical programmer IEEE Comput. 1978 11 31-41
[17]
Dorn J, Lacomis J, Weimer W, et al. Automatically exploring tradeoffs between software output fidelity and energy costs IEEE Trans. Software Eng. 2019 45 3 219-236
[18]
Espinel, V.A.: The \$1 trillion economic impact of software. Tech. rep., BSA, The Software Alliance, Washington, DC, USA, (2016). https://softwareimpact.bsa.org/pdf/Economic_Impact_of_Software_Report.pdf
[19]
Gabin, A.n., Jinhan, Kim, Shin, Yoo: comparing line and AST granularity level for program repair using PyGGI. In: Petke J, Stolee K, Langdon WB, et al (eds) GI-2018, ICSE workshops proceedings. ACM, Gothenburg, Sweden, pp 19–26, (2018).
[20]
Gabin, An, Blot, A., Petke, J., et al.: PyGGI 2.0: language independent genetic improvement framework. In: Apel S, Russo A (eds) Proceedings of the 27th Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering ESEC/FSE 2019). ACM, Tallinn, Estonia, pp 1100–1104, (2019).
[21]
Gelperin D and Hetzel B The growth of software testing Commun. ACM 1988 31 6 687-695
[22]
Haraldsson, S.O., Woodward, J.R., Brownlee, A.E.I., et al.: Exploring fitness and edit distance of mutated python programs. In: Castelli M, McDermott J, Sekanina L (eds) EuroGP 2017: proceedings of the 20th European Conference on Genetic Programming, LNCS, vol 10196. Springer Verlag, Amsterdam, pp 19–34, (2017)
[23]
Harman M and Jones BF Search based software engineering Inf. Softw. Technol. 2001 43 14 833-839
[24]
Hynninen, T., Kasurinen, J., Knutas, A., et al.: Software testing: survey of the industry practices. In: Skala K, et al (eds) 41st International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO. IEEE, Opatija, Croatia, pp 1449–1454, (2018).
[25]
Jia Y and Harman M An analysis and survey of the development of mutation testing IEEE Trans. Software Eng. 2011 37 5 649-678
[26]
Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA, (1992)
[27]
Langdon W. B. Riolo Rick and Worzel Bill The Distribution of Reversible Functions is Normal Genetic Programming Theory and Practice 2003 Boston Springer 173-187
[28]
Langdon, W.B.; Genetic improvement of genetic programming. In: Brownlee AS, Haraldsson SO, Petke J, et al (eds) GI @ CEC 2020 Special Session, IEEE Computational Intelligence Society. IEEE Press, internet, p paper id24061, (2020).
[29]
Langdon William B. Dissipative Arithmetic Complex Syst. 2022 31 3 287-309
[30]
Langdon, W.B.: Failed disruption propagation in integer genetic programming. In: Trautmann H, et al (eds) Proceedings of the Genetic and Evolutionary Computation Conference Companion. Association for Computing Machinery, Boston, USA, GECCO ’22, pp 574–577, (2022b)
[31]
Langdon WB Genetic programming convergence Genet. Program Evolvable Mach. 2022 23 1 71-104
[32]
Langdon, W.B.: Open to evolve embodied intelligence. In: Iida F, Hughes J, Abdulali A, et al (eds) Proceedings of 2022 International Conference on Embodied Intelligence, EI-2022, IOP Conference Series: Materials Science and Engineering, vol 1292. IOP Publishing, Internet, Cambridge, p 012021, (2022d) .
[33]
Langdon, W.B.: A trillion genetic programming instructions per second. (2022e). ArXiv, https://arxiv.org/abs/2205.03251
[34]
Langdon WB The end is not clear Commun. ACM 2023 66 7 9
[35]
Langdon, W.B., Alexander, B.J.: Genetic improvement of OLC and H3 with Magpie. In: Nowack V, Wagner M, An G, et al (eds) 12th International Workshop on Genetic Improvement @ICSE 2023. IEEE, Melbourne, Australia, pp 9–16, (2023).
[36]
Langdon WB and Banzhaf W Long-term evolution experiment with genetic programming Artif. Life 2022 28 2 173-204 invited submission to Artificial Life Journal special issue of the ALIFE’19 conference
[37]
Langdon, W.B., Clark, D.: Deep mutations have little impact. In: Gabin An, Blot A, Nowack V, et al (eds) 13th International Workshop on Genetic Improvement @ICSE 2024. ACM, Lisbon, pp 1–8 (2024a)., best paper
[38]
Langdon, W.B., Clark, D.: Genetic improvement of last level cache. In: Giacobini M, Bing Xue, Manzoni L (eds) EuroGP 2024: Proceedings of the 27th European Conference on Genetic Programming, LNCS, vol 14631. Springer Verlag, Aberystwyth, pp 209–226, (2024b).
[39]
Langdon, W.B., Harman, M.: Genetically improved CUDA C++ software. In: Nicolau M, Krawiec K, Heywood MI, et al (eds) 17th European Conference on Genetic Programming, LNCS, vol 8599. Springer, Granada, Spain, pp 87–99, (2014).
[40]
Langdon WB and Harman M Optimising existing software with genetic programming IEEE Trans. Evol. Comput. 2015 19 1 118-135
[41]
Langdon, W.B., Harman, M.: Fitness landscape of the Triangle program. In: Veerapen N, Ochoa G (eds) PPSN-2016 Workshop on Landscape-Aware Heuristic Search, Edinburgh, http://www.cs.ucl.ac.uk/fileadmin/UCL-CS/research/Research_Notes/rn1605.pdf, also available as UCL RN/16/05 (2016)
[42]
Langdon, W.B., Lorenz, R.: Improving SSE parallel code with grow and graft genetic programming. In: Petke J, White DR, Langdon WB, et al (eds) GI-2017. ACM, Berlin, pp 1537–1538, (2017).
[43]
Langdon WB. and Lorenz R Sekanina L, Hu T, Lourenço N, Richter H, and García-Sánchez P Evolving AVX512 Parallel C Code Using GP Genetic Programming: 22nd European Conference, EuroGP 2019, Held as Part of EvoStar 2019, Leipzig, Germany, April 24–26, 2019, Proceedings 2019 Cham Springer International Publishing 245-261
[44]
Langdon William B. and Petke Justyna Bourgine P, Collet P, and Parrend P Software is Not Fragile First Complex Systems Digital Campus World E-Conference 2015 2017 Cham Springer International Publishing
[45]
Langdon, W.B., Brian, Yee, Hong, Lam, Petke J., et al.: Improving CUDA DNA analysis software with genetic programming. In: Silva S, et al (eds) GECCO ’15: Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation. ACM, Madrid, pp 1063–1070, (2015).
[46]
Langdon, W.B., White, D.R., Harman, M., et al.: API-constrained genetic improvement. In: Sarro F, Kalyanmoy Deb (eds) Proceedings of the 8th International Symposium on Search Based Software Engineering, SSBSE 2016, LNCS, vol 9962. Springer, Raleigh, North Carolina, USA, pp 224–230, (2016).
[47]
Langdon, W.B., Shin, Yoo, Harman, M.: Inferring automatic test oracles. In: Galeotti JP, Petke J (eds) Search-Based Software Testing, Buenos Aires, Argentina, pp 5–6, (2017a).
[48]
Langdon, W.B., Veerapen, N., Ochoa, G.: Visualising the search landscape of the Triangle program. In: Castelli M, McDermott J, Sekanina L (eds) EuroGP 2017, LNCS, vol 10196. Springer, Amsterdam, pp 96–113, (2017b).
[49]
Langdon WB., Petke J, and Lorenz R Castelli M, Sekanina L, Zhang M, Cagnoni S, and García-Sánchez P Evolving Better RNAfold Structure Prediction Genetic Programming: 21st European Conference, EuroGP 2018, Parma, Italy, April 4-6, 2018, Proceedings 2018 Cham Springer International Publishing
[50]
Langdon, W.B., Petke, J., Clark, D.: Information loss leads to robustness. IEEE Software Blog, (2021). http://blog.ieeesoftware.org/2021/09/information-loss-leads-to-robustness-w.html
[51]
Langdon, W.B., Al-Subaihin, A., Blot, A., et al.: Genetic improvement of LLVM intermediate representation. In: Pappa G, Giacobini M, Vasicek Z (eds) EuroGP 2023: Proceedings of the 26th European Conference on Genetic Programming, LNCS, vol 13986. Springer Verlag, Brno, Czech Republic, pp 244–259, (2023)
[52]
Lorenz R, Bernhart SH, Höner zu Siederdissen C, Tafer H, Flamm C, Stadler Peter F, and Hofacker Ivo L ViennaRNA Package 2.0 Alg. Mol. Biol. 2011
[53]
Malan KM A survey of advances in landscape analysis for optimisation Algorithms 2021 14 40
[54]
Marginean, A., Barr, E.T., Harman, M., et al.: Automated transplantation of call graph and layout features into Kate. In: Labiche Y, Barros M (eds) SSBSE, LNCS, vol 9275. Springer, Bergamo, Italy, pp 262–268, winner Gold HUMIE (2015)
[55]
Martinez, K., Cupitt, J.: VIPS - a highly tuned image processing software architecture. In: Proceedings of the 2005 International Conference on Image Processing, ICIP. IEEE, Genoa, Italy, pp 574–577, (2005).
[56]
Mesecan, I., Blackwell, D., Clark, D., et al.: HyperGI: Automated detection and repair of information flow leakage. In: Khalajzadeh H, Schneider JG (eds) The 36th IEEE/ACM International Conference on Automated Software Engineering, New Ideas and Emerging Results track, ASE NIER 2021, Melbourne, pp 1358–1362, (2021a), arXiv:2108.12075
[57]
Mesecan, I., Gerten, M.C., Lathrop, J.I., et al.: CRNRepair: Automated program repair of chemical reaction networks. In: Petke J, Bruce BR, Huang Y, et al (eds) GI @ ICSE 2021. IEEE, internet, pp 23–30, (2021b), best paper
[58]
Niedermayr, R., Wagner, S.: Is the stack distance between test case and method correlated with test effectiveness? In: Shaukat Ali, Garousi V (eds) Proceedings of the Evaluation and Assessment on Software Engineering, EASE. ACM, Copenhagen, Denmark, pp 189–198, (2019).
[59]
Papadakis, M., Yue, Jia, Harman, M., et al.: Trivial compiler equivalence: A large scale empirical study of a simple fast and effective equivalent mutant detection technique. In: 37th International Conference on Software Engineering (ICSE 2015), Florence, 936–946, (2015).
[60]
Peng, W.W., Wallace, D.R.: Software error analysis. NIST Special Publication 500-209, Computer Systems Technology, US Department of Commerce. Technology Administration National Institute of Standards and Technology, Gaithersburg, MD 20899, USA, (1993). https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication500-209.pdf, CODEN: NSPUE2
[61]
Petke, J., Alexander, B., Barr, E.T., et al.: A survey of genetic improvement search spaces. In: Alexander B, Haraldsson SO, Wagner M, et al (eds) 7th edition of GI @ GECCO 2019. ACM, Prague, Czech Republic, 1715–1721, (2019).
[62]
Petke, J., Clark, D., Langdon, W.B.: Software robustness: a survey, a theory, and some prospects. In: Avgeriou P, Dongmei Zhang (eds) ESEC/FSE 2021, Ideas, Visions and Reflections. ACM, Athens, Greece, pp 1475–1478, (2021).
[63]
Poli, R., Langdon, W.B., McPhee, N.F.: A field guide to genetic programming. Published via http://lulu.com and freely available at. (2008). http://www.gp-field-guide.org.uk, (With contributions by J. R. Koza)
[64]
Schulte, E.: Neutral networks of real-world programs and their application to automated software evolution. PhD thesis, University of New Mexico, Albuquerque, USA, (2014). https://digitalrepository.unm.edu/cs_etds/49/
[65]
Schulte, E., Dorn, J., Harding, S., et al.: Post-compiler software optimization for reducing energy. In: Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS’14. ACM, Salt Lake City, Utah, USA, pp 639–652, (2014).
[66]
Smigielska, M., Blot, A., Petke, J.: Uniform edit selection for genetic improvement: empirical analysis of mutation operator efficacy. In: Petke J, Bruce BR, Huang Y, et al (eds) GI @ ICSE 2021. IEEE, internet, pp 1–8, (2021).
[67]
Terragni, V., Jahangirova, G., Tonella, P., et al.: Evolutionary improvement of assertion oracles. In: Cohen M, Zimmermann T (eds) Proceedings of the 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2020, Sacramento, California, USA, pp 1178–1189, (2020).
[68]
Ting Hu, Tomassini M, and Banzhaf W A network perspective on genotype-phenotype mapping in genetic programming Genet. Program Evolvable Mach. 2020 21 3 375-397 special Issue: Highlights of Genetic Programming 2019 Events
[69]
Veerapen Nadarajen and Ochoa Gabriela Visualising the global structure of search landscapes: genetic improvement as a case study Gene. Programm. Evolvable Mach. 2018 19 3 317-349
[70]
Veerapen, N., Daolio, F., Ochoa, G.: Modelling genetic improvement landscapes with local optima networks. In: Petke J, White DR, Langdon WB, et al (eds) GI-2017. ACM, Berlin, pp 1543–1548, (2017)., best presentation prize
[71]
Voas JM and Miller KW Software testability: the new verification IEEE Softw. 1995 12 3 17-28
[72]
Xiangjuan, Yao, Harman, M., Yue, Jia: A study of equivalent and stubborn mutation operators using human analysis of equivalence. In: Briand L, van der Hoek A, Jalote P (eds) ICSE. ACM, Hyderbad, pp 919–930, (2014).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Automated Software Engineering
Automated Software Engineering  Volume 32, Issue 1
May 2025
334 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 03 December 2024
Accepted: 26 October 2024
Received: 12 June 2024

Author Tags

  1. Automatic code optimisation
  2. Failed disruption propagation (FDP)
  3. Genetic improvement (GI)
  4. Fault masking
  5. Software resilience
  6. Fitness landscape

Author Tag

  1. Information and Computing Sciences
  2. Computer Software

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media