Abstract
For many everyday devices, each newly released model contains more functionality. This technological advance relies heavily on software solutions of increasing complexity which results in novel challenges in the domain of software testing. Most prominently, while an ever higher number of test cases is required to meet quality demands, performing a large number of test cases frequently amounts to a significant increase in development time and costs. In order to overcome this issue, agile development methods such as continuous integration usually only execute a subset of important test cases to meet both time and testing demands. One way of selecting such a subset of important test cases is to assign priorities to all the available test cases and then greedily pick the ones with the highest priority until the available time budget is spent. For this, in a previous work, we presented a new machine learning approach based on a learning classifier system (LCS). In the present article, we summarize our earlier findings (which are spread over several publications) and provide insights about the most recent adaptations we made to the method. We also provide an extended experimental analysis that outlines more in detail how it compares to a state of the art artificial neural network. It can be observed that the performance of our LCS-based approach is often much higher than the one of the network. Since our work has already been deployed by a major company, we give an overview of the resulting product as well as several of its in-production quality attributes.
Similar content being viewed by others
Availability of Data and Material
The used data sets may be found here: https://bitbucket.org/HelgeS/atcs-data/src/master/.
Change history
29 August 2022
Figures were not placed nearer to its citation. Now, the placment of the figures have been corrected.
01 September 2022
A Correction to this paper has been published: https://doi.org/10.1007/s42979-022-01352-1
Notes
Note that, despite its name, NAPFD actually measures test case failures and not faults. Faults are system errors caused by bugs etc. and each fault may result in multiple test cases failing.
Spieker et al. [32] originally formulated the use case as a reinforcement learning one. We intend to provide a more high-level machine learning view.
Of course they are adapted to numbers. Mutating a number translates to drawing a new random number. Crossover consists of first performing an arithmetic crossover (for two numbers x, y, this corresponds to \(\zeta x + (1 - \zeta ) y\) and \(\zeta y + (1 - \zeta ) x\)) and then the same two-point crossover as used for the ternary subconditions.
Note that the normalization by \(\gamma\) and \(\Gamma\) is necessary to ensure that the result is indeed a probability distribution.
Our code is available here: https://github.com/LagLukas/transfer_learning.
The data sets can be downloaded here: https://bitbucket.org/HelgeS/atcs-data/src/master/.
Spieker et al.’s implementation of their NN-based approach can be found here https://bitbucket.org/HelgeS/retecs.
We take the average over three succeeding values. We consider disjoint CI cycle sets with indexes \(\{3k, 3k+1, 3k+2\}\).
We used one-sided Wilcoxon tests to compare each combination of one of the three ER methods with one of the failure count or time ranked value functions with each combination of the three ER methods with the test case failure value function (a total of \((3 \times 2) \times (3 \times 1) = 18\) comparisons) and for each checked the null hypothesis of whether the first performs worse than the second. Since all the p-values are less than \(10^{-21}\), we conclude that the failure count and time-ranked value functions yield significantly better results.
For the failure count value we can observe similar results; the corresponding plots can be found in Appendix A.
We examined null hypotheses of the form: Our transfer learning approach leads to worse results than the raw XCSF-ER on data set x with value function y.
References
Anand S, Burke EK, Chen TY, Clark J, Cohen MB, Grieskamp W, Harman M, Harrold MJ, McMinn P, Bertolino A, Li JJ, Zhu H. An orchestrated survey of methodologies for automated software test case generation. J Syst Softw. 2013;86(8):1978–2001.
Arrieta A, Wang S, Arruabarrena A, Markiegi U, Sagardui G, Etxeberria L. Multi-objective black-box test case selection for cost-effectively testing simulation models. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’18, 2018. New York: Association for Computing Machinery, p. 1411–8.
Butz MV, Wilson SW. An algorithmic description of XCS. In: Lanzi PL, Stolzmann W, Wilson SW, editors. Advances in learning classifier systems. Berlin: Springer; 2001. p. 253–72.
Dijkstra EW. Chapter I: notes on structured programming. GBR: Academic Press Ltd.; 1972. p. 1–82.
Fedus W, Ramachandran P, Agarwal R, Bengio Y, Larochelle H, Rowland M, Dabney W. Revisiting fundamentals of experience replay. CoRR. http://arxiv.org/abs/2007.06700, 2020.
International Organization for Standardization. ISO/IEC 25010. https://iso25000.com/index.php/en/iso-25000-standards/iso-25010, 2014. Accessed 15 Jun 2021.
Fowler M. Continuous integration. https://www.martinfowler.com/articles/continuousIntegration.html, 2006. Accessed 21 Feb 2021.
Fraser G, Wotawa F. Redundancy based test-suite reduction. In: Dwyer MB, Lopes A, editors. Fundamental approaches to software engineering. Berlin: Springer; 2007. p. 291–305.
Hsu H-Y, Orso A. Mints: a general framework and tool for supporting test-suite minimization. In: 2009 IEEE 31st International Conference on Software Engineering, 2009. p. 419–429.
Huang R, Sun W, Xu Y, Chen H, Towey D, Xia X. A survey on adaptive random testing. IEEE Trans Softw Eng. 2021;47(10):2052–83.
Hunter JD. Matplotlib: a 2D graphics environment. Comput Sci Eng. 2007;9(3):90–5.
Kirdey S, Cureton K, Rick S, Ramanathan S, Mrinal S. Lerner—using RL agents for test case scheduling. https://netflixtechblog.com/lerner-using-rl-agents-for-test-case-scheduling-3e0686211198, 2019. Accessed 21 Feb 2021
Kruskal WH, Allen WW. Use of ranks in one-criterion variance analysis. J Am Stat Assoc. 1952;47(260):583–621.
Lachmann R, Felderer M, Nieke M, Schulze S, Seidl C, Schaefer I. Multi-objective black-box test case selection for system testing. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’17. 2017. New York: Association for Computing Machinery, p. 1311–8.
Lin L-J. Reinforcement Learning for Robots Using Neural Networks. PhD thesis, Pittsburgh, PA, USA, 1992. UMI Order No. GAX93-22750.
Lukasczyk S, Kroiß F, Fraser G. Automated unit test generation for python. CoRR, abs/2007.14049, 2020.
Müller-Schloer C, Tomforde S. Organic computing—technical systems for survival in the real world. In: Autonomic Systems, 2017.
Papadakis M, Kintis M, Zhang J, Jia Y, Traon TL, Harman M. Chapter six—mutation testing advances: an analysis and survey. volume 112 of Advances in Computers, p. 275–378. Elsevier, 2019.
Pätzel D, Heider M, Wagner ARM. An overview of LCS research from 2020 to 2021. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO ’21. New York: Association for Computing Machinery, 2021, pp. 1648–56.
Pätzel D, Stein A, Nakata M. An overview of lcs research from iwlcs 2019–2020. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, GECCO ’20. New York: Association for Computing Machinery, 2020, pp. 1782–8.
Prothmann H, Tomforde S, Branke J, Hähner J, Müller-Schloer C, Schmeck H. Organic traffic control; 2011.
Qu X, Cohen MB, Woolf KM. Combinatorial interaction regression testing: a study of test case generation and prioritization. In: 2007 IEEE International Conference on Software Maintenance. 2007, p. 255–64.
Richards M, Ford N. Fundamentals of software architecture: an engineering approach. London: O’Reilly Media Incorporated; 2019.
Rosenbauer L, Stein A, Hähner J. An artificial immune system for adaptive test selection. In: 2020 IEEE Symposium Series on Computational Intelligence (SSCI), 2020; p. 2940–7.
Rosenbauer L, Pätzel D, Stein A, Hähner J. Transfer learning for automated test case prioritization using xcsf. In: EvoApplications: 24th International Conference on the Applications of Evolutionary Computation as part of evostar 2021, April 2021, Seville, Spain, 2021.
Rosenbauer L, Pätzel D, Stein A, Hähner J. An organic computing system for automated testing. In: Bauer L, Pionteck T, editors. Architecture of computing systems—ARCS 2021. Cham: Springer International Publishing; 2021.
Rosenbauer L, Stein A, Hähner J. An artificial immune system for black box test case selection. In: EvoCop: 21st European Conference on Evolutionary Computation in Combinatorial Optimisation as part of evostar 2021, April 2021, Seville, Spain, 2021.
Rosenbauer L, Stein A, Maier R, Pätzel D, Hähner J. Xcs as a reinforcement learning approach to automatic test case prioritization. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, GECCO’20, New York: Association for Computing Machinery, 2020, p. 1798–806.
Rosenbauer L, Stein A, Pätzel D, Hähner J. Xcsf for automatic test case prioritization. In: Merelo JJ, Garibaldi J, Wagner C, Bäck T, Madani K, Warwick K (eds) Proceedings of the 12th International Joint Conference on Computational Intelligence (ECTA), November 2–4, 2020, 2020.
Rosenbauer L, Stein A, Pätzel D, Hähner J. Xcsf with experience replay for automatic test case prioritization. In: Abbass H, Coello Coello CA, Singh HK (eds) 2020 IEEE Symposium Series on Computational Intelligence (SSCI), virtual event, Canberra, Australia, 1–4 December 2020, 2020.
Smart JF. Jenkins: the Definitive Guide. Beijing: O’Reilly; 2011.
Spieker H, Gotlieb A, Marijan D, Mossige M. Reinforcement learning for automatic test case prioritization and selection in continuous integration. CoRR. 1811.04122, 2018.
Stein A, Maier R, Rosenbauer L, Hähner J. Xcs classifier system with experience replay. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference, GECCO’20. New York: Association for Computing Machinery, 2020, p. 404–13.
Stein A, Menssen S, Hähner J. What about interpolation? a radial basis function approach to classifier prediction modeling in xcsf. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’18. New York: Association for Computing Machinery, 2018, p. 537–44.
Stein A, Rudolph S, Tomforde S, Hähner J. Self-learning smart cameras—harnessing the generalisation capability of XCS. In: Proceedings of the 9th International Joint Conference on Computational Intelligence, Funchal, Portugal, 2017.
Ståhl D, Bosch J. Modeling continuous integration practice differences in industry software development. J Syst Softw. 2014;87:48–59.
Urbanowicz RJ, Browne WN. Introduction to Learning Classifier Systems. Springer Publishing Company, Incorporated, 1st edn., 2017.
Wilson S. Classifiers that approximate functions. Nat Comput. 2002;1:1–2.
Wilson SW. Classifier fitness based on accuracy. Evol Comput. 1995;3(2):149–75.
Yoo S, Harman M. Regression testing minimization, selection and prioritization: a survey. Softw Test Verif Reliab. 2012;22(2):67–120.
Yu Y, Jones JA, Harrold MJ. An empirical study of the effects of test-suite reduction on fault localization. In: Proceedings of the 30th International Conference on Software Engineering, ICSE ’08. New York: Association for Computing Machinery, 2008, p. 201–10.
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
Not applicable.
Corresponding author
Ethics declarations
Conflict of interest
Not applicable.
Code Availability
The source code for the ML approaches etc. can be retrieved from here: https://github.com/LagLukas/transfer_learning.
Consent to Participate
Not applicable (no medical study).
Consent to Publish
Not applicable (no medical study).
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised due to incorrect given name and family name of the authors in all references. Now, they have been corrected.
This article is part of the topical collection “Computational Intelligence” guest edited by Kurosh Madani, Kevin Warwick, Juan Julian Merelo, Thomas Bäck and Anna Kononova.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Rosenbauer, L., Pätzel, D., Stein, A. et al. A Learning Classifier System for Automated Test Case Prioritization and Selection. SN COMPUT. SCI. 3, 373 (2022). https://doi.org/10.1007/s42979-022-01255-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-022-01255-1