research-article

Using Mutation Analysis for Assessing and Comparing Testing Coverage Criteria

Authors:

James H. Andrews,

Lionel C. Briand,

Akbar Siami NaminAuthors Info & Claims

IEEE Transactions on Software Engineering, Volume 32, Issue 8

Pages 608 - 624

https://doi.org/10.1109/TSE.2006.83

Published: 01 August 2006 Publication History

Abstract

The empirical assessment of test techniques plays an important role in software testing research. One common practice is to seed faults in subject software, either manually or by using a program that generates all possible mutants based on a set of mutation operators. The latter allows the systematic, repeatable seeding of large numbers of faults, thus facilitating the statistical analysis of fault detection effectiveness of test suites; however, we do not know whether empirical results obtained this way lead to valid, representative conclusions. Focusing on four common control and data flow criteria (Block, Decision, C-Use, and P-Use), this paper investigates this important issue based on a middle size industrial program with a comprehensive pool of test cases and known faults. Based on the data available thus far, the results are very consistent across the investigated criteria as they show that the use of mutation operators is yielding trustworthy results: Generated mutants can be used to predict the detection effectiveness of real faults. Applying such a mutation analysis, we then investigate the relative cost and effectiveness of the above-mentioned criteria by revisiting fundamental questions regarding the relationships between fault detection, test suite size, and control/data flow coverage. Although such questions have been partially investigated in previous studies, we can use a large number of mutants, which helps decrease the impact of random variation in our analysis and allows us to use a different analysis approach. Our results are then compared with published studies, plausible reasons for the differences are provided, and the research leads us to suggest a way to tune the mutation analysis process to possible differences in fault detection probabilities in a specific environment.

References

[1]

J.H. Andrews, L.C. Briand, and Y. Labiche, “Is Mutation an Appropriate Tool for Testing Experiments?” Proc. IEEE Int'l Conf. Software Eng., pp. 402-411, 2005.

Digital Library

[2]

J.H. Andrews and Y. Zhang, “General Test Result Checking with Log File Analysis,” IEEE Trans. Software Eng., vol. 29, no. 7, pp.634-648, July 2003.

Digital Library

[3]

B. Beizer, Software Testing Techniques, second ed. Van Nostrand Reinhold, 1990.

Digital Library

[4]

L.C. Briand, Y. Labiche, and Y. Wang, “Using Simulation to Empirically Investigate Test Coverage Criteria,” Proc. IEEE/ACM Int'l Conf. Software Eng., pp. 86-95, 2004.

Digital Library

[5]

T.A. Budd and D. Angluin, “Two Notions of Correctness and Their Relation to Testing,” Acta Informatica, vol. 18, no. 1, pp. 31-45, 1982.

Digital Library

[6]

D.T. Campbell and J.C. Stanley, Experimental and Quasi-Experimental Designs for Research. Houghton Mifflin Company, 1990.

[7]

W. Chen, R.H. Untch, G. Rothermel, S. Elbaum, and J. von Ronne, “Can Fault-Exposure-Potential Estimates Improve the Fault Detection Abilities of Test Suites?” Software Testing, Verification, and Reliability, vol. 12, no. 4, pp. 197-218, 2002.

[8]

R.A. DeMillo, R.J. Lipton, and F.G. Sayward, “Hints on Test Data Selection: Help for the Practicing Programmer,” Computer, vol. 11, no. 4, pp. 34-41, Apr. 1978.

Digital Library

[9]

R.L. Eubank, Spline Smoothing and Nonparametric Regression. Marcel Dekker, 1988.

[10]

N.E. Fenton and S.L. Pfleeger, Software Metrics: A Rigorous and Practical Approach, second ed. PWS Publishing, 1998.

Digital Library

[11]

P.G. Frankl, O. Iakounenko, “Further Empirical Studies of Test Effectiveness,” Proc. Sixth ACM SIGSOFT Int'l Symp. Foundations of Software Eng., pp. 153-162, Nov. 1998.

Digital Library

[12]

P.G. Frankl and S.N. Weiss, “An Experimental Comparison of the Effectiveness of the All-Uses and All-Edges Adequacy Criteria,” Proc. Fourth Symp. Testing, Analysis, and Verification, pp. 154-164, 1991.

Digital Library

[13]

P.G. Frankl and S.N. Weiss, “An Experimental Comparison of the Effectiveness of Branch Testing and Data Flow Testing,” IEEE Trans. Software Eng., vol. 19, no. 8, pp. 774-787, Aug. 1993.

Digital Library

[14]

D. Hamlet and J. Maybee, The Engineering of Software. Addison Wesley, 2001.

[15]

R.G. Hamlet, “Testing Programs with the Aid of a Compiler,” IEEE Trans. Software Eng., vol. 3, no. 4, pp. 279-290, 1977.

Digital Library

[16]

M. Harder, J. Mellen, and M.D. Ernst, “Improving Test Suites via Operational Abstraction,” Proc. 25th Int'l Conf. Software Eng., pp.60-71, May 2003.

Digital Library

[17]

M. Hutchins, H. Froster, T. Goradia, and T. Ostrand, “Experiments on the Effectiveness of Dataflow- and Controlflow-Based Test Adequacy Criteria,” Proc. 16th IEEE Int'l Conf. Software Eng., pp. 191-200, May 1994.

Digital Library

[18]

S. Kim, J.A. Clark, and J.A. McDermid, “Investigating the Effectiveness of Object-Oriented Testing Strategies with the Mutation Method,” Software Testing, Verification, and Reliability, vol. 11, no. 3, pp. 207-225, 2001.

[19]

R.E. Kirk, “Practical Significance: A Concept Whose Time Has Come,” Educational and Psychological Measurement, vol. 56, no. 5, pp. 746-759, 1996.

[20]

M.R. Lyu, J.R. Horgan, and S. London, “A Coverage Analysis Tool for the Effectiveness of Software Testing,” IEEE Trans. Reliability, vol. 43, no. 4, pp. 527-535, 1994.

[21]

A.M. Memon, I. Banerjee, and A. Nagarajan, “What Test Oracle Should I Use for Effective GUI Testing?” Proc. IEEE Int'l Conf. Automated Software Eng. (ASE '03), pp. 164-173, Oct. 2003.

Digital Library

[22]

A.J. Offutt, “Investigations of the Software Testing Coupling Effect,” ACM Trans. Software Eng. and Methodology, vol. 1, no. 1, pp.3-18, 1992.

Digital Library

[23]

A.J. Offutt, A. Lee, G. Rothermel, R.H. Untch, and C. Zapf, “An Experimental Determination of Sufficient Mutation Operators,” ACM Trans. Software Eng. and Methodology, vol. 5, no. 2, pp. 99-118, 1996.

Digital Library

[24]

A.J. Offutt and J. Pan, “Detecting Equivalent Mutants and the Feasible Path Problem,” Software Testing, Verification, and Reliability, vol. 7, no. 3, pp. 165-192, 1997.

[25]

A.J. Offutt and R.H. Untch, “Mutation 2000: Uniting the Orthogonal,” Proc. Mutation, pp. 45-55, Oct. 2000.

[26]

A. Pasquini, A. Crespo, and P. Matrelle, “Sensitivity of Reliability-Growth Models to Operational Profiles Errors vs Testing Accuracy,” IEEE Trans. Reliability, vol. 45, no. 4, pp. 531-540, 1996.

[27]

S. Rapps and E.J. Weyuker, “Selecting Software Test Data Using Data Flow Information,” IEEE Trans. Software Eng., vol. 11, no. 4, pp. 367-375, Apr. 1985.

Digital Library

[28]

G. Rothermel, R.H. Untch, C. Chu, and M.J. Harrold, “Prioritizing Test Cases for Regression Testing,” IEEE Trans. Software Eng., vol. 27, no. 10, pp. 929-948, Oct. 2001.

Digital Library

[29]

T.P. Ryan, Modern Regression Methods. Wiley, 1996.

[30]

P. Thévenod-Fosse, H. Waeselynck, and Y. Crouzet, “An Experimental Study on Software Structural Testing: Deterministic versus Random Input Generation,” Proc. 21st Int'l Symp. Fault-Tolerant Computing, pp. 410-417, June 1991.

[31]

F.I. Vokolos and P.G. Frankl, “Empirical Evaluation of the Textual Differencing Regression Testing Technique,” Proc. IEEE Int'l Conf. Software Maintenance, pp. 44-53, Mar. 1998.

Digital Library

[32]

C. Wohlin, P. Runeson, M. Host, M.C. Ohlsson, B. Regnell, and A. Wesslen, Experimentation in Software Engineering—An Introduction. Kluwer, 2000.

Digital Library

[33]

W.E. Wong, J.R. Horgan, A.P. Mathur, and A. Pasquini, “Test Set Size Minimization and Fault Detection Effectiveness: A Case Study in a Space Application,” Technical Report TR-173-P, Software Eng. Research Center (SERC), 1997.

Cited By

Pedram SLabiche YLiu HAleti AArrieta A(2024)Using Category Partition to Detect Metamorphic RelationsProceedings of the 9th ACM International Workshop on Metamorphic Testing10.1145/3679006.3685068(10-17)Online publication date: 13-Sep-2024
https://dl.acm.org/doi/10.1145/3679006.3685068
Yoon JCha S(2024)FeatMaker: Automated Feature Engineering for Search Strategy of Symbolic ExecutionProceedings of the ACM on Software Engineering10.1145/36608151:FSE(2447-2468)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660815
Tian ZShu HWang DCao XKamei YChen JChristakis MPradel M(2024)Large Language Models for Equivalent Mutant Detection: How Far Are We?Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680395(1733-1745)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680395
Show More Cited By

Index Terms

Using Mutation Analysis for Assessing and Comparing Testing Coverage Criteria
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Index terms have been assigned to the content through auto-classification.

Recommendations

Mutation Testing Cost Reduction Techniques: A Survey

Since the 1970s, researchers have widely used mutation as a testing technique, applying mainly it to validate test suites, as well as to validate test case strategies and test data generation. Mutation today is sufficiently mature for industrial ...
Finding Atomicity-Violation Bugs through Unserializable Interleaving Testing

Multicore hardware is making concurrent programs pervasive. Unfortunately, concurrent programs are prone to bugs. Among different types of concurrency bugs, atomicity violations are common and important. How to test the interleaving space and expose ...
Investigating the Correlation between Mutation Score and Coverage Score
UKSIM '13: Proceedings of the 2013 UKSim 15th International Conference on Computer Modelling and Simulation

Strong test suites ensure the correctness and quality of software. Coverage and Mutation score are widespread metrics for evaluating the quality of a test suite. Mutation analysis process improves a test suite to obtain higher coverage scores. We ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Software Engineering

IEEE Transactions on Software Engineering Volume 32, Issue 8

August 2006

96 pages

ISSN:0098-5589

Issue’s Table of Contents

Copyright © 2006.

Publisher

IEEE Press

Publication History

Published: 01 August 2006

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

169
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Pedram SLabiche YLiu HAleti AArrieta A(2024)Using Category Partition to Detect Metamorphic RelationsProceedings of the 9th ACM International Workshop on Metamorphic Testing10.1145/3679006.3685068(10-17)Online publication date: 13-Sep-2024
https://dl.acm.org/doi/10.1145/3679006.3685068
Yoon JCha S(2024)FeatMaker: Automated Feature Engineering for Search Strategy of Symbolic ExecutionProceedings of the ACM on Software Engineering10.1145/36608151:FSE(2447-2468)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660815
Tian ZShu HWang DCao XKamei YChen JChristakis MPradel M(2024)Large Language Models for Equivalent Mutant Detection: How Far Are We?Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680395(1733-1745)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680395
Alblwi SAyad AMili ASaadatmand MLonetti FBudnik CLi JGuerriero A(2024)Mutation Coverage is not Strongly Correlated with Mutation CoverageProceedings of the 5th ACM/IEEE International Conference on Automation of Software Test (AST 2024)10.1145/3644032.3644442(1-11)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3644032.3644442
Song YZhang XXie XLiu QGao RXing CRoychoudhury APaiva AAbreu RStorey M(2024)ReClues: Representing and indexing failures in parallel debugging with program variablesProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639098(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639098
Barboni MMorichetta APolini ACasoni F(2024)ReSuMo: a regression strategy and tool for mutation testing of solidity smart contractsSoftware Quality Journal10.1007/s11219-023-09637-132:1(225-253)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1007/s11219-023-09637-1
Duran-Limon HVelasco-Elizondo PMora MMeda-Campana MAguilar KHernandez-Ochoa MSumuano L(2024)Verifying consistency of software product line architectures with product architecturesSoftware and Systems Modeling (SoSyM)10.1007/s10270-023-01114-423:1(195-221)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1007/s10270-023-01114-4
Liu KHan YZhang JChen ZSarro FHarman MHuang GMa YJust RFraser G(2023)Who Judges the Judge: An Empirical Study on Online Judge TestsProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598060(334-346)Online publication date: 12-Jul-2023
https://dl.acm.org/doi/10.1145/3597926.3598060
Lipp SElsner DKacianka SPretschner ABöhme MBanescu SJust RFraser G(2023)Green Fuzzing: A Saturation-Based Stopping Criterion using Vulnerability PredictionProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598043(127-139)Online publication date: 12-Jul-2023
https://dl.acm.org/doi/10.1145/3597926.3598043
Dola SDwyer MSoffa M(2023)Input Distribution Coverage: Measuring Feature Interaction Adequacy in Neural Network TestingACM Transactions on Software Engineering and Methodology10.1145/357604032:3(1-48)Online publication date: 26-Apr-2023
https://dl.acm.org/doi/10.1145/3576040
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents