Article

Testing randomized software by means of statistical hypothesis tests

Authors:

Ralph Guderlei,

Johannes Mayer,

Christoph Schneckenburger,

Frank FleischerAuthors Info & Claims

SOQUA '07: Fourth international workshop on Software quality assurance: in conjunction with the 6th ESEC/FSE joint meeting

Pages 46 - 54

https://doi.org/10.1145/1295074.1295084

Published: 03 September 2007 Publication History

Abstract

Software testing research has mostly focused on deterministic software systems so far. In reality, however, randomized software systems (i.e. software systems with random output) also play an important role, e. g. for simulation purposes. Test evaluation is a real problem in that case. In previous work, statistical hypothesis tests have already been used, but test decisions have not been interpreted. Furthermore, those tests have only been applied if theoretic values on the distribution of program outputs had been available and not in case of golden implementations. In the present paper, we propose a general approach on how to apply statistical hypothesis tests in order to test randomized software systems. We exactly determine the confidence gained through these tests. We show that after passing a statistical hypothesis test, it can be guaranteed that at least the tested characteristics of the system under test are correct with a certain probability and accuracy of the result. Our approach is also applicable in case of golden implementations. Therefore, knowledge about the outputs' distribution is not necessary in that situation, which is a great advantage. Two case studies are described that have been conducted in order to assess the proposed approach. One of the case studies is based on a software system for the simulation of stochastic geometric models (among others) that evolved from the GeoStoch research project and is now used at France Télécom R&D, Paris, in order to calculate costs for communication networks and to plan new network structures.

References

[1]

Apache Software Foundation. Apache Commons Math homepage. http://jakarta.apache.org/commons/math.

[2]

J. Bible and G. Rothermel. A unifying framework supporting the analysis and development of safe regression test selection techniques. Technical Report 99--6011, Oregon State University, 1999.

[3]

R. V. Binder. Testing Object-Oriented Systems. Addison-Wesley, 1999.

Digital Library

[4]

J. Bohrmann. On random testing and test oracles in the context of credit risk. Diploma thesis, Faculty of Computer Science, Ulm University, 2006.

[5]

G. Casella and R. L. Berger. Statistical Inference. Wadsworth Group, Duxbury, CA, USA, 2002.

[6]

A. Di Pierro and H. Wiklicky. Probabilistic abstract interpretation and statistical testing. In Proceedings of the Second Joint International Workshop on Process Algebra and Probabilistic Methods, Performance Modeling and Verification, volume 2399 of Lecture Notes in Computer Science, pages 211--212. Springer-Verlag, 2002.

Digital Library

[7]

C. Gloaguen, F. Fleischer, H. Schmidt, and V. Schmidt. Simulation of typical Cox--Voronoi cells with a special regard to implementation tests. Mathematical Methods of Operations Research (ZOR), 62(3):357--373, 2005.

[8]

C. Gloaguen, F. Fleischer, H. Schmidt, and V. Schmidt. Fitting of stochastic telecommunication network models via distance measures and Monte--Carlo tests. Telecommunication Systems, 31(4):353--377, 2006.

Digital Library

[9]

C. Gloaguen, F. Fleischer, H. Schmidt, and V. Schmidt. Modelling and simulation of telecommunication networks: Analysis of mean shortest path lengths. In R. Lechnerova, I. Saxl, and V. Benes, editors, Proceedings of the 6th International Conference on Stereology, Spatial Statistics and Stochastic Geometry, pages 25--36. Union of Czech Mathematicians and Physicists, Prague, Czech Republic, 2006.

[10]

R. Hierons. Testing from a nondeterministic finite state machine using adaptive state counting. IEEE Transactions on Computers, 53(10):1330--1342, 2004.

Digital Library

[11]

D. Hoffman. Heuristic test oracles. Software Testing and Quality Engineering Magazine, 1(2), 1999.

[12]

Inst. of Stochastics, Ulm University. GeoStoch homepage. http://www.geostoch.de/.

[13]

K. N. King and A. J. Offutt. A fortran language system for mutation-based software testing. Software Practice and Experience, 21(7):685--718, 1991.

Digital Library

[14]

D. Kozen. Semantics of probabilistic programs. Journal of Computer and System Sciences, 22(3):328--350, 1981.

[15]

Y.-S. Ma, J. Offutt, and Y.-R. Kwon. MuJava: An automated class mutation system. Software Testing, Verification, and Reliability, 15(2):97--133, 2005.

Digital Library

[16]

R. Maier and V. Schmidt. Stationary iterated tessellations. Advances in Applied Probability, 35:337--353, 2003.

[17]

J. Mayer. On testing image processing applications with statistical methods. In Proceedings of Software Engineering 2005 (SE 2005), volume P-64 of Lecture Notes in Informatics, pages 69--78, Bonn, Germany, 2005. Köllen Druck+Verlag GmbH.

[18]

J. Mayer and R. Guderlei. Test oracles using statistical methods. In Proceedings of the First International Workshop on Software Quality (SOQUA 2004), volume P-58 of Lecture Notes in Informatics, pages 179--189, Bonn, Germany, 2004. Köllen Druck+Verlag GmbH.

[19]

J. Mayer, V. Schmidt, and F. Schweiggert. A unified simulation framework for spatial stochastic models. Simulation Modelling Practice and Theory, 12(5):307--326, 2004.

[20]

J. Mecke. Parametric representation of mean values for stationary random mosaics. Mathematische Operationsforschung und Statistik Series Statistics, 15:437--442, 1984.

[21]

D. Monniaux. An abstract Monte-Carlo method for the analysis of probabilistic programs. In Proceedings of the 28th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 93--101. ACM Press, New York, NY, USA, 2001.

Digital Library

[22]

D. Monniaux. Abstraction of expectation functions using Gaussian distributions. In Proceedings of the 4th International Conference on Verification, Model Checking, and Abstract Interpretation, volume 2575 of Lecture Notes In Computer Science, pages 161--173. Springer-Verlag, 2002.

Digital Library

[23]

L. Nachmanson, M. Veanes, W. Schulte, N. Tillmann, and W. Grieskamp. Optimal strategies for testing nondeterministic systems. SIGSOFT Software Engineering Notes, 29(4):55--64, 2004.

Digital Library

[24]

J. Offutt and R. H. Untch. Mutation 2000: Uniting the orthogonal. In Proceedings of Mutation 2000: Mutation Testing in the Twentieth and the Twenty First Centuries, pages 45--55, 2000.

[25]

C. H. Papadimitriou. Computational Complexity. Addison-Wesley, 1995.

[26]

H. Sevcikova, A. Borning, D. Socha, and W.-G. Bleek. Automated testing of stochastic systems: A statistically grounded approach. In Proceedings of the ACM/SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2006), pages 215--224. ACM, 2006.

Digital Library

[27]

C. Szyperski, D. Gruntz, and S. Murer. Component Software -- Beyond Object-Oriented Programming. Addison-Wesley / ACM Press, 2nd edition, 2002.

Digital Library

[28]

E. W. Weyuker. On testing non-testable programs. The Computer Journal, 25(4):465--470, 1982.

[29]

F. Wilcoxon. Individual comparisons by ranking methods. Biometrics Bulletin, 1(6):80--83, 1945.

Cited By

Karanikolas CDimitroulakos GMasselos K(2023)Simulating Software Evolution to Evaluate the Reliability of Early Decision-making among Design Alternatives toward MaintainabilityACM Transactions on Software Engineering and Methodology10.1145/356993132:3(1-38)Online publication date: 26-Apr-2023
https://dl.acm.org/doi/10.1145/3569931
Al-tekreeti MNaik KAbdrabou AZaman MSrivastava P(2019)A Methodology for Generating Tests for Evaluating User-Centric Performance of Mobile Streaming ApplicationsModel-Driven Engineering and Software Development10.1007/978-3-030-11030-7_18(406-429)Online publication date: 1-Feb-2019
https://doi.org/10.1007/978-3-030-11030-7_18
Al‐tekreeti MAbdrabou ANaik K(2019)An end‐user‐centric test generation methodology for performance evaluation of mobile networked applicationsSoftware Testing, Verification and Reliability10.1002/stvr.171329:6-7Online publication date: 7-Oct-2019
https://doi.org/10.1002/stvr.1713
Show More Cited By

Index Terms

Testing randomized software by means of statistical hypothesis tests

Recommendations

Statistical Metamorphic Testing Testing Programs with Random Output by Means of Statistical Hypothesis Tests and Metamorphic Testing
QSIC '07: Proceedings of the Seventh International Conference on Quality Software

Testing software with random output is a challenging task as the output corresponding to a given input dif- fers from execution to execution. Therefore, the usual ap- proaches to software testing are not applicable to random- ized software. Instead, ...
Evaluating Automated Unit Testing in Sulu
ICST '08: Proceedings of the 2008 International Conference on Software Testing, Verification, and Validation

Sulu is a programming language designed with automated unit testing specifically in mind. One aim of Sulu is to demonstrate how automated software testing can be more integrated into current software development processes. Sulu's runtime and tools ...
Bootstrap hypothesis testing for some common statistical problems: A critical evaluation of size and power properties

The construction of bootstrap hypothesis tests can differ from that of bootstrap confidence intervals because of the need to generate the bootstrap distribution of test statistics under a specific null hypothesis. Similarly, bootstrap power calculations ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SOQUA '07: Fourth international workshop on Software quality assurance: in conjunction with the 6th ESEC/FSE joint meeting

September 2007

120 pages

ISBN:9781595937247

DOI:10.1145/1295074

General Chair:
Mauro Pezzè
University of Lugano, Switzerland, and University of Milano-Bicocca, Italy

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

ACM: Association for Computing Machinery
SIGSOFT: ACM Special Interest Group on Software Engineering
CEPIS: The Council of European Professional Informatics Societies

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 September 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

ESEC/FSE07

Sponsor:

ACM
SIGSOFT
CEPIS

ESEC/FSE07: Joint 11th European Software Engineering Conference 2007

September 3 - 4, 2007

Dubrovnik, Croatia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
520
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)1

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Karanikolas CDimitroulakos GMasselos K(2023)Simulating Software Evolution to Evaluate the Reliability of Early Decision-making among Design Alternatives toward MaintainabilityACM Transactions on Software Engineering and Methodology10.1145/356993132:3(1-38)Online publication date: 26-Apr-2023
https://dl.acm.org/doi/10.1145/3569931
Al-tekreeti MNaik KAbdrabou AZaman MSrivastava P(2019)A Methodology for Generating Tests for Evaluating User-Centric Performance of Mobile Streaming ApplicationsModel-Driven Engineering and Software Development10.1007/978-3-030-11030-7_18(406-429)Online publication date: 1-Feb-2019
https://doi.org/10.1007/978-3-030-11030-7_18
Al‐tekreeti MAbdrabou ANaik K(2019)An end‐user‐centric test generation methodology for performance evaluation of mobile networked applicationsSoftware Testing, Verification and Reliability10.1002/stvr.171329:6-7Online publication date: 7-Oct-2019
https://doi.org/10.1002/stvr.1713
Nakajima S(2018)Generalized Oracle for Testing Machine Learning Computer ProgramsSoftware Engineering and Formal Methods10.1007/978-3-319-74781-1_13(174-179)Online publication date: 2-Feb-2018
https://doi.org/10.1007/978-3-319-74781-1_13
Guderlei RJust RSchneckenburger C(2008)Benchmarking Testing Strategies with Tools from Mutation AnalysisProceedings of the 2008 IEEE International Conference on Software Testing Verification and Validation Workshop10.1109/ICSTW.2008.11(360-364)Online publication date: 9-Apr-2008
https://dl.acm.org/doi/10.1109/ICSTW.2008.11
Vetterli MLigtenberg A(2006)A Discrete Fourier-Cosine Transform ChipIEEE Journal on Selected Areas in Communications10.1109/JSAC.1986.11462894:1(49-61)Online publication date: 1-Sep-2006
https://dl.acm.org/doi/10.1109/JSAC.1986.1146289

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents