Article

Puzzle-based automatic testing: bringing humans into the loop by solving puzzles

Authors:

Sunghun KimAuthors Info & Claims

ASE '12: Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering

Pages 140 - 149

https://doi.org/10.1145/2351676.2351697

Published: 03 September 2012 Publication History

Abstract

Recently, many automatic test generation techniques have been proposed, such as Randoop, Pex and jCUTE. However, usually test coverage of these techniques has been around 50-60% only, due to several challenges, such as 1) the object mutation problem, where test generators cannot create and/or modify test inputs to desired object states; and 2) the constraint solving problem, where test generators fail to solve path conditions to cover certain branches. By analyzing branches not covered by state-of-the-art techniques, we noticed that these challenges might not be so difficult for humans.

To verify this hypothesis, we propose a Puzzle-based Automatic Testing environment (PAT) which decomposes object mutation and complex constraint solving problems into small puzzles for humans to solve. We generated PAT puzzles for two open source projects and asked different groups of people to solve these puzzles. It was shown that they could be effectively solved by humans: 231 out of 400 puzzles were solved by humans at an average speed of one minute per puzzle. The 231 puzzle solutions helped cover 534 and 308 additional branches (7.0% and 5.8% coverage improvement) in the two open source projects, on top of the saturated branch coverages achieved by the two state-of-the-art test generation techniques.

References

[1]

D. Babic and A. J. Hu. Calysto: scalable and precise extended static checking. In Proc. ICSE, New York, NY, USA, 2008. ACM.

Digital Library

[2]

C. Boyapati, S. Khurshid, and D. Marinov. Korat: automated testing based on Java predicates. In Proc. ISSTA, 2002.

Digital Library

[3]

S. Chandra, S. J. Fink, and M. Sridharan. Snugglebug: a powerful approach to weakest preconditions. In Proc. PLDI, 2009.

Digital Library

[4]

L. A. Clarke. A system to generate test data and symbolically execute programs. IEEE Trans. Softw. Eng., 2(3):215–222, 1976.

Digital Library

[5]

O. Consortium. Asm. http://asm.objectweb.org.

[6]

S. Cooper, A. Treuille, J. Barbero, A. Leaver-Fay, K. Tuite, F. Khatib, A. C. Snyder, M. Beenen, D. Salesin, D. Baker, and Z. Popovi´c. The challenge of designing scientiﬁc discovery games. In Proc. FDG, 2010.

Digital Library

[7]

R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Efﬁciently computing static single assignment form and the control dependence graph. ACM Trans. Program. Lang. Syst., 13(4):451–490, 1991.

Digital Library

[8]

B. Daniel and M. Boshernitsan. Predicting effectiveness of automatic testing tools. In Proc. ASE, ASE ’08, pages 363–366, Washington, DC, USA, 2008.

Digital Library

[9]

L. M. de Moura and N. Bjørner. Z3: An efﬁcient SMT solver. In Proc. 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 337–340, 2008.

[10]

R. A. DeMillo and A. J. Offutt. Constraint-based automatic test data generation. IEEE Trans. Softw. Eng., 17(9):900–910, 1991.

Digital Library

[11]

W. Dietl, S. Dietzel, M. D. Ernst, N. Mote, B. Walker, S. Cooper, T. Pavlik, and Z. Popovi´c. Veriﬁcation games: Making veriﬁcation fun. In FTfJP’2012: 14th Workshop on Formal Techniques for Java-like Programs, Beijing, China, June 2012.

Digital Library

[12]

E. Dijkstra. A Discipline of Programming. Prentice Hall, Englewood Cliffs, 1976.

Digital Library

[13]

B. Dutertre and L. de Moura. System description: Yices 1.0. In Proc. SMT-COMP, 2006.

[14]

I. Erete and A. Orso. Optimizing constraint solving to better support symbolic execution. In Proc. CSTVA, pages 310 –315, March 2011.

Digital Library

[15]

P. Godefroid, N. Klarlund, and K. Sen. DART: Directed automated random testing. In Proc. PLDI, pages 213–223, 2005.

Digital Library

[16]

A. Gotlieb, B. Botella, and M. Rueher. Automatic test data generation using constraint solving techniques. In Proc. ISSTA, ISSTA ’98, pages 53–62, 1998.

Digital Library

[17]

F. Heidenreich, J. Johannes, M. Seifert, and C. Wende. Closing the gap between modelling and java. In SLE, volume 5969 of Lecture Notes in Computer Science, pages 374–383. Springer, 2009.

Digital Library

[18]

IBM. T.J. Watson Libraries for Analysis (WALA). Online manual. http://wala.sf.net .

[19]

H. Jaygarl, S. Kim, T. Xie, and C. K. Chang. OCAT: Object Capture-based Automated Testing. In Proc. ISSTA, July 2010.

Digital Library

[20]

C. C. Michael, G. McGraw, and M. A. Schatz. Generating software test data by evolution. IEEE Trans. Softw. Eng., 27:1085–1110, December 2001.

Digital Library

[21]

M. G. Nanda and S. Sinha. Accurate interprocedural null-dereference analysis for Java. In Proc. ICSE, pages 133–143, Washington, DC, USA, 2009. IEEE Computer Society.

Digital Library

[22]

C. Pacheco, S. K. Lahiri, M. D. Ernst, and T. Ball. Feedback-directed random test generation. In Proc. ICSE, pages 75–84, Minneapolis, MN, USA, May 23–25, 2007.

Digital Library

[23]

R. Pandita, T. Xie, N. Tillmann, and J. de Halleux. Guided test generation for coverage criteria. In Proc. ICSM, Sept. 2010.

Digital Library

[24]

C.-S. Park and K. Sen. Randomized active atomicity violation detection in concurrent programs. In Proc. FSE, SIGSOFT ’08/FSE-16, pages 135–145, New York, NY, USA, 2008. ACM.

Digital Library

[25]

C. S. Păsăreanu, N. Rungta, and W. Visser. Symbolic execution with mixed concrete-symbolic solving. In Proc. ISSTA, ISSTA ’11, pages 34–44, New York, NY, USA, 2011. ACM.

Digital Library

[26]

K. Sen and G. Agha. CUTE and jCUTE: Concolic unit testing and explicit path model-checking tools. In Proc. CAV, 2006.

Digital Library

[27]

K. Sen, D. Marinov, and G. Agha. CUTE: A concolic unit testing engine for C. In Proc. ESEC/FSE, pages 263–272, 2005.

Digital Library

[28]

S. Thummalapenta, T. Xie, N. Tillmann, P. de Halleux, and W. Schulte. MSeqGen: Object-oriented unit-test generation via mining source code. In Proc. ESEC/FSE, August 2009.

Digital Library

[29]

N. Tillmann and J. de Halleux. Pex-white box test generation for .NET. In Proc. TAP, pages 134–153, 2008.

Digital Library

[30]

N. Tillmann, J. de Halleux, and T. Xie. Pex4fun: Teaching and learning computer science via social gaming. In Proc. CSEET, CSEET ’11, 2011.

Digital Library

[31]

L. von Ahn and L. Dabbish. Labeling images with a computer game. In Proc. CHI, April 2004.

Digital Library

[32]

E. Y. C. Wong, A. T. S. Chan, and H.-V. Leong. Efﬁcient management of XML contents over wireless environment by Xstream. In Proc. SAC, pages 1122–1127, 2004.

Digital Library

[33]

X. Xiao, T. Xie, N. Tillmann, and J. de Halleux. Precise identiﬁcation of problems for structural test generation. In Proc. ICSE, May 2011.

Digital Library

[34]

L. Zhang, T. Xie, L. Zhang, N. Tillmann, J. de Halleux, and H. Mei. Test generation via dynamic symbolic execution for mutation testing. In Proc. ICSM, Sept. 2010.

Digital Library

Cited By

Ahmed FMajeed AKhan TBhatti S(2022)Value-based cost-cognizant test case prioritization for regression testingPLOS ONE10.1371/journal.pone.026497217:5(e0264972)Online publication date: 17-May-2022
https://doi.org/10.1371/journal.pone.0264972
Wang JYang YWang SHu JWang Q(2022)Context- and Fairness-Aware In-Process Crowdworker RecommendationACM Transactions on Software Engineering and Methodology10.1145/348757131:3(1-31)Online publication date: 7-Mar-2022
https://dl.acm.org/doi/10.1145/3487571
Wang JYang YWang SChen CWang DWang Q(2022)Context-Aware Personalized Crowdtesting Task RecommendationIEEE Transactions on Software Engineering10.1109/TSE.2021.308117148:8(3131-3144)Online publication date: 1-Aug-2022
https://doi.org/10.1109/TSE.2021.3081171
Show More Cited By

Index Terms

Puzzle-based automatic testing: bringing humans into the loop by solving puzzles
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

GATE: game-based testing environment
ICSE '11: Proceedings of the 33rd International Conference on Software Engineering

In this paper, we propose a game-based public testing mechanism called GATE. The purpose of GATE is to make use of the rich human resource on the Internet to help increase effectiveness in software testing and improve test adequacy. GATE facilitates ...
On automatic generation of RTL validation test benches using circuit testing techniques
GLSVLSI '03: Proceedings of the 13th ACM Great Lakes symposium on VLSI

In this paper, we examine how good validation test benches can be automatically generated starting from the RTL description of a circuit. We develop our methodology based on extensive experiments performed with several popular benchmarks as well as ...
Measuring and Mitigating Gaps in Structural Testing
ICSE '23: Proceedings of the 45th International Conference on Software Engineering

Structural code coverage is a popular test adequacy metric that measures the percentage of program structure (e.g., statement, branch, decision) executed by a test suite. While structural coverage has several benefits, previous studies suggested that ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ASE '12: Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering

September 2012

409 pages

ISBN:9781450312042

DOI:10.1145/2351676

General Chair:
Michael Goedicke,
Program Chairs:
Tim Menzies,
Motoshi Saeki

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

SIGAI: ACM Special Interest Group on Artificial Intelligence
Universität Duisburg Essen: Universität Duisburg Essen
TCSE: IEEE Computer Society's Tech. Council on Software Engin.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 September 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

ASE'12

Sponsor:

SIGSOFT

ASE'12: IEEE/ACM International Conference on Automated Software Engineering

September 3 - 7, 2012

Essen, Germany

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

30
Total Citations
View Citations
342
Total Downloads

Downloads (Last 12 months)9
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ahmed FMajeed AKhan TBhatti S(2022)Value-based cost-cognizant test case prioritization for regression testingPLOS ONE10.1371/journal.pone.026497217:5(e0264972)Online publication date: 17-May-2022
https://doi.org/10.1371/journal.pone.0264972
Wang JYang YWang SHu JWang Q(2022)Context- and Fairness-Aware In-Process Crowdworker RecommendationACM Transactions on Software Engineering and Methodology10.1145/348757131:3(1-31)Online publication date: 7-Mar-2022
https://dl.acm.org/doi/10.1145/3487571
Wang JYang YWang SChen CWang DWang Q(2022)Context-Aware Personalized Crowdtesting Task RecommendationIEEE Transactions on Software Engineering10.1109/TSE.2021.308117148:8(3131-3144)Online publication date: 1-Aug-2022
https://doi.org/10.1109/TSE.2021.3081171
Wang JWang SChen JMenzies TCui QXie MWang Q(2021)Characterizing Crowds to Better Optimize Worker Recommendation in Crowdsourced TestingIEEE Transactions on Software Engineering10.1109/TSE.2019.291852047:6(1259-1276)Online publication date: 1-Jun-2021
https://doi.org/10.1109/TSE.2019.2918520
Wang JYang YMenzies TWang Q(2020)iSENSE2.0ACM Transactions on Software Engineering and Methodology10.1145/339460229:4(1-27)Online publication date: 6-Jul-2020
https://dl.acm.org/doi/10.1145/3394602
Wang JYang YWang SHu YWang DWang QRothermel GBae D(2020)Context-aware in-process crowdworker recommendationProceedings of the ACM/IEEE 42nd International Conference on Software Engineering10.1145/3377811.3380380(1535-1546)Online publication date: 27-Jun-2020
https://dl.acm.org/doi/10.1145/3377811.3380380
Galimova E(2020)Features of software testing in the development of geographic information systemsE3S Web of Conferences10.1051/e3sconf/202017702008177(02008)Online publication date: 8-Jul-2020
https://doi.org/10.1051/e3sconf/202017702008
Jiang HLi XRen ZXuan JJin Z(2019)Toward Better Summarizing Bug Reports With Crowdsourcing Elicited AttributesIEEE Transactions on Reliability10.1109/TR.2018.287342768:1(2-22)Online publication date: Mar-2019
https://doi.org/10.1109/TR.2018.2873427
Wang JYang YKrishna RMenzies TWang QAtlee JBultan TWhittle J(2019)iSENSEProceedings of the 41st International Conference on Software Engineering10.1109/ICSE.2019.00097(912-923)Online publication date: 25-May-2019
https://dl.acm.org/doi/10.1109/ICSE.2019.00097
Mao KHarman MJia YRosu GDi Penta MNguyen T(2017)Crowd intelligence enhances automated mobile testingProceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering10.5555/3155562.3155569(16-26)Online publication date: 30-Oct-2017
https://dl.acm.org/doi/10.5555/3155562.3155569
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten