research-article

Open access

Etna: An Evaluation Platform for Property-Based Testing (Experience Report)

Authors:

Harrison Goldstein,

Benjamin C. Pierce,

Leonidas LampropoulosAuthors Info & Claims

Proceedings of the ACM on Programming Languages, Volume 7, Issue ICFP

Article No.: 218, Pages 878 - 894

https://doi.org/10.1145/3607860

Published: 31 August 2023 Publication History

Abstract

Property-based testing is a mainstay of functional programming, boasting a rich literature, an enthusiastic user community, and an abundance of tools — so many, indeed, that new users may have difficulty choosing. Moreover, any given framework may support a variety of strategies for generating test inputs; even experienced users may wonder which are better in a given situation. Sadly, the PBT literature, though long on creativity, is short on rigorous comparisons to help answer such questions.

We present Etna, a platform for empirical evaluation and comparison of PBT techniques. Etna incorporates a number of popular PBT frameworks and testing workloads from the literature, and its extensible architecture makes adding new ones easy, while handling the technical drudgery of performance measurement. To illustrate its benefits, we use Etna to carry out several experiments with popular PBT approaches in both Coq and Haskell, allowing users to more clearly understand best practices and tradeoffs.

References

[1]

Alexandr Andoni, Dumitru Daniliuc, Sarfraz Khurshid, and Darko Marinov. 2002. Evaluating the "Small Scope Hypothesis". Oct.

[2]

Thomas Arts, Laura M. Castro, and John Hughes. 2008. Testing Erlang Data Types with QuviQ QuickCheck. In Proceedings of the 7th ACM SIGPLAN Workshop on Erlang. ACM, 1–8. https://doi.org/10.1145/1411273.1411275

Digital Library

[3]

Rudy Matela Braquehais. 2017. Tools for Discovery, Refinement and Generalization of Functional Properties by Enumerative Testing. Ph.D. Dissertation. University of York.

[4]

Lukas Bulwahn. 2012. The New Quickcheck for Isabelle - Random, Exhaustive and Symbolic Testing under One Roof. In 2nd International Conference on Certified Programs and Proofs (Lecture Notes in Computer Science, Vol. 7679). Springer, 92–108.

Digital Library

[5]

Lukas Bulwahn. 2012. Smart Testing of Functional Programs in Isabelle. In 18th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning (Lecture Notes in Computer Science, Vol. 7180). Springer, 153–167. isbn:978-3-642-28716-9

[6]

Koen Claessen, Jonas Duregård, and Michał H. Pał ka. 2014. Generating Constrained Random Data with Uniform Distribution. In Functional and Logic Programming (Lecture Notes in Computer Science, Vol. 8475). Springer, 18–34. isbn:978-3-319-07150-3 https://doi.org/10.1007/978-3-319-07151-0_2

[7]

Koen Claessen and John Hughes. 2000. QuickCheck: a lightweight tool for random testing of Haskell programs. In Proceedings of the 5th ACM SIGPLAN International Conference on Functional Programming. ACM, 268–279. https://doi.org/10.1145/351240.351266

Digital Library

[8]

Simon Cruanes. 2017. QuickCheck Inspired Property-Based Testing for OCaml. https://github.com/c-cube/qcheck/

[9]

Stephen Dolan. 2017. Property Fuzzing for OCaml. https://github.com/stedolan/crowbar

[10]

Brendan Dolan-Gavitt, Patrick Hulin, Engin Kirda, Tim Leek, Andrea Mambretti, Wil Robertson, Frederick Ulrich, and Ryan Whelan. 2016. LAVA: Large-Scale Automated Vulnerability Addition. In 2016 IEEE Symposium on Security and Privacy. 110–121. https://doi.org/10.1109/SP.2016.15

[11]

Jonas Duregård, Patrik Jansson, and Meng Wang. 2012. Feat: Functional Enumeration of Algebraic Types. In Proceedings of the 2012 Haskell Symposium. ACM, 61–72. isbn:978-1-4503-1574-6 https://doi.org/10.1145/2364506.2364515

Digital Library

[12]

Burke Fetscher, Koen Claessen, Michal H. Palka, John Hughes, and Robert Bruce Findler. 2015. Making Random Judgments: Automatically Generating Well-Typed Terms from the Definition of a Type-System. In 24th European Symposium on Programming (Lecture Notes in Computer Science, Vol. 9032). Springer, 383–405. isbn:978-3-662-46668-1

[13]

Harrison Goldstein, John Hughes, Leonidas Lampropoulos, and Benjamin C. Pierce. 2021. Do Judge a Test by its Cover. In Programming Languages and Systems (Lecture Notes in Computer Science). Springer, 264–291. isbn:978-3-030-72019-3 https://doi.org/10.1007/978-3-030-72019-3_10

Digital Library

[14]

Rahul Gopinath, Carlos Jensen, and Alex Groce. 2014. Code Coverage for Suite Evaluation by Developers. In Proceedings of the 36th International Conference on Software Engineering. ACM, 72–82. isbn:9781450327565 https://doi.org/10.1145/2568225.2568278

Digital Library

[15]

Ahmad Hazimeh, Adrian Herrera, and Mathias Payer. 2020. Magma: A Ground-Truth Fuzzing Benchmark. Proceedings of the ACM on Measurement and Analysis of Computing Systems, 1–29. https://doi.org/10.1145/3428334

Digital Library

[16]

Tram Hoang, Anton Trunov, Leonidas Lampropoulos, and Ilya Sergey. 2022. Random Testing of a Higher-Order Blockchain Language (Experience Report). In Proceedings of the 27th ACM SIGPLAN International Conference on Functional Programming. https://doi.org/10.5281/zenodo.6778257

Digital Library

[17]

Catalin Hritcu, John Hughes, Benjamin C. Pierce, Antal Spector-Zabusky, Dimitrios Vytiniotis, Arthur Azevedo de Amorim, and Leonidas Lampropoulos. 2013. Testing Noninterference, Quickly. In Proceedings of the 18th ACM SIGPLAN International Conference on Functional Programming. https://doi.org/10.1145/2544174.2500574

Digital Library

[18]

Catalin Hritcu, Leonidas Lampropoulos, Antal Spector-Zabusky, Arthur Azevedo Amorim, Maxime Denes, John Hughes, Benjamin C. Pierce, and Dimitrios Vytiniotis. 2016. Testing Noninterference, Quickly. In Journal of Functional Programming. https://doi.org/10.1017/S0956796816000058

[19]

John Hughes. 2019. How to Specify It! - A Guide to Writing Properties of Pure Functions. In Symposium on Trends in Functional Programming. https://doi.org/10.1007/978-3-030-47147-7_4

Digital Library

[20]

Yue Jia and Mark Harman. 2011. An Analysis and Survey of the Development of Mutation Testing. IEEE Transactions on Software Engineering, 37, 5 (2011), 649–678. https://doi.org/10.1109/TSE.2010.62

Digital Library

[21]

Project Jupyter. 2023. Project Jupyter. https://jupyter.org

[22]

George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating Fuzz Testing. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2123–2138. isbn:9781450356930 https://doi.org/10.1145/3243734.3243804

Digital Library

[23]

Casey Klein and Robert Bruce Findler. 2009. Randomized Testing in PLT Redex. In Workshop on Scheme and Functional Programming.

[24]

Ivan Kuraj and Viktor Kuncak. 2014. SciFe: Scala framework for efficient enumeration of data structures with invariants. In Proceedings of the 5th Annual Scala Workshop. ACM, 45–49. isbn:978-1-4503-2868-5 https://doi.org/10.1145/2637647.2637655

Digital Library

[25]

Ivan Kuraj, Viktor Kuncak, and Daniel Jackson. 2015. Programming with Enumerable Sets of Structures. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. https://doi.org/10.1145/2814270.2814323

Digital Library

[26]

Leonidas Lampropoulos. 2018. Random Testing for Language Design. Ph.D. Dissertation. University of Pennsylvania.

[27]

Leonidas Lampropoulos, Diane Gallois-Wong, Catalin Hritcu, John Hughes, Benjamin C. Pierce, and Li yao Xia. 2017. Beginner’s Luck: a language for property-based generators. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages. https://doi.org/10.1145/3009837.3009868

Digital Library

[28]

Leonidas Lampropoulos, Michael Hicks, and Benjamin C. Pierce. 2019. Coverage Guided, Property Based Testing. In Proceedings of the 2019 ACM Conference on Object-Oriented Programming Languages, Systems, and Applications. https://doi.org/10.1145/3360607

Digital Library

[29]

Leonidas Lampropoulos, Zoe Paraskevopoulou, and Benjamin C. Pierce. 2018. Generating Good Generators for Inductive Relations. In Proceedings of the ACM Conference on Principles of Programming Languages (POPL). https://doi.org/10.1145/3158133

Digital Library

[30]

Leonidas Lampropoulos and Benjamin C. Pierce. 2018. QuickChick: Property-Based Testing In Coq. Electronic textbook.

[31]

Fredrik Lindblad. 2007. Property Directed Generation of First-Order Test Data. In 8th Symposium on Trends in Functional Programming. Intellect, 105–123. isbn:978-1-84150-196-3

[32]

Andreas Löcher and Konstantinos Sagonas. 2018. Automating Targeted Property-Based Testing. In 2018 IEEE 11th International Conference on Software Testing, Verification and Validation. 70–80. https://doi.org/10.1109/ICST.2018.00017

[33]

Andreas Löscher and Konstantinos Sagonas. 2017. Targeted Property-Based Testing. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis. ACM, 46–56. isbn:9781450350761 https://doi.org/10.1145/3092703.3092711

Digital Library

[34]

David Maciver and Alastair F. Donaldson. 2020. Test-Case Reduction via Test-Case Generation: Insights from the Hypothesis Reducer (Tool Insights Paper). In 34th European Conference on Object-Oriented Programming (LIPIcs, Vol. 166). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 13:1–13:27. https://doi.org/10.4230/LIPIcs.ECOOP.2020.13

[35]

David R. MacIver. 2016. Hypothesis: Property-Based Testing for Python. https://hypothesis.works/

[36]

Jan Midtgaard, Mathias Nygaard Justesen, Patrick Kasting, Flemming Nielson, and Hanne Riis Nielson. 2017. Effect-Driven QuickChecking of Compilers. Proceedings of the ACM on Programming Languages, 1, ICFP (2017), Article 15, Aug, 23 pages. https://doi.org/10.1145/3110259

Digital Library

[37]

Agustín Mista and Alejandro Russo. 2019. Generating Random Structurally Rich Algebraic Data Type Values. In Proceedings of the 14th International Workshop on Automation of Software Test. IEEE Press, 48–54. https://doi.org/10.1109/AST.2019.00013

Digital Library

[38]

Agustín Mista and Alejandro Russo. 2021. Deriving Compositional Random Generators. In Proceedings of the 31st Symposium on Implementation and Application of Functional Languages. ACM, Article 11, 12 pages. isbn:9781450375627 https://doi.org/10.1145/3412932.3412943

Digital Library

[39]

Rickard Nilsson. 2019. ScalaCheck: Property-Based Testing for Scala. https://scalacheck.org/

[40]

Rohan Padhye, Caroline Lemieux, Koushik Sen, Mike Papadakis, and Yves Le Traon. 2019. Semantic Fuzzing with Zest. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. ACM, 329–340. isbn:9781450362245 https://doi.org/10.1145/3293882.3330576

Digital Library

[41]

Michał H. Pał ka, Koen Claessen, Alejandro Russo, and John Hughes. 2011. Testing an Optimising Compiler by Generating Random Lambda Terms. In Proceedings of the 6th International Workshop on Automation of Software Test. ACM, 91–97. isbn:978-1-4503-0592-1 https://doi.org/10.1145/1982595.1982615

Digital Library

[42]

pandas. 2023. pandas - Python Data Analysis Library. https://pandas.pydata.org/

[43]

Manolis Papadakis and Konstantinos F. Sagonas. 2011. A PropEr integration of types and function specifications with property-based testing. In Proceedings of the 10th ACM SIGPLAN Workshop on Erlang. 39–50. https://doi.org/10.1145/2034654.2034663

Digital Library

[44]

Zoe Paraskevopoulou, Aaron Eline, and Leonidas Lampropoulos. 2022. Computing Correctly with Inductive Relations. In Proceedings of the ACM SIGPLAN Symposium on Programming Language Design and Implementation. https://doi.org/10.1145/3519939.3523707

Digital Library

[45]

Benjamin C. Pierce. 2018. Software Foundations. Electronic textbook.

[46]

Plotly. 2023. Plotly: Low-Code Data App Development. https://plotly.com/

[47]

Colin Runciman, Matthew Naylor, and Fredrik Lindblad. 2008. SmallCheck and Lazy SmallCheck: automatic exhaustive testing for small values. In 1st ACM SIGPLAN Symposium on Haskell. ACM, 37–48. isbn:978-1-60558-064-7 https://doi.org/10.1145/1543134.1411292

Digital Library

[48]

Arvind Satyanarayan, Dominik Moritz, Kanit Wongsuphasawat, and Jeffrey Heer. 2017. Vega-Lite: A Grammar of Interactive Graphics. IEEE Transactions on Visualization and Computer Graphics, 23, 1 (2017), Jan., 341–350. issn:1941-0506 https://doi.org/10.1109/TVCG.2016.2599030

Digital Library

[49]

Jacob Stanley. 2019. Hedgehog: Release with Confidence. https://hackage.haskell.org/package/hedgehog/

[50]

Kanit Wongsuphasawat, Zening Qu, Dominik Moritz, Riley Chang, Felix Ouk, Anushka Anand, Jock Mackinlay, Bill Howe, and Jeffrey Heer. 2017. Voyager 2: Augmenting Visual Analysis with Partial View Specifications. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 2648–2659. isbn:978-1-4503-4655-9 https://doi.org/10.1145/3025453.3025768

Digital Library

[51]

Li-yao Xia. 2018. A quick tour of generic-random. https://hackage.haskell.org/package/generic-random-1.5.0.0/docs/Generic-Random.html

[52]

Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 283–294. https://doi.org/10.1145/1993498.1993532

Digital Library

[53]

Zenong Zhang, Zach Patterson, Michael Hicks, and Shiyi Wei. 2022. FIXREVERTER: A Realistic Bug Injection Methodology for Benchmarking Fuzz Testing. In 31st USENIX Security Symposium. USENIX Association, 3699–3715. isbn:978-1-939133-31-1

Cited By

Goldstein HCutler JDickstein DPierce BHead ARoychoudhury APaiva AAbreu RStorey M(2024)Property-Based Testing in PracticeProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639581(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639581

Index Terms

Etna: An Evaluation Platform for Property-Based Testing (Experience Report)

Index terms have been assigned to the content through auto-classification.

Recommendations

FitSpec: refining property sets for functional testing
Haskell '16

This paper presents FitSpec, a tool providing automated assistance in the task of refining sets of test properties for Haskell functions. FitSpec tests mutant variations of functions under test against a given property set, recording any surviving ...
FitSpec: refining property sets for functional testing
Haskell 2016: Proceedings of the 9th International Symposium on Haskell

This paper presents FitSpec, a tool providing automated assistance in the task of refining sets of test properties for Haskell functions. FitSpec tests mutant variations of functions under test against a given property set, recording any surviving ...
JQF: coverage-guided property-based testing in Java
ISSTA 2019: Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis

We present JQF, a platform for performing coverage-guided fuzz testing in Java. JQF is designed both for practitioners, who wish to find bugs in Java programs, as well as for researchers, who wish to implement new fuzzing algorithms.

Practitioners ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages

Proceedings of the ACM on Programming Languages Volume 7, Issue ICFP

August 2023

981 pages

EISSN:2475-1421

DOI:10.1145/3554311

Editor:
Michael Hicks
Amazon, USA

Issue’s Table of Contents

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 August 2023

Published in PACMPL Volume 7, Issue ICFP

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
412
Total Downloads

Downloads (Last 12 months)412
Downloads (Last 6 weeks)25

Reflects downloads up to 12 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Goldstein HCutler JDickstein DPierce BHead ARoychoudhury APaiva AAbreu RStorey M(2024)Property-Based Testing in PracticeProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639581(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639581

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents