research-article

Performance regression testing of concurrent classes

Authors:

Michael Pradel,

Markus Huggler,

Thomas R. GrossAuthors Info & Claims

ISSTA 2014: Proceedings of the 2014 International Symposium on Software Testing and Analysis

Pages 13 - 25

https://doi.org/10.1145/2610384.2610393

Published: 21 July 2014 Publication History

Abstract

Developers of thread-safe classes struggle with two opposing goals. The class must be correct, which requires synchronizing concurrent accesses, and the class should provide reasonable performance, which is difficult to realize in the presence of unnecessary synchronization. Validating the performance of a thread-safe class is challenging because it requires diverse workloads that use the class, because existing performance analysis techniques focus on individual bottleneck methods, and because reliably measuring the performance of concurrent executions is difficult. This paper presents SpeedGun, an automatic performance regression testing technique for thread-safe classes. The key idea is to generate multi-threaded performance tests and to compare two versions of a class with each other. The analysis notifies developers when changing a thread-safe class significantly influences the performance of clients of this class. An evaluation with 113 pairs of classes from popular Java projects shows that the analysis effectively identifies 13 performance differences, including performance regressions that the respective developers were not aware of.

References

[1]

E. R. Altman, M. Arnold, S. Fink, and N. Mitchell. Performance analysis of idle programs. In Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 739–753. ACM, 2010.

Digital Library

[2]

C. Artho, K. Havelund, and A. Biere. High-level data races. Software Testing, Verification and Reliability, 13(4):207–227, 2003.

[3]

M. Attariyan, M. Chow, and J. Flinn. X-ray: Automating root-cause diagnosis of performance anomalies in production. In Symposium on Operating Systems Design and Implementation (OSDI), pages 307–320, 2012.

Digital Library

[4]

A. Avritzer and E. J. Weyuker. The automatic generation of load test suites and the assessment of the resulting software. IEEE Transactions on Software Engineering, 21(9):705–716, 1995.

Digital Library

[5]

P. A. Brooks and A. M. Memon. Automated GUI testing guided by usage profiles. In Conference on Automated Software Engineering (ASE), pages 333–342, 2007.

Digital Library

[6]

S. Burckhardt, C. Dern, M. Musuvathi, and R. Tan. Line-Up: a complete and automatic linearizability checker. In Conference on Programming Language Design and Implementation (PLDI), pages 330–340. ACM, 2010.

Digital Library

[7]

S. Burckhardt, P. Kothari, M. Musuvathi, and S. Nagarakatte. A randomized scheduler with probabilistic guarantees of finding bugs. In Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 167–178, 2010.

Digital Library

[8]

J. Burnim, S. Juvekar, and K. Sen. WISE: Automated test generation for worst-case complexity. In ICSE, pages 463–473. IEEE, 2009.

Digital Library

[9]

C. Cadar, D. Dunbar, and D. R. Engler. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Symposium on Operating Systems Design and Implementation (OSDI), pages 209–224. USENIX, 2008.

Digital Library

[10]

T. Chen, L. I. Ananiev, and A. V. Tikhonov. Keeping kernel performance from regressions. In Linux Symposium, 2007.

[11]

I. Ciupa, A. Leitner, M. Oriol, and B. Meyer. ARTOO: adaptive random testing for object-oriented software. In International Conference on Software Engineering (ICSE), pages 71–80. ACM, 2008.

Digital Library

[12]

K. E. Coons, S. Burckhardt, and M. Musuvathi. GAMBIT: effective unit testing for concurrency libraries. In Symposium on Principles and Practice of Parallel Programming, (PPOPP), pages 15–24. ACM, 2010.

Digital Library

[13]

C. Csallner and Y. Smaragdakis. JCrasher: an automatic robustness tester for Java. Software Practice and Experience, 34(11):1025–1050, 2004.

Digital Library

[14]

B. Daniel, D. Dig, K. Garcia, and D. Marinov. Automated testing of refactoring engines. In European Software Engineering Conference and International Symposium on Foundations of Software Engineering (ESEC/FSE), pages 185–194. ACM, 2007.

Digital Library

[15]

O. Edelstein, E. Farchi, Y. Nir, G. Ratsaby, and S. Ur. Multithreaded Java program test generation. IBM Systems Journal, 41(1):111–125, 2002.

Digital Library

[16]

C. Flanagan and S. N. Freund. FastTrack: eﬃcient and precise dynamic race detection. In Conference on Programming Language Design and Implementation (PLDI), pages 121–133. ACM, 2009.

Digital Library

[17]

C. Flanagan, S. N. Freund, and J. Yi. Velodrome: a sound and complete dynamic atomicity checker for multithreaded programs. In Conference on Programming Language Design and Implementation (PLDI), pages 293–303. ACM, 2008.

Digital Library

[18]

K. C. Foo, Z. M. Jiang, B. Adams, A. E. Hassan, Y. Zou, and P. Flora. Mining performance regression testing repositories for automated performance analysis. In International Conference on Quality Software (QSIC), pages 32–41, 2010.

Digital Library

[19]

A. Georges, D. Buytaert, and L. Eeckhout. Statistically rigorous Java performance evaluation. In Conference on Object-Oriented Programming, Systems, Languages, and Application (OOPSLA), pages 57–76. ACM, 2007.

Digital Library

[20]

P. Godefroid, N. Klarlund, and K. Sen. DART: directed automated random testing. In Conference on Programming Language Design and Implementation (PLDI), pages 213–223. ACM, 2005.

Digital Library

[21]

S. L. Graham, P. B. Kessler, and M. K. Mckusick. Gprof: A call graph execution profiler. In SIGPLAN Symposium on Compiler Construction, pages 120–126. ACM, 1982.

Digital Library

[22]

M. Grechanik, C. Fu, and Q. Xie. Automatically finding performance problems with feedback-directed learning software testing. In International Conference on Software Engineering (ICSE), pages 156–166, 2012.

Digital Library

[23]

S. Han, Y. Dang, S. Ge, D. Zhang, and T. Xie. Performance debugging in the large via mining millions of stack traces. In International Conference on Software Engineering (ICSE), pages 145–155. IEEE, 2012.

Digital Library

[24]

M. Hauswirth, P. F. Sweeney, A. Diwan, and M. Hind. Vertical profiling: understanding the behavior of object-oriented applications. In Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 251–269, 2004.

Digital Library

[25]

S. Hong, J. Ahn, S. Park, M. Kim, and M. J. Harrold. Testing concurrent programs to achieve high synchronization coverage. In International Symposium on Software Testing and Analysis (ISSTA), pages 210–220. ACM, 2012.

Digital Library

[26]

J. Huang and C. Zhang. Persuasive prediction of concurrency access anomalies. In International Symposium on Software Testing and Analysis (ISSTA), pages 144–154. ACM, 2011.

Digital Library

[27]

M. Ji, E. W. Felten, and K. Li. Performance measurements for multithreaded programs. In Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), pages 161–170, 1998.

Digital Library

[28]

G. Jin, L. Song, X. Shi, J. Scherpelz, and S. Lu. Understanding and detecting real-world performance bugs. In Conference on Programming Language Design and Implementation (PLDI), pages 77–88. ACM, 2012.

Digital Library

[29]

W. Jin, A. Orso, and T. Xie. Automated behavioral regression testing. In International Conference on Software Testing, Verification and Validation (ICST), pages 137–146. IEEE, 2010.

Digital Library

[30]

P. Joshi, C.-S. Park, K. Sen, and M. Naik. A randomized dynamic program analysis technique for detecting real deadlocks. In Conference on Programming Language Design and Implementation (PLDI), pages 110–120. ACM, 2009.

Digital Library

[31]

M. Jovic, A. Adamoli, and M. Hauswirth. Catch me if you can: performance bug detection in the wild. In Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 155–170. ACM, 2011.

Digital Library

[32]

Z. Lai, S.-C. Cheung, and W. K. Chan. Detecting atomic-set serializability violations in multithreaded programs through active randomized testing. In International Conference on Software Engineering (ICSE), pages 235–244. ACM, 2010.

Digital Library

[33]

B. Lucia and L. Ceze. Finding concurrency bugs with context-aware communication graphs. In Symposium on Microarchitecture (MICRO), pages 553–563. ACM, 2009.

Digital Library

[34]

S. McCamant and M. D. Ernst. Predicting problems caused by component upgrades. In European Software Engineering Conference and Symposium on Foundations of Software Engineering (ESEC/FSE), pages 287–296. ACM, 2003.

Digital Library

[35]

W. M. McKeeman. Differential testing for software. Digital Technical Journal, 10(1):100–107, 1998.

[36]

M. Musuvathi, S. Qadeer, T. Ball, G. Basler, P. A. Nainar, and I. Neamtiu. Finding and reproducing Heisenbugs in concurrent programs. In Symposium on Operating Systems Design and Implementation, pages 267–280. USENIX, 2008.

Digital Library

[37]

A. Nistor, Q. Luo, M. Pradel, T. R. Gross, and D. Marinov. Ballerina: Automatic generation and clustering of eﬃcient random unit tests for multithreaded code. In International Conference on Software Engineering (ICSE), pages 727–737, 2012.

Digital Library

[38]

A. Nistor, L. Song, D. Marinov, and S. Lu. Toddler: Detecting performance problems via similar memory-access patterns. In International Conference on Software Engineering (ICSE), pages 562–571, 2013.

Digital Library

[39]

C. Pacheco, S. K. Lahiri, M. D. Ernst, and T. Ball. Feedback-directed random test generation. In International Conference on Software Engineering (ICSE), pages 75–84. IEEE, 2007.

Digital Library

[40]

C.-S. Park and K. Sen. Randomized active atomicity violation detection in concurrent programs. In Symposium on Foundations of Software Engineering (FSE), pages 135–145. ACM, 2008.

Digital Library

[41]

M. Pradel and T. R. Gross. Fully automatic and precise detection of thread safety violations. In Conference on Programming Language Design and Implementation (PLDI), pages 521–530, 2012.

Digital Library

[42]

M. Pradel and T. R. Gross. Leveraging test generation and specification mining for automated bug detection without false positives. In International Conference on Software Engineering (ICSE), pages 288–298, 2012.

Digital Library

[43]

M. Pradel and T. R. Gross. Automatic testing of sequential and concurrent substitutability. In International Conference on Software Engineering (ICSE), pages 282–291, 2013.

Digital Library

[44]

S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. E. Anderson. Eraser: A dynamic data race detector for multithreaded programs. ACM Transactions on Computer Systems, 15(4):391–411, 1997.

Digital Library

[45]

K. Sen. Race directed random testing of concurrent programs. In Conference on Programming Language Design and Implementation (PLDI), pages 11–21. ACM, 2008.

Digital Library

[46]

O. Shacham, N. Bronson, A. Aiken, M. Sagiv, M. Vechev, and E. Yahav. Testing atomicity of composed concurrent operations. In Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA), pages 51–64, 2011.

Digital Library

[47]

Y. Shi, S. Park, Z. Yin, S. Lu, Y. Zhou, W. Chen, and W. Zheng. Do I use the wrong definition? DefUse: Definition-use invariants for detecting concurrency and sequential bugs. In Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 160–174. ACM, 2010.

Digital Library

[48]

S. Steenbuck and G. Fraser. Generating unit tests for concurrent classes. In International Conference on Software Testing, Verification and Validation (ICST), 2013.

Digital Library

[49]

S. Tasharofi, M. Pradel, Y. Lin, and R. Johnson. Bita: Coverage-guided, automatic testing of actor programs. In Conference on Automated Software Engineering (ASE), 2013.

Digital Library

[50]

S. Thummalapenta, T. Xie, N. Tillmann, J. de Halleux, and W. Schulte. MSeqGen: Object-oriented unit-test generation via mining source code. In European Software Engineering Conference and International Symposium on Foundations of Software Engineering (ESEC/FSE), pages 193–202. ACM, 2009.

Digital Library

[51]

W. Visser, C. S. Pasareanu, and S. Khurshid. Test input generation with Java PathFinder. In International Symposium on Software Testing and Analysis (ISSTA), pages 97–107. ACM, 2004.

Digital Library

[52]

A. Wert, J. Happe, and L. Happe. Supporting swift reaction: Automatically uncovering performance problems by systematic experiments. In International Conference on Software Engineering (ICSE), pages 552–561, 2013.

Digital Library

[53]

T. Xie, D. Marinov, W. Schulte, and D. Notkin. Symstra: A framework for generating object-oriented unit tests using symbolic execution. In Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), pages 365–381. Springer, 2005.

Digital Library

[54]

G. Xu. Finding reusable data structures. In Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 1017–1034. ACM, 2012.

Digital Library

[55]

D. Yan, G. H. Xu, and A. Rountev. Uncovering performance problems in Java applications with reference propagation profiling. In International Conference on Software Engineering, (ICSE), pages 134–144. IEEE, 2012.

Digital Library

[56]

C. Yilmaz, A. S. Krishna, A. M. Memon, A. A. Porter, D. C. Schmidt, A. S. Gokhale, and B. Natarajan. Main effects screening: a distributed continuous quality assurance process for monitoring performance degradation in evolving software systems. In International Conference on Software Engineering (ICSE), pages 293–302, 2005.

Digital Library

[57]

J. Yu, S. Narayanasamy, C. Pereira, and G. Pokam. Maple: a coverage-driven testing tool for multithreaded programs. In Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 485–502. ACM, 2012.

Digital Library

[58]

S. Zaman, B. Adams, and A. E. Hassan. A qualitative study on performance bugs. In Working Conference on Mining Software Repositories (MSR), pages 199–208. IEEE, 2012.

[59]

P. Zhang, S. Elbaum, and M. B. Dwyer. Automatic generation of load tests. In Conference on Automated Software Engineering (ASE), pages 43–52, 2011.

Digital Library

[60]

X. Zhuang, S. Kim, M. J. Serrano, and J.-D. Choi. Perfdiff: a framework for performance difference analysis in a virtual machine environment. In Symposium on Code Generation and Optimization (CGO), pages 4–13. ACM, 2008.

Digital Library

Cited By

Zhao GGeorgiou SHassan SZou YTruong DCorbin TSpinellis DConstantinou EBacchelli A(2024)Enhancing Performance Bug Prediction Using Performance Code MetricsProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644920(50-62)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3644920
Zhao YXiao LWong S(2024)A Platform-Agnostic Framework for Automatically Identifying Performance Issue Reports with Heuristic Linguistic PatternsIEEE Transactions on Software Engineering10.1109/TSE.2024.3390623(1-22)Online publication date: 2024
https://doi.org/10.1109/TSE.2024.3390623
Laaber CYue TAli S(2024)Evaluating Search-Based Software Microbenchmark PrioritizationIEEE Transactions on Software Engineering10.1109/TSE.2024.338083650:7(1687-1703)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1109/TSE.2024.3380836
Show More Cited By

Index Terms

Performance regression testing of concurrent classes
1. Computing methodologies
  1. Concurrent computing methodologies
    1. Concurrent programming languages
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging
  2. Software notations and tools
    1. General programming languages
      1. Language types
        Concurrent programming languages
        Object oriented languages

Recommendations

Effectiveness and challenges in generating concurrent tests for thread-safe classes
ASE '18: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering

Developing correct and efficient concurrent programs is difficult and error-prone, due to the complexity of thread synchronization. Often, developers alleviate such problem by relying on thread-safe classes, which encapsulate most synchronization-...
Efficient detection of thread safety violations via coverage-guided generation of concurrent tests
ICSE '17: Proceedings of the 39th International Conference on Software Engineering

As writing concurrent programs is challenging, developers often rely on thread-safe classes, which encapsulate most synchronization issues. Testing such classes is crucial to ensure the correctness of concurrent programs. An effective approach to ...
Line-up: a complete and automatic linearizability checker
PLDI '10

Modular development of concurrent applications requires thread-safe components that behave correctly when called concurrently by multiple client threads. This paper focuses on linearizability, a specific formalization of thread safety, where all ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ISSTA 2014: Proceedings of the 2014 International Symposium on Software Testing and Analysis

July 2014

460 pages

ISBN:9781450326452

DOI:10.1145/2610384

General Chair:
Corina S. Păsăreanu
NASA Ames, USA
,
Program Chair:
Darko Marinov
University of Illinois at Urbana-Champaign, USA

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 July 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ISSTA '14

Sponsor:

SIGSOFT

ISSTA '14: International Symposium on Software Testing and Analysis

July 21 - 25, 2014

CA, San Jose, USA

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '25

Sponsor:
sigsoft

34th ACM SIGSOFT International Symposium on Software Testing and Analysis

June 25 - 28, 2025

Trondheim , Norway

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

66
Total Citations
View Citations
396
Total Downloads

Downloads (Last 12 months)24
Downloads (Last 6 weeks)1

Reflects downloads up to 25 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhao GGeorgiou SHassan SZou YTruong DCorbin TSpinellis DConstantinou EBacchelli A(2024)Enhancing Performance Bug Prediction Using Performance Code MetricsProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644920(50-62)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3644920
Zhao YXiao LWong S(2024)A Platform-Agnostic Framework for Automatically Identifying Performance Issue Reports with Heuristic Linguistic PatternsIEEE Transactions on Software Engineering10.1109/TSE.2024.3390623(1-22)Online publication date: 2024
https://doi.org/10.1109/TSE.2024.3390623
Laaber CYue TAli S(2024)Evaluating Search-Based Software Microbenchmark PrioritizationIEEE Transactions on Software Engineering10.1109/TSE.2024.338083650:7(1687-1703)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1109/TSE.2024.3380836
He HXu ELi SJia ZZheng SYu YMa JLiao X(2023)When Database Meets New Storage Devices: Understanding and Exposing Performance Mismatches via ConfigurationsProceedings of the VLDB Endowment10.14778/3587136.358714516:7(1712-1725)Online publication date: 8-May-2023
https://dl.acm.org/doi/10.14778/3587136.3587145
Zhao YXiao LBondi AChen BLiu Y(2023)A Large-Scale Empirical Study of Real-Life Performance Issues in Open Source ProjectsIEEE Transactions on Software Engineering10.1109/TSE.2022.316762849:2(924-946)Online publication date: 1-Feb-2023
https://doi.org/10.1109/TSE.2022.3167628
Al Shoaibi DMkaouer M(2023)Understanding Software Performance Challenges an Empirical Study on Stack Overflow2023 International Conference on Code Quality (ICCQ)10.1109/ICCQ57276.2023.10114662(1-15)Online publication date: 22-Apr-2023
https://doi.org/10.1109/ICCQ57276.2023.10114662
Han XYu TYan G(2023)A systematic mapping study of software performance researchSoftware: Practice and Experience10.1002/spe.318553:5(1249-1270)Online publication date: 2-Jan-2023
https://doi.org/10.1002/spe.3185
Wu JDong JFang RZhang WWang WZuo DDwyer MDamian DZeller A(2022)FADATestProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510169(896-908)Online publication date: 21-May-2022
https://dl.acm.org/doi/10.1145/3510003.3510169
Chen JShang WShihab E(2022)PerfJIT: Test-Level Just-in-Time Prediction for Performance Regression Introducing CommitsIEEE Transactions on Software Engineering10.1109/TSE.2020.302395548:5(1529-1544)Online publication date: 1-May-2022
https://doi.org/10.1109/TSE.2020.3023955
Reichelt DKühne SHasselbring W(2022)Automated Identification of Performance Changes at Code Level2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)10.1109/QRS57517.2022.00096(916-925)Online publication date: Dec-2022
https://doi.org/10.1109/QRS57517.2022.00096
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten