research-article

Performance regression testing target prioritization via performance risk analysis

Authors:

Yuanyuan ZhouAuthors Info & Claims

ICSE 2014: Proceedings of the 36th International Conference on Software Engineering

Pages 60 - 71

https://doi.org/10.1145/2568225.2568232

Published: 31 May 2014 Publication History

Abstract

As software evolves, problematic changes can significantly degrade software performance, i.e., introducing performance regression. Performance regression testing is an effective way to reveal such issues in early stages. Yet because of its high overhead, this activity is usually performed infrequently. Consequently, when performance regression issue is spotted at a certain point, multiple commits might have been merged since last testing. Developers have to spend extra time and efforts narrowing down which commit caused the problem. Existing efforts try to improve performance regression testing efficiency through test case reduction or prioritization.

In this paper, we propose a new lightweight and white-box approach, performance risk analysis (PRA), to improve performance regression testing efficiency via testing target prioritization. The analysis statically evaluates a given source code commit's risk in introducing performance regression. Performance regression testing can leverage the analysis result to test commits with high risks first while delaying or skipping testing on low-risk commits.

To validate this idea's feasibility, we conduct a study on 100 real-world performance regression issues from three widely used, open-source software. Guided by insights from the study, we design PRA and build a tool, PerfScope. Evaluation on the examined problematic commits shows our tool can successfully alarm 91% of them. Moreover, on 600 randomly picked new commits from six large-scale software, with our tool, developers just need to test only 14-22% of the 600 commits and will still be able to alert 87-95% of the commits with performance regression.

References

[1]

Approaches to performance testing. http://www.oracle.com/ technetwork/articles/entarch/ performance-testing-095962.html.

[2]

Chrome bug 56752. https://code.google.com/p/ chromium/issues/detail?id=56752.

[3]

Chromium’s performance testing machines. http://build. chromium.org/p/chromium.perf/buildslaves.

[4]

Code Bisection. http://en.wikipedia.org/wiki/Code_ Bisection.

[5]

Considerations for load tests. http://msdn.microsoft.com/ en-us/library/ms404664.aspx.

[6]

Google performance testing. http://googletesting. blogspot.com/2007/10/performance-testing.html.

[7]

Mysql bug 16504. http://bugs.mysql.com/bug.php? id=16504.

[8]

OProfile. http://oprofile.sourceforge.net.

[9]

pgbench Good Practices. http://www.postgresql.org/ docs/devel/static/pgbench.html.

[10]

A CHARYA, M., AND R OBINSON, B. Practical change impact analysis based on static program slicing for industrial software systems. ICSE ’11, ACM, pp. 746–755.

Digital Library

[11]

A GUILERA, M. K., M OGUL, J. C., W IENER, J. L., R EYNOLDS, P., AND M UTHITACHAROEN, A. Performance debugging for distributed systems of black boxes. SOSP ’03, ACM, pp. 74–89.

Digital Library

[12]

A PIWATTANAPONG, T., O RSO, A., AND H ARROLD, M. J. Efficient and precise dynamic impact analysis using execute-after sequences. ICSE ’05, ACM, pp. 432–441.

Digital Library

[13]

A RNOLD, R. S. Software Change Impact Analysis. IEEE Computer Society Press, Los Alamitos, CA, USA, 1996.

Digital Library

[14]

A RNOLD, R. S., AND B OHNER, S. A. Impact analysis - towards a framework for comparison. In ICSM (1993), pp. 292–301.

Digital Library

[15]

B AKER, M. Mozilla performance regression policy. http://www. mozilla.org/hacking/regression-policy.html.

[16]

B ARBER, S. Life-cycle performance testing for eliminating last-minute surprises. http://msdn.microsoft.com/ en-us/library/bb905531.aspx.

[17]

B ARNA, C., L ITOIU, M., AND G HANBARI, H. Model-based performance testing (nier track). ICSE ’11, ACM, pp. 872–875.

Digital Library

[18]

B IEMAN, J. Editorial: Is anyone listening? Software Quality Journal 13, 3 (Sept. 2005), 225–226.

Digital Library

[19]

B LACK, J., M ELACHRINOUDIS, E., AND K AELI, D. Bi-criteria models for all-uses test suite reduction. ICSE ’04, IEEE Computer Society, pp. 106–115.

Digital Library

[20]

B OHNER, S. A. Extending software change impact analysis into cots components. SEW ’02, IEEE Computer Society, pp. 175–182.

Digital Library

[21]

C HEN, T., A NANIEV, L. I., AND T IKHONOV, A. V. Keeping kernel performance from regressions. vol. 1 of OLS’ 07, pp. 93–102.

[22]

C HEN, Y.-F., R OSENBLUM, D. S., AND V O, K.-P. Testtube: a system for selective regression testing. ICSE ’94, IEEE Computer Society Press, pp. 211–220.

Digital Library

[23]

C ORBET, J. Performance regression discussion in kernel summit 2010. http://lwn.net/Articles/412747/.

[24]

D ENARO, G., P OLINI, A., AND E MMERICH, W. Early performance testing of distributed software applications. WOSP ’04, ACM, pp. 94–103.

Digital Library

[25]

E LBAUM, S., M ALISHEVSKY, A. G., AND R OTHERMEL, G. Prioritizing test cases for regression testing. ISSTA ’00, ACM, pp. 102–112.

Digital Library

[26]

F OO, K. C., J IANG, Z. M., A DAMS, B., H ASSAN, A. E., Z OU, Y., AND F LORA, P. Mining performance regression testing repositories for automated performance analysis. QSIC ’10, IEEE Computer Society, pp. 32–41.

Digital Library

[27]

F OO, K. C. D. Automated discovery of performance regressions in enterprise applications. Master’s thesis, Queen’s University, Canada, 2011.

[28]

F OX, G. Performance engineering as a part of the development life cycle for large-scale software systems. ICSE ’89, ACM, pp. 85–94.

Digital Library

[29]

G LEK, T. Massive performance regression from switching to gcc 4.5. http://gcc.gnu.org/ml/gcc/2010-06/msg00715. html.

[30]

G RECHANIK, M., F U, C., AND X IE, Q. Automatically finding performance problems with feedback-directed learning software testing. ICSE 2012, IEEE Press, pp. 156–166.

Digital Library

[31]

H AN, S., D ANG, Y., G E, S., Z HANG, D., AND X IE, T. Performance debugging in the large via mining millions of stack traces. ICSE 2012, IEEE Press, pp. 145–155.

Digital Library

[32]

H ARROLD, M. J., G UPTA, R., AND S OFFA, M. L. A methodology for controlling the size of a test suite. ACM Trans. Softw. Eng. Methodol. 2, 3 (July 1993), 270–285.

Digital Library

[33]

H ECKMANN, R., F ERDINAND, C., A NGEWANDTE, A., AND G MBH, I. Worst-case execution time prediction by static program analysis. IPDPS 2004, IEEE Computer Society, pp. 26–30.

[34]

J IN, G., S ONG, L., S HI, X., S CHERPELZ, J., AND L U, S. Understanding and detecting real-world performance bugs. PLDI ’12, ACM, pp. 77–88.

Digital Library

[35]

J OVIC, M., A DAMOLI, A., AND H AUSWIRTH, M. Catch me if you can: performance bug detection in the wild. OOPSLA ’11, ACM, pp. 155–170.

Digital Library

[36]

K ALIBERA, T., B ULEJ, L., AND T UMA, P. Automated detection of performance regressions: The mono experience. In MASCOTS (2005), pp. 183–190.

Digital Library

[37]

K ILLIAN, C., N AGARAJ, K., P ERVEZ, S., B RAUD, R., A NDERSON, J. W., AND J HALA, R. Finding latent performance bugs in systems implementations. FSE ’10, ACM, pp. 17–26.

Digital Library

[38]

K IM, J.-M., AND P ORTER, A. A history-based test prioritization technique for regression testing in resource constrained environments. ICSE ’02, ACM, pp. 119–129.

Digital Library

[39]

L ATTNER, C., AND A DVE, V. LLVM: A compilation framework for lifelong program analysis & transformation. CGO ’’04, IEEE Computer Society, pp. 75–86.

Digital Library

[40]

L AW, J., AND R OTHERMEL, G. Whole program path-based dynamic impact analysis. ICSE ’03, IEEE Computer Society, pp. 308–318.

Digital Library

[41]

L EUNG, A. W., L ALONDE, E., T ELLEEN, J., D AVIS, J., AND M ALTZAHN, C. Using comprehensive analysis for performance debugging in distributed storage systems. In MSST (2007), pp. 281–286.

Digital Library

[42]

L I, Z., H ARMAN, M., AND H IERONS, R. M. Search algorithms for regression test case prioritization. IEEE Trans. Softw. Eng. 33, 4 (Apr. 2007), 225–237.

Digital Library

[43]

M ARTIN, R. C. Agile Software Development: Principles, Patterns, and Practices. Prentice Hall PTR, Upper Saddle River, NJ, USA, 2003.

Digital Library

[44]

M AYER, M. In search of a better, faster, stronger web. http:// goo.gl/m4fXx, 2009.

[45]

M ITCHELL, M. GCC performance regression testing discussion. http://gcc.gnu.org/ml/gcc/2005-11/msg01306. html.

[46]

M OLYNEAUX, I. The Art of Application Performance Testing: Help for Programmers and Quality Assurance, 1st ed. O’Reilly Media, Inc., 2009.

Digital Library

[47]

O RSO, A., A PIWATTANAPONG, T., AND H ARROLD, M. J. Leveraging field data for impact analysis and regression testing. ESEC/FSE-11, ACM, pp. 128–137.

Digital Library

[48]

O RSO, A., A PIWATTANAPONG, T., L AW, J. B., R OTHERMEL, G., AND H ARROLD, M. J. An empirical comparison of dynamic impact analysis algorithms. ICSE ’04, pp. 491–500.

Digital Library

[49]

P ERSHAD, T., AND B AR N IR, O. Software quality and testing in mysql. MySQL Conference and Expo, 2009.

[50]

R OTHERMEL, G., AND H ARROLD, M. J. A safe, efficient regression test selection technique. ACM Trans. Softw. Eng. Methodol. 6, 2 (Apr. 1997), 173–210.

Digital Library

[51]

R OTHERMEL, G., U NTCH, R. H., C HU, C., AND H ARROLD, M. J. Test case prioritization: An empirical study. ICSM ’99, IEEE Computer Society, pp. 179–188.

Digital Library

[52]

R OTHERMEL, G., U NTCH, R. J., AND C HU, C. Prioritizing test cases for regression testing. IEEE Trans. Softw. Eng. 27 (October 2001), 929–948.

Digital Library

[53]

S HEN, K., Z HONG, M., AND L I, C. I/O system performance debugging using model-driven anomaly characterization. FAST ’05, USENIX Association, pp. 23–23.

Digital Library

[54]

S HERRIFF, M., AND W ILLIAMS, L. Empirical software change impact analysis using singular value decomposition. In ICST (2008), pp. 268–277.

Digital Library

[55]

T IP, F. A survey of program slicing techniques. Journal of programming languages 3, 3 (1995), 121–189.

[56]

T URVER, R. J., AND M UNRO, M. An early impact analysis technique for software maintenance. Journal of Software Maintenance: Research and Practice 6, 1 (1994), 35–52.

[57]

W EYUKER, E. J., AND V OKOLOS, F. I. Experience with performance testing of software systems: Issues, an approach, and case study. IEEE Trans. Softw. Eng. 26, 12 (Dec. 2000), 1147–1156.

Digital Library

[58]

W ILHELM, R., E NGBLOM, J., E RMEDAHL, A., H OLSTI, N., T HESING, S., W HALLEY, D., B ERNAT, G., F ERDINAND, C., H ECKMANN, R., M ITRA, T., M UELLER, F., P UAUT, I., P USCHNER, P., S TASCHULAT, J., AND S TENSTRÖM, P. The worst-case execution-time problem – overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst. 7, 3 (May 2008), 36:1–36:53.

Digital Library

[59]

Y AN, D., X U, G., AND R OUNTEV, A. Uncovering performance problems in java applications with reference propagation profiling. ICSE 2012, IEEE Press, pp. 134–144.

Digital Library

[60]

Y ILMAZ, C., K RISHNA, A. S., M EMON, A., P ORTER, A., S CHMIDT, D. C., G OKHALE, A., AND N ATARAJAN, B. Main effects screening: a distributed continuous quality assurance process for monitoring performance degradation in evolving software systems. ICSE ’05, ACM, pp. 293–302.

Digital Library

[61]

Z HANG, P., E LBAUM, S., AND D WYER, M. B. Automatic generation of load tests. In Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering (Washington, DC, USA, 2011), ASE ’11, IEEE Computer Society, pp. 43–52.

Digital Library

[62]

Z HONG, H., Z HANG, L., AND M EI, H. An experimental comparison of four test suite reduction techniques. ICSE ’06, ACM, pp. 636–640.

Digital Library

[63]

Z IMMERMANN, T., W EISGERBER, P., D IEHL, S., AND Z ELLER, A. Mining version histories to guide software changes. ICSE ’04, IEEE Computer Society, pp. 563–572.

Digital Library

Cited By

Laaber CYue TAli S(2024)Evaluating Search-Based Software Microbenchmark PrioritizationIEEE Transactions on Software Engineering10.1109/TSE.2024.338083650:7(1687-1703)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1109/TSE.2024.3380836
AlOmar EMkaouer MOuni A(2024)Behind the Intent of Extract Method Refactoring: A Systematic Literature ReviewIEEE Transactions on Software Engineering10.1109/TSE.2023.334580050:4(668-694)Online publication date: 4-Jan-2024
https://dl.acm.org/doi/10.1109/TSE.2023.3345800
Lu RXu EZhang YZhu FZhu ZWang MZhu ZXue GShu JLi MWu JNaor DGoel A(2023)PERSEUSProceedings of the 21st USENIX Conference on File and Storage Technologies10.5555/3585938.3585942(49-63)Online publication date: 21-Feb-2023
https://dl.acm.org/doi/10.5555/3585938.3585942
Show More Cited By

Index Terms

Performance regression testing target prioritization via performance risk analysis
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

PerfRanker: prioritization of performance regression tests for collection-intensive software
ISSTA 2017: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis

Regression performance testing is an important but time/resource-consuming phase during software development. Developers need to detect performance regressions as early as possible to reduce their negative impact and fixing cost. However, conducting ...
An industrial case study of automatically identifying performance regression-causes
MSR 2014: Proceedings of the 11th Working Conference on Mining Software Repositories

Even the addition of a single extra field or control statement in the source code of a large-scale software system can lead to performance regressions. Such regressions can considerably degrade the user experience. Working closely with the members of a ...
Addressing Performance Regressions in DevOps: Can We Escape from System Performance Testing?
ICSE '23: Proceedings of the 45th International Conference on Software Engineering: Companion Proceedings

Performance regression is an important type of performance issue in software systems. It indicates that the performance of the same features in the new version of the system becomes worse than that of previous versions, such as increased response time ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE 2014: Proceedings of the 36th International Conference on Software Engineering

May 2014

1139 pages

ISBN:9781450327565

DOI:10.1145/2568225

General Chair:
Pankaj Jalote
IIIT-Delhi, India
,
Program Chairs:
Lionel Briand
University of Luxembourg, Luxembourg
,
André van der Hoek
University of California, Irvine, USA

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

TCSE: IEEE Computer Society's Tech. Council on Software Engin.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICSE '14

Sponsor:

SIGSOFT

ICSE '14: 36th International Conference on Software Engineering

May 31 - June 7, 2014

Hyderabad, India

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

57
Total Citations
View Citations
849
Total Downloads

Downloads (Last 12 months)44
Downloads (Last 6 weeks)3

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Laaber CYue TAli S(2024)Evaluating Search-Based Software Microbenchmark PrioritizationIEEE Transactions on Software Engineering10.1109/TSE.2024.338083650:7(1687-1703)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1109/TSE.2024.3380836
AlOmar EMkaouer MOuni A(2024)Behind the Intent of Extract Method Refactoring: A Systematic Literature ReviewIEEE Transactions on Software Engineering10.1109/TSE.2023.334580050:4(668-694)Online publication date: 4-Jan-2024
https://dl.acm.org/doi/10.1109/TSE.2023.3345800
Lu RXu EZhang YZhu FZhu ZWang MZhu ZXue GShu JLi MWu JNaor DGoel A(2023)PERSEUSProceedings of the 21st USENIX Conference on File and Storage Technologies10.5555/3585938.3585942(49-63)Online publication date: 21-Feb-2023
https://dl.acm.org/doi/10.5555/3585938.3585942
Lu RXu EZhang YZhu FZhu ZWang MZhu ZXue GShu JLi MWu J(2023)From Missteps to Milestones: A Journey to Practical Fail-Slow DetectionACM Transactions on Storage10.1145/361769019:4(1-28)Online publication date: 1-Nov-2023
https://dl.acm.org/doi/10.1145/3617690
Li JZhang YLu SGunawi HGu XHuang FLi D(2023)Performance Bug Analysis and Detection for Distributed Storage and Computing SystemsACM Transactions on Storage10.1145/358028119:3(1-33)Online publication date: 19-Jun-2023
https://dl.acm.org/doi/10.1145/3580281
Farah PVergilio SVieira MCardellini VDi Marco ATuma P(2023)PerfoRT: A Tool for Software Performance RegressionCompanion of the 2023 ACM/SPEC International Conference on Performance Engineering10.1145/3578245.3584928(119-120)Online publication date: 15-Apr-2023
https://dl.acm.org/doi/10.1145/3578245.3584928
Jangali MTang YAlexandersson NLeitner PYang JShang W(2023)Automated Generation and Evaluation of JMH Microbenchmark Suites From Unit TestsIEEE Transactions on Software Engineering10.1109/TSE.2022.318800549:4(1704-1725)Online publication date: 1-Apr-2023
https://doi.org/10.1109/TSE.2022.3188005
Zhang YXie XLi YLin YChen SLiu YLi X(2023)Demystifying Performance Regressions in String SolversIEEE Transactions on Software Engineering10.1109/TSE.2022.316837349:3(947-961)Online publication date: 1-Mar-2023
https://doi.org/10.1109/TSE.2022.3168373
Zhang ZXing ZXia XXu XZhu LLu QGrundy JPollock LPenta M(2023)Faster or Slower? Performance Mystery of Python Idioms Unveiled with Empirical EvidenceProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00130(1495-1507)Online publication date: 14-May-2023
https://dl.acm.org/doi/10.1109/ICSE48619.2023.00130
Al Shoaibi DMkaouer M(2023)Understanding Software Performance Challenges an Empirical Study on Stack Overflow2023 International Conference on Code Quality (ICCQ)10.1109/ICCQ57276.2023.10114662(1-15)Online publication date: 22-Apr-2023
https://doi.org/10.1109/ICCQ57276.2023.10114662
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents