Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2568225.2568232acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Performance regression testing target prioritization via performance risk analysis

Published: 31 May 2014 Publication History

Abstract

As software evolves, problematic changes can significantly degrade software performance, i.e., introducing performance regression. Performance regression testing is an effective way to reveal such issues in early stages. Yet because of its high overhead, this activity is usually performed infrequently. Consequently, when performance regression issue is spotted at a certain point, multiple commits might have been merged since last testing. Developers have to spend extra time and efforts narrowing down which commit caused the problem. Existing efforts try to improve performance regression testing efficiency through test case reduction or prioritization.
In this paper, we propose a new lightweight and white-box approach, performance risk analysis (PRA), to improve performance regression testing efficiency via testing target prioritization. The analysis statically evaluates a given source code commit's risk in introducing performance regression. Performance regression testing can leverage the analysis result to test commits with high risks first while delaying or skipping testing on low-risk commits.
To validate this idea's feasibility, we conduct a study on 100 real-world performance regression issues from three widely used, open-source software. Guided by insights from the study, we design PRA and build a tool, PerfScope. Evaluation on the examined problematic commits shows our tool can successfully alarm 91% of them. Moreover, on 600 randomly picked new commits from six large-scale software, with our tool, developers just need to test only 14-22% of the 600 commits and will still be able to alert 87-95% of the commits with performance regression.

References

[1]
Approaches to performance testing. http://www.oracle.com/ technetwork/articles/entarch/ performance-testing-095962.html.
[2]
Chrome bug 56752. https://code.google.com/p/ chromium/issues/detail?id=56752.
[3]
Chromium’s performance testing machines. http://build. chromium.org/p/chromium.perf/buildslaves.
[4]
Code Bisection. http://en.wikipedia.org/wiki/Code_ Bisection.
[5]
Considerations for load tests. http://msdn.microsoft.com/ en-us/library/ms404664.aspx.
[6]
Google performance testing. http://googletesting. blogspot.com/2007/10/performance-testing.html.
[7]
Mysql bug 16504. http://bugs.mysql.com/bug.php? id=16504.
[8]
OProfile. http://oprofile.sourceforge.net.
[9]
pgbench Good Practices. http://www.postgresql.org/ docs/devel/static/pgbench.html.
[10]
A CHARYA, M., AND R OBINSON, B. Practical change impact analysis based on static program slicing for industrial software systems. ICSE ’11, ACM, pp. 746–755.
[11]
A GUILERA, M. K., M OGUL, J. C., W IENER, J. L., R EYNOLDS, P., AND M UTHITACHAROEN, A. Performance debugging for distributed systems of black boxes. SOSP ’03, ACM, pp. 74–89.
[12]
A PIWATTANAPONG, T., O RSO, A., AND H ARROLD, M. J. Efficient and precise dynamic impact analysis using execute-after sequences. ICSE ’05, ACM, pp. 432–441.
[13]
A RNOLD, R. S. Software Change Impact Analysis. IEEE Computer Society Press, Los Alamitos, CA, USA, 1996.
[14]
A RNOLD, R. S., AND B OHNER, S. A. Impact analysis - towards a framework for comparison. In ICSM (1993), pp. 292–301.
[15]
B AKER, M. Mozilla performance regression policy. http://www. mozilla.org/hacking/regression-policy.html.
[16]
B ARBER, S. Life-cycle performance testing for eliminating last-minute surprises. http://msdn.microsoft.com/ en-us/library/bb905531.aspx.
[17]
B ARNA, C., L ITOIU, M., AND G HANBARI, H. Model-based performance testing (nier track). ICSE ’11, ACM, pp. 872–875.
[18]
B IEMAN, J. Editorial: Is anyone listening? Software Quality Journal 13, 3 (Sept. 2005), 225–226.
[19]
B LACK, J., M ELACHRINOUDIS, E., AND K AELI, D. Bi-criteria models for all-uses test suite reduction. ICSE ’04, IEEE Computer Society, pp. 106–115.
[20]
B OHNER, S. A. Extending software change impact analysis into cots components. SEW ’02, IEEE Computer Society, pp. 175–182.
[21]
C HEN, T., A NANIEV, L. I., AND T IKHONOV, A. V. Keeping kernel performance from regressions. vol. 1 of OLS’ 07, pp. 93–102.
[22]
C HEN, Y.-F., R OSENBLUM, D. S., AND V O, K.-P. Testtube: a system for selective regression testing. ICSE ’94, IEEE Computer Society Press, pp. 211–220.
[23]
C ORBET, J. Performance regression discussion in kernel summit 2010. http://lwn.net/Articles/412747/.
[24]
D ENARO, G., P OLINI, A., AND E MMERICH, W. Early performance testing of distributed software applications. WOSP ’04, ACM, pp. 94–103.
[25]
E LBAUM, S., M ALISHEVSKY, A. G., AND R OTHERMEL, G. Prioritizing test cases for regression testing. ISSTA ’00, ACM, pp. 102–112.
[26]
F OO, K. C., J IANG, Z. M., A DAMS, B., H ASSAN, A. E., Z OU, Y., AND F LORA, P. Mining performance regression testing repositories for automated performance analysis. QSIC ’10, IEEE Computer Society, pp. 32–41.
[27]
F OO, K. C. D. Automated discovery of performance regressions in enterprise applications. Master’s thesis, Queen’s University, Canada, 2011.
[28]
F OX, G. Performance engineering as a part of the development life cycle for large-scale software systems. ICSE ’89, ACM, pp. 85–94.
[29]
G LEK, T. Massive performance regression from switching to gcc 4.5. http://gcc.gnu.org/ml/gcc/2010-06/msg00715. html.
[30]
G RECHANIK, M., F U, C., AND X IE, Q. Automatically finding performance problems with feedback-directed learning software testing. ICSE 2012, IEEE Press, pp. 156–166.
[31]
H AN, S., D ANG, Y., G E, S., Z HANG, D., AND X IE, T. Performance debugging in the large via mining millions of stack traces. ICSE 2012, IEEE Press, pp. 145–155.
[32]
H ARROLD, M. J., G UPTA, R., AND S OFFA, M. L. A methodology for controlling the size of a test suite. ACM Trans. Softw. Eng. Methodol. 2, 3 (July 1993), 270–285.
[33]
H ECKMANN, R., F ERDINAND, C., A NGEWANDTE, A., AND G MBH, I. Worst-case execution time prediction by static program analysis. IPDPS 2004, IEEE Computer Society, pp. 26–30.
[34]
J IN, G., S ONG, L., S HI, X., S CHERPELZ, J., AND L U, S. Understanding and detecting real-world performance bugs. PLDI ’12, ACM, pp. 77–88.
[35]
J OVIC, M., A DAMOLI, A., AND H AUSWIRTH, M. Catch me if you can: performance bug detection in the wild. OOPSLA ’11, ACM, pp. 155–170.
[36]
K ALIBERA, T., B ULEJ, L., AND T UMA, P. Automated detection of performance regressions: The mono experience. In MASCOTS (2005), pp. 183–190.
[37]
K ILLIAN, C., N AGARAJ, K., P ERVEZ, S., B RAUD, R., A NDERSON, J. W., AND J HALA, R. Finding latent performance bugs in systems implementations. FSE ’10, ACM, pp. 17–26.
[38]
K IM, J.-M., AND P ORTER, A. A history-based test prioritization technique for regression testing in resource constrained environments. ICSE ’02, ACM, pp. 119–129.
[39]
L ATTNER, C., AND A DVE, V. LLVM: A compilation framework for lifelong program analysis & transformation. CGO ’’04, IEEE Computer Society, pp. 75–86.
[40]
L AW, J., AND R OTHERMEL, G. Whole program path-based dynamic impact analysis. ICSE ’03, IEEE Computer Society, pp. 308–318.
[41]
L EUNG, A. W., L ALONDE, E., T ELLEEN, J., D AVIS, J., AND M ALTZAHN, C. Using comprehensive analysis for performance debugging in distributed storage systems. In MSST (2007), pp. 281–286.
[42]
L I, Z., H ARMAN, M., AND H IERONS, R. M. Search algorithms for regression test case prioritization. IEEE Trans. Softw. Eng. 33, 4 (Apr. 2007), 225–237.
[43]
M ARTIN, R. C. Agile Software Development: Principles, Patterns, and Practices. Prentice Hall PTR, Upper Saddle River, NJ, USA, 2003.
[44]
M AYER, M. In search of a better, faster, stronger web. http:// goo.gl/m4fXx, 2009.
[45]
M ITCHELL, M. GCC performance regression testing discussion. http://gcc.gnu.org/ml/gcc/2005-11/msg01306. html.
[46]
M OLYNEAUX, I. The Art of Application Performance Testing: Help for Programmers and Quality Assurance, 1st ed. O’Reilly Media, Inc., 2009.
[47]
O RSO, A., A PIWATTANAPONG, T., AND H ARROLD, M. J. Leveraging field data for impact analysis and regression testing. ESEC/FSE-11, ACM, pp. 128–137.
[48]
O RSO, A., A PIWATTANAPONG, T., L AW, J. B., R OTHERMEL, G., AND H ARROLD, M. J. An empirical comparison of dynamic impact analysis algorithms. ICSE ’04, pp. 491–500.
[49]
P ERSHAD, T., AND B AR N IR, O. Software quality and testing in mysql. MySQL Conference and Expo, 2009.
[50]
R OTHERMEL, G., AND H ARROLD, M. J. A safe, efficient regression test selection technique. ACM Trans. Softw. Eng. Methodol. 6, 2 (Apr. 1997), 173–210.
[51]
R OTHERMEL, G., U NTCH, R. H., C HU, C., AND H ARROLD, M. J. Test case prioritization: An empirical study. ICSM ’99, IEEE Computer Society, pp. 179–188.
[52]
R OTHERMEL, G., U NTCH, R. J., AND C HU, C. Prioritizing test cases for regression testing. IEEE Trans. Softw. Eng. 27 (October 2001), 929–948.
[53]
S HEN, K., Z HONG, M., AND L I, C. I/O system performance debugging using model-driven anomaly characterization. FAST ’05, USENIX Association, pp. 23–23.
[54]
S HERRIFF, M., AND W ILLIAMS, L. Empirical software change impact analysis using singular value decomposition. In ICST (2008), pp. 268–277.
[55]
T IP, F. A survey of program slicing techniques. Journal of programming languages 3, 3 (1995), 121–189.
[56]
T URVER, R. J., AND M UNRO, M. An early impact analysis technique for software maintenance. Journal of Software Maintenance: Research and Practice 6, 1 (1994), 35–52.
[57]
W EYUKER, E. J., AND V OKOLOS, F. I. Experience with performance testing of software systems: Issues, an approach, and case study. IEEE Trans. Softw. Eng. 26, 12 (Dec. 2000), 1147–1156.
[58]
W ILHELM, R., E NGBLOM, J., E RMEDAHL, A., H OLSTI, N., T HESING, S., W HALLEY, D., B ERNAT, G., F ERDINAND, C., H ECKMANN, R., M ITRA, T., M UELLER, F., P UAUT, I., P USCHNER, P., S TASCHULAT, J., AND S TENSTRÖM, P. The worst-case execution-time problem – overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst. 7, 3 (May 2008), 36:1–36:53.
[59]
Y AN, D., X U, G., AND R OUNTEV, A. Uncovering performance problems in java applications with reference propagation profiling. ICSE 2012, IEEE Press, pp. 134–144.
[60]
Y ILMAZ, C., K RISHNA, A. S., M EMON, A., P ORTER, A., S CHMIDT, D. C., G OKHALE, A., AND N ATARAJAN, B. Main effects screening: a distributed continuous quality assurance process for monitoring performance degradation in evolving software systems. ICSE ’05, ACM, pp. 293–302.
[61]
Z HANG, P., E LBAUM, S., AND D WYER, M. B. Automatic generation of load tests. In Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering (Washington, DC, USA, 2011), ASE ’11, IEEE Computer Society, pp. 43–52.
[62]
Z HONG, H., Z HANG, L., AND M EI, H. An experimental comparison of four test suite reduction techniques. ICSE ’06, ACM, pp. 636–640.
[63]
Z IMMERMANN, T., W EISGERBER, P., D IEHL, S., AND Z ELLER, A. Mining version histories to guide software changes. ICSE ’04, IEEE Computer Society, pp. 563–572.

Cited By

View all
  • (2024)Evaluating Search-Based Software Microbenchmark PrioritizationIEEE Transactions on Software Engineering10.1109/TSE.2024.338083650:7(1687-1703)Online publication date: 1-Jul-2024
  • (2024)Behind the Intent of Extract Method Refactoring: A Systematic Literature ReviewIEEE Transactions on Software Engineering10.1109/TSE.2023.334580050:4(668-694)Online publication date: 4-Jan-2024
  • (2023)PERSEUSProceedings of the 21st USENIX Conference on File and Storage Technologies10.5555/3585938.3585942(49-63)Online publication date: 21-Feb-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE 2014: Proceedings of the 36th International Conference on Software Engineering
May 2014
1139 pages
ISBN:9781450327565
DOI:10.1145/2568225
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • TCSE: IEEE Computer Society's Tech. Council on Software Engin.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Performance regression
  2. cost modeling
  3. performance risk analysis

Qualifiers

  • Research-article

Conference

ICSE '14
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)44
  • Downloads (Last 6 weeks)3
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Evaluating Search-Based Software Microbenchmark PrioritizationIEEE Transactions on Software Engineering10.1109/TSE.2024.338083650:7(1687-1703)Online publication date: 1-Jul-2024
  • (2024)Behind the Intent of Extract Method Refactoring: A Systematic Literature ReviewIEEE Transactions on Software Engineering10.1109/TSE.2023.334580050:4(668-694)Online publication date: 4-Jan-2024
  • (2023)PERSEUSProceedings of the 21st USENIX Conference on File and Storage Technologies10.5555/3585938.3585942(49-63)Online publication date: 21-Feb-2023
  • (2023)From Missteps to Milestones: A Journey to Practical Fail-Slow DetectionACM Transactions on Storage10.1145/361769019:4(1-28)Online publication date: 1-Nov-2023
  • (2023)Performance Bug Analysis and Detection for Distributed Storage and Computing SystemsACM Transactions on Storage10.1145/358028119:3(1-33)Online publication date: 19-Jun-2023
  • (2023)PerfoRT: A Tool for Software Performance RegressionCompanion of the 2023 ACM/SPEC International Conference on Performance Engineering10.1145/3578245.3584928(119-120)Online publication date: 15-Apr-2023
  • (2023)Automated Generation and Evaluation of JMH Microbenchmark Suites From Unit TestsIEEE Transactions on Software Engineering10.1109/TSE.2022.318800549:4(1704-1725)Online publication date: 1-Apr-2023
  • (2023)Demystifying Performance Regressions in String SolversIEEE Transactions on Software Engineering10.1109/TSE.2022.316837349:3(947-961)Online publication date: 1-Mar-2023
  • (2023)Faster or Slower? Performance Mystery of Python Idioms Unveiled with Empirical EvidenceProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00130(1495-1507)Online publication date: 14-May-2023
  • (2023)Understanding Software Performance Challenges an Empirical Study on Stack Overflow2023 International Conference on Code Quality (ICCQ)10.1109/ICCQ57276.2023.10114662(1-15)Online publication date: 22-Apr-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media