Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Coverage Prediction for Accelerating Compiler Testing

Published: 01 February 2021 Publication History

Abstract

Compilers are one of the most fundamental software systems. Compiler testing is important for assuring the quality of compilers. Due to the crucial role of compilers, they have to be well tested. Therefore, automated compiler testing techniques (those based on randomly generated programs) tend to run a large number of test programs (which are test inputs of compilers). The cost for compilation and execution for these test programs is significant. These techniques can take a long period of testing time to detect a relatively small number of compiler bugs. That may cause many practical problems, e.g., bringing a lot of costs including time costs and financial costs, and delaying the development/release cycle. Recently, some approaches have been proposed to accelerate compiler testing by executing test programs that are more likely to trigger compiler bugs earlier according to some criteria. However, these approaches ignore an important aspect in compiler testing: different test programs may have similar test capabilities (i.e., testing similar functionalities of a compiler, even detecting the same compiler bug), which may largely discount their acceleration effectiveness if the test programs with similar test capabilities are executed all the time. Test coverage is a proper approximation to help distinguish them, but collecting coverage dynamically is infeasible in compiler testing since most test programs are generated on the fly by automatic test-generation tools like Csmith. In this paper, we propose the first method to predict test coverage statically for compilers, and then propose to prioritize test programs by clustering them according to the predicted coverage information. The novel approach to accelerating compiler testing through coverage prediction is called COP (short for COverage Prediction). Our evaluation on GCC and LLVM demonstrates that COP significantly accelerates compiler testing, achieving an average of 51.01 percent speedup in test execution time on an existing dataset including three old release versions of the compilers and achieving an average of 68.74 percent speedup on a new dataset including 12 latest release versions. Moreover, COP outperforms the state-of-the-art acceleration approach significantly by improving <inline-formula><tex-math notation="LaTeX">$17.16\%\sim 82.51\%$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>17</mml:mn><mml:mo>.</mml:mo><mml:mn>16</mml:mn><mml:mo>%</mml:mo><mml:mo>&#x223C;</mml:mo><mml:mn>82</mml:mn><mml:mo>.</mml:mo><mml:mn>51</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="chen-ieq1-2889771.gif"/></alternatives></inline-formula> speedups in different settings on average.

References

[1]
J. Chen, W. Hu, D. Hao, Y. Xiong, H. Zhang, L. Zhang, and B. Xie, “An empirical comparison of compiler testing techniques,” in Proc. 38th Int. Conf. Softw. Eng., 2016, pp. 180–190.
[2]
J. Chen, A. F. Donaldson, A. Zeller, and H. Zhang, “Testing and verification of compilers (Dagstuhl Seminar 17502),” Dagstuhl Reports, vol. 7, no. 12, pp. 50–65, 2018.
[3]
X. Yang, Y. Chen, E. Eide, and J. Regehr, “Finding and understanding bugs in c compilers,” in Proc. 32nd Conf. Program. Language Des. Implementation, 2011, pp. 283–294.
[4]
V. Le, M. Afshari, and Z. Su, “Compiler validation via equivalence modulo inputs,” in Proc. 35th Conf. Program. Language Des. Implementation, 2014, Art. no.
[5]
J. Chen, Y. Bai, D. Hao, Y. Xiong, H. Zhang, L. Zhang, and B. Xie, “Test case prioritization for compilers: A text-vector based approach,” in Proc. 9th Int. Conf. Softw. Testing Verification Validation, 2016, pp. 266–277.
[6]
J. Chen, Y. Bai, D. Hao, Y. Xiong, H. Zhang, and B. Xie, “Learning to prioritize test programs for compiler testing,” in Proc. 39th Int. Conf. Softw. Eng., 2017, pp. 700–711.
[7]
J. Chen, “Learning to accelerate compiler testing,” in Proc. 40th Int. Conf. Softw. Eng.: Companion, 2018, pp. 472–475.
[8]
Y. Chen, A. Groce, C. Zhang, W.-K. Wong, X. Fern, E. Eide, and J. Regehr, “Taming compiler fuzzers,” in Proc. 34th ACM SIGPLAN Conf. Program. Language Des. Implementation, 2013, pp. 197–208.
[9]
H. Zhu, P. A. Hall, and J. H. May, “Software unit test coverage and adequacy,” ACM Comput. Surveys, vol. 29, no. 4, pp. 366–427, 1997.
[10]
J. A. Jones and M. J. Harrold, “Test-suite reduction and prioritization for modified condition/decision coverage,” IEEE Trans. Softw. Eng., vol. 29, no. 3, pp. 195–209, Mar. 2003.
[11]
A. S. Namin and J. H. Andrews, “The influence of size and coverage on test suite effectiveness,” in Proc. 18th Int. Symp. Softw. Testing Anal., 2009, pp. 57–68.
[12]
W. E. Wong, R. Gao, Y. Li, R. Abreu, and F. Wotawa, “A survey on software fault localization,” IEEE Trans. Softw. Eng., vol. 42, no. 8, pp. 707–740, Aug. 2016.
[13]
G. Rothermel, M. J. Harrold, J. Von Ronne, and C. Hong, “Empirical studies of test-suite reduction,” Softw. Testing Verification Rel., vol. 12, no. 4, pp. 219–249, 2002.
[14]
H. Zhong, L. Zhang, and H. Mei, “An experimental study of four typical test suite reduction techniques,” Inf. Softw. Technol., vol. 50, no. 6, pp. 534–546, 2008.
[15]
W. M. McKeeman, “Differential testing for software,” Digital Tech. J., vol. 10, no. 1, pp. 100–107, 1998.
[16]
C. Sun, V. Le, and Z. Su, “Finding and analyzing compiler warning defects,” in Proc. 38th Int. Conf. Softw. Eng., 2016, pp. 203–213.
[17]
V. Le, C. Sun, and Z. Su, “Finding deep compiler bugs via guided stochastic program mutation,” in Proc. Int. Conf. Object-Oriented Program. Syst. Languages Appl., 2015, pp. 386–399.
[18]
C. Sun, V. Le, and Z. Su, “Finding compiler bugs via live code mutation,” in Proc. Int. Conf. Object-Oriented Program., Syst. Languages Appl., 2016, pp. 849–863.
[19]
B. Jiang and W. K. Chan, “Input-based adaptive randomized test case prioritization: A local beam search approach,” J. Syst. Softw., vol. 105, pp. 91–106, 2015.
[20]
R. Xu and D. Wunsch, “Survey of clustering algorithms,” IEEE Trans. Neural Netw., vol. 16, no. 3, pp. 645–678, May 2005.
[21]
Y. Zhang, H. Wang, and V. Filkov, “A clustering-based approach for mining dockerfile evolutionary trajectories,” Sci. China Inf. Sci., vol. 62, no. 1, 2019, Art. no.
[22]
D. Pelleg, A. W. Moore, et al., “X-means: Extending k-means with efficient estimation of the number of clusters,” in Proc. 17th Int. Conf. Mach. Learn., 2000, pp. 727–734.
[23]
J. A. Hartigan and M. A. Wong, “Algorithm as 136: A k-means clustering algorithm,” J. Roy. Statistical Soc. Series C (Appl. Statist.), vol. 28, no. 1, pp. 100–108, 1979.
[24]
J. Dai and Q. Xu, “Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification,” Appl. Softw. Comput., vol. 13, no. 1, pp. 211–221, 2013.
[25]
J. Chen, W. Hu, L. Zhang, D. Hao, S. Khurshid, and L. Zhang, “Learning to accelerate symbolic execution via code transformation,” in Proc. 32nd Eur. Conf. Object-Oriented Program., 2018, pp. 6:1–6:27.
[26]
J. H. Friedman, “Greedy function approximation: A gradient boosting machine,” Ann. Statist., vol. 29, pp. 1189–1232, 2001.
[27]
V. Le, C. Sun, and Z. Su, “Randomized stress-testing of link-time optimizers,” in Proc. Int. Symp. Softw. Testing Anal., 2015, pp. 327–337.
[28]
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, “The weka data mining software: an update,” ACM Sigkdd Explorations Newslett., vol. 11, no. 1, pp. 10–18, 2009.
[29]
C. H. Achen, “What does ”explained variance” explain?: Reply,” Political Anal., vol. 2, no. 1, pp. 173–184, 1990.
[30]
C. J. Willmott and K. Matsuura, “Advantages of the mean absolute error (mae) over the root mean square error (rmse) in assessing average model performance,” Climate Res., vol. 30, no. 1, 2005, Art. no.
[31]
R. G. Carpenter, “Principles and procedures of statistics with special reference to the biological sciences,” Ann. New York Academy Sci., vol. 682, no. 12, pp. 283–295, 1960.
[32]
A. Groce, C. Zhang, E. Eide, Y. Chen, and J. Regehr, “Swarm testing,” in Proc. Int. Symp. Softw. Testing Anal., 2012, pp. 78–88.
[33]
C. Lidbury, A. Lascu, N. Chong, and A. F. Donaldson, “Many-core compiler fuzzing,” in Proc. 36th Conf. Program. Language Des. Implementation, 2015, pp. 65–76.
[34]
C. Sun, V. Le, Q. Zhang, and Z. Su, “Toward understanding compiler bugs in GCC and LLVM,” in Proc. 25th ACM SIGSOFT Int. Symp. Softw. Testing Anal., 2016, pp. 294–305.
[35]
H. Mei and L. Zhang, “Can big data bring a breakthrough for software automation?” Sci. China Inf. Sci., vol. 61, no. 5, 2018, Art. no.
[36]
A. Lipowski and D. Lipowska, “Roulette-wheel selection via stochastic acceptance,” Physica A: Statistical Mech. Appl., vol. 391, no. 6, pp. 2193–2196, 2012.
[37]
A. S. Kossatchev and M. A. Posypkin, “Survey of compiler testing methods,” Program. Comput. Softw., vol. 31, no. 1, pp. 10–19, 2005.
[38]
M. Pflanzer, A. F. Donaldson, and A. Lascu, “Automatic test case reduction for opencl,” in Proc. 4th Int. Workshop OpenCL, 2016, Art. no.
[39]
M. Boussaa, O. Barais, B. Baudry, and G. Sunyé, “Notice: A framework for non-functional testing of compilers,” in Proc. IEEE Int. Conf. Softw. Quality Rel. Security, 2016, pp. 335–346.
[40]
A. F. Donaldson, H. Evrard, A. Lascu, and P. Thomson, “Automated testing of graphics shader compilers,” Proc. ACM Program. Languages, vol. 1, 2017, Art. no.
[41]
Y. Chen, T. Su, C. Sun, Z. Su, and J. Zhao, “Coverage-directed differential testing of JVM implementations,” in Proc. ACM Sigplan Conf. Program. Language Des. Implementation, 2016, pp. 85–99.
[42]
K. V. Hanford, “Automatic generation of test cases,” IBM Syst. J., vol. 9, no. 4, pp. 242–257, 1970.
[43]
R. L. Sauder, “A general test data generator for cobol,” in Proc. Spring Joint Comput. Conf., 1962, pp. 317–323.
[44]
E. Nagai, H. Awazu, N. Ishiura, and N. Takeda, “Random testing of C compilers targeting arithmetic optimization,” in Proc. Workshop Synthesis Syst. Integr. Mixed Inf. Technol., 2012, pp. 48–53.
[45]
E. Nagai, A. Hashimoto, and N. Ishiura, “Scaling up size and number of expressions in random testing of arithmetic optimization of C compilers,” in Proc. Workshop Synthesis Syst. Integr. Mixed Inf. Technol., 2013, pp. 88–93.
[46]
C. Lindig, “Random testing of c calling conventions,” in Proc. 6th Int. Symp. Automated Anal.-Driven Debugging, 2005, pp. 3–12.
[47]
A. S. Boujarwah and K. Saleh, “Compiler test case generation methods: A survey and assessment,” Inf. Softw. Technol., vol. 39, no. 9, pp. 617–625, 1997.
[48]
Y. Chen, A. Groce, C. Zhang, W.-K. Wong, X. Fern, E. Eide, and J. Regehr, “Taming compiler fuzzers,” in Proc. 34th Conf. Program. Language Des. Implementation, 2013, pp. 197–208.
[49]
J. Regehr, Y. Chen, P. Cuoq, E. Eide, C. Ellison, and X. Yang, “Test-case reduction for C compiler bugs,” in Proc. 33rd Conf. Program. Language Des. Implementation, 2012, pp. 335–346.
[50]
M. H. Pałka, K. Claessen, A. Russo, and J. Hughes, “Testing an optimising compiler by generating random lambda terms,” in Proc. 6th Int. Workshop Autom. Softw. Test, 2011, pp. 91–97.
[51]
M. A. Alipour, A. Groce, R. Gopinath, and A. Christi, “Generating focused random tests using directed swarm testing,” in Proc. 25th Int. Symp. Softw. Testing Anal., 2016, pp. 70–81.
[52]
C. Zhao, Y. Xue, Q. Tao, L. Guo, and Z. Wang, “Automated test program generation for an industrial optimizing compiler,” in Proc. ICSE Workshop Autom. Softw. Test, 2009, pp. 36–43.
[53]
Q. Zhang, C. Sun, and Z. Su, “Skeletal program enumeration for rigorous compiler testing,” in Proc. 38th Conf. Program. Language Des. Implementation, 2017, 347–361.
[54]
Q. Tao, W. Wu, C. Zhao, and W. Shen, “An automatic testing approach for compiler based on metamorphic testing technique,” in Proc. Asia Pacific Softw. Eng. Conf., 2010, pp. 270–279.
[55]
A. F. Donaldson and A. Lascu, “Metamorphic testing for (graphics) compilers,” in Proc. 1st Int. Workshop Metamorphic Testing, 2016, pp. 44–47.
[56]
G. Woo, H. S. Chae, and H. Jang, “An intermediate representation approach to reducing test suites for retargeted compilers,” in Proc. Ada-Eur. Int. Conf. Reliable Softw. Technol., 2007, pp. 100–113.
[57]
H. S. Chae, G. Woo, T. Y. Kim, J. H. Bae, and W. Y. Kim, “An automated approach to reducing test suites for testing retargeted c compilers for embedded systems,” J. Syst. Softw., vol. 84, no. 12, pp. 2053–2064, 2011.
[58]
L. Zhang, D. Hao, L. Zhang, G. Rothermel, and H. Mei, “Bridging the gap between the total and additional test-case prioritization strategies,” in Proc. Int. Conf. Softw. Eng., 2013, pp. 192–201.
[59]
Y. Lou, J. Chen, L. Zhang, and D. Hao, “A survey on regression test-case prioritization,” Advances in Computers, Jan. 2018.
[60]
G. Rothermel, R. H. Untch, C. Chu, and M. J. Harrold, “Test case prioritization: an empirical study,” in Proc. Int. Conf. Softw. Maintenance, 1999, pp. 179–188.
[61]
S. Elbaum, A. Malishevsky, and G. Rothermel, “Prioritizing test cases for regression testing,” in Proc. Int. Symp. Softw. Testing Anal., 2000, pp. 102–112.
[62]
J. A. Jones and M. J. Harrold, “Test-suite reduction and prioritization for modified condition/decision coverage,” in Proc. Int. Conf. Softw. Maintenance, 2001, pp. 92–101.
[63]
H. Mei, D. Hao, L. Zhang, L. Zhang, J. Zhou, and G. Rothermel, “A static approach to prioritizing junit test cases,” IEEE Trans. Softw. Eng., vol. 38, no. 6, pp. 1258–1275, Nov./Dec. 2012.
[64]
J. Chen, Y. Bai, D. Hao, L. Zhang, L. Zhang, and B. Xie, “How do assertions impact coverage-based test-suite reduction?” in Proc. IEEE Int. Conf. Softw. Testing Verification Validation, 2017, pp. 418–423.
[65]
S. Elbaum, A. Malishevsky, and G. Rothermel, “Test case prioritization: A family of empirical studies,” IEEE Trans. Softw. Eng., vol. 28, no. 2, pp. 159–182, Feb. 2002.
[66]
L. Mei, Z. Zhang, W. K. Chan, and T. H. Tse, “Test case prioritization for regression testing of service-oriented business applications,” in Proc. Int. World Wide Web Conf., 2009, pp. 901–910.
[67]
B. Korel, L. Tahat, and M. Harman, “Test prioritization using system models,” in Proc. Int. Conf. Softw. Maintenance, 2005, pp. 559–568.
[68]
Z. Li, M. Harman, and R. Hierons, “Search algorithms for regression test case prioritisation,” IEEE Trans. Softw. Eng., vol. 33, no. 4, pp. 225–237, Apr. 2007.
[69]
B. Jiang, Z. Zhang, W. K. Chan, and T. H. Tse, “Adaptive random test case prioritization,” in Proc. Int. Conf. Automated Softw. Eng., 2009, pp. 257–266.
[70]
S. Elbaum, A. Malishevsky, and G. Rothermel, “Incorporating varying test costs and fault severities into test case prioritization,” in Proc. Int. Conf. Softw. Eng., 2001, pp. 329–338.
[71]
H. Park, H. Ryu, and J. Baik, “Historical value-based approach for cost-cognizant test case prioritization to improve the effectiveness of regression testing,” in Proc. Int. Conf. Secure Softw. Integr. Rel. Improvement, 2008, pp. 39–46.
[72]
S.-S. Hou, L. Zhang, T. Xie, and J. Sun, “Quota-constrained test-case prioritization for regression testing of service-centric systems,” in Proc. Int. Conf. Softw. Maintenance, 2008, pp. 257–266.
[73]
J. M. Kim and A. Porter, “A history-based test prioritization technique for regression testing in resource constrained environments,” in Proc. Int. Conf. Softw. Eng., 2002, pp. 119–129.
[74]
K. R. Walcott, M. L. Soffa, G. M. Kapfhammer, and R. S. Roos, “Time aware test suite prioritization,” in Proc. Int. Symp. Softw. Testing Anal., 2006, pp. 1–11.
[75]
L. Zhang, S. Hou, C. Guo, T. Xie, and H. Mei, “Time-aware test-case prioritization using integer linear programming,” in Proc. Int. Symp. Softw. Testing Anal., 2009, pp. 213–224.
[76]
M. Qu, M. B. Cohen, and K. M. Woolf, “Combinatorial interaction regression testing: A study of test case generation and prioritization,” in Proc. Int. Conf. Softw. Maintenance, 2007, pp. 255–264.
[77]
A. Malishevsky, J. R. Ruthruff, G. Rothermel, and S. Elbaum, “Cost-cognizant test case prioritization,” Dept. Comput. Sci. Eng., Univ. Nebraska, Lincoln, NE, Tech. Rep., 2006.
[78]
H. Do and G. Rothermel, “Using sensitivity analysis to create simplified economic models for regression testing,” in Proc. Int. Symp. Softw. Testing Anal., 2008, pp. 51–62.
[79]
H. Do and G. Rothermel, “An empirical study of regression testing techniques incorporating context and lifecycle factors and improved cost-benefit models,” in Proc. Symp. Foundations Softw. Eng., Nov. 2006, pp. 141–151.
[80]
S. Yoo, M. Harman, P. Tonella, and A. Susi, “Clustering test cases to achieve effective and scalable prioritisation incorporating expert knowledge,” in Proc. 18th Int. Symp. Softw. Testing Anal., 2009, pp. 201–212.
[81]
R. Carlson, H. Do, and A. Denton, “A clustering approach to improving test case prioritization: An industrial case study,” in Proc. IEEE Int. Conf. Softw. Maintenance, 2011, pp. 382–391.
[82]
W. Fu, H. Yu, G. Fan, and X. Ji, “Coverage-based clustering and scheduling approach for test case prioritization,” IEICE Trans. Inf. Syst., vol. 100, no. 6, pp. 1218–1230, 2017.
[83]
J. Chen, Y. Lou, L. Zhang, J. Zhou, X. Wang, D. Hao, and L. Zhang, “Optimizing test prioritization via test distribution analysis,” in Proc. 26th ACM Joint Meet. Eur. Softw. Eng. Conf. Symp. Foundations Softw. Eng., 2018, pp. 656–667.
[84]
H. Spieker, A. Gotlieb, D. Marijan, and M. Mossige, “Reinforcement learning for automatic test case prioritization and selection in continuous integration,” in Proc. 26th ACM SIGSOFT Int. Symp. Softw. Testing Anal., 2017, pp. 12–22.
[85]
B. Busjaeger and T. Xie, “Learning for test prioritization: an industrial case study,” in Proc. 24th ACM SIGSOFT Int. Symp. Foundations Softw. Eng., 2016, pp. 975–980.
[86]
W. Song, J. Nam, and T. Lin, “Qtep: quality-aware test case prioritization,” in Proc. Joint Meeting Foundations Softw. Eng., 2017, pp. 523–534.

Cited By

View all
  • (2024)Compiler Bug Isolation via Enhanced Test Program MutationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695074(819-830)Online publication date: 27-Oct-2024
  • (2024)Test Input Prioritization for 3D Point CloudsACM Transactions on Software Engineering and Methodology10.1145/364367633:5(1-44)Online publication date: 4-Jun-2024
  • (2024)Test Input Prioritization for Graph Neural NetworksIEEE Transactions on Software Engineering10.1109/TSE.2024.338553850:6(1396-1424)Online publication date: 5-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering  Volume 47, Issue 2
Feb. 2021
211 pages

Publisher

IEEE Press

Publication History

Published: 01 February 2021

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Compiler Bug Isolation via Enhanced Test Program MutationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695074(819-830)Online publication date: 27-Oct-2024
  • (2024)Test Input Prioritization for 3D Point CloudsACM Transactions on Software Engineering and Methodology10.1145/364367633:5(1-44)Online publication date: 4-Jun-2024
  • (2024)Test Input Prioritization for Graph Neural NetworksIEEE Transactions on Software Engineering10.1109/TSE.2024.338553850:6(1396-1424)Online publication date: 5-Apr-2024
  • (2024)Test Input Prioritization for Machine Learning ClassifiersIEEE Transactions on Software Engineering10.1109/TSE.2024.335001950:3(413-442)Online publication date: 1-Mar-2024
  • (2023)GraphPrior: Mutation-based Test Input Prioritization for Graph Neural NetworksACM Transactions on Software Engineering and Methodology10.1145/360719133:1(1-40)Online publication date: 4-Jul-2023
  • (2023)Program Reconditioning: Avoiding Undefined Behaviour When Finding and Reducing Compiler BugsProceedings of the ACM on Programming Languages10.1145/35912947:PLDI(1801-1825)Online publication date: 6-Jun-2023
  • (2023)Exploring Better Black-Box Test Case Prioritization via Log AnalysisACM Transactions on Software Engineering and Methodology10.1145/356993232:3(1-32)Online publication date: 26-Apr-2023
  • (2023)Compiler Test-Program Generation via Memoized Configuration SearchProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00172(2035-2047)Online publication date: 14-May-2023
  • (2022)Learning to Construct Better Mutation FaultsProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556949(1-13)Online publication date: 10-Oct-2022
  • (2022)Detecting Simulink compiler bugs via controllable zombie blocks mutationProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3540250.3549159(1061-1072)Online publication date: 7-Nov-2022
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media