Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3578245.3584692acmconferencesArticle/Chapter ViewAbstractPublication PagesicpeConference Proceedingsconference-collections
short-paper
Open access

Analysing Static Source Code Features to Determine a Correlation to Steady State Performance in Java Microbenchmarks

Published: 15 April 2023 Publication History

Abstract

Source code analysis is an important aspect of software development that provides insight into a program's quality, security and performance. There are few methods for consistently predicting or determining when a written piece of code will end its warm-up state and proceed to a steady state. In this study, we use the data gathered by the SEALABQualityGroup at the University of L'Aquila and Charles University and extend their research of steady state analysis to determine whether certain source code features could provide a basis for developers to make more informed predictions on when a steady state would occur. We explore if there is a direct correlation between source code features on the time and ability of a Java microbenchmark to reach a steady state to build a machine learning-based approach for steady-state prediction. We found that the correlation between source code features and the probability of reaching a steady state go as high as 10.9% for Pearson's correlation coefficient, whereas the correlation between source code features and the time it takes to reach a steady state go as high as 21.6% for Spearman's correlation coefficient. Our results also show that a K Nearest Neighbour Classifier with features selected with either Spearman's or Kendall's correlation coefficient boasts an accuracy of 78.6%.

References

[1]
Naomi S Altman. 1992. An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, Vol. 46, 3 (1992), 175--185.
[2]
Edd Barrett, Carl Friedrich Bolz-Tereick, Rebecca Killick, Sarah Mount, and Laurence Tratt. 2017. Virtual machine warmup blows hot and cold. Proceedings of the ACM on Programming Languages, Vol. 1, OOPSLA (2017), 1--27.
[3]
David Binkley. 2007. Source code analysis: A road map. Future of Software Engineering (FOSE'07) (2007), 104--119.
[4]
Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine learning, Vol. 20, 3 (1995), 273--297.
[5]
David R Cox. 1958. The regression analysis of binary sequences. Journal of the Royal Statistical Society: Series B (Methodological), Vol. 20, 2 (1958), 215--232.
[6]
Charlie Curtsinger and Emery D Berger. 2013. Stabilizer: Statistically sound performance evaluation. ACM SIGARCH Computer Architecture News, Vol. 41, 1 (2013), 219--228.
[7]
David Freedman, Robert Pisani, and Roger Purves. 2007. Statistics (international student edition). Pisani, R. Purves, 4th edn. WW Norton & Company, New York (2007).
[8]
Andy Georges, Dries Buytaert, and Lieven Eeckhout. 2007. Statistically rigorous java performance evaluation. ACM SIGPLAN Notices, Vol. 42, 10 (2007), 57--76.
[9]
Wilhelmiina Hamalainen and Mikko Vinni. 2006. Comparison of Machine Learning Methods for Intelligent Tutoring Systems. In Intelligent Tutoring Systems, Mitsuru Ikeda, Kevin D. Ashley, and Tak-Wai Chan (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 525--534.
[10]
Tin Kam Ho. 1995. Random decision forests. In Proceedings of 3rd international conference on document analysis and recognition, Vol. 1. IEEE, 278--282.
[11]
Arvinder Kaur and Ruchikaa Nayyar. 2020. A comparative study of static code analysis tools for vulnerability detection in C/C and JAVA source code. Procedia Computer Science, Vol. 171 (2020), 2023--2029.
[12]
M. G. Kendall. 1938. A New Measure of Rank Correlation. Biometrika, Vol. 30, 1--2 (June 1938), 81--93. https://doi.org/10.1093/biomet/30.1--2.81
[13]
Christoph Laaber, Mikael Basmaci, and Pasquale Salza. 2021. Predicting unstable software benchmarks using static source code features. Empirical Software Engineering, Vol. 26, 6 (2021), 1--53.
[14]
Terence J. Parr and Russell W. Quong. 1995. ANTLR: A predicated-LL (k) parser generator. Software: Practice and Experience, Vol. 25, 7 (1995), 789--810.
[15]
Jared Chad Swanzen. 2023. Reproduce - Analysing Static Source Code Features to Determine a Correlation to Steady State Performance in Java Microbenchmarks. https://doi.org/10.5281/zenodo.7646968
[16]
Luca Traini, Vittorio Cortellessa, Daniele Di Pompeo, and Michele Tucci. 2023. Towards effective assessment of steady state performance in Java software: are we there yet? Empirical Software Engineering, Vol. 28, 1 (2023), 1--57.
[17]
Hannes Tribus. 2010. Static Code Features for a Machine Learning based Inspection: An approach for C.
[18]
Geoffrey I Webb, Eamonn Keogh, and Risto Miikkulainen. 2010. Na"ive Bayes. Encyclopedia of machine learning, Vol. 15 (2010), 713--714.
[19]
Xindong Wu, Vipin Kumar, J Ross Quinlan, Joydeep Ghosh, Qiang Yang, Hiroshi Motoda, Geoffrey J McLachlan, Angus Ng, Bing Liu, S Yu Philip, et al. 2008. Top 10 algorithms in data mining. Knowledge and information systems, Vol. 14, 1 (2008), 1--37.
[20]
Jerrold H Zar. 2005. Spearman rank correlation. Encyclopedia of Biostatistics, Vol. 7 (2005).

Index Terms

  1. Analysing Static Source Code Features to Determine a Correlation to Steady State Performance in Java Microbenchmarks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICPE '23 Companion: Companion of the 2023 ACM/SPEC International Conference on Performance Engineering
    April 2023
    421 pages
    ISBN:9798400700729
    DOI:10.1145/3578245
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 April 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. ANTLR
    2. Java microbenchmark
    3. Kendall's tau
    4. Pearson's r
    5. Spearman's roh
    6. correlation coefficient
    7. correlation study
    8. machine-learning
    9. static source code analysis
    10. steady state

    Qualifiers

    • Short-paper

    Conference

    ICPE '23

    Acceptance Rates

    Overall Acceptance Rate 252 of 851 submissions, 30%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 191
      Total Downloads
    • Downloads (Last 12 months)113
    • Downloads (Last 6 weeks)10
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media