Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3507548.3507588acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsaiConference Proceedingsconference-collections
research-article

Use Machine Learning to Predict the Running Time of the Program

Published: 09 March 2022 Publication History

Abstract

The prediction of program running time can be used to improve scheduling performance of distributed systems. In 2011, Google released a data set documenting the vast amount of information in the Google cluster. However, most of the existing running time prediction models only consider the coarse-grained characteristics of the running environment without considering the influence of the time series data of the running environment on the prediction results. Based on this, this paper innovatively proposes a model to predict the running time of the program, which predicts the future running time through historical information. At the same time, we also propose a new data processing and feature extraction scheme for Google cluster data sets. The results show that our model greatly outperforms the classical model on the Google cluster data set, and the root-mean-square error index of running time under different prediction modes is reduced by more than 60% and 40%, respectively. We hope that the model proposed in this paper can provide new research ideas for cloud computing system design.

References

[1]
Gibbons R. 1997. A historical application profiler for use by parallel schedulers. Lecture Notes in Computer Science, vol 1291. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63574-2_16
[2]
Schopf, Jennifer M., and Francine Berman. 1999. Using stochastic intervals to predict application behavior on contended resources. In proceedings of Fourth International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN'99). IEEE, Perth/Fremantle, WA, Australia, 344-349. https://doi.org/10.1109/ ISPAN.1999.778962.
[3]
Mendes, Celso L., and Daniel A. Reed. 1998. Integrated compilation and scalability analysis for parallel systems. In proceedings of 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192). IEEE, Paris, France, 385-392. https://doi.org/10.1109/PACT.1998.727287.
[4]
Matsunaga, Andréa, and José AB Fortes. 2010. On the use of machine learning to predict the time and resources consumed by applications. In proceedings the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing. IEEE, Melbourne, VIC, Australia, 495-504. https://doi.org/10.1109/CCGRID.2010.98.
[5]
R. Duan, F. Nadeem, J. Wang, Y. Zhang, R. Prodan and T. Fahringer. 2009. A hybrid intelligent method for performance modeling and prediction of workflow activities in grids. In proceedings the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid. IEEE, Shanghai, China, 339-347. https://doi.org/10.1109/CCGRID.2009.58.
[6]
S. Di, D. Kondo and W. Cirne. 2012. Characterization and Comparison of Cloud versus Grid Workloads. In proceedings the 2012 IEEE International Conference on Cluster Computing. IEEE, Beijing, China, 2012, 230-238. https://doi.org/10.1109/CLUSTER.2012.35.
[7]
E. Alpaydin. 2004. Introduction to machine learning. The MIT Press. Boston, USA.
[8]
R. O. Duda, P. E. Hart, and D. G. Stork. 2001. Pattern classification. Oxford University Press. New York, UK.
[9]
I. H. Witten, and E. Frank. 2005. Data mining:practical machine learning tools and techniques. The Story Behind. Mango Publishing Group. Florida, USA.
[10]
R. Albers,E. Suijs, andP.H. N. deWith. 2009. Triple-C: Resource-usage prediction for semi-automatic parallelization of groups of dynamic image-processing tasks. In Proceedings of the 7th Annual Symposium on Principles of Programming Languages. ACM, New York, 24–31. https://doi.org/10.1145/567446.567449.
[11]
I. Rodero, F. Guim, J. Corbalan 2008. The Grid Backfilling: a Multi-Site Scheduling Architecture with Data Mining Prediction Techniques. Westminster Press. London, UK.
[12]
W. Smith, I. Foster, and V. Taylor. 2004. Predicting application run times with historical information. Parallel Distrib. Comput., vol. 64, no. 9, 1007-1016, 2004. https://doi.org/10.1145/567446.567449
[13]
LeCun, Y., Bengio, Y. & Hinton, G. 2015. Deep learning. Nature 521, 436–444 (2015). https://doi.org/10.1038/nature14539
[14]
George W. Snedecor and William G. 1980. Cochran. Statistical Methods. Ames, IA: Iowa State Press, USA.
[15]
Pearson, K. On Lines and Planes of Closest Fit to Systems of Points in Space. Philosophical Magazine. 2 (11): 559–572.
[16]
TUG 2021. Mathieu B, Matthieu B, Retrieved March 27 2021 from http://scikit-learn.org/stable/modules/model_evaluation.html#common-cases-predefined-values.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CSAI '21: Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence
December 2021
437 pages
ISBN:9781450384155
DOI:10.1145/3507548
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 March 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Cloud computing
  2. Machine learning
  3. Runtime prediction
  4. Time series

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Youth Program of National Natural Science Foundation of China
  • Science and Technology Commission of Shanghai Municipality Grant

Conference

CSAI 2021

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 69
    Total Downloads
  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)2
Reflects downloads up to 28 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media