Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Identification of Energy Hotspots: A Case Study of the Very Fast Decision Tree

  • Conference paper
  • First Online:
Green, Pervasive, and Cloud Computing (GPC 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10232))

Included in the following conference series:

Abstract

Large-scale data centers account for a significant share of the energy consumption in many countries. Machine learning technology requires intensive workloads and thus drives requirements for lots of power and cooling capacity in data centers. It is time to explore green machine learning. The aim of this paper is to profile a machine learning algorithm with respect to its energy consumption and to determine the causes behind this consumption. The first scalable machine learning algorithm able to handle large volumes of streaming data is the Very Fast Decision Tree (VFDT), which outputs competitive results in comparison to algorithms that analyze data from static datasets. Our objectives are to: (i) establish a methodology that profiles the energy consumption of decision trees at the function level, (ii) apply this methodology in an experiment to obtain the energy consumption of the VFDT, (iii) conduct a fine-grained analysis of the functions that consume most of the energy, providing an understanding of that consumption, (iv) analyze how different parameter settings can significantly reduce the energy consumption. The results show that by addressing the most energy intensive part of the VFDT, the energy consumption can be reduced up to a 74.3%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://www.theverge.com/2016/7/21/12246258/google-deepmind-ai-data-center-cooling.

  2. 2.

    https://team.inria.fr/spirals/.

References

  1. Ahmed, N.K., Atiya, A.F., Gayar, N.E., El-Shishiny, H.: An empirical comparison of machine learning models for time series forecasting. Econometric Rev. 29(5–6), 594–621 (2010)

    Article  MathSciNet  Google Scholar 

  2. Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: Massive Online Analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)

    Google Scholar 

  3. Caruana, R., Niculescu-Mizil, A.: An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 161–168. ACM (2006)

    Google Scholar 

  4. De Francisci Morales, G.: SAMOA: a platform for mining big data streams. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 777–778. ACM (2013)

    Google Scholar 

  5. Demirci, M.: A survey of machine learning applications for energy-efficient resource management in cloud computing environments. In: IEEE 14th International Conference on Machine Learning and Applications (ICMLA), pp. 1185–1190 (2015)

    Google Scholar 

  6. Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)

    Google Scholar 

  7. Flach, P.: Machine Learning: The Art and Science of Algorithms that Make Sense of Data. Cambridge University Press, New York (2012)

    Book  MATH  Google Scholar 

  8. Freire, A., Macdonald, C., Tonellotto, N., Ounis, I., Cacheda, F.: A self-adapting latency/power tradeoff model for replicated search engines. In: 7th ACM International Conference on Web Search and Data Mining, pp. 13–22 (2014)

    Google Scholar 

  9. Gama, J., Rocha, R., Medas, P.: Accurate decision trees for mining high-speed data streams. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 523–528. ACM (2003)

    Google Scholar 

  10. Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)

    Article  MathSciNet  MATH  Google Scholar 

  11. Hooper, A.: Green computing. Commun. ACM 51(10), 11–13 (2008)

    Article  Google Scholar 

  12. Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106 (2001)

    Google Scholar 

  13. King, R.D., Feng, C., Sutherland, A.: Statlog: comparison of classification algorithms on large real-world problems. Appl. Artif. Intell. Int. J. 9(3), 289–333 (1995)

    Article  Google Scholar 

  14. Kirkby, R.B.: Improving hoeffding trees. Ph.D. thesis, The University of Waikato (2007)

    Google Scholar 

  15. Kourtellis, N., Morales, G.D.F., Bifet, A., Murdopo, A.: VHT: Vertical Hoeffding Tree. arXiv preprint arXiv:1607.08325 (2016)

  16. Martín, E.G., Lavesson, N., Grahn, H.: Energy efficiency in data stream mining. In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, pp. 1125–1132. ACM (2015)

    Google Scholar 

  17. Garcia-Martín, E., Lavesson, N., Grahn, H.: Energy efficiency analysis of the very fast decision tree algorithm. In: Missaoui, R., Abdessalem, T., Latapy, M. (eds.) Trends in Social Network Analysis - Information Propagation, User Behavior Modelling, Forecasting, and Vulnerability Assessment(2017, to appear)

    Google Scholar 

  18. Murdopo, A.: Distributed decision tree learning for mining big data streams (2013)

    Google Scholar 

  19. Murugesan, S.: Harnessing green IT: principles and practices. IT Prof. 10(1), 24–33 (2008)

    Article  MathSciNet  Google Scholar 

  20. Noureddine, A., Rouvoy, R., Seinturier, L.: Monitoring energy hotspots in software. Autom. Softw. Eng. 22(3), 291–332 (2015)

    Article  Google Scholar 

  21. Reams, C.: Modelling energy efficiency for computation. Ph.D. thesis, University of Cambridge (2012)

    Google Scholar 

  22. Wu, X., Zhu, X., Wu, G.Q., Ding, W.: Data mining with big data. IEEE Trans. Knowl. Data Eng. 26(1), 97–107 (2014)

    Article  Google Scholar 

  23. Yang, T.J., Chen, Y.H., Sze, V.: Designing energy-efficient convolutional neural networks using energy-aware pruning. arXiv preprint arXiv:1611.05128 (2016)

Download references

Acknowledgments

This work is part of the research project “Scalable resource-efficient systems for big data analytics” funded by the Knowledge Foundation (grant: 20140032) in Sweden.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eva Garcia-Martin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Garcia-Martin, E., Lavesson, N., Grahn, H. (2017). Identification of Energy Hotspots: A Case Study of the Very Fast Decision Tree. In: Au, M., Castiglione, A., Choo, KK., Palmieri, F., Li, KC. (eds) Green, Pervasive, and Cloud Computing. GPC 2017. Lecture Notes in Computer Science(), vol 10232. Springer, Cham. https://doi.org/10.1007/978-3-319-57186-7_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-57186-7_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-57185-0

  • Online ISBN: 978-3-319-57186-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics