Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3370748.3406566acmconferencesArticle/Chapter ViewAbstractPublication PagesislpedConference Proceedingsconference-collections
research-article

How to cultivate a green decision tree without loss of accuracy?

Published: 10 August 2020 Publication History

Abstract

Decision tree is the core algorithm of the random forest learning that has been widely applied to classification and regression problems in the machine learning field. For avoiding underfitting, a decision tree algorithm will stop growing its tree model when the model is a fully-grown tree. However, a fully-grown tree will result in an overfitting problem reducing the accuracy of a decision tree. In such a dilemma, some post-pruning strategies have been proposed to reduce the model complexity of the fully-grown decision tree. Nevertheless, such a process is very energy-inefficiency over an non-volatile-memory-based (NVM-based) system because NVM generally have high writing costs (i.e., energy consumption and I/O latency). Such unnecessary data will induce high writing energy consumption and long I/O latency on NVM-based architectures, especially for low-power-oriented embedded systems. In order to establish a green decision tree (i.e., a tree model with minimized construction energy consumption), this study rethinks a pruning algorithm, namely duo-phase pruning framework, which can significantly decrease the energy consumption on the NVM-based computing system without loss of accuracy.

Supplementary Material

MP4 File (3370748.3406566.mp4)
This is the presentation video of paper 13 at ISLPED 2020.

References

[1]
Q. Li, L. Jiang, Y. Zhang, Y. He, and C. J. Xue, "Compiler directed write-mode selection for high performance low power volatile PCM," Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems (LCTES '13), Seattle, Washington, June, 2013, pp. 101--110.
[2]
M. Zhang, L. Zhang, L. Jiang, Z. Liu and F. T. Chong, "Balancing Performance and Lifetime of MLC PCM by Using a Region Retention Monitor," 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), Austin, TX, 2017, pp. 385--396.
[3]
S. Chen, Y. Chang, Y. Chang and W. Shih, "mwJFS: A Multiwrite-Mode Journaling File System for MLC NVRAM Storages," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 27, no. 9, pp. 2060--2073, Sept. 2019.
[4]
S.-H. Chen, Y.-H. Chang, T.-Y. Chen, Y.-M. Chang, P.-W. Hsiao, H.-W. Wei, and W.-K. Shih, "Enhancing the Energy Efficiency of Journaling File System via Exploiting Multi-Write Modes on MLC NVRAM," in Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED '18), Seattle, WA, Jul. 2018.
[5]
Y. Freund, R. Schapire, and N. Abe, "A short introduction to boosting," in Journal-Japanese Society For Artificial Intelligence 14.771-780 (1999): 1612.
[6]
L. Breiman, "Random forests," in Machine learning, vol. 45, issue 1, pp. 55--32, Oct. 2001.
[7]
L. Breiman, "Classification and regression trees," Routledge, 2017.
[8]
J. R. Quinlan, "C4.5: Programs for Machine Learning," Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1993.
[9]
F. Esposito, D. Malerba, G. Semeraro and J. Kay, "A comparative analysis of methods for pruning decision trees," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 5, pp. 476--491, May 1997.
[10]
J. Fürnkranz and G. Widmer, "Incremental Reduced Error Pruning," in Machine Learning Proceedings, Rutgers University, New Brunswick, NJ, July 10--13, 1994.
[11]
M. Mehta, J. Rissanen, and R. Agrawal. "MDL-Based Decision Tree Pruning," KDD. Vol. 21. No. 2. 1995.
[12]
M. Wright and A. Ziegler, "ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R," in Journal of Statistical Software, vol. 77, number 1, 2017.
[13]
J.R. Quinlan, "Simplifying decision trees," in International Journal of Man-Machine Studies, vol. 27, issue 3, 1987.
[14]
S. Drazin and M. Montag, "Decision tree analysis using weka," Machine Learning-Project II, University of Miami, 2012.
[15]
A. Goldbloom and B. Hamner, "Kaggle online community @ https://www.kaggle.com/datasets", April 2010.
[16]
UC Irvine, "UC Irvine Machine Learning Repository @ https://archive.ics.uci.edu/ml/index.php", 2007.
[17]
Y. LeCun, C. Cortes, and C. J. Burges, "MNIST handwritten digit database", AT&T Labs [Online]. Available: http://yann.lecun.com/exdb/mnist 2 (2010): 18.
[18]
I. C. Yeh and C. H. Lien, "The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients", Expert Systems with Applications, vol. 36, issue 2, pp. 2473--2480, 2009.
[19]
R.-S. Liu, D.-Y. Shen, C.-L. Yang, S.-C. Yu, and C.-Y. M. Wang, "NVM duet: unified working memory and persistent store architecture", In Proceedings of the 19th international conference on Architectural support for programming languages and operating systems (ASPLOS '14), pp. 455--470, Salt Lake City, Utah, March 2014.
[20]
V.-Q. Pham, T. Kozakaya, O. Yamaguchi, and R. Okada, "COUNT Forest: CO-Voting Uncertain Number of Targets Using Random Forest for Crowd Density Estimation", In Proceedings of the IEEE International Conference on Computer Vision (ICCV'15), pp. 3253--3261, Dec. 2015.
[21]
E. Viegas, A. O. Santin, A. França, R. Jasinski, V. A. Pedroni, and L. S. Oliveira, "Towards an Energy-Efficient Anomaly-Based Intrusion Detection Engine for Embedded Systems", In IEEE Transactions on Computers, vol. 66, no. 1, pp. 163--177, 1 Jan. 2017.

Cited By

View all
  • (2024)Deciphering Fitness Application Data Using Machine LearningInnovative Computing and Communications10.1007/978-981-97-4152-6_37(509-522)Online publication date: 15-Oct-2024
  • (2022)CIMA: A Novel Classification-Integrated Moving Average Model for Smart Lighting Intelligent Control Based on Human PresenceComplexity10.1155/2022/49893442022(1-19)Online publication date: 21-Sep-2022
  • (2022)GraphRCProceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design10.1145/3508352.3549408(1-9)Online publication date: 30-Oct-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISLPED '20: Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design
August 2020
263 pages
ISBN:9781450370530
DOI:10.1145/3370748
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE CAS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 August 2020

Permissions

Request permissions for this article.

Check for updates

Badges

  • Best Paper

Author Tags

  1. decision tree
  2. multi-write NVRAM
  3. pruning strategy

Qualifiers

  • Research-article

Conference

ISLPED '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 398 of 1,159 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)2
Reflects downloads up to 09 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Deciphering Fitness Application Data Using Machine LearningInnovative Computing and Communications10.1007/978-981-97-4152-6_37(509-522)Online publication date: 15-Oct-2024
  • (2022)CIMA: A Novel Classification-Integrated Moving Average Model for Smart Lighting Intelligent Control Based on Human PresenceComplexity10.1155/2022/49893442022(1-19)Online publication date: 21-Sep-2022
  • (2022)GraphRCProceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design10.1145/3508352.3549408(1-9)Online publication date: 30-Oct-2022
  • (2021)Eco-feller: Minimizing the Energy Consumption of Random Forest Algorithm by an Eco-pruning Strategy over MLC NVRAM2021 58th ACM/IEEE Design Automation Conference (DAC)10.1109/DAC18074.2021.9586164(649-654)Online publication date: 5-Dec-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media