research-article

Accelerating Decision Tree Ensemble with Guided Branch Approximation

Authors:

Keisuke Kamahori,

Shinya Takamaeda-YamazakiAuthors Info & Claims

HEART '22: Proceedings of the 12th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies

Pages 24 - 32

https://doi.org/10.1145/3535044.3535048

Published: 09 June 2022 Publication History

Abstract

Processing lightweight machine learning (ML) algorithms, such as decision tree ensemble (DTE), on low-power edge devices is beneficial; however, these devices usually have limited resources, and domain-specific accelerators are not readily available. Therefore, energy- and resource-efficient acceleration mechanisms for ML workloads on lightweight embedded microcontrollers without additional hardware accelerators are desired. However, the penalties associated with branch mispredictions can be performance bottlenecks when executing DTE on conventional in-order pipelined processors. This study proposes the Guided Branch Approximation (GBA), an approximate computing approach to improve the performance of DTE on lightweight general-purpose processors by selectively ignoring the correctness of branch instructions. GBA enhances the performance by speculatively executing selected branch instructions without any rollback on branch mispredictions. GBA allows programmers and high-level ML frameworks to annotate approximal branch instructions and to ensure target applications’ quality of service (QoS). GBA comprises the following: 1) the approximate branch instruction format, a new type of branch instruction that ignores the wrong prediction of branch predictors, and 2) a hardware-based QoS mechanism that dynamically manages the execution of approximable branch instructions to prevent undesirable QoS degradation. We evaluate the proposed idea on an in-order pipeline processor using a software simulator. Experiments show that GBA can reduce the total execution time by more than 15 % while preserving the QoS of the DTE algorithm in the best-case scenario with a slight modification to the hardware.

References

[1]

Fabrice Bellard, Gaurav Kothari, Parikshit Sarnaik, and Göktürk Yuksek. 2019. MARSS-RICV. https://github.com/bucaps/marss-riscv

[2]

Leo Breiman. 2001. Random Forests. Mach. Learn. 45, 1 (Oct. 2001), 5–32.

Digital Library

[3]

Leo Breiman, Jerome H Friedman, Richard A Olshen, and Charles J Stone. 2017. Classification and regression trees. Routledge.

[4]

Lars Buitinck, Gilles Louppe, Mathieu Blondel, Fabian Pedregosa, Andreas Mueller, Olivier Grisel, Vlad Niculae, Peter Prettenhofer, Alexandre Gramfort, Jaques Grobler, Robert Layton, Jake VanderPlas, Arnaud Joly, Brian Holt, and Gaël Varoquaux. 2013. API design for machine learning software: experiences from the scikit-learn project. In ECML PKDD Workshop: Languages for Data Mining and Machine Learning. 108–122.

[5]

Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml

[6]

Mohammed S Elbamby, Cristina Perfecto, Chen-Feng Liu, Jihong Park, Sumudu Samarakoon, Xianfu Chen, and Mehdi Bennis. 2019. Wireless Edge Computing With Latency and Reliability Guarantees. Proc. IEEE 107, 8 (Aug. 2019), 1717–1737.

[7]

Hadi Esmaeilzadeh, Adrian Sampson, Luis Ceze, and Doug Burger. 2012. Architecture support for disciplined approximate programming. In Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems (London, England, UK) (ASPLOS XVII). Association for Computing Machinery, New York, NY, USA, 301–312.

Digital Library

[8]

Igor Fedorov, Ryan P Adams, Matthew Mattina, and Paul Whatmough. 2019. SpArSe: Sparse architecture search for CNNs on resource-constrained microcontrollers. Adv. Neural Inf. Process. Syst. 32 (2019).

[9]

Jie Han and Michael Orshansky. 2013. Approximate computing: An emerging paradigm for energy-efficient design. In 2013 18th IEEE European Test Symposium (ETS). ieeexplore.ieee.org, 1–6.

[10]

Joshua San Miguel, Mario Badr, and Natalie Enright Jerger. 2014. Load Value Approximation. In 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture. ieeexplore.ieee.org, 127–139.

[11]

Sparsh Mittal. 2016. A Survey of Techniques for Approximate Computing. ACM Comput. Surv. 48, 4 (March 2016), 1–33.

Digital Library

[12]

Tomoki Nakamura, Kazutaka Tomida, Shouta Kouno, Hidetsugu Irie, and Shuichi Sakai. 2021. Stochastic Iterative Approximation: Software/hardware techniques for adjusting aggressiveness of approximation. In 2021 IEEE 39th International Conference on Computer Design (ICCD). 74–82.

[13]

Bernard Nongpoh, Rajarshi Ray, Moumita Das, and Ansuman Banerjee. 2019. Enhancing Speculative Execution With Selective Approximate Computing. ACM Trans. Des. Automat. Electron. Syst. 24, 2 (Feb. 2019), 1–29.

Digital Library

[14]

the University of California. 2016. riscv-gnu-toolchain: GNU toolchain for RISC-V, including GCC. https://github.com/riscv-collab/riscv-gnu-toolchain

[15]

Andrew Waterman, Yunsup Lee, David A Patterson, and Krste Asanovi. 2014. The RISC-V Instruction Set Manual. Volume 1: User-Level ISA, Version 2.0. Technical Report. Fort Belvoir, VA.

Recommendations

Difficult-path branch prediction using subordinate microthreads
Special Issue: Proceedings of the 29th annual international symposium on Computer architecture (ISCA '02)

Branch misprediction penalties continue to increase as microprocessor cores become wider and deeper. Thus, improving branch prediction accuracy remains an important challenge. Simultaneous Subordinate Microthreading (SSMT) provides a means to improve ...
Difficult-path branch prediction using subordinate microthreads
ISCA '02: Proceedings of the 29th annual international symposium on Computer architecture

Branch misprediction penalties continue to increase as microprocessor cores become wider and deeper. Thus, improving branch prediction accuracy remains an important challenge. Simultaneous Subordinate Microthreading (SSMT) provides a means to improve ...
A Comprehensive Analysis of Indirect Branch Prediction
ISHPC '02: Proceedings of the 4th International Symposium on High Performance Computing

Indirect branch prediction is a performance limiting factor for current computer systems, preventing superscalar processors from exploiting the available ILP. Indirect branches are responsible for 55.7% of mispredictions in our benchmark set, although ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

HEART '22: Proceedings of the 12th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies

June 2022

114 pages

ISBN:9781450396608

DOI:10.1145/3535044

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

JSPS 18H05288
JSPS 19H04075

Conference

HEART2022

HEART2022: International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies

June 9 - 10, 2022

Tsukuba, Japan

Acceptance Rates

HEART '22 Paper Acceptance Rate 10 of 21 submissions, 48%;

Overall Acceptance Rate 22 of 50 submissions, 44%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
97
Total Downloads

Downloads (Last 12 months)27
Downloads (Last 6 weeks)0

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents