Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3240765.3243479guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

Hardware-Aware Machine Learning: Modeling and Optimization

Published: 05 November 2018 Publication History

Abstract

Recent breakthroughs in Machine Learning (ML) applications, and especially in Deep Learning (DL), have made DL models a key component in almost every modern computing system. The increased popularity of DL applications deployed on a wide-spectrum of platforms (from mobile devices to datacenters) have resulted in a plethora of design challenges related to the constraints introduced by the hardware itself. “What is the latency or energy cost for an inference made by a Deep Neural Network (DNN)?” “Is it possible to predict this latency or energy consumption before a model is even trained?” “If yes, how can machine learners take advantage of these models to design the hardware-optimal DNN for deployment?” From lengthening battery life of mobile devices to reducing the runtime requirements of DL models executing in the cloud, the answers to these questions have drawn significant attention. One cannot optimize what isn't properly modeled. Therefore, it is important to understand the hardware efficiency of DL models during serving for making an inference, before even training the model. This key observation has motivated the use of predictive models to capture the hardware performance or energy efficiency of ML applications. Furthermore, ML practitioners are currently challenged with the task of designing the DNN model, i.e., of tuning the hyper-parameters of the DNN architecture, while optimizing for both accuracy of the DL model and its hardware efficiency. Therefore, state-of-the-art methodologies have proposed hardware-aware hyper-parameter optimization techniques. In this paper, we provide a comprehensive assessment of state-of-the-art work and selected results on the hardware-aware modeling and optimization for ML applications. We also highlight several open questions that are poised to give rise to novel hardware-aware designs in the next few years, as DL applications continue to significantly impact associated hardware systems and platforms.

References

[1]
Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. TensorFlow: A system for large-scale machine learning. arXiv preprint arXiv: (2016).
[2]
James S Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. 2011. Algorithms for hyper-parameter optimization. In Advances in neural information processing systems. 2546–2554.
[3]
Ermao Cai, Da-Cheng Juan, Dimitrios Stamoulis, and Diana Marculescu. 2017. Neuralpower: Predict and deploy energy-efficient convolutional neural networks. arXiv preprint arXiv: (2017).
[4]
Ermao Cai, Dimitrios Stamoulis, and Diana Marculescu. 2016. Exploring aging deceleration in FinFET-based multi-core systems. In Computer-Aided Design (ICCAD), 2016 IEEE/ACM International Conference on. IEEE, 1–8.
[5]
Yu-Hsin Chen, Tushar Krishna, Joel S Emer, and Vivienne Sze. 2017. Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE Journal of Solid-State Circuits 52, 1 (2017), 127–138.
[6]
Zhuo Chen, Dimitrios Stamoulis, and Diana Marculescu. 2017. Profit: priority and power/performance optimization for many-core systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2017).
[7]
Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or −1. arXiv preprint arXiv: (2016).
[8]
Xiaoliang Dai, Hongxu Yin, and Niraj K Jha. 2017. NeST: A Neural Network Synthesis Tool Based on a Grow-and-Prune Paradigm. arXiv preprint arXiv: (2017).
[9]
Ruizhou Ding, Zeye Liu, RD Shawn Blanton, and Diana Marculescu. 2018. Quantized deep neural networks for energy efficient hardware-based inference. In Design Automation Conference (ASP-DAC), 2018 23rd Asia and South Pacific. IEEE, 1–8.
[10]
Ruizhou Ding, Zeye Liu, Rongye Shi, Diana Marculescu, and RD Blanton. 2017. LightNN: Filling the Gap between Conventional Deep Neural Networks and Binarized Networks. In Proceedings of the on Great Lakes Symposium on VLSI 2017. ACM, 35–40.
[11]
Jin-Dong Dong, An-Chieh Cheng, Da-Cheng Juan, Wei Wei, and Min Sun. 2018. DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural Architectures. arXiv preprint arXiv: (2018).
[12]
J. Fowers, K. Ovtcharov, M. Papamichael, T. Massengill, M. Liu, D. Lo, S. Alkalay, M. Haselman, L. Adams, M. Ghandi, S. Heil, P. Patel, A. Sapek, G. Weisz, L. Woods, S. Lanka, S.K. Reinhardt, A.M. Caulfield, E.S. Chung, and D. Burger. 2018. A Configurable Cloud-Scale DNN Processor for Real-Time Al. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[13]
Mingyu Gao, Jing Pu, Xuan Yang, Mark Horowitz, and Christos Kozyrakis. 2017. Tetris: Scalable and efficient neural network acceleration with 3d memory. ACM SIGOPS Operating Systems Review 51, 2 (2017), 751–764.
[14]
Michael A Gelbart, Jasper Snoek, and Ryan P Adams. 2014. Bayesian optimization with unknown constraints. arXiv preprint arXiv: (2014).
[15]
Ariel Gordon, Elad Eban, Ofir Nachum, Bo Chen, Hao Wu, Tien-Iu Yang, and Edward Choi. 2018. Morphnet: Fast & simple resource-constrained structure learning of deep networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16]
Robert B Gramacy and Herbert KH Lee. 2010. Optimization Under Unknown Constraints. arXiv preprint arXiv: (2010).
[17]
Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. 2015. Deep learning with limited numerical precision. In International Conference on Machine Learning. 1737–1746.
[18]
Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv: (2015).
[19]
Song Han, Jeff Pool, John Tran, and William Dally. 2015. Learning both weights and connections for efficient neural network. In NI PS. 1135–1143.
[20]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.
[21]
José Miguel Hernandez-Lobato, Michael A Gelbart, Ryan P Adams, Matthew W Hoffman, and Zoubin Ghahramani. 2016. A general framework for constrained Bayesian optimization using information-based search. The Journal of Machine Learning Research 17, 1 (2016), 5549–5601.
[22]
José Miguel Hernández-Lobato, Michael A Gelbart, Brandon Reagen, Robert Adolf, Daniel Hernandez-Lobato, Paul N Whatmough, David Brooks, Gu-Yeon Wei, and Ryan P Adams. 2016. Designing neural network hardware accelerators with decoupled objective evaluations. In NIPS workshop on Bayesian Optimization. 10.
[23]
Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv: (2017).
[24]
Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. 2017. Quantization and training of neural networks for efficient integer-arithmetic-only inference. arXiv preprint arXiv: (2017).
[25]
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:.
[26]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097–1105.
[27]
Hang Qi, Evan R. Sparks, and Ameet Talwalkar. 2017. Paleo: A Performance Model for Deep Neural Networks. In Proceedings of the International Conference on Learning Representations.
[28]
Sreeraj Rajendran, Wannes Meert, Domenico Giustiniano, Vincent Lenders, and Sofie Pollin. 2017. Distributed deep learning models for wireless signal classification with low-cost spectrum sensors. arXiv preprint arXiv: (2017).
[29]
Brandon Reagen, José Miguel Hernández-Lobato, Robert Adolf, Michael Gelbart, Paul Whatmough, Gu-Yeon Wei, and David Brooks. 2017. A case for efficient accelerator design space exploration via Bayesian optimization. In Low Power Electronics and Design (ISLPED, 2017 IEEE/ACM International Symposium on. IEEE, 1–6.
[30]
Brandon Reagen, Paul Whatmough, Robert Adolf, Saketh Rama, Hyunkwang Lee, Sae Kyu Lee, José Miguel Hernández-Lobato, Gu-Yeon Wei, and David Brooks. 2016. Minerva: Enabling low-power, highly-accurate deep neural network accelerators. In ACM SIGARCH Computer Architecture News, Vol. 44. IEEE Press, 267–278.
[31]
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211–252.
[32]
Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. 2016. Taking the human out of the loop: A review of bayesian optimization. Proc. IEEE 104, 1 (2016), 148–175.
[33]
Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms. In Advances in neural information processing systems. 2951–2959.
[34]
Dimitrios Stamoulis, Ermao Cai, Da-Cheng Juan, and Diana Marculescu. 2018. HyperPower: Power- and memory-constrained hyper-parameter optimization for neural networks. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2018. IEEE, 19–24.
[35]
Dimitrios Stamoulis, Ting-Wu Chin, Anand Krishnan Prakash, Haocheng Fang, Sribhuvan Sajja, Mitchell Bognar, Diana Marculescu, et al. 2018. Designing Adaptive Neural Networks for Energy-Constrained Image Classification. arXiv preprint arXiv: (2018).
[36]
Dimitrios Stamoulis and Diana Marculescu. 2016. Can we guarantee performance requirements under workload and process variations? In Proceedings of the 2016 International Symposium on Low Power Electronics and Design. ACM, 308–313.
[37]
Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, and Quoc V Le. 2018. MnasNet: Platform-Aware Neural Architecture Search for Mobile. arXiv preprint arXiv: (2018).
[38]
Tien-Ju Yang, Yu-Hsin Chen, and Vivienne Sze. 2017. Designing energy-efficient convolutional neural networks using energy-aware pruning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Cited By

View all
  • (2024)Machine Learning in Short-Reach Optical Systems: A Comprehensive SurveyPhotonics10.3390/photonics1107061311:7(613)Online publication date: 28-Jun-2024
  • (2024)MLOps critical success factors - A systematic literature reviewVFAST Transactions on Software Engineering10.21015/vtse.v12i1.174712:1(183-209)Online publication date: 31-Mar-2024
  • (2024)Arch2End: Two-Stage Unified System-Level Modeling for Heterogeneous Intelligent DevicesIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.344370643:11(4154-4165)Online publication date: Nov-2024
  • Show More Cited By

Index Terms

  1. Hardware-Aware Machine Learning: Modeling and Optimization
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Guide Proceedings
        2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)
        Nov 2018
        939 pages

        Publisher

        IEEE Press

        Publication History

        Published: 05 November 2018

        Permissions

        Request permissions for this article.

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 16 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Machine Learning in Short-Reach Optical Systems: A Comprehensive SurveyPhotonics10.3390/photonics1107061311:7(613)Online publication date: 28-Jun-2024
        • (2024)MLOps critical success factors - A systematic literature reviewVFAST Transactions on Software Engineering10.21015/vtse.v12i1.174712:1(183-209)Online publication date: 31-Mar-2024
        • (2024)Arch2End: Two-Stage Unified System-Level Modeling for Heterogeneous Intelligent DevicesIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.344370643:11(4154-4165)Online publication date: Nov-2024
        • (2024)Sustainable AI: Experiences, Challenges & RecommendationsProceedings of the SC '24 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1109/SCW63240.2024.00227(1805-1814)Online publication date: 17-Nov-2024
        • (2024)Strategies for Graph Optimization in Deep Learning Compilers2024 International Conference on Interactive Intelligent Systems and Techniques (IIST)10.1109/IIST62526.2024.00086(332-337)Online publication date: 4-Mar-2024
        • (2024)LLM-dCache: Improving Tool-Augmented LLMs with GPT-Driven Localized Data Caching2024 31st IEEE International Conference on Electronics, Circuits and Systems (ICECS)10.1109/ICECS61496.2024.10848749(1-4)Online publication date: 18-Nov-2024
        • (2024)TNEST: Training Sparse Neural Network for FPGA Based Edge ApplicationProceedings of the Second International Conference on Advances in Computing Research (ACR’24)10.1007/978-3-031-56950-0_2(15-28)Online publication date: 29-Mar-2024
        • (2023)Compiler Technologies in Deep Learning Co-Design: A SurveyIntelligent Computing10.34133/icomputing.00402Online publication date: 19-Jun-2023
        • (2023)QUIDAM: A Framework for Quantization-aware DNN Accelerator and Model Co-ExplorationACM Transactions on Embedded Computing Systems10.1145/355580722:2(1-21)Online publication date: 24-Jan-2023
        • (2023)I-TAINTED: Identification of Turmeric Adulteration Using the CavIty PerturbatioN Technique and Technology OptimizED Machine LearningIEEE Access10.1109/ACCESS.2023.328971711(66456-66466)Online publication date: 2023
        • Show More Cited By

        View Options

        View options

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media