Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

The Art of Getting Deep Neural Networks in Shape

Published: 08 January 2019 Publication History
  • Get Citation Alerts
  • Abstract

    Training a deep neural network (DNN) involves selecting a set of hyperparameters that define the network topology and influence the accuracy of the resulting network. Often, the goal is to maximize prediction accuracy on a given dataset. However, non-functional requirements of the trained network -- such as inference speed, size, and energy consumption -- can be very important as well. In this article, we aim to automate the process of selecting an appropriate DNN topology that fulfills both functional and non-functional requirements of the application. Specifically, we focus on tuning two important hyperparameters, depth and width, which together define the shape of the resulting network and directly affect its accuracy, speed, size, and energy consumption. To reduce the time needed to search the design space, we train a fraction of DNNs and build a model to predict the performances of the remaining ones. We are able to produce tuned ResNets, which are up to 4.22 times faster than original depth-scaled ResNets on a batch of 128 images while matching their accuracy.

    References

    [1]
    Jimmy Ba and Rich Caruana. 2014. Do deep nets really need to be deep? In Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 2654--2662.
    [2]
    Bowen Baker, Otkrist Gupta, Nikhil Naik, and Ramesh Raskar. 2016. Designing neural network architectures using reinforcement learning. (2016). arxiv:1611.02167
    [3]
    James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of Machine Learning Research 13, Feb (2012), 281--305.
    [4]
    Martin Burtscher, Ivan Zecena, and Ziliang Zong. 2014. Measuring GPU power with the K20 built-in sensor. In Proceedings of Workshop on General Purpose Processing Using GPUs. ACM, 28.
    [5]
    Patryk Chrabaszcz, Ilya Loshchilov, and Frank Hutter. 2017. A downsampled variant of ImageNet as an alternative to the CIFAR datasets. (2017). arxiv:1707.08819
    [6]
    Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2015. BinaryConnect: Training deep neural networks with binary weights during propagations. In Advances in Neural Information Processing Systems 28, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett (Eds.). Curran Associates, Inc., 3123--3131.
    [7]
    Song Han, Huizi Mao, and William J. Dally. 2015. Deep compression: Compressing deep neural network with pruning, trained quantization and Huffman coding. (2015). arxiv:1510.00149
    [8]
    Sayed Hadi Hashemi, Shadi A. Noghabi, and William Gropp. 2016. Performance modeling of distributed deep neural networks. (2016). arxiv:1612.00521
    [9]
    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.
    [10]
    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Identity mappings in deep residual networks. In Proceedings of the European Conference on Computer Vision. Springer, 630--645.
    [11]
    Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE. 2261--2269.
    [12]
    Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size. (2016), 1--13. arxiv:1602.07360
    [13]
    Jia Deng, Wei Dong, R. Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE. 248--255.
    [14]
    Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. Science Department, University of Toronto, Tech. (2009), 1--60.
    [15]
    Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems - Vol. 1 (NIPS’12). Curran Associates Inc., 1097--1105.
    [16]
    Jens Lang and Gudula Rünger. 2013. High-resolution power profiling of GPU functions using low-resolution measurement. In Proceedings of the European Conference on Parallel Processing, Felix Wolf, Bernd Mohr, and Dieter an Mey (Eds.). Springer, 801--812.
    [17]
    Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio. 2007. An empirical evaluation of deep architectures on problems with many factors of variation. In Proceedings of the 24th International Conference on Machine Learning (ICML’07). 473--480.
    [18]
    Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. In Proceedings of the IEEE, Vol. 86. 2278--2323.
    [19]
    Yann LeCun and Corinna Cortes. {n.d.}. MNIST handwritten digit database. Retrieved November 21, 2018 from http://yann.lecun.com/exdb/mnist/.
    [20]
    Risto Miikkulainen, Jason Zhi Liang, Elliot Meyerson, Aditya Rawal, Daniel Fink, Olivier Francon, Bala Raju, Hormoz Shahrzad, Arshak Navruzyan, Nigel Duffy, and Babak Hodjat. 2017. Evolving deep neural networks. (2017). arxiv:1703.00548
    [21]
    Renato Negrinho and Geoff Gordon. 2017. Deeparchitect: Automatically designing and training deep architectures. (2017). arxiv:1704.08792
    [22]
    Hang Qi, Evan R. Sparks, and Ameet Talwalkar. 2017. Paleo: A performance model for deep neural networks. In Proceedings of the International Conference on Learning Representations.
    [23]
    Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. Xnor-net: ImageNet classification using binary convolutional neural networks. In Proceedings of the European Conference on Computer Vision. Springer, 525--542.
    [24]
    Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. 2014. Fitnets: Hints for thin deep nets. (2014). arxiv:1412.6550
    [25]
    Pytorch Core Team. 2017. Pytorch: Tensors and dynamic neural networks in python with strong GPU acceleration.
    [26]
    Gregor Urban, Krzysztof J. Geras, Samira Ebrahimi Kahou, Ozlem Aslan, Shengjie Wang, Rich Caruana, Abdelrahman Mohamed, Matthai Philipose, and Matt Richardson. 2016. Do deep convolutional nets really need to be deep and convolutional? (2016). arxiv:1603.05691
    [27]
    Tien-Ju Yang, Yu-Hsin Chen, and Vivienne Sze. 2017. Designing energy-efficient convolutional neural networks using energy-aware pruning. (2017). arxiv:1611.05128
    [28]
    Sergey Zagoruyko and Nikos Komodakis. 2016. Wide residual networks. (2016). arxiv:1605.07146
    [29]
    Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, and Yuheng Zou. 2016. DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients. (2016). arxiv:1606.06160
    [30]
    Barret Zoph and Quoc V. Le. 2016. Neural architecture search with reinforcement learning. (2016). arxiv:1611.01578
    [31]
    Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V. Le. 2017. Learning transferable architectures for scalable image recognition. (2017). arxiv:1707.07012

    Cited By

    View all
    • (2023)Analysis and Interpretation of Deep Convolutional Features Using Self-organizing MapsInnovations in Machine and Deep Learning10.1007/978-3-031-40688-1_10(213-229)Online publication date: 29-Sep-2023
    • (2021)Advancing Design and Runtime Management of AI Applications with AI-SPRINT (Position Paper)2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC51774.2021.00216(1455-1462)Online publication date: Jul-2021
    • (2021)A Vision Based Deep Reinforcement Learning Algorithm for UAV Obstacle AvoidanceIntelligent Systems and Applications10.1007/978-3-030-82193-7_8(115-128)Online publication date: 4-Aug-2021
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Architecture and Code Optimization
    ACM Transactions on Architecture and Code Optimization  Volume 15, Issue 4
    December 2018
    706 pages
    ISSN:1544-3566
    EISSN:1544-3973
    DOI:10.1145/3284745
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 January 2019
    Accepted: 01 October 2018
    Revised: 01 September 2018
    Received: 01 May 2018
    Published in TACO Volume 15, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Deep neural networks
    2. computer vision
    3. parallel processing

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • Hessian LOEWE initiative within the Software - Factory 4.0 project
    • Graduate School of Excellence Computational Engineering

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)203
    • Downloads (Last 6 weeks)26
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Analysis and Interpretation of Deep Convolutional Features Using Self-organizing MapsInnovations in Machine and Deep Learning10.1007/978-3-031-40688-1_10(213-229)Online publication date: 29-Sep-2023
    • (2021)Advancing Design and Runtime Management of AI Applications with AI-SPRINT (Position Paper)2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC51774.2021.00216(1455-1462)Online publication date: Jul-2021
    • (2021)A Vision Based Deep Reinforcement Learning Algorithm for UAV Obstacle AvoidanceIntelligent Systems and Applications10.1007/978-3-030-82193-7_8(115-128)Online publication date: 4-Aug-2021
    • (2020)Inference and Energy Efficient Design of Deep Neural Networks for Embedded Devices2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI49217.2020.00017(36-41)Online publication date: Jul-2020
    • (2020)Deep CNN for Contrast-Enhanced Ultrasound Focal Liver Lesions Diagnosis2020 International Symposium on Electronics and Telecommunications (ISETC)10.1109/ISETC50328.2020.9301116(1-4)Online publication date: 5-Nov-2020
    • (2020)EFFICIENT ANOMALY DETECTION IN SURVEILLANCE VIDEOS BASED ON MULTI LAYER PERCEPTION RECURRENT NEURAL NETWORKMicroprocessors and Microsystems10.1016/j.micpro.2020.103303(103303)Online publication date: Oct-2020
    • (2019)Intelligent video surveillance: a review through deep learning techniques for crowd analysisJournal of Big Data10.1186/s40537-019-0212-56:1Online publication date: 6-Jun-2019

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media