Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

SNAS: Fast Hardware-Aware Neural Architecture Search Methodology<sup/>

Published: 01 November 2022 Publication History

Abstract

Recently, automated neural architecture search (NAS) emerges as the default technique to find a state-of-the-art (SOTA) convolutional neural network (CNN) architecture with higher accuracy than manually designed architectures for image classification. In this article, we present a fast hardware-aware NAS methodology, called S3NAS, reflecting the latest research results. It consists of three steps: 1) supernet design; 2) Single-Path NAS for fast architecture exploration; and 3) scaling and post-processing. In the first step, we design a supernet, superset of candidate networks with two features: one is to allow stages to have a different number of blocks, and the other is to enable blocks to have parallel layers of different kernel sizes (MixConv). Next, we perform a differential search by extending the Single-Path NAS technique to support the MixConv layer and to add a latency-aware loss term to reduce the hyperparameter search overhead. Finally, we use compound scaling to scale up the network maximally within the latency constraint. In addition, we add squeeze-and-excitation (SE) blocks and h-swish activation functions if beneficial in the post-processing step. Experiments with the proposed methodology on four different hardware platforms demonstrate the effectiveness of the proposed methodology. It is capable of finding networks with better latency&#x2013;accuracy tradeoff than SOTA networks, and the network search can be done within 4 h using TPUv3.

References

[1]
H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, “Pruning filters for efficient ConvNets,” in Proc. ICLR, 2017.
[2]
E. Park, J. Ahn, and S. Yoo, “Weighted-entropy-based quantization for deep neural networks,” in Proc. CVPR, Honolulu, HI, USA, 2017, pp. 7197–7205.
[3]
Y.-D. Kim, E. Park, S. Yoo, T. Choi, L. Yang, and D. Shin, “Compression of deep convolutional neural networks for fast and low power mobile applications,” 2015, arXiv:1511.06530.
[4]
F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” in Proc. CVPR, Honolulu, HI, USA, 2017, pp. 1800–1807.
[5]
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted residuals and linear bottlenecks,” in Proc. CVPR, Salt Lake City, UT, USA, 2018, pp. 4510–4520.
[6]
S. Gupta and B. Akin, “Accelerator-aware neural network design using AutoML,” in Proc. MLSys Workshop, 2020. [Online]. Available: https://arxiv.org/pdf/2003.02838.pdf
[7]
M. Tan and Q. V. Le, “MixConv: Mixed depthwise convolutional kernels,” in Proc. BMVC, 2019, p. 74.
[8]
D. Stamouliset al., “Single-path NAS: Designing hardware-efficient ConvNets in less than 4 hours,” in Proc. ECML-PKDD, 2019, pp. 481–497.
[9]
M. Tan and Q. V. Le, “EfficientNet: Rethinking model scaling for convolutional neural networks,” in Proc. ICML, 2019, pp. 6105–6114.
[10]
J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in Proc. CVPR, 2018, pp. 2011–2023.
[11]
D. Kang, J. Kang, H. Kwon, H. Park, and S. Ha, “A novel convolutional neural network accelerator that enables fully-pipelined execution of layers,” in Proc. ICCD, Abu Dhabi, UAE, 2019, pp. 698–701.
[12]
B. Zoph and Q. V. Le, “Neural architecture search with reinforcement learning,” in Proc. ICLR, 2017.
[13]
B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learning transferable architectures for scalable image recognition,” in Proc. CVPR, 2018, pp. 8697–8710.
[14]
M. Tanet al., “MnasNet: Platform-aware neural architecture search for mobile,” in Proc. CVPR, Long Beach, CA, USA, 2019, pp. 2815–2823.
[15]
A. Marchisio, A. Massa, V. Mrazek, B. Bussolino, M. Martina, and M. Shafique, “NASCaps: A framework for neural architecture search to optimize the accuracy and hardware efficiency of convolutional capsule networks,” in Proc. ICCAD, 2020, pp. 1–9.
[16]
H. Liu, K. Simonyan, and Y. Yang, “DARTS: Differentiable architecture search,” in Proc. ICLR, 2019.
[17]
H. Cai, L. Zhu, and S. Han, “ProxylessNAS: Direct neural architecture search on target task and hardware,” in Proc. ICLR, 2019.
[18]
B. Wuet al., “FBNet: Hardware-aware efficient convnet design via differentiable neural architecture search,” in Proc. CVPR, 2019, pp. 10726–10734.
[19]
Y. Xuet al., “PC-DARTS: Partial channel connections for memory-efficient architecture search,” in Proc. Int. Conf. Learn. Represent., 2020.
[20]
D. Stamouliset al., “Single-path mobile AutoML: Efficient ConvNet design and NAS hyperparameter optimization,” IEEE J. Sel. Topics Signal Process., vol. 14, no. 4, pp. 609–622, May 2020.
[21]
P. Achararit, M. A. Hanif, R. V. W. Putra, M. Shafique, and Y. Hara-Azumi, “APNAS: Accuracy-and-performance-aware neural architecture search for neural hardware accelerators,” IEEE Access, vol. 8, pp. 165319–165334, 2020.
[22]
H. Cai, C. Gan, T. Wang, Z. Zhang, and S. Han, “Once-for-all: Train one network and specialize it for efficient deployment,” in Proc. ICLR, 2019.
[23]
R. Meng, W. Chen, D. Xie, Y. Zhang, and S. Pu, “Neural inheritance relation guided one-shot layer assignment search,” in Proc. AAAI, 2020, pp. 5158–5165.
[24]
I. Radosavovic, R. P. Kosaraju, R. Girshick, K. He, and P. Dollár, “Designing network design spaces,” in Proc. CVPR, 2020, pp. 10425–10433.
[25]
J. Meiet al., “AtomNAS: Fine-grained end-to-end neural architecture search,” in Proc. ICLR, 2020.
[26]
D. Han, J. Kim, and J. Kim, “Deep pyramidal residual networks,” in Proc. CVPR, Honolulu, HI, USA, 2017, pp. 6307–6315.
[27]
A. Howardet al., “Searching for MobileNetV3,” in Proc. ICCV, Seoul, South Korea, 2019, pp. 1314–1324.
[28]
P. Ramachandran, B. Zoph, and Q. V. Le, “Searching for activation functions,” in Proc. ICLR Workshop, 2018.
[29]
E. Park and S. Yoo, “Profit: A novel training method for sub-4-bit mobilenet models,” in Proc. ECCV, 2020, pp. 430–446.
[30]
M. Verucchiet al., “A systematic assessment of embedded neural networks for object detection,” in Proc. ETFA, 2020, pp. 937–944.
[31]
G. Huang, Y. Sun, Z. Liu, D. Sedra, and K. Q. Weinberger, “Deep networks with stochastic depth,” in Proc. ECCV, 2016, pp. 646–661.

Cited By

View all
  • (2024)Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future EnvisionACM Transactions on Embedded Computing Systems10.1145/370172824:1(1-100)Online publication date: 24-Oct-2024
  • (2024)Software Optimization and Design Methodology for Low Power Computer Vision SystemsACM Transactions on Embedded Computing Systems10.1145/368731024:1(1-31)Online publication date: 7-Aug-2024
  • (2024)CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN WorkloadsACM Transactions on Embedded Computing Systems10.1145/366586823:4(1-32)Online publication date: 29-Jun-2024
  • Show More Cited By

Index Terms

  1. SNAS: Fast Hardware-Aware Neural Architecture Search Methodology
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
        IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems  Volume 41, Issue 11
        Nov. 2022
        1569 pages

        Publisher

        IEEE Press

        Publication History

        Published: 01 November 2022

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 29 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future EnvisionACM Transactions on Embedded Computing Systems10.1145/370172824:1(1-100)Online publication date: 24-Oct-2024
        • (2024)Software Optimization and Design Methodology for Low Power Computer Vision SystemsACM Transactions on Embedded Computing Systems10.1145/368731024:1(1-31)Online publication date: 7-Aug-2024
        • (2024)CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN WorkloadsACM Transactions on Embedded Computing Systems10.1145/366586823:4(1-32)Online publication date: 29-Jun-2024
        • (2023)M²NAS: Joint Neural Architecture Optimization System With Network TransmissionIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.322385242:8(2631-2642)Online publication date: 1-Aug-2023
        • (2023)Eight years of AutoML: categorisation, review and trendsKnowledge and Information Systems10.1007/s10115-023-01935-165:12(5097-5149)Online publication date: 1-Dec-2023

        View Options

        View options

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media