Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content

SNAS: Fast Hardware-Aware Neural Architecture Search Methodology<sup/>

Published: 01 November 2022 Publication History


Recently, automated neural architecture search (NAS) emerges as the default technique to find a state-of-the-art (SOTA) convolutional neural network (CNN) architecture with higher accuracy than manually designed architectures for image classification. In this article, we present a fast hardware-aware NAS methodology, called S3NAS, reflecting the latest research results. It consists of three steps: 1) supernet design; 2) Single-Path NAS for fast architecture exploration; and 3) scaling and post-processing. In the first step, we design a supernet, superset of candidate networks with two features: one is to allow stages to have a different number of blocks, and the other is to enable blocks to have parallel layers of different kernel sizes (MixConv). Next, we perform a differential search by extending the Single-Path NAS technique to support the MixConv layer and to add a latency-aware loss term to reduce the hyperparameter search overhead. Finally, we use compound scaling to scale up the network maximally within the latency constraint. In addition, we add squeeze-and-excitation (SE) blocks and h-swish activation functions if beneficial in the post-processing step. Experiments with the proposed methodology on four different hardware platforms demonstrate the effectiveness of the proposed methodology. It is capable of finding networks with better latency&#x2013;accuracy tradeoff than SOTA networks, and the network search can be done within 4 h using TPUv3.


H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, “Pruning filters for efficient ConvNets,” in Proc. ICLR, 2017.
E. Park, J. Ahn, and S. Yoo, “Weighted-entropy-based quantization for deep neural networks,” in Proc. CVPR, Honolulu, HI, USA, 2017, pp. 7197–7205.
Y.-D. Kim, E. Park, S. Yoo, T. Choi, L. Yang, and D. Shin, “Compression of deep convolutional neural networks for fast and low power mobile applications,” 2015, arXiv:1511.06530.
F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” in Proc. CVPR, Honolulu, HI, USA, 2017, pp. 1800–1807.
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted residuals and linear bottlenecks,” in Proc. CVPR, Salt Lake City, UT, USA, 2018, pp. 4510–4520.
S. Gupta and B. Akin, “Accelerator-aware neural network design using AutoML,” in Proc. MLSys Workshop, 2020. [Online]. Available: https://arxiv.org/pdf/2003.02838.pdf
M. Tan and Q. V. Le, “MixConv: Mixed depthwise convolutional kernels,” in Proc. BMVC, 2019, p. 74.
D. Stamouliset al., “Single-path NAS: Designing hardware-efficient ConvNets in less than 4 hours,” in Proc. ECML-PKDD, 2019, pp. 481–497.
M. Tan and Q. V. Le, “EfficientNet: Rethinking model scaling for convolutional neural networks,” in Proc. ICML, 2019, pp. 6105–6114.
J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in Proc. CVPR, 2018, pp. 2011–2023.
D. Kang, J. Kang, H. Kwon, H. Park, and S. Ha, “A novel convolutional neural network accelerator that enables fully-pipelined execution of layers,” in Proc. ICCD, Abu Dhabi, UAE, 2019, pp. 698–701.
B. Zoph and Q. V. Le, “Neural architecture search with reinforcement learning,” in Proc. ICLR, 2017.
B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learning transferable architectures for scalable image recognition,” in Proc. CVPR, 2018, pp. 8697–8710.
M. Tanet al., “MnasNet: Platform-aware neural architecture search for mobile,” in Proc. CVPR, Long Beach, CA, USA, 2019, pp. 2815–2823.
A. Marchisio, A. Massa, V. Mrazek, B. Bussolino, M. Martina, and M. Shafique, “NASCaps: A framework for neural architecture search to optimize the accuracy and hardware efficiency of convolutional capsule networks,” in Proc. ICCAD, 2020, pp. 1–9.
H. Liu, K. Simonyan, and Y. Yang, “DARTS: Differentiable architecture search,” in Proc. ICLR, 2019.
H. Cai, L. Zhu, and S. Han, “ProxylessNAS: Direct neural architecture search on target task and hardware,” in Proc. ICLR, 2019.
B. Wuet al., “FBNet: Hardware-aware efficient convnet design via differentiable neural architecture search,” in Proc. CVPR, 2019, pp. 10726–10734.
Y. Xuet al., “PC-DARTS: Partial channel connections for memory-efficient architecture search,” in Proc. Int. Conf. Learn. Represent., 2020.
D. Stamouliset al., “Single-path mobile AutoML: Efficient ConvNet design and NAS hyperparameter optimization,” IEEE J. Sel. Topics Signal Process., vol. 14, no. 4, pp. 609–622, May 2020.
P. Achararit, M. A. Hanif, R. V. W. Putra, M. Shafique, and Y. Hara-Azumi, “APNAS: Accuracy-and-performance-aware neural architecture search for neural hardware accelerators,” IEEE Access, vol. 8, pp. 165319–165334, 2020.
H. Cai, C. Gan, T. Wang, Z. Zhang, and S. Han, “Once-for-all: Train one network and specialize it for efficient deployment,” in Proc. ICLR, 2019.
R. Meng, W. Chen, D. Xie, Y. Zhang, and S. Pu, “Neural inheritance relation guided one-shot layer assignment search,” in Proc. AAAI, 2020, pp. 5158–5165.
I. Radosavovic, R. P. Kosaraju, R. Girshick, K. He, and P. Dollár, “Designing network design spaces,” in Proc. CVPR, 2020, pp. 10425–10433.
J. Meiet al., “AtomNAS: Fine-grained end-to-end neural architecture search,” in Proc. ICLR, 2020.
D. Han, J. Kim, and J. Kim, “Deep pyramidal residual networks,” in Proc. CVPR, Honolulu, HI, USA, 2017, pp. 6307–6315.
A. Howardet al., “Searching for MobileNetV3,” in Proc. ICCV, Seoul, South Korea, 2019, pp. 1314–1324.
P. Ramachandran, B. Zoph, and Q. V. Le, “Searching for activation functions,” in Proc. ICLR Workshop, 2018.
E. Park and S. Yoo, “Profit: A novel training method for sub-4-bit mobilenet models,” in Proc. ECCV, 2020, pp. 430–446.
M. Verucchiet al., “A systematic assessment of embedded neural networks for object detection,” in Proc. ETFA, 2020, pp. 937–944.
G. Huang, Y. Sun, Z. Liu, D. Sedra, and K. Q. Weinberger, “Deep networks with stochastic depth,” in Proc. ECCV, 2016, pp. 646–661.

Cited By

View all
  • (2024)Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future EnvisionACM Transactions on Embedded Computing Systems10.1145/370172824:1(1-100)Online publication date: 24-Oct-2024
  • (2024)Software Optimization and Design Methodology for Low Power Computer Vision SystemsACM Transactions on Embedded Computing Systems10.1145/368731024:1(1-31)Online publication date: 7-Aug-2024
  • (2024)CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN WorkloadsACM Transactions on Embedded Computing Systems10.1145/366586823:4(1-32)Online publication date: 29-Jun-2024
  • Show More Cited By

Index Terms

  1. SNAS: Fast Hardware-Aware Neural Architecture Search Methodology
        Index terms have been assigned to the content through auto-classification.



        Information & Contributors


        Published In

        cover image IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
        IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems  Volume 41, Issue 11
        Nov. 2022
        1569 pages


        IEEE Press

        Publication History

        Published: 01 November 2022


        • Research-article


        Other Metrics

        Bibliometrics & Citations


        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 29 Jan 2025

        Other Metrics


        Cited By

        View all
        • (2024)Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future EnvisionACM Transactions on Embedded Computing Systems10.1145/370172824:1(1-100)Online publication date: 24-Oct-2024
        • (2024)Software Optimization and Design Methodology for Low Power Computer Vision SystemsACM Transactions on Embedded Computing Systems10.1145/368731024:1(1-31)Online publication date: 7-Aug-2024
        • (2024)CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN WorkloadsACM Transactions on Embedded Computing Systems10.1145/366586823:4(1-32)Online publication date: 29-Jun-2024
        • (2023)M²NAS: Joint Neural Architecture Optimization System With Network TransmissionIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.322385242:8(2631-2642)Online publication date: 1-Aug-2023
        • (2023)Eight years of AutoML: categorisation, review and trendsKnowledge and Information Systems10.1007/s10115-023-01935-165:12(5097-5149)Online publication date: 1-Dec-2023

        View Options

        View options






        Share this Publication link

        Share on social media