research-article

SNAS: Fast Hardware-Aware Neural Architecture Search Methodology<sup/>

Authors:

Soonhoi HaAuthors Info & Claims

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Volume 41, Issue 11

Pages 4826 - 4836

https://doi.org/10.1109/TCAD.2021.3134843

Published: 01 November 2022 Publication History

Abstract

Recently, automated neural architecture search (NAS) emerges as the default technique to find a state-of-the-art (SOTA) convolutional neural network (CNN) architecture with higher accuracy than manually designed architectures for image classification. In this article, we present a fast hardware-aware NAS methodology, called S3NAS, reflecting the latest research results. It consists of three steps: 1) supernet design; 2) Single-Path NAS for fast architecture exploration; and 3) scaling and post-processing. In the first step, we design a supernet, superset of candidate networks with two features: one is to allow stages to have a different number of blocks, and the other is to enable blocks to have parallel layers of different kernel sizes (MixConv). Next, we perform a differential search by extending the Single-Path NAS technique to support the MixConv layer and to add a latency-aware loss term to reduce the hyperparameter search overhead. Finally, we use compound scaling to scale up the network maximally within the latency constraint. In addition, we add squeeze-and-excitation (SE) blocks and h-swish activation functions if beneficial in the post-processing step. Experiments with the proposed methodology on four different hardware platforms demonstrate the effectiveness of the proposed methodology. It is capable of finding networks with better latency–accuracy tradeoff than SOTA networks, and the network search can be done within 4 h using TPUv3.

References

[1]

H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, “Pruning filters for efficient ConvNets,” in Proc. ICLR, 2017.

[2]

E. Park, J. Ahn, and S. Yoo, “Weighted-entropy-based quantization for deep neural networks,” in Proc. CVPR, Honolulu, HI, USA, 2017, pp. 7197–7205.

[3]

Y.-D. Kim, E. Park, S. Yoo, T. Choi, L. Yang, and D. Shin, “Compression of deep convolutional neural networks for fast and low power mobile applications,” 2015, arXiv:1511.06530.

[4]

F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” in Proc. CVPR, Honolulu, HI, USA, 2017, pp. 1800–1807.

[5]

M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted residuals and linear bottlenecks,” in Proc. CVPR, Salt Lake City, UT, USA, 2018, pp. 4510–4520.

[6]

S. Gupta and B. Akin, “Accelerator-aware neural network design using AutoML,” in Proc. MLSys Workshop, 2020. [Online]. Available: https://arxiv.org/pdf/2003.02838.pdf

[7]

M. Tan and Q. V. Le, “MixConv: Mixed depthwise convolutional kernels,” in Proc. BMVC, 2019, p. 74.

[8]

D. Stamouliset al., “Single-path NAS: Designing hardware-efficient ConvNets in less than 4 hours,” in Proc. ECML-PKDD, 2019, pp. 481–497.

[9]

M. Tan and Q. V. Le, “EfficientNet: Rethinking model scaling for convolutional neural networks,” in Proc. ICML, 2019, pp. 6105–6114.

[10]

J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in Proc. CVPR, 2018, pp. 2011–2023.

[11]

D. Kang, J. Kang, H. Kwon, H. Park, and S. Ha, “A novel convolutional neural network accelerator that enables fully-pipelined execution of layers,” in Proc. ICCD, Abu Dhabi, UAE, 2019, pp. 698–701.

[12]

B. Zoph and Q. V. Le, “Neural architecture search with reinforcement learning,” in Proc. ICLR, 2017.

[13]

B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learning transferable architectures for scalable image recognition,” in Proc. CVPR, 2018, pp. 8697–8710.

[14]

M. Tanet al., “MnasNet: Platform-aware neural architecture search for mobile,” in Proc. CVPR, Long Beach, CA, USA, 2019, pp. 2815–2823.

[15]

A. Marchisio, A. Massa, V. Mrazek, B. Bussolino, M. Martina, and M. Shafique, “NASCaps: A framework for neural architecture search to optimize the accuracy and hardware efficiency of convolutional capsule networks,” in Proc. ICCAD, 2020, pp. 1–9.

[16]

H. Liu, K. Simonyan, and Y. Yang, “DARTS: Differentiable architecture search,” in Proc. ICLR, 2019.

[17]

H. Cai, L. Zhu, and S. Han, “ProxylessNAS: Direct neural architecture search on target task and hardware,” in Proc. ICLR, 2019.

[18]

B. Wuet al., “FBNet: Hardware-aware efficient convnet design via differentiable neural architecture search,” in Proc. CVPR, 2019, pp. 10726–10734.

[19]

Y. Xuet al., “PC-DARTS: Partial channel connections for memory-efficient architecture search,” in Proc. Int. Conf. Learn. Represent., 2020.

[20]

D. Stamouliset al., “Single-path mobile AutoML: Efficient ConvNet design and NAS hyperparameter optimization,” IEEE J. Sel. Topics Signal Process., vol. 14, no. 4, pp. 609–622, May 2020.

[21]

P. Achararit, M. A. Hanif, R. V. W. Putra, M. Shafique, and Y. Hara-Azumi, “APNAS: Accuracy-and-performance-aware neural architecture search for neural hardware accelerators,” IEEE Access, vol. 8, pp. 165319–165334, 2020.

[22]

H. Cai, C. Gan, T. Wang, Z. Zhang, and S. Han, “Once-for-all: Train one network and specialize it for efficient deployment,” in Proc. ICLR, 2019.

[23]

R. Meng, W. Chen, D. Xie, Y. Zhang, and S. Pu, “Neural inheritance relation guided one-shot layer assignment search,” in Proc. AAAI, 2020, pp. 5158–5165.

[24]

I. Radosavovic, R. P. Kosaraju, R. Girshick, K. He, and P. Dollár, “Designing network design spaces,” in Proc. CVPR, 2020, pp. 10425–10433.

[25]

J. Meiet al., “AtomNAS: Fine-grained end-to-end neural architecture search,” in Proc. ICLR, 2020.

[26]

D. Han, J. Kim, and J. Kim, “Deep pyramidal residual networks,” in Proc. CVPR, Honolulu, HI, USA, 2017, pp. 6307–6315.

[27]

A. Howardet al., “Searching for MobileNetV3,” in Proc. ICCV, Seoul, South Korea, 2019, pp. 1314–1324.

[28]

P. Ramachandran, B. Zoph, and Q. V. Le, “Searching for activation functions,” in Proc. ICLR Workshop, 2018.

[29]

E. Park and S. Yoo, “Profit: A novel training method for sub-4-bit mobilenet models,” in Proc. ECCV, 2020, pp. 430–446.

[30]

M. Verucchiet al., “A systematic assessment of embedded neural networks for object detection,” in Proc. ETFA, 2020, pp. 937–944.

[31]

G. Huang, Y. Sun, Z. Liu, D. Sedra, and K. Q. Weinberger, “Deep networks with stochastic depth,” in Proc. ECCV, 2016, pp. 646–661.

Cited By

Luo XLiu DKong HHuai SChen HXiong GLiu W(2024)Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future EnvisionACM Transactions on Embedded Computing Systems10.1145/370172824:1(1-100)Online publication date: 24-Oct-2024
https://dl.acm.org/doi/10.1145/3701728
Ha SJeong E(2024)Software Optimization and Design Methodology for Low Power Computer Vision SystemsACM Transactions on Embedded Computing Systems10.1145/368731024:1(1-31)Online publication date: 7-Aug-2024
https://dl.acm.org/doi/10.1145/3687310
Panopoulos IVenieris SVenieris I(2024)CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN WorkloadsACM Transactions on Embedded Computing Systems10.1145/366586823:4(1-32)Online publication date: 29-Jun-2024
https://dl.acm.org/doi/10.1145/3665868
Show More Cited By

Index Terms

SNAS: Fast Hardware-Aware Neural Architecture Search Methodology

Index terms have been assigned to the content through auto-classification.

Recommendations

Neural Architecture Search Survey: A Hardware Perspective
We review the problem of automating hardware-aware architectural design process of Deep Neural Networks (DNNs). The field of Convolutional Neural Network (CNN) algorithm design has led to advancements in many fields, such as computer vision, virtual ...
FLASH: Fast Neural Architecture Search with Hardware Optimization
Special Issue ESWEEK 2021, CASES 2021, CODES+ISSS 2021 and EMSOFT 2021
Neural architecture search (NAS) is a promising technique to design efficient and high-performance deep neural networks (DNNs). As the performance requirements of ML applications grow continuously, the hardware accelerators start playing a central role in ...
Hardware-aware neural architecture search for stochastic computing-based neural networks on tiny devices
Abstract
Along with the progress of artificial intelligence (AI) democratization, there is an increasing potential for the deployment of deep neural networks (DNNs) to tiny devices, such as implantable cardioverter defibrillators (ICD). However,...
Highlights
- Hardware-Software co-design is brought into stochastic computing-based neural networks for the first time. It takes both hardware and software performance ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Volume 41, Issue 11

Nov. 2022

1569 pages

ISSN:0278-0070

Issue’s Table of Contents

1937-4151 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 01 November 2022

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Luo XLiu DKong HHuai SChen HXiong GLiu W(2024)Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future EnvisionACM Transactions on Embedded Computing Systems10.1145/370172824:1(1-100)Online publication date: 24-Oct-2024
https://dl.acm.org/doi/10.1145/3701728
Ha SJeong E(2024)Software Optimization and Design Methodology for Low Power Computer Vision SystemsACM Transactions on Embedded Computing Systems10.1145/368731024:1(1-31)Online publication date: 7-Aug-2024
https://dl.acm.org/doi/10.1145/3687310
Panopoulos IVenieris SVenieris I(2024)CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN WorkloadsACM Transactions on Embedded Computing Systems10.1145/366586823:4(1-32)Online publication date: 29-Jun-2024
https://dl.acm.org/doi/10.1145/3665868
Wang LXie LBi KZhao KGuo JTian Q(2023)M²NAS: Joint Neural Architecture Optimization System With Network TransmissionIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.322385242:8(2631-2642)Online publication date: 1-Aug-2023
https://dl.acm.org/doi/10.1109/TCAD.2022.3223852
Barbudo RVentura SRomero J(2023)Eight years of AutoML: categorisation, review and trendsKnowledge and Information Systems10.1007/s10115-023-01935-165:12(5097-5149)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1007/s10115-023-01935-1

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents