Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/DAC18074.2021.9586250guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

NAAS: Neural Accelerator Architecture Search

Published: 05 December 2021 Publication History

Abstract

Data-driven, automatic design space exploration of neural accelerator architecture is desirable for specialization and productivity. Previous frameworks focus on sizing the numerical architectural hyper-parameters while neglect searching the PE connectivities and compiler mappings. To tackle this challenge, we propose Neural Accelerator Architecture Search (NAAS) that holistically searches the neural network architecture, accelerator architecture and compiler mapping in one optimization loop. NAAS composes highly matched architectures together with efficient mapping. As a data-driven approach, NAAS rivals the human design Eyeriss by $4.4 \times$ EDP reduction with 2.7% accuracy improvement on ImageNet under the same computation resource, and offers $1.4 \times$ to $3.5 \times$ EDP reduction than only sizing the architectural hyper-parameters.

References

[1]
H. Cai, L. Zhu, and S. Han, “ProxylessNAS: Direct neural architecture search on target task and hardware,” in ICLR, 2019.
[2]
Q. Lu et al., “On neural architecture search for resource-constrained hardware platforms,” 2019.
[3]
K. Wang et al., “HAQ: Hardware-aware automated quantization with mixed precision,” in CVPR, 2019, pp. 8612–8620.
[4]
H. Cai et al., “Once for all: Train one network and specialize it for efficient deployment,” in ICLR, 2020.
[5]
T. Chen et al., “Learning to optimize tensor programs,” in NeurIPS, 2018, pp. 3389–3400.
[6]
S.-C. Kao and T. Krishna, “Gamma: Mapping space exploration via genetic algorithm,” in ICCAD’20, 2020.
[7]
C. Hao et al., “FPGA/DNN co-design: An efficient design methodology for 1ot intelligence on the edge,” in DAC, 2019, pp. 1–6.
[8]
Y. Li et al., “EDD: Efficient differentiable dnn architecture and implementation co-search for embedded ai solutions,” DAC, 2020.
[9]
X. Zhang et al., “Skynet: a hardware-efficient method for object detection and tracking on embedded systems,” 2020.
[10]
S.-C. Kao et al., “Confuciux: Autonomous hardware resource assignment for dnn accelerators using reinforcement learning,” in MICRO, 2020.
[11]
L. Yang et al., “Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks,” 2020.
[12]
Y. Lin et al., “Neural hardware architecture search,” NeurIPS WS, 2019.
[13]
W. Jiang et al., “Hardware/software co-exploration of neural architectures,” TCAD, 2020.
[14]
Y.-H. Chen et al., “Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks,” IJSSC, vol. 52, no. 1, 2016.
[15]
NVIDIA, “NVIDIA deep learning accelerator,” 2017. [Online]. Available: http://nvidia.org
[16]
J. Albericio et al., “Cnvlutin: Ineffectual-neuron-free deep neural network computing,” ACM SIGARCH, vol. 44, no. 3, pp. 1–13, 2016.
[17]
N. Hansen, “The CMA evolution strategy: a comparing review,” in Towards a new evolutionary computation. Springer, 2006, pp. 75–102.
[18]
Z. Du et al., “Shidiannao: Shifting vision processing closer to the sensor,” in ISCA, 2015, pp. 92–104.
[19]
E. Strubell et al., “Energy and policy considerations for deep learning in nlp,” in ACL, 2019, pp. 3645–3650.
[20]
M. Motamedi et al., “Design space exploration of fpga-based deep convolutional neural networks,” in ASP-DAC. IEEE, 2016, pp. 575–580.
[21]
G. Zhong et al., “Design space exploration of fpga-based accelerators with multi-level parallelism,” in DATE. IEEE, 2017, pp. 1141–1146.
[22]
Y. Chen et al., “Cloud-dnn: An open framework for mapping dnn models to cloud fpgas,” in FPGA, 2019, pp. 73–82.
[23]
T. Chen et al., “TVM: An automated end-to-end optimizing compiler for deep learning,” in 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), 2018, pp. 578–594.
[24]
J. Mu et al., “A history-based auto-tuning framework for fast and high-performance dnn design on gpu,” in 2020 57th ACM/IEEE Design Automation Conference (DAC). IEEE, 2020, pp. 1–6.
[25]
Y. S. Shao et al., “The aladdin approach to accelerator design and modeling,” IEEE Micro, vol. 35, no. 3, pp. 58–70, 2015.
[26]
Y. N. Wu and V. Sze, “Accelergy: An architecture-level energy estimation methodology for accelerator designs,” in TCAD, 2019.
[27]
A. Parashar et al., “Timeloop: A systematic approach to dnn accelerator evaluation,” in ISPASS. IEEE, 2019, pp. 304–315.
[28]
H. Kwon et al., “Understanding reuse, performance, and hardware cost of dnn dataflow: A data-centric approach,” in ISCA, 2019, pp. 754–768.
[29]
M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” in ICML, 2019, pp. 6105–6114.
[30]
B. Zoph and Q. V. Le, “Neural architecture search with reinforcement learning,” in ICLR, 2017.
[31]
Z. Guo et al., “Single path one-shot neural architecture search with uniform sampling,” arXiv preprint, 2019.

Cited By

View all
  • (2024)Mobile Foundation Model as FirmwareProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3649361(279-295)Online publication date: 29-May-2024
  • (2023)Explainable-DSE: An Agile and Explainable Exploration of Efficient HW/SW Codesigns of Deep Learning Accelerators Using Bottleneck AnalysisProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 410.1145/3623278.3624772(87-107)Online publication date: 25-Mar-2023
  • (2023)UNICO: Unified Hardware Software Co-Optimization for Robust Neural Network AccelerationProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3614282(77-90)Online publication date: 28-Oct-2023
  • Show More Cited By

Index Terms

  1. NAAS: Neural Accelerator Architecture Search
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    2021 58th ACM/IEEE Design Automation Conference (DAC)
    Dec 2021
    1380 pages

    Publisher

    IEEE Press

    Publication History

    Published: 05 December 2021

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 01 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Mobile Foundation Model as FirmwareProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3649361(279-295)Online publication date: 29-May-2024
    • (2023)Explainable-DSE: An Agile and Explainable Exploration of Efficient HW/SW Codesigns of Deep Learning Accelerators Using Bottleneck AnalysisProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 410.1145/3623278.3624772(87-107)Online publication date: 25-Mar-2023
    • (2023)UNICO: Unified Hardware Software Co-Optimization for Robust Neural Network AccelerationProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3614282(77-90)Online publication date: 28-Oct-2023
    • (2023)CODEBench: A Neural Architecture and Hardware Accelerator Co-Design FrameworkACM Transactions on Embedded Computing Systems10.1145/357579822:3(1-30)Online publication date: 20-Apr-2023
    • (2022)Enable Deep Learning on Mobile Devices: Methods, Systems, and ApplicationsACM Transactions on Design Automation of Electronic Systems10.1145/348661827:3(1-50)Online publication date: 4-Mar-2022

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media