Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3558481.3591311acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
abstract

Brief Announcement: Accelerate CNN Inference with Zoning Graph at Dynamic Granularity

Published: 17 June 2023 Publication History

Abstract

Partitioning a CNN and parallel executing inference with multiple IoT devices have gained popularity as a way to meet real-time requirements without sacrificing model accuracy. However, existing algorithms have struggled to find the optimal model partitioning granularity for complex CNNs. Additionally, executing inference with heterogeneous IoT devices is NP-hard when the structure of the CNN is a directed acyclic graph (DAG) rather than a chain. In this paper, we introduce a versatile and cooperative inference framework that combines both model and data parallelism to accelerate CNN inference. DeepZoning employs two algorithms at different levels: (1) a low-level Adaptive Workload Partition algorithm that uses linear programming and takes spatial and channel dimensions into optimization during the search for feature map distribution on heterogeneous devices, and (2) a high-level Model Partition algorithm that finds the optimal model granularity and organizes complex CNNs into sequential zones to balance communication and computation during execution.

Supplemental Material

MP4 File
To parallel executing inference with IoT edge clusters, We introduce DeepZoning, a versatile and cooperative inference framework that combines both model and data parallelism to accelerate CNN inference.

References

[1]
Mijung Kim and K Selçuk Candan. 2012. SBV-Cut: Vertex-cut based graph partitioning using structural balance vertices. Data & Knowledge Engineering, Vol. 72 (2012), 285--303.
[2]
Rui Li, Yufan Xu, Aravind Sukumaran-Rajam, Atanas Rountev, and P Sadayappan. 2021. Efficient Distributed Algorithms for Convolutional Neural Networks. In Proceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures. 439--442.
[3]
Jiachen Mao, Xiang Chen, Kent W Nixon, Christopher Krieger, and Yiran Chen. 2017. Modnn: Local distributed mobile computing system for deep neural network. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017. IEEE, 1396--1401.
[4]
Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alexander A Alemi. 2017. Inception-v4, inception-resnet and the impact of residual connections on learning. In Thirty-first AAAI conference on artificial intelligence.
[5]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1--9.
[6]
Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. 2017. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1492--1500.
[7]
Liekang Zeng, Xu Chen, Zhi Zhou, Lei Yang, and Junshan Zhang. 2020. Coedge: Cooperative dnn inference with adaptive workload partitioning over heterogeneous edge devices. IEEE/ACM Transactions on Networking, Vol. 29, 2 (2020), 595--608.
[8]
Li Zhou, Mohammad Hossein Samavatian, Anys Bacha, Saikat Majumdar, and Radu Teodorescu. 2019. Adaptive parallel execution of deep neural networks on heterogeneous edge devices. In Proceedings of the 4th ACM/IEEE Symposium on Edge Computing. 195--208.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SPAA '23: Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures
June 2023
504 pages
ISBN:9781450395458
DOI:10.1145/3558481
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 June 2023

Check for updates

Author Tags

  1. cooperative cnn inference
  2. edge computing
  3. graph partition
  4. model deployment

Qualifiers

  • Abstract

Data Availability

To parallel executing inference with IoT edge clusters, We introduce DeepZoning, a versatile and cooperative inference framework that combines both model and data parallelism to accelerate CNN inference. https://dl.acm.org/doi/10.1145/3558481.3591311#SPAA23-ba064.mp4

Funding Sources

  • National Natural Science Foundation of China

Conference

SPAA '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 447 of 1,461 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 153
    Total Downloads
  • Downloads (Last 12 months)103
  • Downloads (Last 6 weeks)1
Reflects downloads up to 18 Aug 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media