Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3583740.3628437acmconferencesArticle/Chapter ViewAbstractPublication PagessecConference Proceedingsconference-collections
research-article
Open access

Bang for the Buck: Evaluating the cost-effectiveness of Heterogeneous Edge Platforms for Neural Network Workloads

Published: 07 August 2024 Publication History

Abstract

Machine learning (ML) applications have experienced remarkable growth and integration into various domains. However, challenges with cloud-based deployments, such as latency, privacy, reliability, bandwidth and connectivity, have driven the popularity of deploying ML on edge devices. ML application deployment stack consists of various components such as neural network models, input frameworks, software runtime libraries and hardware architecture. Understanding the impact of different components in the ML stack on deployment effectiveness, particularly in terms of cost effectiveness, remains a challenge. In this work, we systematically analyze the diverse choices available for each component of the ML stack and their influence on deployment performance. We empirically evaluate eight heterogeneous edge platforms and eight software runtime libraries, considering various hardware components like CPUs, GPUs, NPUs, and VPUs for ML inference. Our findings contribute to a better understanding of optimizing cost effectiveness in ML deployments on edge platforms, aiding decision-making for application developers and stakeholders.

References

[1]
[n. d.]. The 2022 Gartner Hype Cycle™for Artificial Intelligence (AI). https://www.gartner.com/en/articles/what-s-new-in-artificial-intelligence-from-the-2022-gartner-hype-cycle.
[2]
[n. d.]. Acuity Toolkit : https://github.com/khadas/aml_npu_sdk.
[3]
[n. d.]. AI in Edge Devices. https://www.linleygroup.com/events/agenda.php?num=49&day=1.
[4]
[n. d.]. ARMNN : https://github.com/ARM-software/armnn.
[5]
[n. d.]. Intel NCS2. https://www.intel.com/content/www/us/en/developer/articles/tool/neural-compute-stick.html.
[6]
[n. d.]. Intel® Distribution of OpenVINO™ Toolkit. https://github.com/openvinotoolkit/openvino.
[7]
[n. d.]. Jetson Xavier AGX. https://www.nvidia.com/en-in/autonomous-machines/embedded-systems/jetson-agx-xavier/.
[8]
[n. d.]. Jetson Xavier Nano. https://developer.nvidia.com/embedded/jetson-nano-developer-kit.
[9]
[n. d.]. Jetson Xavier NX. https://www.nvidia.com/en-in/autonomous-machines/embedded-systems/jetson-xavier-nx/.
[10]
[n. d.]. Jetson Xavier TX2. https://developer.nvidia.com/embedded/jetson-tx2.
[11]
[n. d.]. Keras Applications. https://keras.io/api/applications/.
[12]
[n. d.]. Khadas Vim3. https://www.khadas.com/vim3.
[13]
[n. d.]. KSNN : https://github.com/khadas/ksnn.
[14]
[n. d.]. NVIDIA TensorRT. https://developer.nvidia.com/tensorrt.
[15]
[n. d.]. ODROID H2. https://www.hardkernel.com/shop/odroid-h2/.
[16]
[n. d.]. Odroid M1. https://www.hardkernel.com/shop/odroid-m1-with-8gbyte-ram/.
[17]
[n. d.]. PyArmNN : https://github.com/NXPmicro/pyarmnn-release.
[18]
[n. d.]. Pycoral : https://github.com/google-coral/pycoral.
[19]
[n. d.]. RKNN : https://github.com/rockchip-linux/rknn-toolkit.
[20]
[n. d.]. Top 5 Edge AI Trends to Watch in 2023. https://blogs.nvidia.com/blog/2022/12/19/edge-ai-trends-2023/.
[21]
Hazem A. Abdelhafez, Hassan Halawa, Amr Almoallim, Amirhossein Ahmadi, Karthik Pattabiraman, and Matei Ripeanu. 2022. Characterizing Variability in Heterogeneous Edge Systems: A Methodology & Case Study. In 7th IEEE/ACM Symposium on Edge Computing, SEC 2022, Seattle, WA, USA, December 5--8, 2022. IEEE, 107--121.
[22]
Robert Adolf, Saketh Rama, Brandon Reagen, Gu-Yeon Wei, and David Brooks. 2016. Fathom: Reference workloads for modern deep learning methods. In 2016 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 1--10.
[23]
Mattia Antonini, Tran Huy Vu, Chulhong Min, Alessandro Montanari, Akhil Mathur, and Fahim Kawsar. 2019. Resource Characterisation of Personal-Scale Sensing Models on Edge Accelerators. In Proceedings of the First International Workshop on Challenges in Artificial Intelligence and Machine Learning for Internet of Things (New York, NY, USA) (AIChallengeIoT'19). Association for Computing Machinery, New York, NY, USA, 49--55.
[24]
Martin Arlitt, Manish Marwah, Gowtham Bellala, Amip Shah, Jeff Healey, and Ben Vandiver. 2015. IoTAbench: An Internet of Things Analytics Benchmark. In Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering (Austin, Texas, USA) (ICPE '15). Association for Computing Machinery, New York, NY, USA, 133--144.
[25]
S. Baller, A. Jindal, M. Chadha, and M. Gerndt. 2021. DeepEdgeBench: Benchmarking Deep Neural Networks on Edge Devices. In 2021 IEEE International Conference on Cloud Engineering (IC2E). IEEE Computer Society, 20--30.
[26]
Simone Bianco, Rémi Cadène, Luigi Celona, and Paolo Napoletano. 2018. Benchmark Analysis of Representative Deep Neural Network Architectures. IEEE Access 6 (2018), 64270--64277.
[27]
Michaela Blott, Lisa Halder, Miriam Leeser, and Linda Doyle. 2019. Qutibench: Benchmarking neural networks on heterogeneous hardware. ACM Journal on Emerging Technologies in Computing Systems (JETC) 15, 4 (2019), 1--38.
[28]
Tianshi Chen, Yunji Chen, Marc Duranton, Qi Guo, Atif Hashmi, Mikko Lipasti, Andrew Nere, Shi Qiu, Michele Sebag, and Olivier Temam. 2012. BenchNN: On the broad potential application scope of hardware neural network accelerators. In 2012 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 36--45.
[29]
François Chollet. 2017. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1251--1258.
[30]
Cody Coleman, Daniel Kang, Deepak Narayanan, Luigi Nardi, Tian Zhao, Jian Zhang, Peter Bailis, Kunle Olukotun, Chris Ré, and Matei Zaharia. 2019. Analysis of dawnbench, a time-to-accuracy machine learning performance benchmark. ACM SIGOPS Operating Systems Review 53, 1 (2019), 14--25.
[31]
Anthony Danalis, Gabriel Marin, Collin McCurdy, Jeremy S Meredith, Philip C Roth, Kyle Spafford, Vinod Tipparaju, and Jeffrey S Vetter. 2010. The scalable heterogeneous computing (SHOC) benchmark suite. In Proceedings of the 3rd workshop on general-purpose computation on graphics processing units. 63--74.
[32]
DeepBench. 2018. DeepBench-2018. In https://svail.github.io/DeepBench/.
[33]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. IEEE, 248--255.
[34]
Johann Hauswald, Yiping Kang, Michael A Laurenzano, Quan Chen, Cheng Li, Trevor Mudge, Ronald G Dreslinski, Jason Mars, and Lingjia Tang. 2015. Djinn and tonic: Dnn as a service and its implications for future warehouse scale computers. In 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA). IEEE, 27--40.
[35]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Identity mappings in deep residual networks. In European conference on computer vision. Springer, 630--645.
[36]
Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700--4708.
[37]
Spencer Lin Jason Jackson and IBM Marcelo Sávio. [n. d.]. The edge computing advantage. https://www.ibm.com/downloads/cas/Y7RA6X93.
[38]
EunJin Jeong, Jangryul Kim, Samnieng Tan, Jaeseong Lee, and Soonhoi Ha. 2021. Deep learning inference parallelization on heterogeneous processors with tensorrt. IEEE Embedded Systems Letters 14, 1 (2021), 15--18.
[39]
Jongmin Jo, Sucheol Jeong, and Pilsung Kang. 2020. Benchmarking gpu-accelerated edge devices. In 2020 IEEE International Conference on Big Data and Smart Computing (BigComp). IEEE, 117--120.
[40]
Duseok Kang, DongHyun Kang, Jintaek Kang, Sungjoo Yoo, and Soonhoi Ha. 2018. Joint optimization of speed, accuracy, and energy for embedded image recognition systems. In 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 715--720.
[41]
Yu Liu, Hantian Zhang, Luyuan Zeng, Wentao Wu, and Ce Zhang. 2018. MLbench: benchmarking machine learning services against human experts. Proceedings of the VLDB Endowment 11, 10 (2018), 1220--1232.
[42]
Vijay Janapa Reddi, Christine Cheng, David Kanter, Peter Mattson, Guenther Schmuelling, Carole-Jean Wu, Brian Anderson, Maximilien Breughe, Mark Charlebois, William Chou, et al. 2020. Mlperf inference benchmark. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA). IEEE, 446--459.
[43]
Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An Incremental Improvement.
[44]
Albert Reuther, Peter Michaleas, Michael Jones, Vijay Gadepally, Siddharth Samsi, and Jeremy Kepner. 2019. Survey and benchmarking of machine learning accelerators. In 2019 IEEE high performance extreme computing conference (HPEC). IEEE, 1--9.
[45]
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4510--4520.
[46]
Linpeng Tang, Yida Wang, Theodore L Willke, and Kai Li. 2018. Scheduling computation graphs of deep learning models on manycore cpus. arXiv preprint arXiv:1807.09667 (2018).
[47]
Jin-Hua Tao, Zi-Dong Du, Qi Guo, Hui-Ying Lan, Lei Zhang, Sheng-Yuan Zhou, Ling-Jie Xu, Cong Liu, Hai-Feng Liu, Shan Tang, et al. 2018. Benchip: Benchmarking intelligence processors. Journal of Computer Science and Technology 33, 1 (2018), 1--23.
[48]
Blesson Varghese, Nan Wang, David Bermbach, Cheol-Ho Hong, Eyal De Lara, Weisong Shi, and Christopher Stewart. 2021. A Survey on Edge Performance Benchmarking. ACM Comput. Surv. 54, 3, Article 66 (apr 2021), 33 pages.
[49]
Siqi Wang, Anuj Pathania, and Tulika Mitra. 2020. Neural network inference on mobile SoCs. IEEE Design & Test 37, 5 (2020), 50--57.
[50]
Samuel Williams, Andrew Waterman, and David Patterson. 2009. Roofline: an insightful visual performance model for multicore architectures. Commun. ACM 52, 4 (4 2009).
[51]
Yecheng Xiang and Hyoseung Kim. 2019. Pipelined data-parallel CPU/GPU scheduling for multi-DNN real-time inference. In 2019 IEEE Real-Time Systems Symposium (RTSS). IEEE, 392--405.

Cited By

View all
  • (2024)Stress-Testing USB Accelerators for Efficient Edge Inference2024 IEEE/ACM Symposium on Edge Computing (SEC)10.1109/SEC62691.2024.00015(1-14)Online publication date: 4-Dec-2024
  • (2024)Energy modeling of inference workloads with AI accelerators at the Edge: A benchmarking study2024 IEEE International Conference on Cloud Engineering (IC2E)10.1109/IC2E61754.2024.00028(189-196)Online publication date: 24-Sep-2024
  • (2024)Flow Control Solution to Avoid Bottlenecks in Edge Computing for Video Analytics2024 9th International Conference on Fog and Mobile Edge Computing (FMEC)10.1109/FMEC62297.2024.10710217(74-81)Online publication date: 2-Sep-2024

Index Terms

  1. Bang for the Buck: Evaluating the cost-effectiveness of Heterogeneous Edge Platforms for Neural Network Workloads

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        SEC '23: Proceedings of the Eighth ACM/IEEE Symposium on Edge Computing
        December 2023
        405 pages
        ISBN:9798400701238
        DOI:10.1145/3583740
        Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 07 August 2024

        Check for updates

        Author Tags

        1. neural networks
        2. edge computing
        3. accelerators

        Qualifiers

        • Research-article

        Conference

        SEC '23
        Sponsor:
        SEC '23: Eighth ACM/IEEE Symposium on Edge Computing
        December 6 - 9, 2023
        DE, Wilmington, USA

        Acceptance Rates

        Overall Acceptance Rate 40 of 100 submissions, 40%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)85
        • Downloads (Last 6 weeks)19
        Reflects downloads up to 26 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Stress-Testing USB Accelerators for Efficient Edge Inference2024 IEEE/ACM Symposium on Edge Computing (SEC)10.1109/SEC62691.2024.00015(1-14)Online publication date: 4-Dec-2024
        • (2024)Energy modeling of inference workloads with AI accelerators at the Edge: A benchmarking study2024 IEEE International Conference on Cloud Engineering (IC2E)10.1109/IC2E61754.2024.00028(189-196)Online publication date: 24-Sep-2024
        • (2024)Flow Control Solution to Avoid Bottlenecks in Edge Computing for Video Analytics2024 9th International Conference on Fog and Mobile Edge Computing (FMEC)10.1109/FMEC62297.2024.10710217(74-81)Online publication date: 2-Sep-2024

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Login options

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media