Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3528535.3565254acmconferencesArticle/Chapter ViewAbstractPublication PagesmiddlewareConference Proceedingsconference-collections
research-article
Open access

MicroEdge: a multi-tenant edge cluster system architecture for scalable camera processing

Published: 08 November 2022 Publication History

Abstract

With the proliferation of high bandwidth cameras and AR/VR devices, and their increasing use in situation awareness applications, edge computing is gaining prominence to meet the throughput requirements of such applications. This work focuses on camera applications that perform real-time Machine Learning inferences on camera frames. We find that Machine Learning based camera applications suffer from hardware resource fragmentation due to models under-utilizing or over-utilizing the accelerator. Meanwhile, it is challenging to support fine-grained resource sharing for accelerators such as TPUs because they can only process requests sequentially in a run to completion fashion. We present MicroEdge, a multi-tenant low-cost edge cluster for camera processing applications running at the edge. MicroEdge provides multi-tenancy support for Coral TPUs by extending K3s, an edge-specific distribution of Kubernetes. Through an admission control algorithm, it allows for fractional assignment of TPU resources commensurate with the application pipeline requirements to ensure that the TPUs are fully utilized. Using real-time camera processing applications and a real-world trace, we show that MicroEdge can support up to 2.8x camera streams for a given hardware configuration compared to vanilla K3s, while maintaining scalability and performance requirements.

References

[1]
Amazon. 2022. What Is Amazon SageMaker? Retrieved October 3, 2022 from https://docs.aws.amazon.com/sagemaker/latest/dg/whatis.html
[2]
K3s Project Authors. 2022. K3s: Lightweight Kubernetes. Retrieved October 3, 2022 from https://k3s.io/
[3]
Johan Barthélemy, Nicolas Verstaevel, Hugh Forehead, and Pascal Perez. 2019. Edge-Computing Video Analytics for Real-Time Traffic Monitoring in a Smart City. Sensors 19, 9, Article 2048 (May 2019), 23 pages.
[4]
Michael Brooks, Naveen-Dodda, and Peter Malkin. 2021. Google Coral BodyPix. Retrieved October 3, 2022 from https://github.com/google-coral/project-bodypix.git
[5]
Junguk Cho, Diman Zad Tootaghaj, Lianjie Cao, and Puneet Sharma. 2022. SLA-Driven ML Inference Framework For Clouds With Heterogeneous Accelerators. In Proceedings of the 5th Conference on Machine Learning and Systems (Santa Clara, California, August 29 - September 1, 2022) (MLSys '22). 20--32. https://proceedings.mlsys.org/paper/2022/file/0777d5c17d4066b82ab86dff8a46af6f-Paper.pdf
[6]
Wikipedia contributors. 2022. Bin packing problem. Retrieved October 3, 2022 from https://en.wikipedia.org/w/index.php?title=Bin_packing_problem
[7]
Daniel Crankshaw, Xin Wang, Guilio Zhou, Michael J. Franklin, Joseph E. Gonzalez, and Ion Stoica. 2017. Clipper: A Low-Latency Online Prediction Serving System. In Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation (Boston, Massachusetts, March 27 - 29, 2017) (NSDI '17). USENIX, Berkeley, CA, USA, 613 -- 627. https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/crankshaw
[8]
Alan Demers, Srinivasan Keshav, and Scott Shenker. 1989. Analysis and Simulation of a Fair Queueing Algorithm. SIGCOMM Comput. Commun. Rev. 19, 4 (Aug 1989), 1 -- 12.
[9]
The Raspberry Pi Foundation. 2022. Raspberry Pi 4 Model B. Retrieved October 3, 2022 from https://www.raspberrypi.com/products/raspberry-pi-4-model-b/
[10]
Google. 2020. Co-compiling multiple models. Retrieved October 3, 2022 from https://coral.ai/docs/edgetpu/compiler/#co-compiling-multiple-models
[11]
Google.2020. Parameter data caching. Retrieved October 3, 2022 from https://coral.ai/docs/edgetpu/compiler/#parameter-data-caching
[12]
Google. 2020. What is the Edge TPU? Retrieved October 3, 2022 from https://coral.ai/docs/edgetpu/faq/
[13]
Google. 2022. Vertex AI. Retrieved October 3, 2022 from https://cloud.google.com/vertex-ai
[14]
Arpan Gujarati, Reza Karimi, Safya Alzayat, Wei Hao, Antoine Kaufmann, Ymir Vigfusson, and Jonathan Mace. 2020. Serving DNNs like Clockwork: Performance Predictability from the Bottom Up. In Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation (Virtual, November 4 - 6, 2020) (OSDI '20). USENIX, Berkeley, CA, USA, 443 -- 462. https://www.usenix.org/conference/osdi20/presentation/gujarati
[15]
Intel. 2022. Intel Neural Compute Stick 2. Retrieved October 3, 2022 from https://software.intel.com/content/www/us/en/develop/hardware/neural-compute-stick.html
[16]
IPVM. 2021. Average Frame Rate Video Surveillance Statistics 2021. Retrieved October 3, 2022 fromhttps://ipvm.com/reports/average-frame-rate-video-surveillance-2021
[17]
Samvit Jain, Xun Zhang, Yuhao Zhou, Ganesh Ananthanarayanan, Junchen Jiang, Yuanchao Shu, Paramvir Bahl, and Joseph Gonzalez. 2020. Spatula: Efficient cross-camera video analytics on large camera networks. In Proceedings of the 5th IEEE/ACM Symposium on Edge Computing (Virtual, November 11 - 13, 2020) (SEC '20). ACM, New York, NY, USA, 110--124.
[18]
Si Young Jang, Boyan Kostadinov, and Dongman Lee. 2021. Microservice-based Edge Device Architecture for Video Analytics. In Proceedings of the 6th IEEE/ACM Symposium on Edge Computing (San Jose, California, December 14 - 17, 2021) (SEC '21). ACM, New York, NY, USA, 165 -- 177.
[19]
Daniel Kang, John Emmons, Firas Abuzaid, Peter Bailis, and Matei Zaharia. 2017. NoScope: Optimizing Neural Network Queries over Video at Scale. In Proceedings of the 43rd International Conference on Very Large Data Bases (Munich, Germany, Auguest 28 -- September 1, 2017) (VLDB '17). VLDB Endowment, Los Angeles, CA, USA, 1586 -- 1597.
[20]
Kubernetes. 2022. Assigning Pods to Nodes. Retrieved October 3, 2022 from https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/
[21]
Kubernetes. 2022. Managing Resources for Containers. Retrieved October 3, 2022 from https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
[22]
Kubernetes. 2022. Production-Grade Container Orchestration. Retrieved October 3, 2022 from https://kubernetes.io/
[23]
Kubernetes. 2022. Service. Retrieved October 3, 2022 from https://kubernetes.io/docs/concepts/services-networking/service/
[24]
NVIDIA. 2020. CUDA Multi-process Service. Retrieved October 3, 2022 from https://docs.nvidia.com/deploy/pdf/CUDA_Multi_Process_Service_Overview.pdf
[25]
NVIDIA. 2022. Jetson Nano Developer Kit. Retrieved October 3, 2022 from https://developer.nvidia.com/embedded/jetson-nano-developer-kit
[26]
NVIDIA. 2022. Multi-Process Service. Retrieved October 3, 2022 from https://docs.nvidia.com/pdf/CUDA_Multi_Process_Service_Overview.pdf
[27]
NVIDIA. 2022. NVIDIA Container Toolkit. Retrieved October 3, 2022 from https://github.com/NVIDIA/nvidia-docker
[28]
NVIDIA. 2022. NVIDIA Triton Inference Server. Retrieved October 3, 2022 from https://developer.nvidia.com/nvidia-triton-inference-server
[29]
Christopher Olston, Noah Fiedel, Kiril Gorovoy, Jeremiah Harmsen, Li Lao, Fangwei Li, Vinu Rajashekhar, Sukriti Ramesh, and Jordan Soyke. 2017. Tensorflow-serving: Flexible, high-performance ml serving. In Workshop on ML Systems at NIPS 2017 (Long Beach, California, December 8, 2017). 8 pages. http://learningsys.org/nips17/assets/papers/paper_1.pdf
[30]
George Papandreou, Tyler Zhu, Liang-Chieh Chen, Spyros Gidaris, Jonathan Tompson, and Kevin Murphy. 2018. PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model. In Proceedings of the 15th European Conference on Computer Vision (Munich, Germany, September 8 - 14, 2018) (ECCV '18). Springer, Cham, Switzerland, 282 -- 299.
[31]
Albert Pumarola, Jordi Sanchez, Gary P. T. Choi, Alberto Sanfeliu, and Francesc Moreno. 2019. 3DPeople: Modeling the Geometry of Dressed Humans. In Proceedings of the 17th IEEE/CVF International Conference on Computer Vision (Seoul, Korea, October 27 - November 2, 2019) (ICCV '19). IEEE, New York, NY, USA, 2242--2251.
[32]
Python. 2020. Python roundrobin 0.0.2. Retrieved December 12, 2021 from https://pypi.org/project/roundrobin/
[33]
Francisco Romero, Qian Li, Neeraja J Yadwadkar, and Christos Kozyrakis. 2021. INFaaS: Automated Model-less Inference Serving. In Proceedings of the 2021 USENIX Annual Technical Conference (Virtual, July 14 - 16, 2021) (ATC '21). USENIX, Berkeley, CA, USA, 397--411. https://www.usenix.org/conference/atc21/presentation/romero
[34]
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (Salt Lake City, Utah, June 18 - 22, 2018) (CVPR '18). IEEE, New York, NY, USA, 4510 -- 4520.
[35]
Mahadev Satyanarayanan. 2017. The Emergence of Edge Computing. Computer 50, 1 (Jan 2017), 30--39.
[36]
Mohammad Shahrad, Rodrigo Fonseca, Íñigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini. 2020. Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider. In Proceedings of the 2020 USENIX Annual Technical Conference (Virtual, July 15 - 17, 2020) (ATC '20). USENIX, Berkeley, CA, USA, 205--218. https://www.usenix.org/conference/atc20/presentation/shahrad
[37]
TensorFlow. 2022. TensorFlow Hub. Retrieved October 3, 2022 from https://tfhub.dev/
[38]
Zhuangdi Xu, Harshil Shah, and Umakishore Ramachandran. 2020. Coral-Pie: A Geo-Distributed Edge-compute Solution for Space-Time Vehicle Tracking. In Proceedings of the 2020 ACM/IFIP Middleware (Delft, the Netherlands, December 7 - 11, 2020) (Middleware '20). ACM, New York, NY, USA, 400 - 414.
[39]
Zhe Yang, Klara Nahrstedt, Hongpeng Guo, and Qian Zhou. 2021. DeepRT: A Soft Real Time Scheduler for Computer Vision Applications on the Edge. In Proceedings of the 6th IEEE/ACM Symposium on Edge Computing (San Jose, California, December 14 - 17, 2021) (SEC '21). ACM, New York, NY, USA, 271 -- 284.
[40]
Juheon Yi and Youngki Lee. 2020. Heimdall: Mobile GPU Coordination Platform for Augmented Reality Applications. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking (London, United Kingdom, September 21 - 25, 2020) (MobiCom '20). ACM, New York, NY, USA, 462 -- 475.
[41]
Mingming Zhang, Chaochao Chen, Tianyu Wo, Tao Xie, Md. Zakirul Alam Bhuiyan, and Xuelian Lin. 2017. SafeDrive: Online Driving Anomaly Detection From Large-Scale Vehicle Data. IEEE Transactions on Industrial Informatics 13 (Dec 2017), 2087--2096.

Cited By

View all
  • (2024)FrameFeedback: A Closed-Loop Control System for Dynamic Offloading Real-Time Edge Inference2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00116(584-591)Online publication date: 27-May-2024
  • (2023)Internet-of-Things Edge Computing Systems for Streaming Video Analytics: Trails Behind and the Paths AheadIoT10.3390/iot40400214:4(486-513)Online publication date: 24-Oct-2023
  • (2023)Better Orchestration for SLO-Oriented Cross-site Microservices in Multi-tenant Cloud/Edge ContinuumProceedings of the 24th International Middleware Conference: Demos, Posters and Doctoral Symposium10.1145/3626564.3629091(9-10)Online publication date: 11-Dec-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
Middleware '22: Proceedings of the 23rd ACM/IFIP International Middleware Conference
November 2022
110 pages
ISBN:9781450393409
DOI:10.1145/3528535
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 November 2022

Permissions

Request permissions for this article.

Check for updates

Badges

  • Best Paper

Author Tags

  1. camera processing
  2. edge TPU
  3. edge computing
  4. machine learning inference
  5. resource aware scheduling

Qualifiers

  • Research-article

Funding Sources

Conference

Middleware '22
Sponsor:
Middleware '22: 23rd International Middleware Conference
November 7 - 11, 2022
QC, Quebec, Canada

Acceptance Rates

Middleware '22 Paper Acceptance Rate 8 of 21 submissions, 38%;
Overall Acceptance Rate 203 of 948 submissions, 21%

Upcoming Conference

MIDDLEWARE '24
25th International Middleware Conference
December 2 - 6, 2024
Hong Kong , Hong Kong

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)437
  • Downloads (Last 6 weeks)40
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2024)FrameFeedback: A Closed-Loop Control System for Dynamic Offloading Real-Time Edge Inference2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00116(584-591)Online publication date: 27-May-2024
  • (2023)Internet-of-Things Edge Computing Systems for Streaming Video Analytics: Trails Behind and the Paths AheadIoT10.3390/iot40400214:4(486-513)Online publication date: 24-Oct-2023
  • (2023)Better Orchestration for SLO-Oriented Cross-site Microservices in Multi-tenant Cloud/Edge ContinuumProceedings of the 24th International Middleware Conference: Demos, Posters and Doctoral Symposium10.1145/3626564.3629091(9-10)Online publication date: 11-Dec-2023
  • (2023)Compliant multimedia storage and data extraction from the untrusted and privacy-sensitive edge2023 International Conference on Multimedia Computing, Networking and Applications (MCNA)10.1109/MCNA59361.2023.10185828(123-130)Online publication date: 19-Jun-2023
  • (2023)Towards a Benchmark for Fog Data Processing2023 IEEE International Conference on Cloud Engineering (IC2E)10.1109/IC2E59103.2023.00018(92-98)Online publication date: 25-Sep-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media