research-article

Open access

MicroEdge: a multi-tenant edge cluster system architecture for scalable camera processing

Authors:

Enrique Saurez,

Tushar Krishna,

Umakishore RamachandranAuthors Info & Claims

Middleware '22: Proceedings of the 23rd ACM/IFIP International Middleware Conference

Pages 322 - 334

https://doi.org/10.1145/3528535.3565254

Published: 08 November 2022 Publication History

Abstract

With the proliferation of high bandwidth cameras and AR/VR devices, and their increasing use in situation awareness applications, edge computing is gaining prominence to meet the throughput requirements of such applications. This work focuses on camera applications that perform real-time Machine Learning inferences on camera frames. We find that Machine Learning based camera applications suffer from hardware resource fragmentation due to models under-utilizing or over-utilizing the accelerator. Meanwhile, it is challenging to support fine-grained resource sharing for accelerators such as TPUs because they can only process requests sequentially in a run to completion fashion. We present MicroEdge, a multi-tenant low-cost edge cluster for camera processing applications running at the edge. MicroEdge provides multi-tenancy support for Coral TPUs by extending K3s, an edge-specific distribution of Kubernetes. Through an admission control algorithm, it allows for fractional assignment of TPU resources commensurate with the application pipeline requirements to ensure that the TPUs are fully utilized. Using real-time camera processing applications and a real-world trace, we show that MicroEdge can support up to 2.8x camera streams for a given hardware configuration compared to vanilla K3s, while maintaining scalability and performance requirements.

References

[1]

Amazon. 2022. What Is Amazon SageMaker? Retrieved October 3, 2022 from https://docs.aws.amazon.com/sagemaker/latest/dg/whatis.html

[2]

K3s Project Authors. 2022. K3s: Lightweight Kubernetes. Retrieved October 3, 2022 from https://k3s.io/

[3]

Johan Barthélemy, Nicolas Verstaevel, Hugh Forehead, and Pascal Perez. 2019. Edge-Computing Video Analytics for Real-Time Traffic Monitoring in a Smart City. Sensors 19, 9, Article 2048 (May 2019), 23 pages.

[4]

Michael Brooks, Naveen-Dodda, and Peter Malkin. 2021. Google Coral BodyPix. Retrieved October 3, 2022 from https://github.com/google-coral/project-bodypix.git

[5]

Junguk Cho, Diman Zad Tootaghaj, Lianjie Cao, and Puneet Sharma. 2022. SLA-Driven ML Inference Framework For Clouds With Heterogeneous Accelerators. In Proceedings of the 5th Conference on Machine Learning and Systems (Santa Clara, California, August 29 - September 1, 2022) (MLSys '22). 20--32. https://proceedings.mlsys.org/paper/2022/file/0777d5c17d4066b82ab86dff8a46af6f-Paper.pdf

[6]

Wikipedia contributors. 2022. Bin packing problem. Retrieved October 3, 2022 from https://en.wikipedia.org/w/index.php?title=Bin_packing_problem

[7]

Daniel Crankshaw, Xin Wang, Guilio Zhou, Michael J. Franklin, Joseph E. Gonzalez, and Ion Stoica. 2017. Clipper: A Low-Latency Online Prediction Serving System. In Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation (Boston, Massachusetts, March 27 - 29, 2017) (NSDI '17). USENIX, Berkeley, CA, USA, 613 -- 627. https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/crankshaw

[8]

Alan Demers, Srinivasan Keshav, and Scott Shenker. 1989. Analysis and Simulation of a Fair Queueing Algorithm. SIGCOMM Comput. Commun. Rev. 19, 4 (Aug 1989), 1 -- 12.

Digital Library

[9]

The Raspberry Pi Foundation. 2022. Raspberry Pi 4 Model B. Retrieved October 3, 2022 from https://www.raspberrypi.com/products/raspberry-pi-4-model-b/

[10]

Google. 2020. Co-compiling multiple models. Retrieved October 3, 2022 from https://coral.ai/docs/edgetpu/compiler/#co-compiling-multiple-models

[11]

Google.2020. Parameter data caching. Retrieved October 3, 2022 from https://coral.ai/docs/edgetpu/compiler/#parameter-data-caching

[12]

Google. 2020. What is the Edge TPU? Retrieved October 3, 2022 from https://coral.ai/docs/edgetpu/faq/

[13]

Google. 2022. Vertex AI. Retrieved October 3, 2022 from https://cloud.google.com/vertex-ai

[14]

Arpan Gujarati, Reza Karimi, Safya Alzayat, Wei Hao, Antoine Kaufmann, Ymir Vigfusson, and Jonathan Mace. 2020. Serving DNNs like Clockwork: Performance Predictability from the Bottom Up. In Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation (Virtual, November 4 - 6, 2020) (OSDI '20). USENIX, Berkeley, CA, USA, 443 -- 462. https://www.usenix.org/conference/osdi20/presentation/gujarati

[15]

Intel. 2022. Intel Neural Compute Stick 2. Retrieved October 3, 2022 from https://software.intel.com/content/www/us/en/develop/hardware/neural-compute-stick.html

[16]

IPVM. 2021. Average Frame Rate Video Surveillance Statistics 2021. Retrieved October 3, 2022 fromhttps://ipvm.com/reports/average-frame-rate-video-surveillance-2021

[17]

Samvit Jain, Xun Zhang, Yuhao Zhou, Ganesh Ananthanarayanan, Junchen Jiang, Yuanchao Shu, Paramvir Bahl, and Joseph Gonzalez. 2020. Spatula: Efficient cross-camera video analytics on large camera networks. In Proceedings of the 5th IEEE/ACM Symposium on Edge Computing (Virtual, November 11 - 13, 2020) (SEC '20). ACM, New York, NY, USA, 110--124.

[18]

Si Young Jang, Boyan Kostadinov, and Dongman Lee. 2021. Microservice-based Edge Device Architecture for Video Analytics. In Proceedings of the 6th IEEE/ACM Symposium on Edge Computing (San Jose, California, December 14 - 17, 2021) (SEC '21). ACM, New York, NY, USA, 165 -- 177.

[19]

Daniel Kang, John Emmons, Firas Abuzaid, Peter Bailis, and Matei Zaharia. 2017. NoScope: Optimizing Neural Network Queries over Video at Scale. In Proceedings of the 43rd International Conference on Very Large Data Bases (Munich, Germany, Auguest 28 -- September 1, 2017) (VLDB '17). VLDB Endowment, Los Angeles, CA, USA, 1586 -- 1597.

Digital Library

[20]

Kubernetes. 2022. Assigning Pods to Nodes. Retrieved October 3, 2022 from https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/

[21]

Kubernetes. 2022. Managing Resources for Containers. Retrieved October 3, 2022 from https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

[22]

Kubernetes. 2022. Production-Grade Container Orchestration. Retrieved October 3, 2022 from https://kubernetes.io/

[23]

Kubernetes. 2022. Service. Retrieved October 3, 2022 from https://kubernetes.io/docs/concepts/services-networking/service/

[24]

NVIDIA. 2020. CUDA Multi-process Service. Retrieved October 3, 2022 from https://docs.nvidia.com/deploy/pdf/CUDA_Multi_Process_Service_Overview.pdf

[25]

NVIDIA. 2022. Jetson Nano Developer Kit. Retrieved October 3, 2022 from https://developer.nvidia.com/embedded/jetson-nano-developer-kit

[26]

NVIDIA. 2022. Multi-Process Service. Retrieved October 3, 2022 from https://docs.nvidia.com/pdf/CUDA_Multi_Process_Service_Overview.pdf

[27]

NVIDIA. 2022. NVIDIA Container Toolkit. Retrieved October 3, 2022 from https://github.com/NVIDIA/nvidia-docker

[28]

NVIDIA. 2022. NVIDIA Triton Inference Server. Retrieved October 3, 2022 from https://developer.nvidia.com/nvidia-triton-inference-server

[29]

Christopher Olston, Noah Fiedel, Kiril Gorovoy, Jeremiah Harmsen, Li Lao, Fangwei Li, Vinu Rajashekhar, Sukriti Ramesh, and Jordan Soyke. 2017. Tensorflow-serving: Flexible, high-performance ml serving. In Workshop on ML Systems at NIPS 2017 (Long Beach, California, December 8, 2017). 8 pages. http://learningsys.org/nips17/assets/papers/paper_1.pdf

[30]

George Papandreou, Tyler Zhu, Liang-Chieh Chen, Spyros Gidaris, Jonathan Tompson, and Kevin Murphy. 2018. PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model. In Proceedings of the 15th European Conference on Computer Vision (Munich, Germany, September 8 - 14, 2018) (ECCV '18). Springer, Cham, Switzerland, 282 -- 299.

Digital Library

[31]

Albert Pumarola, Jordi Sanchez, Gary P. T. Choi, Alberto Sanfeliu, and Francesc Moreno. 2019. 3DPeople: Modeling the Geometry of Dressed Humans. In Proceedings of the 17th IEEE/CVF International Conference on Computer Vision (Seoul, Korea, October 27 - November 2, 2019) (ICCV '19). IEEE, New York, NY, USA, 2242--2251.

[32]

Python. 2020. Python roundrobin 0.0.2. Retrieved December 12, 2021 from https://pypi.org/project/roundrobin/

[33]

Francisco Romero, Qian Li, Neeraja J Yadwadkar, and Christos Kozyrakis. 2021. INFaaS: Automated Model-less Inference Serving. In Proceedings of the 2021 USENIX Annual Technical Conference (Virtual, July 14 - 16, 2021) (ATC '21). USENIX, Berkeley, CA, USA, 397--411. https://www.usenix.org/conference/atc21/presentation/romero

[34]

Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (Salt Lake City, Utah, June 18 - 22, 2018) (CVPR '18). IEEE, New York, NY, USA, 4510 -- 4520.

[35]

Mahadev Satyanarayanan. 2017. The Emergence of Edge Computing. Computer 50, 1 (Jan 2017), 30--39.

Digital Library

[36]

Mohammad Shahrad, Rodrigo Fonseca, Íñigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini. 2020. Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider. In Proceedings of the 2020 USENIX Annual Technical Conference (Virtual, July 15 - 17, 2020) (ATC '20). USENIX, Berkeley, CA, USA, 205--218. https://www.usenix.org/conference/atc20/presentation/shahrad

[37]

TensorFlow. 2022. TensorFlow Hub. Retrieved October 3, 2022 from https://tfhub.dev/

[38]

Zhuangdi Xu, Harshil Shah, and Umakishore Ramachandran. 2020. Coral-Pie: A Geo-Distributed Edge-compute Solution for Space-Time Vehicle Tracking. In Proceedings of the 2020 ACM/IFIP Middleware (Delft, the Netherlands, December 7 - 11, 2020) (Middleware '20). ACM, New York, NY, USA, 400 - 414.

Digital Library

[39]

Zhe Yang, Klara Nahrstedt, Hongpeng Guo, and Qian Zhou. 2021. DeepRT: A Soft Real Time Scheduler for Computer Vision Applications on the Edge. In Proceedings of the 6th IEEE/ACM Symposium on Edge Computing (San Jose, California, December 14 - 17, 2021) (SEC '21). ACM, New York, NY, USA, 271 -- 284.

[40]

Juheon Yi and Youngki Lee. 2020. Heimdall: Mobile GPU Coordination Platform for Augmented Reality Applications. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking (London, United Kingdom, September 21 - 25, 2020) (MobiCom '20). ACM, New York, NY, USA, 462 -- 475.

Digital Library

[41]

Mingming Zhang, Chaochao Chen, Tianyu Wo, Tao Xie, Md. Zakirul Alam Bhuiyan, and Xuelian Lin. 2017. SafeDrive: Online Driving Anomaly Detection From Large-Scale Vehicle Data. IEEE Transactions on Industrial Informatics 13 (Dec 2017), 2087--2096.

Cited By

Jackson MJi BNikolopoulos D(2024)FrameFeedback: A Closed-Loop Control System for Dynamic Offloading Real-Time Edge Inference2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00116(584-591)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPSW63119.2024.00116
Ravindran A(2023)Internet-of-Things Edge Computing Systems for Streaming Video Analytics: Trails Behind and the Paths AheadIoT10.3390/iot40400214:4(486-513)Online publication date: 24-Oct-2023
https://doi.org/10.3390/iot4040021
Cao Y(2023)Better Orchestration for SLO-Oriented Cross-site Microservices in Multi-tenant Cloud/Edge ContinuumProceedings of the 24th International Middleware Conference: Demos, Posters and Doctoral Symposium10.1145/3626564.3629091(9-10)Online publication date: 11-Dec-2023
https://dl.acm.org/doi/10.1145/3626564.3629091
Show More Cited By

Index Terms

MicroEdge: a multi-tenant edge cluster system architecture for scalable camera processing
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
  2. Real-time systems
    1. Real-time system architecture
2. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Deviceless edge computing: extending serverless computing to the edge of the network
SYSTOR '17: Proceedings of the 10th ACM International Systems and Storage Conference

The serverless paradigm has been rapidly adopted by developers of cloud-native applications, mainly because it relieves them from the burden of provisioning, scaling and operating the underlying infrastructure. In this paper, we propose a novel ...
Supporting Multi-Provider Serverless Computing on the Edge
ICPP Workshops '18: Workshop Proceedings of the 47th International Conference on Parallel Processing

Serverless computing has recently emerged as a new execution model for cloud computing, in which service providers offer compute runtimes, also known as Function-as-a-Service (FaaS) platforms, allowing users to develop, execute and manage application ...
Cloud, Fog, or Mist in IoT? That Is the Question
Special Issue on Fog, Edge, and Cloud Integration

Internet of Things (IoT) has been commercially explored as Platforms as a Services (PaaS). The standard solution for this kind of service is to combine the Cloud computing infrastructure with IoT software, services, and protocols also known as CoT (...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

Middleware '22: Proceedings of the 23rd ACM/IFIP International Middleware Conference

November 2022

110 pages

ISBN:9781450393409

DOI:10.1145/3528535

General Chairs:
Paolo Bellavista
University of Bologna
,
Kaiwen Zhang
Ecole de technologie supérieure
,
Abdelouahed Gherbi
Ecole de technologie supérieure
,
Program Chairs:
Saurabh Bagchi
Purdue University
,
Marta Patiño
Universidad Politécnica de Madrid, Spain
,
Publications Chairs:
Giuseppe Di Modica
University of Bologna
,
Julien Gascon-Samson
Ecole de technologie supérieure

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 November 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Best Paper

Author Tags

Qualifiers

Research-article

Funding Sources

NSF (National Science Foundation)

Conference

Middleware '22

Sponsor:

ACM

Middleware '22: 23rd International Middleware Conference

November 7 - 11, 2022

QC, Quebec, Canada

Acceptance Rates

Middleware '22 Paper Acceptance Rate 8 of 21 submissions, 38%;

Overall Acceptance Rate 203 of 948 submissions, 21%

Upcoming Conference

MIDDLEWARE '24

25th International Middleware Conference

December 2 - 6, 2024

Hong Kong , Hong Kong

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
932
Total Downloads

Downloads (Last 12 months)437
Downloads (Last 6 weeks)40

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Jackson MJi BNikolopoulos D(2024)FrameFeedback: A Closed-Loop Control System for Dynamic Offloading Real-Time Edge Inference2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00116(584-591)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPSW63119.2024.00116
Ravindran A(2023)Internet-of-Things Edge Computing Systems for Streaming Video Analytics: Trails Behind and the Paths AheadIoT10.3390/iot40400214:4(486-513)Online publication date: 24-Oct-2023
https://doi.org/10.3390/iot4040021
Cao Y(2023)Better Orchestration for SLO-Oriented Cross-site Microservices in Multi-tenant Cloud/Edge ContinuumProceedings of the 24th International Middleware Conference: Demos, Posters and Doctoral Symposium10.1145/3626564.3629091(9-10)Online publication date: 11-Dec-2023
https://dl.acm.org/doi/10.1145/3626564.3629091
Ovesen ANordmo TJohansen D(2023)Compliant multimedia storage and data extraction from the untrusted and privacy-sensitive edge2023 International Conference on Multimedia Computing, Networking and Applications (MCNA)10.1109/MCNA59361.2023.10185828(123-130)Online publication date: 19-Jun-2023
https://doi.org/10.1109/MCNA59361.2023.10185828
Pfandzelter TBermbach D(2023)Towards a Benchmark for Fog Data Processing2023 IEEE International Conference on Cloud Engineering (IC2E)10.1109/IC2E59103.2023.00018(92-98)Online publication date: 25-Sep-2023
https://doi.org/10.1109/IC2E59103.2023.00018

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents