research-article

Efficient GPU Sharing for Serverless Workflows

Authors:

Istemi Ekin Akkus,

Paarijaat Aditya,

Volker HiltAuthors Info & Claims

HiPS '21: Proceedings of the 1st Workshop on High Performance Serverless Computing

Pages 17 - 24

https://doi.org/10.1145/3452413.3464785

Published: 18 June 2021 Publication History

Abstract

Serverless computing has emerged as a new cloud computing paradigm, where an application consists of individual functions that can be separately managed and executed. However, the function development environment of all serverless computing frameworks at present is CPU-based. In this paper, we propose to extend the open-sourced KNIX high-performance serverless framework so that it can execute functions on shared GPU cluster resources. We have evaluated the performance impacts on the extended KNIX system by measuring overheads and penalties incurred using different deep learning frameworks.

References

[1]

2016. OpenLambda Dev Meeting July 5. http://open-lambda.org/resources/slides/ july-5-16.pdf.

[2]

2021. AWS Lambda. https://aws.amazon.com/lambda/.

[3]

A MNIST-like fashion product database. Benchmark 2021. A MNIST-like fashion product database. Benchmark. https://github.com/zalandoresearch/fashionmnist.

[4]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A system for large-scale machine learning. arXiv:1605.08695 [cs.DC]

Digital Library

[5]

Istemi Ekin Akkus, Ruichuan Chen, Ivica Rimac, Manuel Stein, Klaus Satzke, Andre Beck, Paarijaat Aditya, and Volker Hilt. 2018. SAND: Towards High-Performance Serverless Computing. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 923--935. https://www. usenix.org/conference/atc18/presentation/akkus

Digital Library

[6]

Amazon Elastic Inference 2021. Amazon Elastic Inference. https://aws.amazon. com/machine-learning/elastic-inference/.

[7]

Autonomous Vehicle and ADAS development on AWS 2020. Autonomous Vehicle and ADAS development on AWS. https://aws.amazon.com/blogs/industries/ autonomous-vehicle-and-adas-development-on-aws-part-1-achieving-scale/.

[8]

Azure Functions-Serverless Architecture | Microsoft Azure 2021. Azure Functions-Serverless Architecture | Microsoft Azure. https://azure.microsoft.com/enus/services/functions/.

[9]

Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015).

[10]

Cloud Functions - Serverless Environment to Build and Connect Cloud Services | Google Cloud Platform 2021. Cloud Functions - Serverless Environment to Build and Connect Cloud Services | Google Cloud Platform. https://cloud.google.com/ functions/.

[11]

Container Service for Kubernetes [n.d.]. Container Service for Kubernetes. https://www.alibabacloud.com/product/kubernetes.

[12]

J. Duato, A. J. Peña, F. Silla, R. Mayo, and E. S. Quintana-Ortí. 2010. rCUDA: Reducing the number of GPU-based accelerators in high performance clusters. In 2010 International Conference on High Performance Computing Simulation. 224--231. https://doi.org/10.1109/HPCS.2010.5547126

[13]

Alex Ellis. 2017. Functions as a Service (FaaS). https://blog.alexellis.io/functions-as-a-service/.

[14]

Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2015. A Neural Algorithm of Artistic Style. CoRR abs/1508.06576 (2015). arXiv:1508.06576 http://arxiv.org/abs/1508.06576

[15]

GPU Manager is used for managing the nvidia GPU devices in Kubernetes cluster. 2021. GPU Manager is used for managing the nvidia GPU devices in Kubernetes cluster. https://github.com/tkestack/gpu-manager.

[16]

GPU Sharing Scheduler for Kubernetes Cluster [n.d.]. GPU Sharing Scheduler for Kubernetes Cluster. https://github.com/AliyunContainerService/gpushare-scheduler-extender.

[17]

IBM Cloud Functions [n.d.]. Cloud Functions - Overview | IBM Cloud. https: //www.ibm.com/cloud/functions.

[18]

J. Kim, T. J. Jun, D. Kang, D. Kim, and D. Kim. 2018. GPU Enabled Serverless Computing Framework. In 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP). 533--540. https: //doi.org/10.1109/PDP2018.2018.00090

[19]

KNIX 2021. KNIX. https://github.com/knix-microfunctions/knix.

[20]

Kubernetes Schedule GPUs 2021. Kubernetes Schedule GPUs. https://kubernetes. io/docs/tasks/manage-gpus/scheduling-gpus/.

[21]

Nuclio Serverless Functions [n.d.]. Nuclio Serverless Functions. https://nuclio.io/.

[22]

NVIDIA-Docker [n.d.]. Build and run Docker containers leveraging NVIDIA GPUs. https://github.com/NVIDIA/nvidia-docker.

[23]

Adam Paszke, Sam Gross, Soumith Chintala, and Gregory Chanan. 2017. Pytorch: Tensors and dynamic neural networks in python with strong gpu acceleration. PyTorch: Tensors and dynamic neural networks in Python with strong GPU acceleration 6 (2017), 3.

[24]

Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:1409.1556 [cs.CV]

[25]

tf.linalg.matmul API 2021. tf.linalg.matmul API. https://www.tensorflow.org/ api_docs/python/tf/linalg/matmul.

[26]

The CIFAR-10 dataset 2021. The CIFAR-10 dataset. https://www.tensorflow.org/ datasets/catalog/cifar10

Cited By

Bhasi VSharma AMohanty SKandemir MDas C(2024)Paldia: Enabling SLO-Compliant and Cost-Effective Serverless Computing on Heterogeneous Hardware2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00018(100-113)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPS57955.2024.00018
Zhao MJha KHong S(2023)GPU-enabled Function-as-a-Service for Machine Learning Inference2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS54959.2023.00096(918-928)Online publication date: May-2023
https://doi.org/10.1109/IPDPS54959.2023.00096
Kim MLee JYu HLee E(2023)Improving Memory Utilization by Sharing DNN Models for Serverless Inference2023 IEEE International Conference on Consumer Electronics (ICCE)10.1109/ICCE56470.2023.10043587(1-6)Online publication date: 6-Jan-2023
https://doi.org/10.1109/ICCE56470.2023.10043587
Show More Cited By

Index Terms

Efficient GPU Sharing for Serverless Workflows

Recommendations

On the Acceleration of FaaS Using Remote GPU Virtualization
ICPE '23 Companion: Companion of the 2023 ACM/SPEC International Conference on Performance Engineering

Serverless computing and, in particular, Function as a Service (FaaS) has introduced novel computational approaches with its highly-elastic capabilities, per-millisecond billing and scale-to-zero capacities, thus being of interest for the computing ...
Practical Tooling for Serverless Computing
UCC '17: Proceedings of the10th International Conference on Utility and Cloud Computing

Cloud applications are increasingly built from a mixture of runtime technologies. Hosted functions and service-oriented web hooks are among the most recent ones which are natively supported by cloud platforms. They are collectively referred to as ...
On Merits and Viability of Multi-Cloud Serverless
SoCC '21: Proceedings of the ACM Symposium on Cloud Computing

Serverless computing is a rapidly growing paradigm in the cloud industry that envisions functions as the computational building blocks of an application. Instead of forcing the application developer to provision cloud resources for their application, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HiPS '21: Proceedings of the 1st Workshop on High Performance Serverless Computing

June 2021

46 pages

ISBN:9781450383882

DOI:10.1145/3452413

General Chairs:
Yadu Babuji
Argonne National Laboratory & University of Chicago
,
Kyle Chard
Argonne National Laboratory & University of Chicago
,
Program Chairs:
Ian Foster
Argonne National Laboratory & University of Chicago
,
Zhuozhao Li
Argonne National Laboratory & University of Chicago

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

University of Arizona: University of Arizona
SIGHPC: ACM Special Interest Group on High Performance Computing, Special Interest Group on High Performance Computing
SIGARCH: ACM Special Interest Group on Computer Architecture

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

HPDC '21

Sponsor:

University of Arizona
SIGHPC
SIGARCH

HPDC '21: The 30th International Symposium on High-Performance Parallel and Distributed Computing

June 25, 2021

Virtual Event, Sweden

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
503
Total Downloads

Downloads (Last 12 months)100
Downloads (Last 6 weeks)4

Reflects downloads up to 11 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Bhasi VSharma AMohanty SKandemir MDas C(2024)Paldia: Enabling SLO-Compliant and Cost-Effective Serverless Computing on Heterogeneous Hardware2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00018(100-113)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPS57955.2024.00018
Zhao MJha KHong S(2023)GPU-enabled Function-as-a-Service for Machine Learning Inference2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS54959.2023.00096(918-928)Online publication date: May-2023
https://doi.org/10.1109/IPDPS54959.2023.00096
Kim MLee JYu HLee E(2023)Improving Memory Utilization by Sharing DNN Models for Serverless Inference2023 IEEE International Conference on Consumer Electronics (ICCE)10.1109/ICCE56470.2023.10043587(1-6)Online publication date: 6-Jan-2023
https://doi.org/10.1109/ICCE56470.2023.10043587
Decker JKasprzak PKunkel J(2022)Performance Evaluation of Open-Source Serverless Platforms for KubernetesAlgorithms10.3390/a1507023415:7(234)Online publication date: 2-Jul-2022
https://doi.org/10.3390/a15070234
Werner SSchirmer T(2022)HARDLESS: A Generalized Serverless Compute Architecture for Hardware Processing Accelerators2022 IEEE International Conference on Cloud Engineering (IC2E)10.1109/IC2E55432.2022.00016(79-84)Online publication date: Sep-2022
https://doi.org/10.1109/IC2E55432.2022.00016
Shen WLiu ZTan YLuo ZLei Z(2022)KubeGPU: efficient sharing and isolation mechanisms for GPU resource management in container cloudThe Journal of Supercomputing10.1007/s11227-022-04682-279:1(591-625)Online publication date: 14-Jul-2022
https://dl.acm.org/doi/10.1007/s11227-022-04682-2
Prakash CGarg ABellur UKulkarni PKurkure USivaraman HVu L(2021)Optimizing Goodput of Real-time Serverless Functions using Dynamic Slicing with vGPUs2021 IEEE International Conference on Cloud Engineering (IC2E)10.1109/IC2E52221.2021.00020(60-70)Online publication date: Oct-2021
https://doi.org/10.1109/IC2E52221.2021.00020

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents