Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3452413.3464785acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
research-article

Efficient GPU Sharing for Serverless Workflows

Published: 18 June 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Serverless computing has emerged as a new cloud computing paradigm, where an application consists of individual functions that can be separately managed and executed. However, the function development environment of all serverless computing frameworks at present is CPU-based. In this paper, we propose to extend the open-sourced KNIX high-performance serverless framework so that it can execute functions on shared GPU cluster resources. We have evaluated the performance impacts on the extended KNIX system by measuring overheads and penalties incurred using different deep learning frameworks.

    References

    [1]
    2016. OpenLambda Dev Meeting July 5. http://open-lambda.org/resources/slides/ july-5-16.pdf.
    [2]
    2021. AWS Lambda. https://aws.amazon.com/lambda/.
    [3]
    A MNIST-like fashion product database. Benchmark 2021. A MNIST-like fashion product database. Benchmark. https://github.com/zalandoresearch/fashionmnist.
    [4]
    Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A system for large-scale machine learning. arXiv:1605.08695 [cs.DC]
    [5]
    Istemi Ekin Akkus, Ruichuan Chen, Ivica Rimac, Manuel Stein, Klaus Satzke, Andre Beck, Paarijaat Aditya, and Volker Hilt. 2018. SAND: Towards High-Performance Serverless Computing. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 923--935. https://www. usenix.org/conference/atc18/presentation/akkus
    [6]
    Amazon Elastic Inference 2021. Amazon Elastic Inference. https://aws.amazon. com/machine-learning/elastic-inference/.
    [7]
    Autonomous Vehicle and ADAS development on AWS 2020. Autonomous Vehicle and ADAS development on AWS. https://aws.amazon.com/blogs/industries/ autonomous-vehicle-and-adas-development-on-aws-part-1-achieving-scale/.
    [8]
    Azure Functions-Serverless Architecture | Microsoft Azure 2021. Azure Functions-Serverless Architecture | Microsoft Azure. https://azure.microsoft.com/enus/services/functions/.
    [9]
    Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015).
    [10]
    Cloud Functions - Serverless Environment to Build and Connect Cloud Services | Google Cloud Platform 2021. Cloud Functions - Serverless Environment to Build and Connect Cloud Services | Google Cloud Platform. https://cloud.google.com/ functions/.
    [11]
    Container Service for Kubernetes [n.d.]. Container Service for Kubernetes. https://www.alibabacloud.com/product/kubernetes.
    [12]
    J. Duato, A. J. Peña, F. Silla, R. Mayo, and E. S. Quintana-Ortí. 2010. rCUDA: Reducing the number of GPU-based accelerators in high performance clusters. In 2010 International Conference on High Performance Computing Simulation. 224--231. https://doi.org/10.1109/HPCS.2010.5547126
    [13]
    Alex Ellis. 2017. Functions as a Service (FaaS). https://blog.alexellis.io/functions-as-a-service/.
    [14]
    Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2015. A Neural Algorithm of Artistic Style. CoRR abs/1508.06576 (2015). arXiv:1508.06576 http://arxiv.org/abs/1508.06576
    [15]
    GPU Manager is used for managing the nvidia GPU devices in Kubernetes cluster. 2021. GPU Manager is used for managing the nvidia GPU devices in Kubernetes cluster. https://github.com/tkestack/gpu-manager.
    [16]
    GPU Sharing Scheduler for Kubernetes Cluster [n.d.]. GPU Sharing Scheduler for Kubernetes Cluster. https://github.com/AliyunContainerService/gpushare-scheduler-extender.
    [17]
    IBM Cloud Functions [n.d.]. Cloud Functions - Overview | IBM Cloud. https: //www.ibm.com/cloud/functions.
    [18]
    J. Kim, T. J. Jun, D. Kang, D. Kim, and D. Kim. 2018. GPU Enabled Serverless Computing Framework. In 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP). 533--540. https: //doi.org/10.1109/PDP2018.2018.00090
    [19]
    KNIX 2021. KNIX. https://github.com/knix-microfunctions/knix.
    [20]
    Kubernetes Schedule GPUs 2021. Kubernetes Schedule GPUs. https://kubernetes. io/docs/tasks/manage-gpus/scheduling-gpus/.
    [21]
    Nuclio Serverless Functions [n.d.]. Nuclio Serverless Functions. https://nuclio.io/.
    [22]
    NVIDIA-Docker [n.d.]. Build and run Docker containers leveraging NVIDIA GPUs. https://github.com/NVIDIA/nvidia-docker.
    [23]
    Adam Paszke, Sam Gross, Soumith Chintala, and Gregory Chanan. 2017. Pytorch: Tensors and dynamic neural networks in python with strong gpu acceleration. PyTorch: Tensors and dynamic neural networks in Python with strong GPU acceleration 6 (2017), 3.
    [24]
    Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:1409.1556 [cs.CV]
    [25]
    tf.linalg.matmul API 2021. tf.linalg.matmul API. https://www.tensorflow.org/ api_docs/python/tf/linalg/matmul.
    [26]
    The CIFAR-10 dataset 2021. The CIFAR-10 dataset. https://www.tensorflow.org/ datasets/catalog/cifar10

    Cited By

    View all
    • (2024)Paldia: Enabling SLO-Compliant and Cost-Effective Serverless Computing on Heterogeneous Hardware2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00018(100-113)Online publication date: 27-May-2024
    • (2023)GPU-enabled Function-as-a-Service for Machine Learning Inference2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS54959.2023.00096(918-928)Online publication date: May-2023
    • (2023)Improving Memory Utilization by Sharing DNN Models for Serverless Inference2023 IEEE International Conference on Consumer Electronics (ICCE)10.1109/ICCE56470.2023.10043587(1-6)Online publication date: 6-Jan-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    HiPS '21: Proceedings of the 1st Workshop on High Performance Serverless Computing
    June 2021
    46 pages
    ISBN:9781450383882
    DOI:10.1145/3452413
    • General Chairs:
    • Yadu Babuji,
    • Kyle Chard,
    • Program Chairs:
    • Ian Foster,
    • Zhuozhao Li
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 June 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. deep learning
    2. gpu
    3. image processing
    4. neural networks
    5. serverless

    Qualifiers

    • Research-article

    Conference

    HPDC '21
    Sponsor:

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)100
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Paldia: Enabling SLO-Compliant and Cost-Effective Serverless Computing on Heterogeneous Hardware2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00018(100-113)Online publication date: 27-May-2024
    • (2023)GPU-enabled Function-as-a-Service for Machine Learning Inference2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS54959.2023.00096(918-928)Online publication date: May-2023
    • (2023)Improving Memory Utilization by Sharing DNN Models for Serverless Inference2023 IEEE International Conference on Consumer Electronics (ICCE)10.1109/ICCE56470.2023.10043587(1-6)Online publication date: 6-Jan-2023
    • (2022)Performance Evaluation of Open-Source Serverless Platforms for KubernetesAlgorithms10.3390/a1507023415:7(234)Online publication date: 2-Jul-2022
    • (2022)HARDLESS: A Generalized Serverless Compute Architecture for Hardware Processing Accelerators2022 IEEE International Conference on Cloud Engineering (IC2E)10.1109/IC2E55432.2022.00016(79-84)Online publication date: Sep-2022
    • (2022)KubeGPU: efficient sharing and isolation mechanisms for GPU resource management in container cloudThe Journal of Supercomputing10.1007/s11227-022-04682-279:1(591-625)Online publication date: 14-Jul-2022
    • (2021)Optimizing Goodput of Real-time Serverless Functions using Dynamic Slicing with vGPUs2021 IEEE International Conference on Cloud Engineering (IC2E)10.1109/IC2E52221.2021.00020(60-70)Online publication date: Oct-2021

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media