Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3589334.3645383acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

λGrapher: A Resource-Efficient Serverless System for GNN Serving through Graph Sharing

Published: 13 May 2024 Publication History

Abstract

Graph Neural Networks (GNNs) have been increasingly adopted for graph analysis in web applications such as social networks. Yet, efficient GNN serving remains a critical challenge due to high workload fluctuations and intricate GNN operations. Serverless computing, thanks to its flexibility and agility, offers on-demand serving of GNN inference requests. Alas, the request-centric serverless model is still too coarse-grained to avoid resource waste.
Observing the significant data locality in computation graphs of requests, we propose λGrapher, a serverless system for GNN serving that achieves resource efficiency through graph sharing and fine-grained resource allocation. "Grapher features the following designs: (1) adaptive timeout for request buffering to balance resource efficiency and inference latency, (2) graph-centric scheduling to minimize computation and memory redundancy, and (3) resource-centric function management with fine-grained resource allocation catered to the resource sensitivities of GNN operations and function orchestration optimized to hide communication latency. We implement a prototype of λGrapher based on the representative open-source serverless platform Knative and evaluate it with real-world traces from various web applications. Our results show that λGrapher can achieve an average savings of 61.5% in memory resource and 47.2% in computing resource compared with the state of the arts while ensuring GNN inference latency.

Supplemental Material

MP4 File
presentation video
MP4 File
Supplemental video

References

[1]
Alibaba. 2020. graph-learn: An Industrial Graph Neural Network. https://graph-learn.readthedocs.io/en/latest/index_en.html[Online Accessed, 12-Feb-2024].
[2]
ArchiveTeam. [n.,d.]. Twitter streaming traces, 2017. https://github.com/rickypinci/BATCH/tree/sc2020/traces[Online Accessed, 12-Feb-2024].
[3]
Adam Auten, Matthew Tomei, and Rakesh Kumar. 2020. Hardware acceleration of graph neural networks. In 2020 57th ACM/IEEE Design Automation Conference (DAC). IEEE, 1--6.
[4]
AWS. [n.,d.]. AWS Auto Scaling. https://aws.amazon.com/cn/autoscaling/[Online Accessed, 12-Feb-2024].
[5]
Stefano Battiston, Guido Caldarelli, Robert M May, Tarik Roukny, and Joseph E Stiglitz. 2016. The price of complexity in financial networks. Proceedings of the National Academy of Sciences, Vol. 113, 36 (2016), 10031--10036.
[6]
AWS Machine Learning Blog. [n.,d.]. Build a GNN-based real-time fraud detection solution using Amazon SageMaker, Amazon Neptune, and the Deep Graph Library. https://aws.amazon.com/cn/blogs/machine-learning/build-a-gnn-based-real-time-fraud-detection-solution-using-amazon-sagemaker-amazon-neptune-and-the-deep-graph-library/[Online Accessed, 12-Feb-2024].
[7]
Gustavo André Setti Cassel, Vinicius Facco Rodrigues, Rodrigo da Rosa Righi, Marta Rosecler Bez, Andressa Cruz Nepomuceno, and Cristiano André da Costa. 2022. Serverless computing for Internet of Things: A systematic literature review. Future Generation Computer Systems, Vol. 128 (2022), 299--316.
[8]
Manlio De Domenico, Antonio Lima, Paul Mougel, and Mirco Musolesi. 2013. The anatomy of a scientific rumor. Scientific reports, Vol. 3, 1 (2013), 2980.
[9]
Fabrizio Frasca, Emanuele Rossi, Davide Eynard, Ben Chamberlain, Michael Bronstein, and Federico Monti. 2020. Sign: Scalable inception graph neural networks. arXiv preprint arXiv:2004.11198 (2020).
[10]
Chongming Gao, Shijun Li, Wenqiang Lei, Jiawei Chen, Biao Li, Peng Jiang, Xiangnan He, Jiaxin Mao, and Tat-Seng Chua. 2022. KuaiRec: A Fully-Observed Dataset and Insights for Evaluating Recommender Systems. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management (Atlanta, GA, USA) (CIKM '22). 540--550. https://doi.org/10.1145/3511808.3557220
[11]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems, Vol. 30 (2017).
[12]
Chaoyang He, Emir Ceyani, Keshav Balasubramanian, Murali Annavaram, and Salman Avestimehr. 2021. Spreadgnn: Serverless multi-task federated learning for graph neural networks. arXiv preprint arXiv:2106.02743 (2021).
[13]
Brendan Jennings and Rolf Stadler. 2015. Resource management in clouds: Survey and research challenges. Journal of Network and Systems Management, Vol. 23 (2015), 567--619.
[14]
Zhihao Jia, Sina Lin, Rex Ying, Jiaxuan You, Jure Leskovec, and Alex Aiken. 2020. Redundancy-Free Computation for Graph Neural Networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Virtual Event, CA, USA) (KDD '20). Association for Computing Machinery, New York, NY, USA, 997--1005.
[15]
Weiwei Jiang and Jiayun Luo. 2022. Graph neural network for traffic forecasting: A survey. Expert Systems with Applications (2022), 117921.
[16]
Eric Jonas, Johann Schleier-Smith, Vikram Sreekanti, Chia-Che Tsai, Anurag Khandelwal, Qifan Pu, Vaishaal Shankar, Joao Carreira, Karl Krauth, Neeraja Yadwadkar, et al. 2019. Cloud programming simplified: A berkeley view on serverless computing. arXiv preprint arXiv:1902.03383 (2019).
[17]
Kevin Kiningham, Philip Levis, and Christopher Ré. 2022. GRIP: A graph neural network accelerator architecture. IEEE Trans. Comput. (2022).
[18]
Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
[19]
Knative. [n.,d.]. Knative is an Open-Source Enterprise-level solution to build Serverless and Event Driven Applications. https://knative.dev/ docs/[Online Accessed, 12-Feb-2024] .
[20]
Kuaishou. [n.,d.]. Kuaishou is the video-sharing mobile app. https://https://www.kuaishou.com/en/[Online Accessed, 12-Feb-2024] .
[21]
Srijan Kumar, Bryan Hooi, Disha Makhija, Mohit Kumar, Christos Faloutsos, and VS Subrahmanian. 2018. Rev2: Fraudulent user prediction in rating platforms. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, 333--341.
[22]
Srijan Kumar, Francesca Spezzano, VS Subrahmanian, and Christos Faloutsos. 2016. Edge weight prediction in weighted signed networks. In Data Mining (ICDM), 2016 IEEE 16th International Conference on. IEEE, 221--230.
[23]
Adam Lerer, Ledell Wu, Jiajun Shen, Timothee Lacroix, Luca Wehrstedt, Abhijit Bose, and Alex Peysakhovich. 2019. Pytorch-biggraph: A large scale graph embedding system. Proceedings of Machine Learning and Systems, Vol. 1 (2019), 120--131.
[24]
Jie Li, Laiping Zhao, Yanan Yang, Kunlin Zhan, and Keqiu Li. 2022. Tetris: Memory-efficient Serverless Inference through Tensor Sharing. In 2022 USENIX Annual Technical Conference (USENIX ATC 22).
[25]
Zhuohan Li, Lianmin Zheng, Yinmin Zhong, Vincent Liu, Ying Sheng, Xin Jin, Yanping Huang, Zhifeng Chen, Hao Zhang, Joseph E Gonzalez, et al. 2023. AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving. arXiv preprint arXiv:2302.11665 (2023).
[26]
Dandan Lin, Shijie Sun, Jingtao Ding, Xuehan Ke, Hao Gu, Xing Huang, Chonggang Song, Xuri Zhang, Lingling Yi, Jie Wen, et al. 2022. PlatoGL: Effective and Scalable Deep Graph Learning System for Graph-enhanced Real-Time Recommendation. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 3302--3311.
[27]
Fangming Liu and Yipei Niu. 2023. Demystifying the Cost of Serverless Computing: Towards a Win-Win Deal. IEEE Transactions on Parallel and Distributed Systems (2023).
[28]
Mingxuan Lu, Zhichao Han, Susie Xi Rao, Zitao Zhang, Yang Zhao, Yinan Shan, Ramesh Raghunathan, Ce Zhang, and Jiawei Jiang. 2022. BRIGHT-Graph Neural Networks in Real-time Fraud Detection. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 3342--3351.
[29]
Jingwei Ma, Kangkang Bian, Jiahui Wen, Yang Xu, Mingyang Zhong, and Lei Zhu. 2023. SRDPR: Social Relation-driven Dynamic network for Personalized micro-video Recommendation. Expert Systems with Applications, Vol. 226 (2023), 120157.
[30]
Grzegorz Malewicz, Matthew H Austern, Aart JC Bik, James C Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. 135--146.
[31]
Seth A Myers, Aneesh Sharma, Pankaj Gupta, and Jimmy Lin. 2014. Information network or social network? The structure of the Twitter follow graph. In Proceedings of the 23rd International Conference on World Wide Web. 493--498.
[32]
Li Pan, Lin Wang, Shutong Chen, and Fangming Liu. 2022. Retention-aware container caching for serverless edge computing. In IEEE INFOCOM 2022-IEEE Conference on Computer Communications. IEEE, 1069--1078.
[33]
Qiangyu Pei, Yongjie Yuan, Haichuan Hu, Qiong Chen, and Fangming Liu. 2023. AsyFunc: A High-Performance and Resource-Efficient Serverless Inference System via Asymmetric Functions. In Proceedings of the 2023 ACM Symposium on Cloud Computing. 324--340.
[34]
Hao Peng, Hongfei Wang, Bowen Du, Md Zakirul Alam Bhuiyan, Hongyuan Ma, Jianwei Liu, Lihong Wang, Zeyu Yang, Linfeng Du, Senzhang Wang, et al. 2020. Spatial temporal incidence dynamic graph neural networks for traffic flow forecasting. Information Sciences, Vol. 521 (2020), 277--290.
[35]
Huyen Trang Phan, Ngoc Thanh Nguyen, and Dosam Hwang. 2023. Fake news detection: A survey of graph neural network methods. Applied Soft Computing (2023), 110235.
[36]
AWS Fargate Pricing. [n.,d.]. Serverless Compute Engine--AWS Fargate Pricing--Amazon Web Services. https://aws.amazon.com/fargate/pricing/[Online Accessed, 12-Feb-2024] .
[37]
Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. 2015. Taking the human out of the loop: A review of Bayesian optimization. Proc. IEEE, Vol. 104, 1 (2015), 148--175.
[38]
John Thorpe, Yifan Qiao, Jonathan Eyolfson, Shen Teng, Guanzhou Hu, Zhihao Jia, Jinliang Wei, Keval Vora, Ravi Netravali, Miryung Kim, et al. 2021. Dorylus: Affordable, Scalable, and Accurate $$GNN$$ Training with Distributed $$CPU$$ Servers and Serverless Threads. In 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 21). 495--514.
[39]
Lucian Toader, Alexandru Uta, Ahmed Musaafir, and Alexandru Iosup. 2019. Graphless: Toward serverless graph processing. In 2019 18th International Symposium on Parallel and Distributed Computing (ISPDC). IEEE, 66--73.
[40]
Srinivas Virinchi, Anoop S V K K Saladi, and Abhirup Mondal. 2022. Recommending related products using graph neural networks in directed graphs. In ECML-PKDD 2022. https://www.amazon.science/publications/recommending-related-products-using-graph-neural-networks-in-directed-graphs
[41]
Daixin Wang, Jianbin Lin, Peng Cui, Quanhui Jia, Zhen Wang, Yanming Fang, Quan Yu, Jun Zhou, Shuang Yang, and Yuan Qi. 2019. A semi-supervised graph attentive network for financial fraud detection. In 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 598--607.
[42]
Minjie Yu Wang. 2019. Deep graph library: Towards efficient and scalable deep learning on graphs. In ICLR workshop on representation learning on graphs and manifolds.
[43]
Yuke Wang, Boyuan Feng, Gushu Li, Shuangchen Li, Lei Deng, Yuan Xie, and Yufei Ding. 2021. GNNAdvisor: An adaptive and efficient runtime system for GNN acceleration on GPUs. In 15th USENIX symposium on operating systems design and implementation (OSDI 21).
[44]
Zhaojie Wen, Yishuo Wang, and Fangming Liu. 2022. StepConf: Slo-aware dynamic resource configuration for serverless function workflows. In IEEE INFOCOM 2022-IEEE Conference on Computer Communications. IEEE, 1868--1877.
[45]
Shiwen Wu, Fei Sun, Wentao Zhang, Xu Xie, and Bin Cui. 2022. Graph neural networks in recommender systems: a survey. Comput. Surveys, Vol. 55, 5 (2022), 1--37.
[46]
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems, Vol. 32, 1 (2020), 4--24.
[47]
Fei Xu, Yiling Qin, Li Chen, Zhi Zhou, and Fangming Liu. 2021. $łambda$dnn: Achieving predictable distributed DNN training with serverless architectures. IEEE Trans. Comput., Vol. 71, 2 (2021), 450--463.
[48]
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2018. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826 (2018).
[49]
Min Xu, Pakorn Watanachaturaporn, Pramod K Varshney, and Manoj K Arora. 2005. Decision tree regression for soft classification of remote sensing data. Remote Sensing of Environment, Vol. 97, 3 (2005), 322--336.
[50]
Hongxia Yang. 2019. Aligraph: A comprehensive graph neural network platform. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 3165--3166.
[51]
Liekang Zeng, Peng Huang, Ke Luo, Xiaoxi Zhang, Zhi Zhou, and Xu Chen. 2022. Fograph: Enabling real-time deep graph inference with fog computing. In Proceedings of the ACM Web Conference 2022. 1774--1784.
[52]
Yanfu Zhang, Shangqian Gao, Jian Pei, and Heng Huang. 2022. Improving social network embedding via new second-order continuous graph neural networks. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2515--2523.
[53]
Chenguang Zheng, Hongzhi Chen, Yuxuan Cheng, Zhezheng Song, Yifan Wu, Changji Li, James Cheng, Hao Yang, and Shuai Zhang. 2022. ByteGNN: efficient graph neural network training at large scale. Proceedings of the VLDB Endowment, Vol. 15, 6 (2022), 1228--1242.
[54]
Hongkuan Zhou, Ajitesh Srivastava, Hanqing Zeng, Rajgopal Kannan, and Viktor Prasanna. 2021. Accelerating large scale real-time GNN inference using channel pruning. arXiv preprint arXiv:2105.04528 (2021).
[55]
Hongkuan Zhou, Bingyi Zhang, Rajgopal Kannan, Viktor Prasanna, and Carl Busart. 2022. Model-Architecture Co-Design for High Performance Temporal GNN Inference on FPGA. In 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 1108--1117.
[56]
Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. 2020. Graph neural networks: A review of methods and applications. AI open, Vol. 1 (2020), 57--81.

Cited By

View all
  • (2024)Pre-Warming is Not Enough: Accelerating Serverless Inference With Opportunistic Pre-LoadingProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698509(178-195)Online publication date: 20-Nov-2024
  • (2024)ComboFunc: Joint Resource Combination and Container Placement for Serverless Function Scaling with Heterogeneous ContainerIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.3454071(1-17)Online publication date: 2024
  • (2024)Towards Efficient Graph Processing in Geo-Distributed Data CentersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.345387235:11(2147-2160)Online publication date: Nov-2024
  • Show More Cited By

Index Terms

  1. λGrapher: A Resource-Efficient Serverless System for GNN Serving through Graph Sharing

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '24: Proceedings of the ACM Web Conference 2024
    May 2024
    4826 pages
    ISBN:9798400701719
    DOI:10.1145/3589334
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 May 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. graph neural networks
    2. model serving
    3. serverless computing

    Qualifiers

    • Research-article

    Funding Sources

    • Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)
    • National Key Research & Development (R&D) Plan
    • The Major Key Project of PCL
    • NSFC

    Conference

    WWW '24
    Sponsor:
    WWW '24: The ACM Web Conference 2024
    May 13 - 17, 2024
    Singapore, Singapore

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)265
    • Downloads (Last 6 weeks)22
    Reflects downloads up to 26 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Pre-Warming is Not Enough: Accelerating Serverless Inference With Opportunistic Pre-LoadingProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698509(178-195)Online publication date: 20-Nov-2024
    • (2024)ComboFunc: Joint Resource Combination and Container Placement for Serverless Function Scaling with Heterogeneous ContainerIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.3454071(1-17)Online publication date: 2024
    • (2024)Towards Efficient Graph Processing in Geo-Distributed Data CentersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.345387235:11(2147-2160)Online publication date: Nov-2024
    • (2024)BaaSLess: Backend-as-a-Service (BaaS)-Enabled Workflows in Federated Serverless InfrastructuresIEEE Transactions on Cloud Computing10.1109/TCC.2024.343926812:4(1088-1102)Online publication date: Oct-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media