Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Vineyard: Optimizing Data Sharing in Data-Intensive Analytics

Published: 20 June 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Modern data analytics and AI jobs become increasingly complex and involve multiple tasks performed on specialized systems. Sharing of intermediate data between different systems is often a significant bottleneck in such jobs. When the intermediate data is large, it is mostly exchanged through files in standard formats (e.g., CSV and ORC), causing high I/O and (de)serialization overheads. To solve these problems, we develop Vineyard, a high-performance, extensible, and cloud-native object store, trying to provide an intuitive experience for users to share data across systems in complex real-life workflows. Since different systems usually work on data structures (e.g., dataframes, graphs, hashmaps) with similar interfaces, and their computation logic is often loosely-coupled with how such interfaces are implemented over specific memory layouts, it enables Vineyard to conduct data sharing efficiently at a high level via memory mapping and method sharing. Vineyard provides an IDL named VCDL to facilitate users to register their own intermediate data types into Vineyard such that objects of the registered types can then be efficiently shared across systems in a polyglot workflow. As a cloud-native system, Vineyard is designed to work closely with Kubernetes, as well as achieve fault-tolerance and high performance in production environments. Evaluations on real-life datasets and data analytics jobs show that the above optimizations of Vineyard can significantly improve the end-to-end performance of data analytics jobs, by reducing their data-sharing time up to 68.4x.

    Supplemental Material

    MP4 File
    Vineyard is an in-memory object manager developed by Alibaba Group's DAMO Academy that aims to improve data sharing in data-intensive analytics. Vineyard examined real-life scenarios and shows how big data has surpassed the capacity of single compute systems, leading to a need for specialized systems and federated computing platforms. The intermediate data between these systems becomes a bottleneck when shared with external file systems. Vineyard proposes optimizing intermediate sharing times and efficiently bridging different systems for decoupled intermediate data exchange. By using objects as metadata and a set of blobs, Vineyard enables efficient, zero-copy sharing for complex objects like graphs. Vineyard not only improves performance but also enables cross-engine and cross-language integration effort through its object composability. In practice, Vineyard has been integrated into various data-intensive systems and can accelerate end-to-end execution time up to 9x.

    References

    [1]
    2019. Google Analytics Customer Revenue Prediction. https://www.kaggle.com/c/ga-customer-revenue-prediction.
    [2]
    2023. ioctl(2) - Linux manual page. https://man7.org/linux/man-pages/man2/ioctl.2.html.
    [3]
    2023. LD_PRELOAD - Linux manual page. https://man7.org/linux/man-pages/man8/ld.so.8.html.
    [4]
    2023. Data-intensive computing. https://en.wikipedia.org/wiki/Data-intensive_computing.
    [5]
    2023. Kubernets Scheduling Framework. https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework.
    [6]
    2023. Node Property Prediction. https://ogb.stanford.edu/docs/nodeprop/.
    [7]
    2023. Production-Grade Container Orchestration. https://kubernetes.io.
    [8]
    Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. https://www.tensorflow.org/ Software available from tensorflow.org.
    [9]
    Sajid Alam, Nok Lam Chan, Gabriel Comym, Yetunde Dada, Ivan Danov, Deepyaman Datta, Tynan DeBold, Jannic Holzer, Rashida Kanchwala, Ankita Katiyar, Amanda Koh, Andrew Mackay, Ahdra Merali, Antony Milne, Huong Nguyen, Nero Okwa, Juan Luis Cano Rodríguez, Joel Schwarzmann, Jo Stichbury, and Merel Theisen. 2023. Kedro. https://github.com/kedro-org/kedro
    [10]
    Inc. Amazon Web Service. 2022. Amazon Simple Storage Service: Object Storage built to retrieve any amount of data from anywhere. https://aws.amazon.com/s3/.
    [11]
    Ganesh Ananthanarayanan, Ali Ghodsi, Andrew Warfield, Dhruba Borthakur, Srikanth Kandula, Scott Shenker, and Ion Stoica. 2012. Pacman: Coordinated memory caching for parallel jobs. In 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12). 267--280.
    [12]
    Fluid Authors. 2021. Fluid: elastic data abstraction and acceleration for BigData/AI applications in cloud. https://fluid-cloudnative.github.io.
    [13]
    Kubernetes Authors. 2022. Kubernets Custom Resources. https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources.
    [14]
    Kubernetes Authors. 2022. Kubernets Operator Pattern. https://kubernetes.io/docs/concepts/extend-kubernetes/operator/.
    [15]
    NumPy Authors. 2022. NumPy: The fundamental package for scientific computing with Python. https://www.numpy.org/.
    [16]
    Pandas authors. 2022. Pandas: Python Data Analysis Library. https://pandas.pydata.org/.
    [17]
    Polars Authors. 2022. Polars: Fast multi-threaded, hybrid-streaming DataFrame library. https://www.pola.rs.
    [18]
    SWIG Authors. 2019. SWIG: Simplified Wrapper and Interface Generator. https://github.com/swig/swig.
    [19]
    Inc. ClickHouse. 2022. ClickHouse: Fast Open-Source OLAP DBMS. https://clickhouse.com/.
    [20]
    Dormando. 2022. memcached: a distributed memory object caching system. https://memcached.org/.
    [21]
    Inc. Elementl. 2023. Dagster: An orchestration platform for the development, production, and observation of data assets. https://github.com/dagster-io/dagster.
    [22]
    etcd Authors. 2022. etcd: A distributed, reliable key-value store for the most critical data of a distributed system. https://etcd.io/.
    [23]
    Wenfei Fan, Tao He, Longbin Lai, Xue Li, Yong Li, Zhao Li, Zhengping Qian, Chao Tian, Lei Wang, Jingbo Xu, Youyang Yao, Qiang Yin, Wenyuan Yu, Kai Zeng, Kun Zhao, Jingren Zhou, Diwen Zhu, and Rong Zhu. 2021. GraphScope: A Unified Engine For Big Graph Processing. Proc. VLDB Endow. 14, 12 (2021), 2879--2892.
    [24]
    Wenfei Fan, Wenyuan Yu, Jingbo Xu, Jingren Zhou, Xiaojian Luo, Qiang Yin, Ping Lu, Yang Cao, and Ruiqi Xu. 2018. Parallelizing sequential graph computations. ACM Transactions on Database Systems (TODS) 43, 4 (2018), 1--39.
    [25]
    Yihui Feng, Zhi Liu, Yunjian Zhao, Tatiana Jin, Yidi Wu, Yang Zhang, James Cheng, Chao Li, and Tao Guan. 2021. Scaling Large Production Clusters with Partitioned Synchronization. In 2021 USENIX Annual Technical Conference (USENIX ATC 21). 81--97.
    [26]
    Linux Foundation. 2015. Data Plane Development Kit (DPDK). http://www.dpdk.org.
    [27]
    The Apache Software Foundation. 2022. Apache Airflow: A platform to programmatically author, schedule, and monitor workflows. https://airflow.apache.org/.
    [28]
    The Apache Software Foundation. 2022. Apache Data Fusion SQL Query Engine. https://arrow.apache.org/datafusion/.
    [29]
    The Apache Software Foundation. 2022. Apache Doris: An easy-to-use, high-performance and unified analytical database. https://doris.apache.org/.
    [30]
    The Apache Software Foundation. 2022. Apache Dremio: The Easy and Open Data Lakehouse. https://www.dremio.com/.
    [31]
    The Apache Software Foundation. 2022. Arrow: A cross-language development platform for in-memory analytics. https://github.com/apache/arrow.
    [32]
    Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. 2003. The Google file system. In Proceedings of the nineteenth ACM symposium on Operating systems principles. 29--43.
    [33]
    Ionel Gog, Malte Schwarzkopf, Natacha Crooks, Matthew P Grosvenor, Allen Clement, and Steven Hand. 2015. Musketeer: all for one, one for all in data processing systems. In Proceedings of the Tenth European Conference on Computer Systems. 1--16.
    [34]
    Joseph E Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. Powergraph: Distributed graph-parallel computation on natural graphs. In 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12). 17--30.
    [35]
    Inc. Google. 2022. Protocol Buffers: A language-neutral, platform-neutral extensible mechanism for serializing structured data. https://developers.google.com/protocol-buffers.
    [36]
    Robert Grandl, Arjun Singhvi, Raajay Viswanathan, and Aditya Akella. 2021. Whiz: Data-Driven Analytics Execution. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21).
    [37]
    gRPC Authors. 2022. gRPC: A high performance, open source universal RPC framework. https://grpc.io.
    [38]
    Weihua Hu, Matthias Fey, Hongyu Ren, Maho Nakata, Yuxiao Dong, and Jure Leskovec. 2021. Ogb-lsc: A large-scale challenge for machine learning on graphs. arXiv preprint arXiv:2103.09430 (2021).
    [39]
    Inc. Juicedata. 2022. JuiceFS: A POSIX, HDFS and S3 compatible distributed file system for cloud. https://juicefs.com/en/.
    [40]
    Haoyuan Li, Ali Ghodsi, Matei Zaharia, Scott Shenker, and Ion Stoica. 2014. Tachyon: Reliable, memory speed storage for cluster computing frameworks. In Proceedings of the ACM Symposium on Cloud Computing. 1--15.
    [41]
    Jingdong Li, Zhao Li, Jiaming Huang, Ji Zhang, Xiaoling Wang, Xingjian Lu, and Jingren Zhou. 2021. Large-scale Fake Click Detection for E-commerce Recommendation Systems. In ICDE.
    [42]
    libclang Authors. 2022. libclang: C interface to Clang. https://clang.llvm.org/doxygen/group__CINDEX.html.
    [43]
    libfuse authors. 2022. libfuse: The reference implementation of the Linux FUSE (Filesystem in Userspace) interface. https://github.com/libfuse/libfuse.
    [44]
    Redis Ltd. 2022. Redis: The open source, in-memory data store. https://redis.io/.
    [45]
    The Alibaba Group Holding Ltd. 2022. Mars: a tensor-based unified framework for large-scale data computation. https://github.com/mars-project/mars.
    [46]
    Ruotian Luo. 2017. An Image Captioning codebase in PyTorch. https://github.com/ruotianluo/ImageCaptioning.pytorch.
    [47]
    Frank McSherry, Michael Isard, and Derek G Murray. 2015. Scalability! But at what COST?. In 15th Workshop on Hot Topics in Operating Systems (HotOS XV ).
    [48]
    Anthony M Middleton. 2010. Data-intensive technologies for cloud computing. In Handbook of cloud computing. Springer, 83--136.
    [49]
    Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I Jordan, et al . 2018. Ray: A distributed framework for emerging AI applications. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). 561--577.
    [50]
    Vu Nguyen, Sophia Deeds-Rubin, Thomas Tan, and Barry Boehm. 2007. A SLOC counting standard. In Cocomo ii forum, Vol. 2007. Citeseer, 1--16.
    [51]
    Shoumik Palkar and Matei Zaharia. 2019. Optimizing Data-Intensive Computations in Existing Libraries with Split Annotations. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (Huntsville, Ontario, Canada) (SOSP '19). Association for Computing Machinery, New York, NY, USA, 291--305. https://doi.org/10.1145/3341301.3359652
    [52]
    Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 8024--8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
    [53]
    Zhengping Qian, Chenqiang Min, Longbin Lai, Yong Fang, Gaofeng Li, Youyang Yao, Bingqing Lyu, Xiaoli Zhou, Zhimin Chen, and Jingren Zhou. 2021. GAIA: A System for Interactive Analysis on Distributed Graphs Using a High-Level Language. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21).
    [54]
    Manuel Rigger, Matthias Grimmer, and Hanspeter Mössenböck. 2016. Sulong - Execution of LLVM-Based Languages on the JVM: Position Paper. In Proceedings of the 11th Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems (Rome, Italy) (ICOOOLPS '16). Association for Computing Machinery, New York, NY, USA, Article 7, 4 pages. https://doi.org/10.1145/3012408.3012416
    [55]
    scikit-learn Authors. 2022. scikit-learn: Machine-Learning in Python. https://scikit-learn.org/.
    [56]
    Raghav Sethi, Martin Traverso, Dain Sundstrom, David Phillips, Wenlei Xie, Yutian Sun, Nezih Yegitbasi, Haozhun Jin, Eric Hwang, Nileema Shingte, and Christopher Berner. 2019. Presto: SQL on Everything. In 2019 IEEE 35th International Conference on Data Engineering (ICDE). 1802--1813. https://doi.org/10.1109/ICDE.2019.00196
    [57]
    Konstantin Shvachko, Hairong Kuang, Sanjay Radia, and Robert Chansler. 2010. The hadoop distributed file system. In 2010 IEEE 26th symposium on mass storage systems and technologies (MSST). Ieee, 1--10.
    [58]
    Mark Slee, Aditya Agarwal, and Marc Kwiatkowski. 2007. Thrift: Scalable cross-language services implementation. Facebook white paper 5, 8 (2007), 127.
    [59]
    Levon Stepanian, Angela Demke Brown, Allan Kielstra, Gita Koblents, and Kevin Stoodley. 2005. Inlining Java Native Calls at Runtime. In Proceedings of the 1st ACM/USENIX International Conference on Virtual Execution Environments (Chicago, IL, USA) (VEE '05). Association for Computing Machinery, New York, NY, USA, 121--131. https://doi.org/10.1145/1064979.1064997
    [60]
    Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Ning Zhang, Suresh Antony, Hao Liu, and Raghotham Murthy. 2010. Hive-a petabyte scale data warehouse using hadoop. In 2010 IEEE 26th international conference on data engineering (ICDE 2010). IEEE, 996--1005.
    [61]
    Hongwei Wang, Fuzheng Zhang, Miao Zhao, Wenjie Li, Xing Xie, and Minyi Guo. 2019. Multi-task feature learning for knowledge graph enhanced recommendation. In The World Wide Web Conference. 2000--2010.
    [62]
    Thomas Würthinger, Christian Wimmer, Andreas Wöß, Lukas Stadler, Gilles Duboscq, Christian Humer, Gregor Richards, Doug Simon, and Mario Wolczko. 2013. One VM to Rule Them All. In Proceedings of the 2013 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming & Software (Indianapolis, Indiana, USA) (Onward! 2013). Association for Computing Machinery, New York, NY, USA, 187--204. https://doi.org/10.1145/2509578.2509581
    [63]
    Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauly, Michael J Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12). 15--28.
    [64]
    Matei Zaharia, Reynold S Xin, Patrick Wendell, Tathagata Das, Michael Armbrust, Ankur Dave, Xiangrui Meng, Josh Rosen, Shivaram Venkataraman, Michael J Franklin, et al . 2016. Apache spark: a unified engine for big data processing. Commun. ACM 59, 11 (2016), 56--65.
    [65]
    Matei Zaharia, Reynold S. Xin, Patrick Wendell, Tathagata Das, Michael Armbrust, Ankur Dave, Xiangrui Meng, Josh Rosen, Shivaram Venkataraman, Michael J. Franklin, Ali Ghodsi, Joseph Gonzalez, Scott Shenker, and Ion Stoica. 2016. Apache Spark: A Unified Engine for Big Data Processing. Commun. ACM 59, 11 (Oct. 2016), 56--65. https://doi.org/10.1145/2934664
    [66]
    Rong Zhu, Kun Zhao, Hongxia Yang, Wei Lin, Chang Zhou, Baole Ai, Yong Li, and Jingren Zhou. 2019. AliGraph: a comprehensive graph neural network platform. Proceedings of the VLDB Endowment 12, 12 (2019), 2094--2105.
    [67]
    Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A computation-centric distributed graph processing system. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 301--316.
    [68]
    Xiaowei Zhu, Guanyu Feng, Marco Serafini, Xiaosong Ma, Jiping Yu, Lei Xie, Ashraf Aboulnaga, and Wenguang Chen. 2020. LiveGraph: A Transactional Graph Storage System with Purely Sequential Adjacency List Scans. Proc. VLDB Endow. 13, 7 (mar 2020), 1020--1034. https://doi.org/10.14778/3384345.3384351

    Cited By

    View all
    • (2024)GraphScope Flex: LEGO-like Graph Computing StackCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653383(386-399)Online publication date: 9-Jun-2024
    • (2024)Xorbits: Automating Operator Tiling for Distributed Data Science2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00392(5211-5223)Online publication date: 13-May-2024
    • (2023)Fast Maximal Quasi-clique Enumeration: A Pruning and Branching Co-Design ApproachProceedings of the ACM on Management of Data10.1145/36173311:3(1-26)Online publication date: 13-Nov-2023

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Management of Data
    Proceedings of the ACM on Management of Data  Volume 1, Issue 2
    PACMMOD
    June 2023
    2310 pages
    EISSN:2836-6573
    DOI:10.1145/3605748
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 June 2023
    Published in PACMMOD Volume 1, Issue 2

    Permissions

    Request permissions for this article.

    Author Tags

    1. data sharing
    2. in-memory object store

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)261
    • Downloads (Last 6 weeks)17
    Reflects downloads up to

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)GraphScope Flex: LEGO-like Graph Computing StackCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653383(386-399)Online publication date: 9-Jun-2024
    • (2024)Xorbits: Automating Operator Tiling for Distributed Data Science2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00392(5211-5223)Online publication date: 13-May-2024
    • (2023)Fast Maximal Quasi-clique Enumeration: A Pruning and Branching Co-Design ApproachProceedings of the ACM on Management of Data10.1145/36173311:3(1-26)Online publication date: 13-Nov-2023

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media