Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Automating Vectorized Distributed Graph Computation

Published: 20 December 2024 Publication History

Abstract

Multi-instance graph algorithms interleave the evaluation of multiple instances of the same algorithm with different inputs over the same graph. They have been shown to be significantly faster than traditional serial and batch evaluation, by sharing computation across instances. However, writing correct multi-instance algorithms is challenging; and in this work, we describe AutoMI, a framework for automatically converting vertex-centric graph algorithms into their vectorized multi-instance versions. We also develop an algebraic characterization of algorithms that can benefit best from multi-instance computation with simpler and faster streamlined vectorization. This allows users to decide when to use such optimization and instruct AutoMI to make the best use of SIMD vectorization. Using 6 real-life graphs, we show that AutoMI-converted multi-instance algorithms are 9.6 to 29.5 times faster than serial evaluation, 7.1 to 26.4 times faster than batch evaluation, and are even 2.6 to 4.6 times faster than existing highly optimized handcrafted multi-instance algorithms without vectorization.

References

[1]
Access: 2024. Friendster. http://konect.cc/networks/friendster/.
[2]
Access: 2024. Giraph. https://giraph.apache.org/.
[3]
Access: 2024. Intel® Intrinsics Guide. https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html.
[4]
Access: 2024. libclang. https://clang.llvm.org/.
[5]
Access: 2024. LiveJournal. http://snap.stanford.edu/data/soc-LiveJournal1.html.
[6]
Access: 2024. MovieLens. http://grouplens.org/datasets/movielens/.
[7]
Access: 2024. Multi-instance BFS implementation. https://github.com/mtodat/ms-bfs.
[8]
Access: 2024. Netflix. http://konect.cc/networks/netflix/.
[9]
Access: 2024. PowerLyra. https://github.com/realstolz/powerlyra/.
[10]
Access: 2024. Quegel. http://www.cse.cuhk.edu.hk/systems/quegel/.
[11]
Access: 2024. Twitter. http://konect.cc/networks/twitter/.
[12]
Access: 2024. UK domain. http://konect.cc/networks/dimacs10-uk-2007-05/.
[13]
Mahmoud Abo Khamis, Hung Q Ngo, Reinhard Pichler, Dan Suciu, and Yisu Remy Wang. 2024. Convergence of datalog over (pre-) semirings. J. ACM 71, 2 (2024), 1--55.
[14]
Zahid Abul-Basher. 2017. Multiple-Query Optimization of Regular Path Queries. In 33rd IEEE International Conference on Data Engineering, ICDE 2017, San Diego, CA, USA, April 19--22, 2017. IEEE Computer Society, 1426--1430.
[15]
Khaled Ammar and M Tamer Özsu. 2018. Experimental analysis of distributed graph systems. Proc. VLDB Endow. 11, 10 (2018), 1151--1164.
[16]
NicolasBruno,LuisGravano,NickKoudas,andDiveshSrivastava.2003.Navigation-vs.Index-BasedXMLMulti-Query Processing. In Proceedings of the 19th International Conference on Data Engineering, March 5--8, 2003, Bangalore, India, Umeshwar Dayal, Krithi Ramamritham, and T. M. Vijayaraman (Eds.). IEEE Computer Society, 139--150.
[17]
Peter Buneman, Sanjeev Khanna, and Tan Wang-Chiew. 2001. Why and where: A characterization of data provenance. In Database Theory-ICDT 2001: 8th International Conference London, UK, January 4--6, 2001 Proceedings 8. Springer, 316--330.
[18]
Deepayan Chakrabarti, Yiping Zhan, and Christos Faloutsos. 2004. R-MAT: A recursive model for graph mining. In Proceedings of the 2004 SIAM International Conference on Data Mining. SIAM, 442--446.
[19]
HongzhengChen,MinghuaShen,NongXiao,andYutongLu.2021.Krill:acompilerandruntimesystemforconcurrent graph processing. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1--16.
[20]
Rong Chen, Jiaxin Shi, Yanzhe Chen, Binyu Zang, Haibing Guan, and Haibo Chen. 2019. PowerLyra: Differentiated graph computation and partitioning on skewed graphs. ACM Transactions on Parallel Computing (TOPC) 5, 3 (2019), 1--39.
[21]
Mohsen Koohi Esfahani, Peter Kilpatrick, and Hans Vandierendonck. 2021. Exploiting in-Hub Temporal Locality in SpMV-based Graph Processing. In ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9 - 12, 2021, Xian-He Sun, Sameer Shende, Laxmikant V. Kalé, and Yong Chen (Eds.). ACM, 42:1--42:10.
[22]
Haishuang Fan, Ming Li, Jingya Wu, Wenyan Lu, Xiaowei Li, and Guihai Yan. 2023. BitColor: Accelerating Large-Scale Graph Coloring on FPGA with Parallel Bit-Wise Engines. In Proceedings of the 52nd International Conference on Parallel Processing. 492--502.
[23]
Floris Geerts, Thomas Muñoz, Cristian Riveros, and Domagoj Vrgoc. 2021. Expressive Power of Linear Algebra Query Languages. In PODS.
[24]
Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. In 10th USENIX symposium on operating systems design and implemen- tation (OSDI 12). 17--30.
[25]
Todd J Green, Grigoris Karvounarakis, and Val Tannen. 2007. Provenance semirings. In Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. 31--40.
[26]
Todd J Green and Val Tannen. 2017. The semiring framework for database provenance. In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. 93--99.
[27]
Minyang Han, Khuzaima Daudjee, Khaled Ammar, M. Tamer Özsu, Xingfang Wang, and Tianqi Jin. 2014. An experimental comparison of pregel-like graph processing systems. Proc. VLDB Endow. 7, 12 (2014), 1047--1058.
[28]
Sungpack Hong, Hassan Chafi, Edic Sedlar, and Kunle Olukotun. 2012. Green-Marl: a DSL for easy and efficient graph analysis. In Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems. 349--362.
[29]
U Kang, Charalampos E. Tsourakakis, and Christos Faloutsos. 2011. PEGASUS: mining peta-scale graphs. Knowl. Inf. Syst. 27, 2 (2011), 303--325.
[30]
Moritz Kaufmann, Manuel Then, Alfons Kemper, and Thomas Neumann. 2017. Parallel Array-Based Single- and Multi-Source Breadth First Searches on Large Dense Graphs. In Proceedings of the 20th International Conference on Extending Database Technology, EDBT 2017, Venice, Italy, March 21--24, 2017, Volker Markl, Salvatore Orlando, Bernhard Mitschang, Periklis Andritsos, Kai-Uwe Sattler, and Sebastian Breß (Eds.). OpenProceedings.org, 1--12.
[31]
Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009), 30--37.
[32]
Aapo Kyrola, Guy E. Blelloch, and Carlos Guestrin. 2012. GraphChi: Large-Scale Graph Computation on Just a PC. In 10th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2012, Hollywood, CA, USA, October 8--10, 2012, Chandu Thekkath and Amin Vahdat (Eds.). USENIX Association, 31--46.
[33]
Wangchao Le, Anastasios Kementsietsidis, Songyun Duan, and Feifei Li. 2012. Scalable Multi-query Optimization for SPARQL. In IEEE 28th International Conference on Data Engineering (ICDE 2012), Washington, DC, USA (Arlington, Virginia), 1--5 April, 2012, Anastasios Kementsietsidis and Marcos Antonio Vaz Salles (Eds.). IEEE Computer Society, 666--677.
[34]
Jia Li, Wenyue Zhao, Nikos Ntarmos, Yang Cao, and Peter Buneman. 2023. MITra: A Framework for Multi-Instance Graph Traversal. Proceedings of the VLDB Endowment 16, 10 (2023).
[35]
Xinyi Liu, Zhigang Wang, Ning Wang, Xiangtan Li, Bo Zhang, Jun Qiao, Zhiqiang Wei, and Jie Nie. 2021. An Adaptive Sharing Framework for Efficient Multi-source Shortest Path Computation. In Web Information Systems and Applications - 18th International Conference, WISA 2021, Kaifeng, China, September 24--26, 2021, Proceedings (Lecture Notes in Computer Science, Vol. 12999), Chunxiao Xing, Xiaoming Fu, Yong Zhang, Guigang Zhang, and Chaolemen Borjigin (Eds.). Springer, 635--646.
[36]
Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, and Joseph M Hellerstein. 2012. Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud. Proceedings of the VLDB Endowment 5, 8 (2012).
[37]
Shengliang Lu, Shixuan Sun, Johns Paul, Yuchen Li, and Bingsheng He. 2021. Cache-Efficient Fork-Processing Patterns on Large Graphs. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021, Guoliang Li, Zhanhuai Li, Stratos Idreos, and Divesh Srivastava (Eds.). ACM, 1208--1221.
[38]
Yi Lu, James Cheng, Da Yan, and Huanhuan Wu. 2014. Large-scale distributed graph computing systems: an experi- mental evaluation. Proc. VLDB Endow. 8, 3 (2014), 281--292.
[39]
Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: a system for large-scale graph processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, Indiana, USA, June 6--10, 2010, Ahmed K. Elmagarmid and Divyakant Agrawal (Eds.). ACM, 135--146.
[40]
Abbas Mazloumi, Xiaolin Jiang, and Rajiv Gupta. 2019. MultiLyra: Scalable distributed evaluation of batches of iterative graph queries. In 2019 IEEE International Conference on Big Data (Big Data). IEEE, 349--358.
[41]
Robert Ryan McCune, Tim Weninger, and Greg Madey. 2015. Thinking like a vertex: A survey of vertex-centric frameworks for large-scale distributed graph processing. ACM Computing Surveys (CSUR) 48, 2 (2015).
[42]
Mehryar Mohri et al. 2002. Semiring frameworks and algorithms for shortest-distance problems. Journal of Automata, Languages and Combinatorics 7, 3 (2002), 321--350.
[43]
Peitian Pan and Chao Li. 2017. Congra: Towards Efficient Processing of Concurrent Graph Queries on Shared-Memory Machines. In 2017 IEEE International Conference on Computer Design, ICCD 2017, Boston, MA, USA, November 5--8, 2017. IEEE Computer Society, 217--224.
[44]
Peng Peng, Qi Ge, Lei Zou, M. Tamer Özsu, Zhiwei Xu, and Dongyan Zhao. 2021. Optimizing Multi-Query Evaluation in Federated RDF Systems. IEEE Trans. Knowl. Data Eng. 33, 4 (2021), 1692--1707.
[45]
Orestis Polychroniou, Arun Raghavan, and Kenneth A Ross. 2015. Rethinking SIMD vectorization for in-memory databases. In SIGMOD.
[46]
Dimitrios Prountzos, Roman Manevich, and Keshav Pingali. 2012. Elixir: A system for synthesizing concurrent graph programs. In Proceedings of the ACM international conference on Object oriented programming systems languages and applications. 375--394.
[47]
Dimitrios Prountzos, Roman Manevich, and Keshav Pingali. 2015. Synthesizing parallel graph programs via automated planning. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation. 533--544.
[48]
Xuguang Ren and Junhu Wang. 2016. Multi-Query Optimization for Subgraph Isomorphism Search. Proc. VLDB Endow. 10, 3 (2016), 121--132.
[49]
Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-stream: Edge-centric graph processing using streaming partitions. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. 472--488.
[50]
Timos K. Sellis. 1988. Multiple-Query Optimization. ACM Trans. Database Syst. 13, 1 (1988), 23--52.
[51]
Timos K. Sellis and Subrata Ghosh. 1990. On the Multiple-Query Optimization Problem. IEEE Trans. Knowl. Data Eng. 2, 2 (1990), 262--266.
[52]
Julian Shun and Guy E. Blelloch. [n. d.]. Ligra: a lightweight graph processing framework for shared memory. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '13, Shenzhen, China, February 23--27, 2013, Alex Nicolau, Xiaowei Shen, Saman P. Amarasinghe, and Richard W. Vuduc (Eds.).
[53]
Jiawen Sun, Hans Vandierendonck, and Dimitrios S Nikolopoulos. 2017. Accelerating graph analytics by utilising the memory locality of graph partitioning. In 2017 46th International Conference on Parallel Processing (ICPP). IEEE, 181--190.
[54]
Manuel Then, Moritz Kaufmann, Fernando Chirigati, Tuan-Anh Hoang-Vu, Kien Pham, Alfons Kemper, Thomas Neumann, and Huy T. Vo. 2014. The More the Merrier: Efficient Multi-Source Graph Traversal. Proc. VLDB Endow. 8, 4 (2014), 449--460.
[55]
Manuel Then, Timo Kersten, Stephan Günnemann, Alfons Kemper, and Thomas Neumann. 2017. Automatic algorithm transformation for efficient multi-snapshot analytics on temporal graphs. Proceedings of the VLDB Endowment 10, 8 (2017), 877--888.
[56]
Xinmin Tian, Hideki Saito, Serguei Preis, Eric N. Garcia, Sergey Kozhukhov, Matt Masten, Aleksei G. Cherkasov, and Nikolay Panchenko. 2013. Practical SIMD Vectorization Techniques for Intel® Xeon Phi Coprocessors. In IPDPS Workshops.
[57]
JilongXue,ZhiYang,ZhiQu,ShianHou,andYafeiDai.2014.Seraph:anefficient,low-costsystemforconcurrentgraph processing. In The 23rd International Symposium on High-Performance Parallel and Distributed Computing, HPDC'14, Vancouver, BC, Canada - June 23 - 27, 2014, Beth Plale, Matei Ripeanu, Franck Cappello, and Dongyan Xu (Eds.). ACM, 227--238.
[58]
Da Yan, Yingyi Bu, Yuanyuan Tian, Amol Deshpande, et al. 2017. Big graph analytics platforms. Foundations and Trends® in Databases 7, 1--2 (2017), 1--195.
[59]
Da Yan, James Cheng, M. Tamer Özsu, Fan Yang, Yi Lu, John C. S. Lui, Qizhen Zhang, and Wilfred Ng. 2016. A General-Purpose Query-Centric Framework for Querying Big Graphs. Proc. VLDB Endow. 9, 7 (2016), 564--575.
[60]
Da Yan, James Cheng, M. Tamer Özsu, Fan Yang, Yi Lu, John C. S. Lui, Qizhen Zhang, and Wilfred Ng. 2016. A General-Purpose Query-Centric Framework for Querying Big Graphs. Proc. VLDB Endow. 9, 7 (2016), 564--575.
[61]
Hiroki Yanagisawa. 2010. A multi-source label-correcting algorithm for the all-pairs shortest paths problem. In 24th IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2010, Atlanta, Georgia, USA, 19--23 April 2010 - Conference Proceedings. 1--10.
[62]
Xizhe Yin, Zhijia Zhao, and Rajiv Gupta. 2022. Glign: Taming misaligned graph traversals in concurrent graph processing. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1. 78--92.
[63]
Qizhen Zhang, Hongzhi Chen, Da Yan, James Cheng, Boon Thau Loo, and Purushotham Bangalore. 2017. Architectural implications on the performance and cost of graph analytics systems. In Proceedings of the 2017 Symposium on Cloud Computing. 40--51.
[64]
Qizhen Zhang, Da Yan, and James Cheng. 2016. Quegel: A General-Purpose System for Querying Big Graphs. In Proceedings of the 2016 International Conference on Management of Data (San Francisco, California, USA) (SIGMOD '16). ACM, 2189--2192.
[65]
YuZhang,XiaofeiLiao,HaiJin,LinGu,LigangHe,BingshengHe,andHaikunLiu.2018.CGraph:ACorrelations-aware Approach for Efficient Concurrent Iterative Graph Processing. In 2018 USENIX Annual Technical Conference, USENIX ATC 2018, Boston, MA, USA, July 11--13, 2018, Haryadi S. Gunawi and Benjamin Reed (Eds.). USENIX Association, 441--452.
[66]
Yunming Zhang, Mengjiao Yang, Riyadh Baghdadi, Shoaib Kamil, Julian Shun, and Saman Amarasinghe. 2018. Graphit: A high-performance graph dsl. Proceedings of the ACM on Programming Languages 2, OOPSLA (2018), 1--30.
[67]
Jin Zhao, Yu Zhang, Ligang He, Qikun Li, Xiang Zhang, Xinyu Jiang, Hui Yu, Xiaofei Liao, Hai Jin, Lin Gu, et al. 2023. GraphTune: An Efficient Dependency-aware Substrate to Alleviate Irregularity in Concurrent Graph Processing. ACM Transactions on Architecture and Code Optimization (2023).
[68]
Jin Zhao, Yu Zhang, Xiaofei Liao, Ligang He, Bingsheng He, Hai Jin, Haikun Liu, and Yicheng Chen. 2019. GraphM: an efficient storage system for high throughput of concurrent graph processing. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2019, Denver, Colorado, USA, November 17--19, 2019, Michela Taufer, Pavan Balaji, and Antonio J. Peña (Eds.). ACM, 3:1--3:14.

Cited By

View all
  • (2024)Output-sensitive Conjunctive Query EvaluationProceedings of the ACM on Management of Data10.1145/36958382:5(1-24)Online publication date: 7-Nov-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data
Proceedings of the ACM on Management of Data  Volume 2, Issue 6
SIGMOD
December 2024
792 pages
EISSN:2836-6573
DOI:10.1145/3709598
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 December 2024
Published in PACMMOD Volume 2, Issue 6

Author Tags

  1. algebraic characterization
  2. auto vectorization
  3. graph computation
  4. simd vectorization

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)89
  • Downloads (Last 6 weeks)89
Reflects downloads up to 30 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Output-sensitive Conjunctive Query EvaluationProceedings of the ACM on Management of Data10.1145/36958382:5(1-24)Online publication date: 7-Nov-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media