research-article

Open access

Core Graph: Exploiting Edge Centrality to Speedup the Evaluation of Iterative Graph Queries

Authors:

Nael Abu-Ghazaleh,

Rajiv GuptaAuthors Info & Claims

EuroSys '24: Proceedings of the Nineteenth European Conference on Computer Systems

Pages 18 - 32

https://doi.org/10.1145/3627703.3629571

Published: 22 April 2024 Publication History

Abstract

When evaluating an iterative graph query over a large graph, systems incur significant overheads due to repeated graph transfer across the memory hierarchy coupled with repeated (redundant) propagation of values over the edges in the graph. An approach for reducing these overheads combines the use of a small proxy graph and the large original graph in a two phase query evaluation. The first phase evaluates the query on the proxy graph incurring low overheads and producing mostly precise results. The second phase uses these mostly precise results to bootstrap query evaluation on the larger original graph producing fully precise results. The effectiveness of this approach depends upon the quality of the proxy graph. Prior methods find proxy graphs that are either large or produce highly imprecise results.

We present a new form of proxy graph named the Core Graph (CG) that is not only small, it also produces highly precise results. A CG is a subgraph of the larger input graph that contains all vertices but on average contains only 10.7% of edges and yet produces precise results for 94.5-99.9% vertices in the graph for different queries. The finding of such an effective CG is based on our key new insight, namely, a small subset of non-zero centrality edges are responsible for determining the converged results of nearly all the vertices across different queries. We develop techniques to identify a CG that produces precise results for most vertices and optimizations to efficiently compute precise results of remaining vertices. Across six kinds of graph queries and four input graphs, CGs improved the performance of GPU-based Subway system by up to 4.48×, of out-of-core disk-based GridGraph system by up to 13.62×, and of Ligra in-memory graph processing system by up to 9.31×.

References

[1]

Friendster data set. In http://konect.cc/networks/friendster/.

[2]

Snap: Stanford network analysis platform. In https://snap.stanford.edu/snap/.

[3]

Mahbod Afarin, Chao Gao, Shafiur Rahman, Nael Abu-Ghazaleh, and Rajiv Gupta. Commongraph: Graph analytics on evolving data. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'23), Vancouver, BC, Canada, March 25-29, 2023, pages 133--145. ACM, 2023.

Digital Library

[4]

Mahbod Afarin, Chao Gao, Shafiur Rahman, Nael Abu-Ghazaleh, and Rajiv Gupta. Commongraph: Graph analytics on evolving data. In Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing (HOPC'23), Orlando, FL, USA, June 16, 2023, pages 1--2, 2023.

Digital Library

[5]

Mahbod Afarin, Chao Gao, Shafiur Rahman, Nael Abu-Ghazaleh, and Rajiv Gupta. Graph analytics on evolving data (abstract). In arXiv preprint arXiv:2308.14834, 2023.

[6]

Tal Ben-Nun, Michael Sutton, Sreepathi Pai, and Keshav Pingali. Groute: An asynchronous multi-gpu programming model for irregular computations. In Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '17, pages 235--248, 2017.

Digital Library

[7]

M. Cha, H. Haddadi, Fabrício Benevenuto, and K. Gummadi. Measuring user influence in twitter: The million follower fallacy. In ICWSM, 2010.

[8]

Deepayan Chakrabarti, Yiping Zhan, and Christos Faloutsos. R-mat: A recursive model for graph mining. In SIAM Data Mining, 2004.

[9]

Chao Gao, Mahbod Afarin, Shafiur Rahman, Nael Abu-Ghazaleh, and Rajiv Gupta. Mega evolving graph accelerator. In MICRO-56: 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO '23, 2023.

Digital Library

[10]

Jennifer Golbeck. Analyzing the Social Web. Morgan Kaufmann, 2013.

[11]

Joseph E Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI), pages 17--30, 2012.

[12]

Wei Han, Daniel Mawhirter, Bo Wu, and Matthew Buland. Graphie: Large-scale asynchronous graph traversals on just a GPU. In Proceesdings of the 26th International Conference on Parallel Architectures and Compilation Techniques, PACT '17, pages 233--245, 2017.

[13]

Changwan Hong, Aravind Sukumaran-Rajam, Jinsung Kim, and P. Sadayappan. Multigraph: Efficient graph processing on gpus. In Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, PACT '17, pages 27--40, 2017.

[14]

Xiaolin Jiang, Chengshuo Xu, Xizhe Yin, Zhijia Zhao, and Rajiv Gupta. Tripoline: generalized incremental graph processing via graph triangle inequality. In EuroSys '21: Sixteenth European Conference on Computer Systems, Online Event, United Kingdom, April 26-28, 2021, pages 17--32. ACM, 2021.

Digital Library

[15]

Farzad Khorasani, Rajiv Gupta, and Laxmi N. Bhuyan. Scalable simd-efficient graph processing on gpus. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, PACT '15, pages 39--50, 2015.

Digital Library

[16]

Farzad Khorasani, Keval Vora, Rajiv Gupta, and Laxmi N. Bhuyan. Cusha: vertex-centric graph processing on gpus. In Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing, HPDC '14, pages 239--252. ACM, 2014.

Digital Library

[17]

Min-Soo Kim, Kyuhyeon An, Himchan Park, Hyunseok Seo, and Jinwook Kim. Gts: A fast and scalable graph processing method based on streaming topology to gpus. In Proceedings of the ACM SIGMOD International Conference on Management of Data, page 447--461, 2016.

[18]

Amlan Kusum, Keval Vora, Rajiv Gupta, and Iulian Neamtiu. Efficient processing of large graphs via input reduction. In Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2016, Kyoto, Japan, May 31 - June 04, 2016, pages 245--257. ACM, 2016.

Digital Library

[19]

Haewoon Kwak, Changhyun Lee, Hosung Park, and S. Moon. What is twitter, a social network or a news media? In WWW '10, 2010.

Digital Library

[20]

Aapo Kyrola, Guy E. Blelloch, and Carlos Guestrin. Graphchi: Large-scale graph computation on just a PC. In USENIX Symposium on Operating Systems Design and Implementation (OSDI), pages 31--46. USENIX Association, 2012.

[21]

Jure Leskovec and Christos Faloutsos. Sampling from large graphs. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '06, page 631--636, New York, NY, USA, 2006. Association for Computing Machinery.

Digital Library

[22]

Chen Li, Rachata Ausavarungnirun, Christopher J. Rossbach, Youtao Zhang, Onur Mutlu, Yang Guo, and Jun Yang. A framework for memory oversubscription management in graphics processing units. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '19, pages 49--63, 2019.

Digital Library

[23]

Yucheng Low, Joseph E Gonzalez, Aapo Kyrola, Danny Bickson, Carlos E Guestrin, and Joseph Hellerstein. Graphlab: A new framework for parallel machine learning. arXiv preprint arXiv:1408.2041, 2014.

Digital Library

[24]

LongJason Lu and Minlu Zhang. Edge Betweenness Centrality. In Encyclopedia of Systems Biology, pages 647--648. Springer New York, New York, NY, 2013.

[25]

Grzegorz Malewicz, Matthew H Austern, Aart JC Bik, James C Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pages 135--146, 2010.

Digital Library

[26]

Richard C Murphy, Kyle B Wheeler, Brian W Barrett, and James A Ang. Introducing the graph 500. Cray Users Group (CUG), 19:45--74, 2010.

[27]

Donald Nguyen, Andrew Lenharth, and Keshav Pingali. A lightweight infrastructure for graph analytics. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP '13, page 456--471, 2013.

Digital Library

[28]

Amir Hossein Nodehi Sabet, Junqiao Qiu, and Zhijia Zhao. Tigr: Transforming irregular graphs for gpu-friendly graph processing. ACM SIGPLAN Notices, 53(2):622--636, 2018.

Digital Library

[29]

Santosh Pandey, Lingda Li, Adolfy Hoisie, Xiaoye S. Li, and Hang Liu. C-SAW: a framework for graph sampling and random walk on gpus. In Christine Cuicchi, Irene Qualters, and William T. Kramer, editors, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2020, Virtual Event / Atlanta, Georgia, USA, November 9-19, 2020, page 56. IEEE/ACM, 2020.

[30]

Shafiur Rahman, Mahbod Afarin, Nael Abu-Ghazaleh, and Rajiv Gupta. Jetstream: Graph analytics on streaming data with event-driven hardware accelerator. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO '21, page 1091--1105, New York, NY, USA, 2021. Association for Computing Machinery.

Digital Library

[31]

Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. X-stream: Edge-centric graph processing using streaming partitions. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pages 472--488, 2013.

Digital Library

[32]

Amir Hossein Nodehi Sabet, Zhijia Zhao, and Rajiv Gupta. Subway: minimizing data transfer during out-of-gpu-memory graph processing. In Proceedings of the Fifteenth EuroSys Conference, EuroSys '20, pages 12:1--12:16, 2020.

[33]

Dipanjan Sengupta, Kapil Agarwal, Shuaiwen Leon Song, and Karsten Schwan. Graphreduce: Large-scale graph analytics on accelerator-based HPC systems. In IEEE International Parallel and Distributed Processing Symposium Workshop, IPDPSW '15, pages 604--609, 2015.

Digital Library

[34]

Julian Shun and Guy E Blelloch. Ligra: a lightweight graph processing framework for shared memory. In Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 135--146, 2013.

Digital Library

[35]

L. Takac. Data analysis in public social networks. 2012.

[36]

Keval Vora, Chen Tian, Rajiv Gupta, and Ziang Hu. Coral: Confined recovery in distributed asynchronous graph processing. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '17, pages 223--236, 2017.

Digital Library

[37]

Keval Vora, Guoqing Xu, and Rajiv Gupta. Load the edges you need: A generic I/O optimization for disk-based graph processing. In Ajay Gulati and Hakim Weatherspoon, editors, USENIX Annual Technical Conference (USENIX ATC) 2016, Denver, CO, USA, June 22-24, 2016, pages 507--522. USENIX Association, 2016.

[38]

Yangzihao Wang, Yuechao Pan, Andrew A. Davidson, Yuduo Wu, Carl Yang, Leyuan Wang, Muhammad Osama, Chenshan Yuan, Weitang Liu, Andy T. Riffel, and John D. Owens. Gunrock: GPU graph analytics. ACM Transactions on Parallel Computing, 4(1):3:1-3:49, 2017.

Digital Library

[39]

Ye Wang, Qing Wang, Henning Koehler, and Yu Lin. Query-by-sketch: Scaling shortest path graph queries on very large networks. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20-25, 2021, pages 1946--1958. ACM, 2021.

[40]

Chengshuo Xu, Keval Vora, and Rajiv Gupta. Pnp: Pruning and prediction for point-to-point iterative graph analytics. In Iris Bahar, Maurice Herlihy, Emmett Witchel, and Alvin R. Lebeck, editors, Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2019, Providence, RI, USA, April 13-17, 2019, pages 587--600. ACM, 2019.

Digital Library

[41]

Ke Yang, MingXing Zhang, Kang Chen, Xiaosong Ma, Yang Bai, and Yong Jiang. Knightking: A fast distributed graph random walk engine. In Proceedings of the 27th ACM Symposium on Operating Systems Principles, SOSP '19, page 524--537, New York, NY, USA, 2019. Association for Computing Machinery.

Digital Library

[42]

Mingxing Zhang, Yongwei Wu, Youwei Zhuo, Xuehai Qian, Chengying Huan, and Kang Chen. Wonderland: A novel abstraction-based out-of-core graph processing system. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2018, Williamsburg, VA, USA, March 24-28, 2018, pages 608--621. ACM, 2018.

Digital Library

[43]

Xiaowei Zhu, Wentao Han, and Wenguang Chen. Gridgraph: Large-scale graph processing on a single machine using 2-level hierarchical partitioning. In USENIX Annual Technical Conference (USENIX ATC), July 8-10, Santa Clara, CA, USA, pages 375--386, 2015.

Index Terms

Core Graph: Exploiting Edge Centrality to Speedup the Evaluation of Iterative Graph Queries
1. Computing methodologies
  1. Parallel computing methodologies
2. Information systems
  1. Information systems applications
    1. Computing platforms

Recommendations

Collapsible subgraphs of a 4-edge-connected graph
Abstract
Jaeger in 1979 showed that every 4-edge-connected graph is supereulerian, graphs that have spanning eulerian subgraphs. Catlin in 1988 sharpened Jaeger’s result by showing that every 4-edge-connected graph is collapsible, graphs that ...
Graph edge colouring: Tashkinov trees and Goldberg's conjecture

For the chromatic index @g^'(G) of a (multi)graph G, there are two trivial lower bounds, namely the maximum degree @D(G) and the density W(G)=max"H"@__ __"G","|"V"("H")"|">="2@__ __|E(H)|/@__ __|V(H)|/2@__ __@__ __. A famous conjecture due to Goldberg [...
Rainbow matchings in an edge-colored planar bipartite graph
Highlights
- Given an edge-colored graph G, if any two edges of G receive distinct colors, then we call G a rainbow graph. The anti-Ramsey number AR(G;H) is the maximum ...
Abstract
In this paper, we consider the existence of rainbow matchings in maximal bipartite planar graphs. We determine the maximum number of colors appearing in an edge-coloring of maximal bipartite planar graphs with a Hamilton cycle which ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

EuroSys '24: Proceedings of the Nineteenth European Conference on Computer Systems

April 2024

1245 pages

ISBN:9798400704376

DOI:10.1145/3627703

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGOPS: ACM Special Interest Group on Operating Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 April 2024

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Science Foundation

Conference

EuroSys '24

Sponsor:

SIGOPS

EuroSys '24: Nineteenth European Conference on Computer Systems

April 22 - 25, 2024

Athens, Greece

Acceptance Rates

Overall Acceptance Rate 241 of 1,308 submissions, 18%

Upcoming Conference

EuroSys '25

Sponsor:
sigops

Twentieth European Conference on Computer Systems

March 30 - April 3, 2025

Rotterdam , Netherlands

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
326
Total Downloads

Downloads (Last 12 months)326
Downloads (Last 6 weeks)90

Reflects downloads up to 14 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents