research-article

Chaos: scale-out graph processing from secondary storage

Authors:

Laurent Bindschaedler,

Jasmina Malicevic,

Willy ZwaenepoelAuthors Info & Claims

SOSP '15: Proceedings of the 25th Symposium on Operating Systems Principles

Pages 410 - 424

https://doi.org/10.1145/2815400.2815408

Published: 04 October 2015 Publication History

Abstract

Chaos scales graph processing from secondary storage to multiple machines in a cluster. Earlier systems that process graphs from secondary storage are restricted to a single machine, and therefore limited by the bandwidth and capacity of the storage system on a single machine. Chaos is limited only by the aggregate bandwidth and capacity of all storage devices in the entire cluster.

Chaos builds on the streaming partitions introduced by X-Stream in order to achieve sequential access to storage, but parallelizes the execution of streaming partitions. Chaos is novel in three ways. First, Chaos partitions for sequential storage access, rather than for locality and load balance, resulting in much lower pre-processing times. Second, Chaos distributes graph data uniformly randomly across the cluster and does not attempt to achieve locality, based on the observation that in a small cluster network bandwidth far outstrips storage bandwidth. Third, Chaos uses work stealing to allow multiple machines to work on a single partition, thereby achieving load balance at runtime.

In terms of performance scaling, on 32 machines Chaos takes on average only 1.61 times longer to process a graph 32 times larger than on a single machine. In terms of capacity scaling, Chaos is capable of handling a graph with 1 trillion edges representing 16 TB of input data, a new milestone for graph processing capacity on a small commodity cluster.

Supplementary Material

MP4 File (p410.mp4)

Download
2296.21 MB

References

[1]

http://www.graph500.org/results_jun_2014

[2]

https://www.facebook.com/notes/facebook-engineering/scaling-apache-giraph-to-a-trillion-edges/10151617006153920

[3]

http://zeromq.org/

[4]

http://webdatacommons.org/hyperlinkgraph/

[5]

http://freecode.com/projects/fio

[6]

http://giraph.apache.org/

[7]

Balakrishnan, M., Malkhi, D., Prabhakaran, V., Wobber, T., Wei, M., and Davis, J. D. CORFU: A shared log design for flash clusters. In Proceedings of the conference on Networked Systems Design and Implementation (2012), USENIX Association.

Digital Library

[8]

Blumofe, R. D., and Leiserson, C. E. Scheduling multithreaded computations by work stealing. Journal of the ACM (JACM) 46, 5 (1999), 720--748.

Digital Library

[9]

Chakrabarti, D., Zhan, Y., and Faloutsos, C. RMAT: A recursive model for graph mining. In Proceedings of the SIAM International Conference on Data Mining (2004), SIAM.

[10]

Chen, R., Shi, J., Chen, Y., and Chen, H. PowerLyra: Differentiated graph computation and partitioning on skewed graphs. In Proceedings of the European Conference on Computer Systems (2015), ACM, pp. 1:1--1:15.

Digital Library

[11]

Elnozahy, E. N., Johnson, D. B., and Zwaenepoel, W. The performance of consistent checkpointing. In Proceedings of the Symposium on Reliable Distributed Systems (1992), IEEE, pp. 39--47.

[12]

Garey, M. R., Johnson, D. S., and Stockmeyer, L. Some simplified NP-complete graph problems. Theoretical computer science 1, 3 (1976), 237--267.

[13]

Gonzalez, J. E., Low, Y., Gu, H., Bickson, D., and Guestrin, C. Powergraph: distributed graph-parallel computation on natural graphs. In Proceedings of the Conference on Operating Systems Design and Implementation (2012), USENIX Association, pp. 17--30.

Digital Library

[14]

Gonzalez, J. E., Xin, R. S., Dave, A., Crankshaw, D., Franklin, M. J., and Stoica, I. GraphX: Graph processing in a distributed dataflow framework. In Proceedings of the Conference on Operating Systems Design and Implementation (2014), USENIX Association, pp. 599--613.

Digital Library

[15]

Greenberg, A., Hamilton, J. R., Jain, N., Kandula, S., Kim, C., Lahiri, P., Maltz, D. A., Patel, P., and Sengupta, S. VL2: A scalable and flexible data center network. SIGCOMM Comput. Commun. Rev. 39, 4, 51--62.

Digital Library

[16]

Han, W.-S., Lee, S., Park, K., Lee, J.-H., Kim, M.-S., Kim, J., and Yu, H. Turbograph: a fast parallel graph engine handling billion-scale graphs in a single PC. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (2013), ACM, pp. 77--85.

Digital Library

[17]

Khayyat, Z., Awara, K., Alonazi, A., Jamjoom, H., Williams, D., and Kalnis, P. Mizan: A system for dynamic load balancing in large-scale graph processing. In Proceedings of the European Conference on Computer Systems (2013), ACM, pp. 169--182.

Digital Library

[18]

Kyrola, A., and Blelloch, G. Graphchi: Large-scale graph computation on just a PC. In Proceedings of the Conference on Operating Systems Design and Implementation (2012), USENIX Association.

Digital Library

[19]

Little, J. D. A proof for the queuing formula: L = λ W. Operations Research 9, 3 (May 1961), 383--387.

Digital Library

[20]

Lumsdaine, A., Gregor, D., Hendrickson, B., and Berry, J. Challenges in parallel graph processing. Parallel Processing Letters 17, 1 (2007), 5--20.

[21]

Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J. C., Horn, I., Leiser, N., and Czajkowski, G. Pregel: a system for large-scale graph processing. In Proceedings of the International Conference on Management of Data (2010), ACM, pp. 135--146.

Digital Library

[22]

Malicevic, J., Roy, A., and Zwaenepoel, W. Scale-up graph processing in the cloud: Challenges and solutions. In Proceedings of the International Workshop on Cloud Data and Platforms (2014), ACM, pp. 5:1--5:6.

Digital Library

[23]

Mathur, A., Cao, M., Bhattacharya, S., Dilger, A., Tomas, A., and Vivier, L. The new ext4 filesystem: current status and future plans. In Proceedings of the Linux Symposium (2007), vol. 2, pp. 21--33.

[24]

Mitzenmacher, M. The power of two choices in randomized load balancing. Trans. Parallel Distrib. Syst. 12, 10 (2001).

Digital Library

[25]

Nelson, J., Holt, B., Myers, B., Briggs, P., Ceze, L., Kahan, S., and Oskin, M. Latency-tolerant software distributed shared memory. In Proceedings of the Usenix Annual Technical Conference (2015), USENIX Association, pp. 291--305.

Digital Library

[26]

Nguyen, D., Lenharth, A., and Pingali, K. A lightweight infrastructure for graph analytics. In Proceedings of the Symposium on Operating Systems Principles (2013), ACM, pp. 456--471.

Digital Library

[27]

Nightingale, E. B., Elson, J., Fan, J., Hofmann, O., Howell, J., and Suzue, Y. Flat datacenter storage. In Proceedings of the Conference on Operating Systems Design and Implementation (2012), USENIX Association, pp. 1--15.

Digital Library

[28]

Nilakant, K., Dalibard, V., Roy, A., and Yoneki, E. PrefEdge: SSD prefetcher for large-scale graph traversal. In Proceedings of the International Conference on Systems and Storage (2014), ACM, pp. 4:1--4:12.

Digital Library

[29]

Niranjan Mysore, R., Pamboris, A., Farrington, N., Huang, N., Miri, P., Radhakrishnan, S., Subramanya, V., and Vahdat, A. PortLand: A scalable fault-tolerant layer 2 data center network fabric. In Proceedings of the ACM SIGCOMM 2009 Conference on Data Communication (2009), ACM, pp. 39--50.

Digital Library

[30]

Pearce, R., Gokhale, M., and Amato, N. M. Multithreaded asynchronous graph traversal for in-memory and semi-external memory. In Proceedings of the International conference for High Performance Computing, Networking, Storage and Analysis (2010), IEEE Computer Society, pp. 1--11.

Digital Library

[31]

Roy, A., Mihailovic, I., and Zwaenepoel, W. X-stream: Edge-centric graph processing using streaming partitions. In Proceedings of the ACM symposium on Operating Systems Principles (2013), ACM, pp. 472--488.

Digital Library

[32]

Wang, K., Xu, G., Su, Z., and Liu, Y. D. Graphq: Graph query processing with abstraction refinement: Scalable and programmable analytics over very large graphs on a single PC. In Proceedings of the Usenix Annual Technical Conference (2015), USENIX Association, pp. 387--401.

Digital Library

[33]

Wu, M., Yang, F., Xue, J., Xiao, W., Miao, Y., Wei, L., Lin, H., Dai, Y., and Zhou, L. GraM: Scaling graph computation to the trillions. In Proceedings of the Symposium on Cloud Computing (2015), ACM.

Digital Library

[34]

Zhu, X., Han, W., and Chen, W. GridGraph: Large-scale graph processing on a single machine using 2-level hierarchical partitioning. In Proceedings of the Usenix Annual Technical Conference (2015), USENIX Association, pp. 375--386.

Digital Library

Cited By

Chen ZZhang FChen YFang XFeng GZhu XChen WDu X(2024)Enabling Window-Based Monotonic Graph Analytics with Reusable Transitional Results for Pattern-Consistent QueriesProceedings of the VLDB Endowment10.14778/3681954.368197917:11(3003-3016)Online publication date: 30-Aug-2024
https://doi.org/10.14778/3681954.3681979
Khadirsharbiyani SElyasi NAboutalebi ALiu CChoi CKandemir M(2024)SmartGraph: A Framework for Graph Processing in Computational StorageProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698538(737-754)Online publication date: 20-Nov-2024
https://dl.acm.org/doi/10.1145/3698038.3698538
Czumaj AMishra GMukherjee AKuznetsov PGelles ROlivetti D(2024)Streaming Graph Algorithms in the Massively Parallel Computation ModelProceedings of the 43rd ACM Symposium on Principles of Distributed Computing10.1145/3662158.3662770(496-507)Online publication date: 17-Jun-2024
https://dl.acm.org/doi/10.1145/3662158.3662770
Show More Cited By

Index Terms

Chaos: scale-out graph processing from secondary storage

Recommendations

X-Stream: edge-centric graph processing using streaming partitions
SOSP '13: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

X-Stream is a system for processing both in-memory and out-of-core graphs on a single shared-memory machine. While retaining the scatter-gather programming model with state stored in the vertices, X-Stream is novel in (i) using an edge-centric rather ...
Pregel: a system for large-scale graph processing
SIGMOD '10: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data

Many practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs - in some cases billions of vertices, trillions of edges - poses challenges to their efficient ...
PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs

Natural graphs with skewed distributions raise unique challenges to distributed graph computation and partitioning. Existing graph-parallel systems usually use a “one-size-fits-all” design that uniformly processes all vertices, which either suffer from ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SOSP '15: Proceedings of the 25th Symposium on Operating Systems Principles

October 2015

499 pages

ISBN:9781450338349

DOI:10.1145/2815400

General Chair:
Ethan Miller
UC Santa Cruz
,
Program Chair:
Steven Hand
Google

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SSRC: Storage Systems Research Center, UC Santa Cruz
SIGOPS: ACM Special Interest Group on Operating Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 October 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

SOSP '15

Sponsor:

SSRC
SIGOPS

SOSP '15: ACM SIGOPS 25th Symposium on Operating Systems Principles

October 4 - 7, 2015

California, Monterey

Acceptance Rates

SOSP '15 Paper Acceptance Rate 30 of 181 submissions, 17%;

Overall Acceptance Rate 174 of 961 submissions, 18%

Upcoming Conference

SOSP '25

Sponsor:
sigops

ACM SIGOPS 31st Symposium on Operating Systems Principles

October 13 - 16, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

160
Total Citations
View Citations
2,422
Total Downloads

Downloads (Last 12 months)143
Downloads (Last 6 weeks)2

Reflects downloads up to 30 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chen ZZhang FChen YFang XFeng GZhu XChen WDu X(2024)Enabling Window-Based Monotonic Graph Analytics with Reusable Transitional Results for Pattern-Consistent QueriesProceedings of the VLDB Endowment10.14778/3681954.368197917:11(3003-3016)Online publication date: 30-Aug-2024
https://doi.org/10.14778/3681954.3681979
Khadirsharbiyani SElyasi NAboutalebi ALiu CChoi CKandemir M(2024)SmartGraph: A Framework for Graph Processing in Computational StorageProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698538(737-754)Online publication date: 20-Nov-2024
https://dl.acm.org/doi/10.1145/3698038.3698538
Czumaj AMishra GMukherjee AKuznetsov PGelles ROlivetti D(2024)Streaming Graph Algorithms in the Massively Parallel Computation ModelProceedings of the 43rd ACM Symposium on Principles of Distributed Computing10.1145/3662158.3662770(496-507)Online publication date: 17-Jun-2024
https://dl.acm.org/doi/10.1145/3662158.3662770
Papon TChen TZhang SAthanassoulis M(2024)CAVE: Concurrency-Aware Graph Processing on SSDsProceedings of the ACM on Management of Data10.1145/36549282:3(1-26)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654928
Ji SBu CLi LWu X(2024)LocalTGEP: A Lightweight Edge Partitioner for Time-Varying GraphIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2023.323833312:2(455-466)Online publication date: Apr-2024
https://doi.org/10.1109/TETC.2023.3238333
Cheng QZheng ZJiang TTang CWang TGong LWang CZhou X(2024)SoGraph: A State-Aware Architecture for Out-of-Memory Graph Processing on HBM-Equipped FPGAs2024 34th International Conference on Field-Programmable Logic and Applications (FPL)10.1109/FPL64840.2024.00021(87-91)Online publication date: 2-Sep-2024
https://doi.org/10.1109/FPL64840.2024.00021
Wang ZLai LLiu YShui BTian CZhong S(2024)Parallelization of butterfly counting on hierarchical memoryThe VLDB Journal10.1007/s00778-024-00856-x33:5(1453-1484)Online publication date: 7-Jun-2024
https://doi.org/10.1007/s00778-024-00856-x
Lee ENoh SSeo J(2023)SageProceedings of the VLDB Endowment10.14778/3565838.356584415:13(3897-3910)Online publication date: 20-Jan-2023
https://dl.acm.org/doi/10.14778/3565838.3565844
Zhao JZhang YHe LLi QZhang XJiang XYu HLiao XJin HGu LLiu HHe BZhang JSong XWang LZhou J(2023)GraphTune: An Efficient Dependency-Aware Substrate to Alleviate Irregularity in Concurrent Graph ProcessingACM Transactions on Architecture and Code Optimization10.1145/360009120:3(1-24)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3600091
Dong SP SPan SAnanthabhotla AEkambaram DSharma ADayal SParikh NJin YKim APatil SZhuang JDunster SMahajan AChelluri ADatye CSantana LGarg NGawde O(2023)Disaggregating RocksDB: A Production ExperienceProceedings of the ACM on Management of Data10.1145/35897721:2(1-24)Online publication date: 20-Jun-2023
https://dl.acm.org/doi/10.1145/3589772
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten