Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs

Published: 22 January 2019 Publication History

Abstract

Natural graphs with skewed distributions raise unique challenges to distributed graph computation and partitioning. Existing graph-parallel systems usually use a “one-size-fits-all” design that uniformly processes all vertices, which either suffer from notable load imbalance and high contention for high-degree vertices (e.g., Pregel and GraphLab) or incur high communication cost and memory consumption even for low-degree vertices (e.g., PowerGraph and GraphX). In this article, we argue that skewed distributions in natural graphs also necessitate differentiated processing on high-degree and low-degree vertices. We then introduce PowerLyra, a new distributed graph processing system that embraces the best of both worlds of existing graph-parallel systems. Specifically, PowerLyra uses centralized computation for low-degree vertices to avoid frequent communications and distributes the computation for high-degree vertices to balance workloads. PowerLyra further provides an efficient hybrid graph partitioning algorithm (i.e., hybrid-cut) that combines edge-cut (for low-degree vertices) and vertex-cut (for high-degree vertices) with heuristics. To improve cache locality of inter-node graph accesses, PowerLyra further provides a locality-conscious data layout optimization. PowerLyra is implemented based on the latest GraphLab and can seamlessly support various graph algorithms running in both synchronous and asynchronous execution modes. A detailed evaluation on three clusters using various graph-analytics and MLDM (Machine Learning and Data Mining) applications shows that PowerLyra outperforms PowerGraph by up to 5.53X (from 1.24X) and 3.26X (from 1.49X) for real-world and synthetic graphs, respectively, and is much faster than other systems like GraphX and Giraph, yet with much less memory consumption. A porting of hybrid-cut to GraphX further confirms the efficiency and generality of PowerLyra.

References

[1]
Amine Abou-Rjeili and George Karypis. 2006. Multilevel algorithms for partitioning power-law graphs. In Proceedings of the 20th International Conference on Parallel and Distributed Processing (IPDPS’06). IEEE Computer Society, Washington, DC, USA, 124--124. http://dl.acm.org/citation.cfm?id=1898953.1899055.
[2]
Lada A. Adamic and Bernardo A. Huberman. 2002. Zipf’s law and the internet. Glottometrics 3, 1 (2002), 143--150.
[3]
Maciej Besta and Torsten Hoefler. 2015. Accelerating irregular computations with hardware transactional memory and active messages. In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing (HPDC’15). ACM, New York, 161--172.
[4]
Paolo Boldi, Bruno Codenotti, Massimo Santini, and Sebastiano Vigna. 2004. UbiCrawler: A scalable fully distributed web crawler. Software Practice and Experience 34, 8 (July 2004), 711--726.
[5]
Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30, 1--7 (April 1998), 107--117.
[6]
Nathan Bronson, Zach Amsden, George Cabrera, Prasad Chakka, Peter Dimov, Hui Ding, Jack Ferris, Anthony Giardullo, Sachin Kulkarni, Harry Li, et al. 2013. Tao: Facebook’s distributed data store for the social graph. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’13). 49--60.
[7]
Aydin Buluc and John R Gilbert. 2011. The combinatorial BLAS: Design, implementation, and applications. Int. J. High Performance Computer Applications 25, 4 (Nov. 2011), 496--509.
[8]
Umit Catalyurek and Cevdet Aykanat. 1996. Decomposing irregularly sparse matrices for parallel matrix-vector multiplication. In Proceedings of the 3rd International Workshop on Parallel Algorithms for Irregularly Structured Problems (IRREGULAR’96). Springer-Verlag, London, UK, 75--86. http://dl.acm.org/citation.cfm?id=646010.676990.
[9]
Ümit V. Çatalyürek, Cevdet Aykanat, and Bora Uçar. 2010. On two-dimensional sparse matrix partitioning: Models, methods, and a recipe. SIAM Journal of Science and Computing 32, 2 (Feb. 2010), 656--683.
[10]
Haibo Chen, Heng Zhang, Mingkai Dong, Zhaoguo Wang, Yubin Xia, Haibing Guan, and Binyu Zang. 2017. Efficient and available in-memory KV-store with hybrid erasure coding and replication. ACM Transactions on Storage 13, 3, Article 25 (Sept. 2017), 30 pages.
[11]
Rong Chen, Xin Ding, Peng Wang, Haibo Chen, Binyu Zang, and Haibing Guan. 2014. Computation and communication efficient graph processing with distributed immutable view. In Proceedings of the 23rd International Symposium on High-performance Parallel and Distributed Computing (HPDC’14). ACM, New York, 215--226.
[12]
Rong Chen, Jiaxin Shi, Yanzhe Chen, and Haibo Chen. 2015. PowerLyra: Differentiated graph computation and partitioning on skewed graphs. In Proceedings of the 10th European Conference on Computer Systems (EuroSys’15). ACM, New York, Article 1, 15 pages.
[13]
Rong Chen, Jiaxin Shi, Binyu Zang, and Haibing Guan. 2014. Bipartite-oriented distributed graph partitioning for big learning. In Proceedings of 5th Asia-Pacific Workshop on Systems (APSys’14). ACM, New York, Article 14, 7 pages.
[14]
Rong Chen, Jia-Xin Shi, Hai-Bo Chen, and Bin-Yu Zang. 2015. Bipartite-oriented distributed graph partitioning for big learning. Journal of Computer Science and Technology 30, 1 (2015), 20--29.
[15]
Rishan Chen, Mao Yang, Xuetian Weng, Byron Choi, Bingsheng He, and Xiaoming Li. 2012. Improving large graph processing on partitioned graphs in the cloud. In Proceedings of the 3rd ACM Symposium on Cloud Computing (SoCC’12). ACM, New York, Article 3, 13 pages.
[16]
Rong Chen, Youyang Yao, Peng Wang, Kaiyuan Zhang, Zhaoguo Wang, Haibing Guan, Binyu Zang, and Haibo Chen. 2018. Replication-based fault-tolerance for large-scale graph processing. IEEE Transactions on Parallel and Distributed Systems 29, 7 (2018), 1621--1635.
[17]
Raymond Cheng, Ji Hong, Aapo Kyrola, Youshan Miao, Xuetian Weng, Ming Wu, Fan Yang, Lidong Zhou, Feng Zhao, and Enhong Chen. 2012. Kineograph: Taking the pulse of a fast-changing and connected world. In Proceedings of the 7th ACM European Conference on Computer Systems (EuroSys’12). ACM, New York, 85--98.
[18]
Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, Michael Mitzenmacher, Alessandro Panconesi, and Prabhakar Raghavan. 2009. On compressing social networks. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’09). ACM, New York,219--228.
[19]
Avery Ching, Sergey Edunov, Maja Kabiljo, Dionysios Logothetis, and Sambavi Muthukrishnan. 2015. One trillion edges: Graph processing at facebook-scale. Proceedings of the VLDB Endowment 8, 12 (Aug. 2015), 1804--1815.
[20]
DIMACS. 2006. The 9th DIMACS Implementation Challenge - Shortest Paths. http://www.dis.uniroma1.it/challenge9/.
[21]
Michalis Faloutsos, Petros Faloutsos, and Christos Faloutsos. 1999. On power-law relationships of the internet topology. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM’99). ACM, New York, 251--262.
[22]
Ionel Gog, Malte Schwarzkopf, Natacha Crooks, Matthew P. Grosvenor, Allen Clement, and Steven Hand. 2015. Musketeer: All for one, one for all in data processing systems. In Proceedings of the 10th European Conference on Computer Systems (EuroSys’15). ACM, New York, Article 2, 16 pages.
[23]
Joseph Gonzalez, Yucheng Low, Arthur Gretton, and Carlos Guestrin. 2011. Parallel gibbs sampling: From colored fields to thin junction trees. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS’11). PMLR, 15:324--332. http://proceedings.mlr.press/v15/gonzalez11a.html.
[24]
Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed graph-parallel computation on natural graphs. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI’12). USENIX Association, Berkeley, CA,17--30. http://dl.acm.org/citation.cfm?id=2387880.2387883.
[25]
Joseph E. Gonzalez, Yucheng Low, Carlos Guestrin, and David O’Hallaron. 2009. Distributed parallel inference on large factor graphs. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence (UAI’09). AUAI Press, Arlington, VA, 203--212. http://dl.acm.org/citation.cfm?id=1795114.1795139.
[26]
Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica. 2014. GraphX: Graph processing in a distributed dataflow framework. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI’ 14). USENIX Association, Berkeley, CA, 599--613. http://dl.acm.org/citation.cfm?id=2685048.2685096.
[27]
Wentao Han, Youshan Miao, Kaiwei Li, Ming Wu, Fan Yang, Lidong Zhou, Vijayan Prabhakaran, Wenguang Chen, and Enhong Chen. 2014. Chronos: A graph engine for temporal graph analysis. In Proceedings of the Ninth European Conference on Computer Systems (EuroSys’14). ACM, New York, Article 1, 14 pages.
[28]
Henry Haselgrove. 2010. Wikipedia page-to-page link database. http://haselgrove.id.au/wikipedia.htm.
[29]
Imranul Hoque and Indranil Gupta. 2013. LFGraph: Simple and fast distributed graph analytics. In Proceedings of the First ACM SIGOPS Conference on Timely Results in Operating Systems (TRIOS’13). ACM, New York, Article 9, 17 pages.
[30]
Andy Huang and Wei Wu. 2014. Mining Ecommerce Graph Data with Apache Spark at Alibaba Taobao. Retrieved April 1, 2018, from https://databricks.com/blog/2014/08/14/mining-graph-data-with-spark-at-alibaba-taobao.html.
[31]
Nilesh Jain, Guangdeng Liao, and Theodore L. Willke. 2013. GraphBuilder: Scalable graph ETL framework. In Proceedings of the 1st International Workshop on Graph Data Management Experiences and Systems (GRADES’13). ACM, New York, Article 4, 6 pages.
[32]
Saehan Jo, Jaemin Yoo, and U Kang. 2018. Fast and scalable distributed loopy belief propagation on real-world graphs. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining (WSDM’18). ACM, New York, 297--305.
[33]
Xiaoen Ju, Hani Jamjoom, and Kang G. Shin. 2017. Hieroglyph: Locally-sufficient graph processing via compute-sync-merge. Proceedings of the ACM Measurement Analysis Computer Systems 1, 1, Article 9 (June 2017), 25 pages.
[34]
Tim Kaler, William Hasenplaugh, Tao B. Schardl, and Charles E. Leiserson. 2014. Executing dynamic data-graph computations deterministically using chromatic scheduling. In Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA’14). ACM, New York, 154--165.
[35]
U. Kang, Charalampos E. Tsourakakis, Ana Paula Appel, Christos Faloutsos, and Jure Leskovec. 2011. HADI: Mining radii of large graphs. ACM Transactions on Knowledge Discovery Data 5, 2, Article 8 (Feb. 2011), 24 pages.
[36]
George Karypis and Vipin Kumar. 1999. Parallel multilevel series k-way partitioning scheme for irregular graphs. SIAM Review 41, 2 (1999), 278--300.
[37]
Zuhair Khayyat, Karim Awara, Amani Alonazi, Hani Jamjoom, Dan Williams, and Panos Kalnis. 2013. Mizan: A system for dynamic load balancing in large-scale graph processing. In Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys’13). ACM, New York, 169--182.
[38]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is twitter, a social network or a news media?. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, New York, 591--600.
[39]
Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. GraphChi: Large-scale graph computation on just a PC. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI’12). USENIX Association, Berkeley, CA, 31--46. http://dl.acm.org/citation.cfm?id=2387880.2387884.
[40]
Michael LeBeane, Shuang Song, Reena Panda, Jee Ho Ryoo, and Lizy K. John. 2015. Data partitioning strategies for graph workloads on heterogeneous clusters. In Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis (SC’15). IEEE, 1--12.
[41]
Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2007. Graph evolution: Densification and shrinking diameters. ACM Transactions on Knowledge Discovery Data 1, 1, Article 2 (March 2007).
[42]
Jure Leskovec, Kevin J. Lang, Anirban Dasgupta, and Michael W. Mahoney. 2009. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics 6, 1 (2009), 29--123.
[43]
Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M. Hellerstein. 2012. Distributed graphlab: A framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment 5, 8 (April 2012), 716--727.
[44]
Andrew Lumsdaine, Douglas Gregor, Bruce Hendrickson, and Jonathan Berry. 2007. Challenges in parallel graph processing. Parallel Processing Letters 17, 01 (2007), 5--20.
[45]
Lingxiao Ma, Zhi Yang, Han Chen, Jilong Xue, and Yafei Dai. 2017. Garaph: Efficient GPU-accelerated graph processing on a single machine with balanced replication. In Proceedings of 2017 USENIX Annual Technical Conference (USENIX ATC’17). USENIX Association, 195--207.
[46]
Steffen Maass, Changwoo Min, Sanidhya Kashyap, Woonhak Kang, Mohan Kumar, and Taesoo Kim. 2017. Mosaic: Processing a trillion-edge graph on a single machine. In Proceedings of the 12th European Conference on Computer Systems (EuroSys’17). ACM, New York, 527--543.
[47]
Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: A system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD’10). ACM, New York, 135--146.
[48]
Jasmina Malicevic, Baptiste Lepers, and Willy Zwaenepoel. 2017. Everything you always wanted to know about multicore graph processing but were afraid to ask. In Proceedings of 2017 USENIX Annual Technical Conference (USENIX ATC’17). USENIX Association, 631--643.
[49]
Daniel Margo and Margo Seltzer. 2015. A scalable distributed graph partitioner. Proceedings of the VLDB Endowment 8, 12 (Aug. 2015), 1478--1489.
[50]
Kameshwar Munagala and Abhiram Ranade. 1999. I/O-complexity of graph algorithms. In Proceedings of the Tenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’99). Society for Industrial and Applied Mathematics, Philadelphia, PA, 687--694. http://dl.acm.org/citation.cfm?id=314500.314891.
[51]
Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Martín Abadi. 2013. Naiad: A timely dataflow system. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). ACM, New York, 439--455.
[52]
Mark E. J. Newman. 2005. Power laws, pareto distributions and Zipf’s law. Contemporary Physics 46, 5 (2005), 323--351.
[53]
Donald Nguyen, Andrew Lenharth, and Keshav Pingali. 2013. A lightweight infrastructure for graph analytics. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). ACM, New York, 456--471.
[54]
Biswanath Panda, Joshua S. Herbach, Sugato Basu, and Roberto J. Bayardo. 2009. PLANET: Massively parallel learning of tree ensembles with MapReduce. Proceedins ofthe VLDB Endowment 2, 2 (Aug. 2009), 1426--1437.
[55]
Zhengping Qian, Yong He, Chunzhi Su, Zhuojie Wu, Hongyu Zhu, Taizhi Zhang, Lidong Zhou, Yuan Yu, and Zheng Zhang. 2013. TimeStream: Reliable stream computation in the cloud. In Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys’13). ACM, New York,1--14.
[56]
Amitabha Roy, Laurent Bindschaedler, Jasmina Malicevic, and Willy Zwaenepoel. 2015. Chaos: Scale-out graph processing from secondary storage. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP’15). ACM, New York, 410--424.
[57]
Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-Stream: Edge-centric graph processing using streaming partitions. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). ACM, New York, 472--488.
[58]
Alessandra Sala, Lili Cao, Christo Wilson, Robert Zablit, Haitao Zheng, and Ben Y. Zhao. 2010. Measurement-calibrated graph models for social network experiments. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, New York, 861--870.
[59]
Alessandra Sala, Xiaohan Zhao, Christo Wilson, Haitao Zheng, and Ben Y. Zhao. 2011. Sharing graphs using differentially private graph models. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference (IMC’11). ACM, New York, 81--98.
[60]
Semih Salihoglu and Jennifer Widom. 2013. GPS: A graph processing system. In Proceedings of the 25th International Conference on Scientific and Statistical Database Management (SSDBM’13). ACM, New York, Article 22, 12 pages.
[61]
Kirk Schloegel, George Karypis, and Vipin Kumar. 2000. Parallel multilevel algorithms for multi-constraint graph partitioning (distinguished paper). In Proceedings of the 6th International Euro-Par Conference on Parallel Processing (Euro-Par’00). Springer-Verlag, London, UK, 296--310. http://dl.acm.org/citation.cfm?id=646665.698944.
[62]
Jiwon Seo, Jongsoo Park, Jaeho Shin, and Monica S. Lam. 2013. Distributed socialite: A datalog-based language for large-scale graph analysis. Proceedings of the VLDB Endowment 6, 14 (Sept. 2013), 1906--1917.
[63]
Bin Shao, Haixun Wang, and Yatao Li. 2013. Trinity: A distributed graph engine on a memory cloud. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD’13). ACM, New York, 505--516.
[64]
Jiaxin Shi, Youyang Yao, Rong Chen, Haibo Chen, and Feifei Li. 2016. Fast and concurrent RDF queries with RDMA-based distributed graph exploration. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, Berkeley, CA, 317--332. http://dl.acm.org/citation.cfm?id=3026877.3026902.
[65]
Julian Shun and Guy E. Blelloch. 2013. Ligra: A lightweight graph processing framework for shared memory. In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’13). ACM, New York,135--146.
[66]
Julian Shun, Laxman Dhulipala, and Guy E Blelloch. 2015. Smaller and faster: Parallel processing of compressed graphs with Ligra+. In 2015 Data Compression Conference. IEEE, 403--412.
[67]
Alexander Smola and Shravan Narayanamurthy. 2010. An architecture for parallel topic models. Proceedings of the VLDB Endowment 3, 1-2 (Sept. 2010), 703--710.
[68]
Isabelle Stanton and Gabriel Kliot. 2012. Streaming graph partitioning for large distributed graphs. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’12). ACM, New York, 1222--1230.
[69]
Gábor Takács, István Pilászy, Bottyán Németh, and Domonkos Tikk. 2009. Scalable collaborative filtering approaches for large recommender systems. Journal of Machine Learning Research 10 (June 2009), 623--656. http://dl.acm.org/citation.cfm?id=1577069.1577091.
[70]
Tencent. 2018. Design and Practice of the Anomaly Detection Framework for Billions of User in WeChat (in Chinese). https://cloud.tencent.com/developer/article/1028442.
[71]
Yuanyuan Tian, Andrey Balmin, Severin Andreas Corsten, Shirish Tatikonda, and John McPherson. 2013. From “Think Like a Vertex” to “Think Like a Graph.”Proceedings of the VLDB Endowment 7, 3 (Nov. 2013), 193--204. http://dl.acm.org/citation.cfm?id=2732232.2732238.
[72]
Charalampos Tsourakakis, Christos Gkantsidis, Bozidar Radunovic, and Milan Vojnovic. 2014. FENNEL: Streaming graph partitioning for massive scale graphs. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining (WSDM’14). ACM, New York, 333--342.
[73]
Alexander Ulanov, Manish Marwah, Mijung Kim, Roshan Dathathri, Carlos Zubieta, and Jun Li. 2017. Sandpiper: Scaling probabilistic inferencing to large scale graphical models. In Proceedings of 2017 IEEE International Conference on Big Data. IEEE, 383--388.
[74]
Leslie G. Valiant. 1990. A bridging model for parallel computation. Communications of the ACM 33, 8 (Aug. 1990), 103--111.
[75]
Shiv Verma, Luke M. Leslie, Yosub Shin, and Indranil Gupta. 2017. An experimental comparison of partitioning strategies in distributed graph processing. Proceedings of the VLDB Endowment 10, 5 (Jan. 2017), 493--504.
[76]
Huaimin Wang, Peichang Shi, and Yiming Zhang. 2017. Jointcloud: A cross-cloud cooperation architecture for integrated internet service customization. In Proceedings of the 37th International Conference on Distributed Computing Systems (ICDCS’17). IEEE, 1846--1855.
[77]
Hao Wang, Jing Zhang, Da Zhang, Sarunya Pumma, and Wu-chun Feng. 2017. PaPar: A parallel data partitioning framework for big data applications. In Proceedings of 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS’17). IEEE, 605--614.
[78]
Lei Wang, Liangji Zhuang, Junhang Chen, Huimin Cui, Fang Lv, Ying Liu, and Xiaobing Feng. 2018. Lazygraph: Lazy data coherency for replicas in distributed graph-parallel computation. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’18). ACM, New York, 276--289.
[79]
Peng Wang, Kaiyuan Zhang, Rong Chen, Haibo Chen, and Haibing Guan. 2014. Replication-based fault-tolerance for large-scale graph processing. In Proceedings of the 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’14). IEEE Computer Society, Washington, DC,562--573.
[80]
Siyuan Wang, Chang Lou, Rong Chen, and Haibo Chen. 2018. Fast and concurrent RDF queries using RDMA-assisted GPU graph exploration. In Proceedings of 2018 USENIX Annual Technical Conference (USENIX ATC’18). 651--664.
[81]
Yangzihao Wang, Andrew Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, and John D. Owens. 2016. Gunrock: A high-performance graph processing library on the GPU. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’16). ACM, New York, Article 11, 12 pages.
[82]
Christo Wilson, Bryce Boe, Alessandra Sala, Krishna P. N. Puttaswamy, and Ben Y. Zhao. 2009. User interactions in social networks and their implications. In Proceedings of the 4th ACM European Conference on Computer Systems (EuroSys’09). ACM, New York, 205--218.
[83]
Ming Wu, Fan Yang, Jilong Xue, Wencong Xiao, Youshan Miao, Lan Wei, Haoxiang Lin, Yafei Dai, and Lidong Zhou. 2015. GraM: Scaling graph computation to the trillions. In Proceedings of the 6th ACM Symposium on Cloud Computing (SoCC’15). ACM, New York, 408--421.
[84]
Wencong Xiao, Jilong Xue, Youshan Miao, Zhen Li, Cheng Chen, Ming Wu, Wei Li, and Lidong Zhou. 2017. Tux2: Distributed graph computation for machine learning. In Proceedings of the 14th USENIX Conference on Networked Systems Design and Implementation (NSDI’17). 669--682.
[85]
Chenning Xie, Rong Chen, Haibing Guan, Binyu Zang, and Haibo Chen. 2015. SYNC or ASYNC: Time to fuse for distributed graph-parallel computation. In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’15). ACM, New York, 194--204.
[86]
Cong Xie, Ling Yan, Wu-Jun Li, and Zhihua Zhang. 2014. Distributed power-law graph computing: Theoretical and empirical analysis. In Proceedings of the 28th Annual Conference on Neural Information Processing Systems (NIPS’14). 1673--1681.
[87]
Da Yan, James Cheng, Yi Lu, and Wilfred Ng. 2014. Blogel: A block-centric framework for distributed computation on real-world graphs. Proceedings of the VLDB Endowment 7, 14 (2014), 1981--1992.
[88]
Jerry Ye, Jyh-Herng Chow, Jiang Chen, and Zhaohui Zheng. 2009. Stochastic gradient boosted distributed decision trees. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM’09). ACM, New York, 2061--2064.
[89]
Andy Yoo, Edmond Chow, Keith Henderson, William McLendon, Bruce Hendrickson, and Umit Catalyurek. 2005. A scalable distributed parallel breadth-first search algorithm on BlueGene/L. In Proceedings of the 2005 ACM/IEEE Conference on Supercomputing (SC’05). IEEE Computer Society, Washington, DC, 25.
[90]
Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation (NSDI’12). USENIX Association, Berkeley, CA, 2--2. http://dl.acm.org/citation.cfm?id=2228298.2228301.
[91]
Heng Zhang, Mingkai Dong, and Haibo Chen. 2016. Efficient and available in-memory kv-store with hybrid erasure coding and replication. In Proceedings of 14th USENIX Conference on File and Storage Technologies (FAST’16). USENIX Association, Santa Clara, CA, 167--180. https://www.usenix.org/conference/fast16/technical-sessions/presentation/zhang-heng.
[92]
Kaiyuan Zhang, Rong Chen, and Haibo Chen. 2015. NUMA-aware graph-structured analytics. In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’15). ACM, New York, 183--193.
[93]
Mingxing Zhang, Yongwei Wu, Kang Chen, Teng Ma, and Weimin Zheng. 2016. Measuring and optimizing distributed array programs. Proceedings of the VLDB Endowment 9, 12 (Aug. 2016), 912--923.
[94]
Mingxing Zhang, Yongwei Wu, Kang Chen, Xuehai Qian, Xue Li, and Weimin Zheng. 2016. Exploring the hidden dimension in graph processing. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, Berkeley, CA, 285--300.
[95]
Mingxing Zhang, Youwei Zhuo, Chao Wang, Mingyu Gao, Yongwei Wu, Kang Chen, Christos Kozyrakis, and Xuehai Qian. 2018. GraphP: Reducing communication for PIM-based graph processing with efficient data partition. In Proceedings of 2018 IEEE International Symposium onHigh Performance Computer Architecture (HPCA’18). IEEE, 544--557.
[96]
Yunhao Zhang, Rong Chen, and Haibo Chen. 2017. Sub-millisecond stateful stream querying over fast-evolving linked data. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP’17). ACM, New York, 614--630.
[97]
Yu Zhang, Xiaofei Liao, Hai Jin, Lin Gu, Guang Tan, and Bing Bing Zhou. 2017. HotGraph: Efficient asynchronous processing for real-world graphs. IEEE Transactions on Computing 66, 5 (May 2017), 799--809.
[98]
Xiaohan Zhao, Adelbert Chang, Atish Das Sarma, Haitao Zheng, and Ben Y. Zhao. 2013. On the embeddability of random walk distances. Proceedings of the VLDB Endowment 6, 14 (Sept. 2013), 1690--1701.
[99]
Yunhong Zhou, Dennis Wilkinson, Robert Schreiber, and Rong Pan. 2008. Large-scale parallel collaborative filtering for the netflix prize. In Proceedings of the 4th International Conference on Algorithmic Aspects in Information and Management (AAIM’08). Springer-Verlag, Berlin, Heidelberg, 337--348.
[100]
Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A computation-centric distributed graph processing system. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, Berkeley, CA, 301--316.
[101]
Xiaowei Zhu, Wentao Han, and Wenguang Chen. 2015. GridGraph: Large scale graph processing on a single machine using 2-level hierarchical partitioning. In Proceedings of the 2015 USENIX Conference on Annual Technical Conference (USENIX ATC’15). USENIX Association. https://www.usenix.org/conference/atc15/technical-session/presentation/zhu.

Cited By

View all
  • (2025)Two-Dimensional Balanced Partitioning and Efficient Caching for Distributed Graph AnalysisIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.350129236:2(133-149)Online publication date: 1-Feb-2025
  • (2025)ClusPar: A Game-Theoretic Approach for Efficient and Scalable Streaming Edge PartitioningIEEE Transactions on Computers10.1109/TC.2024.347556874:1(116-130)Online publication date: 1-Jan-2025
  • (2025)A multi-objective partitioning algorithm for large-scale graph based on NSGA-IIExpert Systems with Applications10.1016/j.eswa.2024.125756263(125756)Online publication date: Mar-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Parallel Computing
ACM Transactions on Parallel Computing  Volume 5, Issue 3
September 2018
89 pages
ISSN:2329-4949
EISSN:2329-4957
DOI:10.1145/3305217
  • Editor:
  • David Bader
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 January 2019
Accepted: 01 March 2018
Revised: 01 October 2016
Received: 01 August 2015
Published in TOPC Volume 5, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Graph computation
  2. graph partitioning
  3. locality-conscious data layout
  4. power-law degree distribution
  5. skewed graph

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • National Natural Science Foundation of China
  • National Key Research 8 Development Program of China

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)193
  • Downloads (Last 6 weeks)16
Reflects downloads up to 30 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Two-Dimensional Balanced Partitioning and Efficient Caching for Distributed Graph AnalysisIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.350129236:2(133-149)Online publication date: 1-Feb-2025
  • (2025)ClusPar: A Game-Theoretic Approach for Efficient and Scalable Streaming Edge PartitioningIEEE Transactions on Computers10.1109/TC.2024.347556874:1(116-130)Online publication date: 1-Jan-2025
  • (2025)A multi-objective partitioning algorithm for large-scale graph based on NSGA-IIExpert Systems with Applications10.1016/j.eswa.2024.125756263(125756)Online publication date: Mar-2025
  • (2025)From Sancus to Sancus $$^q$$: staleness and quantization-aware full-graph decentralized training in graph neural networksThe VLDB Journal10.1007/s00778-024-00897-234:2Online publication date: 31-Jan-2025
  • (2024)Trust Management Scheme of IoV Based on Dynamic Sharding BlockchainElectronics10.3390/electronics1306101613:6(1016)Online publication date: 7-Mar-2024
  • (2024)Enabling Window-Based Monotonic Graph Analytics with Reusable Transitional Results for Pattern-Consistent QueriesProceedings of the VLDB Endowment10.14778/3681954.368197917:11(3003-3016)Online publication date: 30-Aug-2024
  • (2024)Improving Graph Compression for Efficient Resource-Constrained Graph AnalyticsProceedings of the VLDB Endowment10.14778/3665844.366585217:9(2212-2226)Online publication date: 1-May-2024
  • (2024)LM-SRPQ: Efficiently Answering Regular Path Query in Streaming GraphsProceedings of the VLDB Endowment10.14778/3641204.364121417:5(1047-1059)Online publication date: 1-Jan-2024
  • (2024)Automating Vectorized Distributed Graph ComputationProceedings of the ACM on Management of Data10.1145/36988332:6(1-27)Online publication date: 20-Dec-2024
  • (2024)Play like a Vertex: A Stackelberg Game Approach for Streaming Graph PartitioningProceedings of the ACM on Management of Data10.1145/36549652:3(1-27)Online publication date: 30-May-2024
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media