Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3173162.3173180acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Public Access

Tigr: Transforming Irregular Graphs for GPU-Friendly Graph Processing

Published: 19 March 2018 Publication History

Abstract

Graph analytics delivers deep knowledge by processing large volumes of highly connected data. In real-world graphs, the degree distribution tends to follow the power law -- a small portion of nodes own a large number of neighbors. The high irregularity of degree distribution acts as a major barrier to their efficient processing on GPU architectures, which are primarily designed for accelerating computations on regular data with SIMD executions. Existing solutions to the inefficiency of GPU-based graph analytics either modify the graph programming abstraction or rely on changes to the low-level thread execution models. The former requires more programming efforts for designing and maintaining graph analytics; while the latter couples with the underlying architectures, making it difficult to adapt as architectures quickly evolve. Unlike prior efforts, this work proposes to address the above fundamental problem at its origin -- the irregular graph data itself. It raises a critical question in irregular graph processing: Is it possible to transform irregular graphs into more regular ones such that the graphs can be processed more efficiently on GPU-like architectures, yet still producing the same results? Inspired by the question, this work introduces Tigr -- a graph transformation framework that can effectively reduce the irregularity of real-world graphs with correctness guarantees for a wide range of graph analytics. To make the transformations practical, Tigr features a lightweight virtual transformation scheme, which can substantially reduce the costs of graph transformations, while preserving the benefits of reduced irregularity. Evaluation on Tigr-based GPU graph processing shows significant and consistent speedup over the state-of-the-art GPU graph processing frameworks for a spectrum of irregular graphs.

References

[1]
Ching Avery. 2011. Giraph: Large-scale graph processing infrastructure on Hadoop. Proceedings of the Hadoop Summit. Santa Clara Vol. 11 (2011).
[2]
Scott Beamer, Krste Asanović, and David Patterson. 2012. Direction-optimizing breadth-first search. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society Press, 12.
[3]
Nathan Bell and Michael Garland. 2009. Implementing sparse matrix-vector multiplication on throughput-oriented processors Proceedings of the conference on high performance computing networking, storage and analysis. ACM, 18.
[4]
Maciej Besta, Michał Podstawski, Linus Groner, Edgar Solomonik, and Torsten Hoefler. 2017. To Push or To Pull: On Reducing Communication and Synchronization in Graph Computations. In Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing. ACM, 93--104.
[5]
Paolo Boldi, Marco Rosa, Massimo Santini, and Sebastiano Vigna. 2011. Layered label propagation: A multiresolution coordinate-free ordering for compressing social networks. In Proceedings of the 20th international conference on World wide web. ACM, 587--596.
[6]
Ulrik Brandes. 2001. A faster algorithm for betweenness centrality. Journal of mathematical sociology Vol. 25, 2 (2001), 163--177.
[7]
Ed Bullmore and Olaf Sporns. 2009. Complex brain networks: graph theoretical analysis of structural and functional systems. Nature Reviews Neuroscience Vol. 10, 3 (2009), 186--198.
[8]
Shuai Che, Jeremy W Sheaffer, and Kevin Skadron. 2011. Dymaxion: Optimizing memory access patterns for heterogeneous systems Proceedings of 2011 international conference for high performance computing, networking, storage and analysis. ACM, 13.
[9]
Rong Chen, Jiaxin Shi, Yanzhe Chen, and Haibo Chen. 2015. PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs Proceedings of the Tenth European Conference on Computer Systems (EuroSys '15). ACM, New York, NY, USA, 1:1--1:15.
[10]
Andreas Crauser, Kurt Mehlhorn, Ulrich Meyer, and Peter Sanders. 1998. A parallelization of Dijkstra's shortest path algorithm. Mathematical Foundations of Computer Science 1998 (1998), 722--731.
[11]
Andrew Davidson, Sean Baxter, Michael Garland, and John D Owens. 2014. Work-efficient parallel GPU methods for single-source shortest paths Parallel and Distributed Processing Symposium, 2014 IEEE 28th International. IEEE, 349--359.
[12]
Pedro Domingos and Matt Richardson. 2001. Mining the network value of customers. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 57--66.
[13]
E. Elsen and V. Vaidyanathan. 2013. A vertex-centric CUDA/C
[14]
API for large graph analytics on GPUs using the gather-apply-scatter abstraction. https://github.com/RoyalCaliber/vertexAPI2. (2013).
[15]
Adam Fidel, Nancy M Amato, and Lawrence Rauchwerger. 2012. The STAPL parallel graph library. In International Workshop on Languages and Compilers for Parallel Computing. Springer, 46--60.
[16]
Abdullah Gharaibeh, Lauro Beltr ao Costa, Elizeu Santos-Neto, and Matei Ripeanu. 2012. A Yoke of Oxen and a Thousand Chickens for Heavy Lifting Graph Processing Proceedings of the 21st international conference on Parallel architectures and compilation techniques. ACM, 345--354.
[17]
Abdullah Gharaibeh, Tahsin Reza, Elizeu Santos-Neto, Lauro Beltrao Costa, Scott Sallinen, and Matei Ripeanu. 2013. Efficient large-scale graph processing on hybrid CPU and GPU systems. arXiv preprint arXiv:1312.3018 (2013).
[18]
Joseph E Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. OSDI, Vol. Vol. 12. 2.
[19]
Douglas Gregor and Andrew Lumsdaine. 2005. The parallel BGL: A generic library for distributed graph computations. Parallel Object-Oriented Scientific Computing (POOSC) Vol. 2 (2005), 1--18.
[20]
John Greiner. 1994. A comparison of parallel algorithms for connected components Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures. ACM, 16--25.
[21]
Tianyi David Han and Tarek S Abdelrahman. 2011. Reducing branch divergence in GPU programs. In Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units. ACM, 3.
[22]
Wei Han, Daniel Mawhirter, Bo Wu, and Matthew Buland. 2017. Graphie: Large-Scale Asynchronous Graph Traversals on Just a GPU Parallel Architectures and Compilation Techniques (PACT), 2017 26th International Conference on. IEEE, 233--245.
[23]
Pawan Harish and PJ Narayanan. 2007. Accelerating large graph algorithms on the GPU using CUDA International Conference on High-Performance Computing. Springer, 197--208.
[24]
Sungpack Hong, Sang Kyun Kim, Tayo Oguntebi, and Kunle Olukotun. 2011 a. Accelerating CUDA graph algorithms at maximum warp ACM SIGPLAN Notices, Vol. Vol. 46. ACM, 267--276.
[25]
Sungpack Hong, Tayo Oguntebi, and Kunle Olukotun. 2011 b. Efficient parallel graph exploration on multi-core CPU and GPU Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on. IEEE, 78--88.
[26]
Jayadharini Jaiganesh and Martin Burtscher. 2018. ECL-CC v1.0. http://cs.txstate.edu/ burtscher/research/ECL-CC/. (2018).
[27]
Yuntao Jia, Victor Lu, Jared Hoberock, Michael Garland, and John C Hart. 2011. Edge v. node parallelism for graph centrality metrics. GPU Computing Gems Vol. 2 (2011), 15--30.
[28]
Laxmikant V Kale and Abhinav Bhatele. 2016. Parallel science and engineering applications: The Charm+approach. CRC Press.
[29]
George Karypis and Vipin Kumar. 1998 a. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on scientific Computing Vol. 20, 1 (1998), 359--392.
[30]
George Karypis and Vipin Kumar. 1998 b. Multilevelk-way partitioning scheme for irregular graphs. Journal of Parallel and Distributed computing Vol. 48, 1 (1998), 96--129.
[31]
Farzad Khorasani, Rajiv Gupta, and Laxmi N Bhuyan. 2015. Scalable SIMD-efficient graph processing on GPUs Parallel Architecture and Compilation (PACT), 2015 International Conference on. IEEE, 39--50.
[32]
Farzad Khorasani, Bryan Rowe, Rajiv Gupta, and Laxmi N Bhuyan. 2016. Eliminating Intra-warp Load Imbalance in Irregular Nested Patterns via Collaborative Task Engagement. In Parallel and Distributed Processing Symposium, 2016 IEEE International. IEEE, 524--533.
[33]
Farzad Khorasani, Keval Vora, Rajiv Gupta, and Laxmi N Bhuyan. 2014. CuSha: vertex-centric graph processing on GPUs Proceedings of the 23rd international symposium on High-performance parallel and distributed computing. ACM, 239--252.
[34]
Aapo Kyrola, Guy E Blelloch, and Carlos Guestrin. 2012. GraphChi: Large-scale graph computation on just a PC. USENIX.
[35]
HyoukJoong Lee, Kevin J Brown, Arvind K Sujeeth, Tiark Rompf, and Kunle Olukotun. 2014. Locality-aware mapping of nested parallel patterns on gpus Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 63--74.
[36]
Jure Leskovec and Andrej Krevl. 2015. SNAP Datasets:Stanford Large Network Dataset Collection. (2015).
[37]
Weifeng Liu and Brian Vinter. 2015. CSR5: An efficient storage format for cross-platform sparse matrix-vector multiplication. In Proceedings of the 29th ACM on International Conference on Supercomputing. ACM, 339--350.
[38]
Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M Hellerstein. 2012. Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment Vol. 5, 8 (2012), 716--727.
[39]
Yucheng Low, Joseph E Gonzalez, Aapo Kyrola, Danny Bickson, Carlos E Guestrin, and Joseph Hellerstein. 2010. GraphLab: A new framework for parallel machine learning. CoRR Vol. abs/1006.4990 (2010). http://arxiv.org/abs/1006.4990
[40]
Andrew Lumsdaine, Douglas Gregor, Bruce Hendrickson, and Jonathan Berry. 2007. Challenges in parallel graph processing. Parallel Processing Letters Vol. 17, 01 (2007), 5--20.
[41]
Lijuan Luo, Martin Wong, and Wen-mei Hwu. 2010. An effective GPU implementation of breadth-first search Proceedings of the 47th design automation conference. ACM, 52--55.
[42]
Grzegorz Malewicz, Matthew H Austern, Aart JC Bik, James C Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. ACM, 135--146.
[43]
Adam McLaughlin and David A Bader. 2014. Scalable and high performance betweenness centrality on the GPU Proceedings of the International Conference for High performance computing, networking, storage and analysis. IEEE Press, 572--583.
[44]
Mario Mendez-Lojo, Martin Burtscher, and Keshav Pingali. 2012. A GPU implementation of inclusion-based points-to analysis. ACM SIGPLAN Notices Vol. 47, 8 (2012), 107--116.
[45]
Duane Merrill, Michael Garland, and Andrew Grimshaw. 2012. Scalable GPU graph traversal. In ACM SIGPLAN Notices, Vol. Vol. 47. ACM, 117--128.
[46]
Ulrich Meyer and Peter Sanders. 1998. Δ-stepping: A parallel single source shortest path algorithm European Symposium on Algorithms. Elsevier, 393--404.
[47]
Alan Mislove, Massimiliano Marcon, Krishna P Gummadi, Peter Druschel, and Bobby Bhattacharjee. 2007. Measurement and analysis of online social networks Proceedings of the 7th ACM SIGCOMM conference on Internet measurement. ACM, 29--42.
[48]
Rupesh Nasre, Martin Burtscher, and Keshav Pingali. 2013. Atomic-free irregular computations on GPUs. In Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units. ACM, 96--107.
[49]
Rupesh Nasre, Martin Burtscher, and Keshav Pingali. 2013. Morph algorithms on GPUs. In ACM SIGPLAN Notices, Vol. Vol. 48. ACM, 147--156.
[50]
Donald Nguyen, Andrew Lenharth, and Keshav Pingali. 2013. A lightweight infrastructure for graph analytics. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. ACM, 456--471.
[51]
Hector Ortega-Arranz, Yuri Torres, Diego R Llanos, and Arturo Gonzalez-Escribano. 2013. A New GPU-based Approach to the Shortest Path Problem High Performance Computing and Simulation (HPCS), 2013 International Conference on. IEEE, 505--511.
[52]
Sreepathi Pai and Keshav Pingali. 2016. A compiler for throughput optimization of graph algorithms on GPUs Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. ACM, 1--19.
[53]
Franccois Pellegrini and Jean Roman. 1996. Scotch: A software package for static mapping by dual recursive bipartitioning of process and architecture graphs. In High-Performance Computing and Networking. Springer, 493--498.
[54]
Keshav Pingali, Donald Nguyen, Milind Kulkarni, Martin Burtscher, M Amber Hassaan, Rashid Kaleem, Tsung-Hsien Lee, Andrew Lenharth, Roman Manevich, Mario Méndez-Lojo, et almbox. 2011. The Tao of parallelism in algorithms. In ACM Sigplan Notices, Vol. Vol. 46. ACM, 12--25.
[55]
Dimitrios Prountzos and Keshav Pingali. 2013. Betweenness centrality: algorithms and implementations Acm Sigplan Notices, Vol. Vol. 48. ACM, 35--46.
[56]
Junqiao Qiu, Zhijia Zhao, and Bin Ren. 2016. MicroSpec: Speculation-centric fine-grained parallelization for FSM computations Parallel Architecture and Compilation Techniques (PACT), 2016 International Conference on. IEEE, 221--233.
[57]
Ryan A. Rossi and Nesreen K. Ahmed. 2015. The Network Data Repository with Interactive Graph Analytics and Visualization Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. http://networkrepository.com
[58]
Gorka Sadowski and Philip Rathle. 2014. Fraud detection: Discovering connections with graph databases. White Paper-Neo Technology-Graphs are Everywhere (2014).
[59]
Ahmet Erdem Sariyüce, Kamer Kaya, Erik Saule, and Ümit V Catalyürek. 2013. Betweenness centrality on GPUs and heterogeneous architectures Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units. ACM, 76--85.
[60]
John Sartori and Rakesh Kumar. 2013. Branch and data herding: Reducing control and memory divergence for error-tolerant GPU applications. IEEE Transactions on Multimedia Vol. 15, 2 (2013), 279--290.
[61]
Dipanjan Sengupta, Shuaiwen Leon Song, Kapil Agarwal, and Karsten Schwan. 2015. GraphReduce: processing large-scale graphs on accelerator-based systems Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 28.
[62]
Julian Shun and Guy E Blelloch. 2013. Ligra: a lightweight graph processing framework for shared memory ACM Sigplan Notices, Vol. Vol. 48. ACM, 135--146.
[63]
Jeremy G Siek, Lie-Quan Lee, and Andrew Lumsdaine. 2001. The Boost Graph Library: User Guide and Reference Manual, Portable Documents. (2001).
[64]
Jyothish Soman, Kothapalli Kishore, and PJ Narayanan. 2010. A fast GPU algorithm for graph connectivity. In Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on. IEEE, 1--8.
[65]
Stanley Tzeng, Anjul Patney, and John D Owens. 2010. Task management for irregular-parallel workloads on the GPU Proceedings of the Conference on High Performance Graphics. Eurographics Association, 29--37.
[66]
Leslie G Valiant. 1990. A bridging model for parallel computation. Commun. ACM Vol. 33, 8 (1990), 103--111.
[67]
Stephan M Wagner and Nikrouz Neshat. 2010. Assessing the vulnerability of supply chains using graph theory. International Journal of Production Economics Vol. 126, 1 (2010), 121--129.
[68]
Kai Wang, Aftab Hussain, Zhiqiang Zuo, Guoqing Xu, and Ardalan Amiri Sani. 2017. Graspan: A single-machine disk-based graph system for interprocedural static analyses of large-scale systems code. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 389--404.
[69]
Kai Wang, Guoqing (Harry) Xu, Zhendong Su, and Yu David Liu. 2015. GraphQ: Graph Query Processing with Abstraction Refinement-Scalable and Programmable Analytics over Very Large Graphs on a Single PC. USENIX Annual Technical Conference. 387--401.
[70]
Yangzihao Wang, Andrew Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, and John D Owens. 2016. Gunrock: A high-performance graph processing library on the GPU Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, 11.
[71]
Brandon West, Adam Fidel, Nancy M Amato, Lawrence Rauchwerger, et almbox. 2015. A hybrid approach to processing big data graphs on memory-restricted systems Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International. IEEE, 799--808.
[72]
Bo Wu, Zhijia Zhao, Eddy Zheng Zhang, Yunlian Jiang, and Xipeng Shen. 2013. Complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on GPU. In ACM SIGPLAN Notices, Vol. Vol. 48. ACM, 57--68.
[73]
Qiumin Xu, Hyeran Jeon, and Murali Annavaram. 2014. Graph processing on gpus: Where are the bottlenecks? Workload Characterization (IISWC), 2014 IEEE International Symposium on. IEEE, 140--149.
[74]
Yi Yang and Huiyang Zhou. 2014. CUDA-NP: realizing nested thread-level parallelism in GPGPU applications ACM SIGPLAN Notices, Vol. Vol. 49. ACM, 93--106.
[75]
Eddy Z Zhang, Yunlian Jiang, Ziyu Guo, Kai Tian, and Xipeng Shen. 2011. On-the-fly elimination of dynamic irregularities for GPU computing ACM SIGARCH Computer Architecture News, Vol. Vol. 39. ACM, 369--380.
[76]
Mingxing Zhang, Yongwei Wu, Kang Chen, Xuehai Qian, Xue Li, and Weimin Zheng. 2016. Exploring the Hidden Dimension in Graph Processing. OSDI. 285--300.
[77]
Zhijia Zhao and Xipeng Shen. 2015. On-the-Fly Principled Speculation for FSM Parallelization Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '15, Istanbul, Turkey, March 14--18, 2015. 619--630.
[78]
Zhijia Zhao, Bo Wu, and Xipeng Shen. 2014. Challenging the "Embarrassingly Sequential": Parallelizing Finite State Machine-Based Computations through Principled Speculation ASPLOS '14: Proceedings of 19th International Conference on Architecture Support for Programming Languages and Operating Systems. ACM Press.
[79]
Jianlong Zhong and Bingsheng He. 2014. Medusa: Simplified graph processing on GPUs. IEEE Transactions on Parallel and Distributed Systems Vol. 25, 6 (2014), 1543--1552.

Cited By

View all
  • (2024)SpeedCore: Space-efficient and Dependency-aware GPU Parallel Framework for Core DecompositionProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673111(555-564)Online publication date: 12-Aug-2024
  • (2024)ScalaBFS2: A High-performance BFS Accelerator on an HBM-enhanced FPGA ChipACM Transactions on Reconfigurable Technology and Systems10.1145/365003717:2(1-39)Online publication date: 29-Feb-2024
  • (2024)WER: Maximizing Parallelism of Irregular Graph Applications through GPU Warp EqualizeRProceedings of the 29th Asia and South Pacific Design Automation Conference10.1109/ASP-DAC58780.2024.10473955(201-206)Online publication date: 22-Jan-2024
  • Show More Cited By

Index Terms

  1. Tigr: Transforming Irregular Graphs for GPU-Friendly Graph Processing

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems
    March 2018
    827 pages
    ISBN:9781450349116
    DOI:10.1145/3173162
    • cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 53, Issue 2
      ASPLOS '18
      February 2018
      809 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/3296957
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 March 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. GPU
    2. SIMD
    3. graph transformation
    4. irregularity
    5. power-law graph
    6. vertex-centric graph processing

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ASPLOS '18

    Acceptance Rates

    ASPLOS '18 Paper Acceptance Rate 56 of 319 submissions, 18%;
    Overall Acceptance Rate 535 of 2,713 submissions, 20%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)343
    • Downloads (Last 6 weeks)61
    Reflects downloads up to 18 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)SpeedCore: Space-efficient and Dependency-aware GPU Parallel Framework for Core DecompositionProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673111(555-564)Online publication date: 12-Aug-2024
    • (2024)ScalaBFS2: A High-performance BFS Accelerator on an HBM-enhanced FPGA ChipACM Transactions on Reconfigurable Technology and Systems10.1145/365003717:2(1-39)Online publication date: 29-Feb-2024
    • (2024)WER: Maximizing Parallelism of Irregular Graph Applications through GPU Warp EqualizeRProceedings of the 29th Asia and South Pacific Design Automation Conference10.1109/ASP-DAC58780.2024.10473955(201-206)Online publication date: 22-Jan-2024
    • (2024)Parallelization of butterfly counting on hierarchical memoryThe VLDB Journal10.1007/s00778-024-00856-x33:5(1453-1484)Online publication date: 7-Jun-2024
    • (2023)A Bucket-aware Asynchronous Single-Source Shortest Path Algorithm on GPUProceedings of the 52nd International Conference on Parallel Processing Workshops10.1145/3605731.3605746(50-60)Online publication date: 7-Aug-2023
    • (2023)I/O-Efficient Butterfly Counting at ScaleProceedings of the ACM on Management of Data10.1145/35887141:1(1-27)Online publication date: 30-May-2023
    • (2023)AdaptGearProceedings of the 20th ACM International Conference on Computing Frontiers10.1145/3587135.3592199(52-62)Online publication date: 9-May-2023
    • (2023)High-Level Synthesis of Irregular ApplicationsProceedings of the 20th ACM International Conference on Computing Frontiers10.1145/3587135.3592196(12-22)Online publication date: 9-May-2023
    • (2023)GraphMedia: Communication-balanced Graph Searching for Billion-scale Social Media AccessProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3613828(8984-8993)Online publication date: 26-Oct-2023
    • (2023)FT-topo: Architecture-Driven Folded-Triangle Partitioning for Communication-efficient Graph ProcessingProceedings of the 37th International Conference on Supercomputing10.1145/3577193.3593729(240-250)Online publication date: 21-Jun-2023
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media