Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/SC.2010.34acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
Article

Multithreaded Asynchronous Graph Traversal for In-Memory and Semi-External Memory

Published: 13 November 2010 Publication History

Abstract

Processing large graphs is becoming increasingly important for many domains such as social networks, bioinformatics, etc. Unfortunately, many algorithms and implementations do not scale with increasing graph sizes. As a result, researchers have attempted to meet the growing data demands using parallel and external memory techniques. We present a novel asynchronous approach to compute Breadth-First-Search (BFS), Single-Source-Shortest-Paths, and Connected Components for large graphs in shared memory. Our highly parallel asynchronous approach hides data latency due to both poor locality and delays in the underlying graph data storage. We present an experimental study applying our technique to both In-Memory and Semi-External Memory graphs utilizing multi-core processors and solid-state memory devices. Our experiments using synthetic and real-world datasets show that our asynchronous approach is able to overcome data latencies and provide significant speedup over alternative approaches. For example, on billion vertex graphs our asynchronous BFS scales up to 14x on 16-cores.

References

[1]
T. Kolda, D. Brown, J. Corones, T. Critchlow, T. Eliassi-Rad, L. Getoor, B. Hendrickson, V. Kumar, D. Lambert, C. Matarazzo, K. McCurley, M. Merrill, N. Samatova, D. Speck, R. Srikant, J. Thomas, M. Wertheimer, and P. C. Wong, "Data sciences technology for homeland security information management and knowledge discovery: Report of the DHS workshop on data sciences," Jointly released by Sandia National Laboratories and Lawrence Livermore National Laboratory, Tech. Rep. UCRL-TR-208926, September 2004.
[2]
K. Mehlhorn and U. Meyer, "External-memory breadth-first search with sublinear I/O," in ESA '02: Proceedings of the 10th Annual European Symposium on Algorithms. London, UK: Springer-Verlag, 2002, pp. 723-735.
[3]
J. Abello, A. L. Buchsbaum, and J. R. Westbrook, "A functional approach to external graph algorithms," Algorithmica, vol. 32, no. 3, pp. 437-458, 2002.
[4]
D. Ajwani and U. Meyer, "Design and engineering of external memory traversal algorithms for general graphs," pp. 1-33, 2009.
[5]
D. Gregor and A. Lumsdaine, "The parallel BGL: A generic library for distributed graph computations," in In Parallel Object-Oriented Scientific Computing (POOSC), 2005.
[6]
B. W. Barrett, J. W. Berry, R. C. Murphy, and K. B. Wheeler, "Implementing a portable multi-threaded graph library: The MTGL on Qthreads," Parallel and Distributed Processing Symposium, International, vol. 0, pp. 1-8, 2009.
[7]
D. A. Bader and K. Madduri, "SNAP, small-world network analysis and partitioning: An open-source parallel graph framework for he exploration of large-scale networks," in IPDPS, 2008, pp. 1-12.
[8]
D. Ajwani, R. Dementiev, and U. Meyer, "A computational study of external-memory BFS algorithms," in SODA '06: Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm. New York, NY, USA: ACM, 2006, pp. 601-610.
[9]
J. G. Siek, L.-Q. Lee, and A. Lumsdaine, The Boost Graph Library: user guide and reference manual. Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc., 2002.
[10]
T. H. Cormen, C. Stein, R. L. Rivest, and C. E. Leiserson, Introduction to Algorithms, 2nd edition. MIT Press and McGraw-Hill, 2001.
[11]
E. W. Dijkstra, "A note on two problems in connexion with graphs," Numerische Mathematik, vol. 1, pp. 269-271, 1959.
[12]
U. Meyer and P. Sanders, "{delta}-stepping: a parallelizable shortest path algorithm," Journal of Algorithms, vol. 49, no. 1, pp. 114-152, 2003, 1998 European Symposium on Algorithms.
[13]
Y. Shiloach and U. Vishkin, "An o(logn) parallel connectivity algorithm," Journal of Algorithms, vol. 3, no. 1, pp. 57-67, 1982.
[14]
J. JáJá, An introduction to parallel algorithms. Redwood City, CA, USA: Addison Wesley Longman Publishing Co., Inc., 1992.
[15]
D. Ajwani, A. Beckmann, R. Jacob, U. Meyer, and G. Moruz, "On computational models for flash memory devices," in Experimental Algorithms, 2009, pp. 16-27.
[16]
D. Ajwani, I. Malinger, U. Meyer, and S. Toledo, "Characterizing the performance of flash memory storage devices and its impact on algorithm design," in Proceedings of the 7th International Workshop on Experimental Algorithms (WEA), 2008, pp. 208-219.
[17]
"BOOST Threads," www.boost.org/doc/libs/release/libs/thread/.
[18]
"ClueWeb09 dataset," http://boston.lti.cs.cmu.edu/Data/clueweb09/.
[19]
P. Boldi and S. Vigna, "The WebGraph framework I: Compression techniques," in Proc. of the Thirteenth International World Wide Web Conference (WWW 2004). Manhattan, USA: ACM Press, 2004, pp. 595-601.
[20]
D. Chakrabarti, Y. Zhan, and C. Faloutsos, "R-MAT: A recursive model for graph mining," in Fourth SIAM International Conference on Data Mining, April 2004.
[21]
J. W. Berry, B. Hendrickson, S. Kahan, and P. Konecny, "Software and algorithms for graph queries on multithreaded architectures," in Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE International, March 2007, pp. 1-14.
[22]
K. B. Wheeler, R. C. Murphy, and D. Thain, "Qthreads: An API for programming with millions of lightweight threads," in IPDPS, 2008, pp. 1-8.
[23]
A. V. Goldberg, "A simple shortest path algorithm with linear average time," in ESA '01: Proceedings of the 9th Annual European Symposium on Algorithms. London, UK: Springer-Verlag, 2001, pp. 230-241.
[24]
B. Hendrickson and J. W. Berry, "Graph analysis with high-performance computing," Computing in Science and Engineering, vol. 10, no. 2, pp. 14-19, 2008.
[25]
A. Lumsdaine, D. Gregor, B. Hendrickson, and J. W. Berry, "Challenges in parallel graph processing," Parallel Processing Letters, vol. 17, no. 1, pp. 5-20, 2007.
[26]
A. Yoo, E. Chow, K. Henderson, W. McLendon, B. Hendrickson, and U. Catalyurek, "A scalable distributed parallel breadth-first search algorithm on bluegene/L," in SC '05: Proceedings of the 2005 ACM/IEEE conference on Supercomputing. Washington, DC, USA: IEEE Computer Society, 2005, p. 25.
[27]
D. P. Bertsekas, F. Guerriero, and R. Musmanno, "Parallel asynchronous label-correcting methods for shortest paths," J. Optim. Theory Appl., vol. 88, no. 2, pp. 297-320, 1996.
[28]
F. Guerriero and R. Musmanno, "Parallel asynchronous algorithms for the k shortest paths problem," Journal of Optimization Theory and Applications, vol. 104, no. 1, pp. 91-108, 2000.
[29]
J. S. Vitter and E. A. Shriver, "Algorithms for parallel memory I: Two-level memories," Algorithmica, vol. 12, no. 2-3, pp. 110-147, 1994.
[30]
J. S. Vitter, "Algorithms and data structures for external memory," Found. Trends Theor. Comput. Sci., vol. 2, no. 4, pp. 305-474, 2008.
[31]
K. Munagala and A. Ranade, "I/O-complexity of graph algorithms," in SODA '99: Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms. Philadelphia, PA, USA: Society for Industrial and Applied Mathematics, 1999, pp. 687-694.

Cited By

View all
  • (2024)CAVE: Concurrency-Aware Graph Processing on SSDsProceedings of the ACM on Management of Data10.1145/36549282:3(1-26)Online publication date: 30-May-2024
  • (2024)FuseIM: Fusing Probabilistic Traversals for Influence Maximization on Exascale SystemsProceedings of the 38th ACM International Conference on Supercomputing10.1145/3650200.3656621(38-49)Online publication date: 30-May-2024
  • (2023)Real-Time PageRank on Dynamic GraphsProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3588195.3593004(239-251)Online publication date: 7-Aug-2023
  • Show More Cited By
  1. Multithreaded Asynchronous Graph Traversal for In-Memory and Semi-External Memory

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SC '10: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
    November 2010
    634 pages
    ISBN:9781424475599

    Sponsors

    Publisher

    IEEE Computer Society

    United States

    Publication History

    Published: 13 November 2010

    Check for updates

    Qualifiers

    • Article

    Conference

    SC '10
    Sponsor:

    Acceptance Rates

    SC '10 Paper Acceptance Rate 51 of 253 submissions, 20%;
    Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 22 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)CAVE: Concurrency-Aware Graph Processing on SSDsProceedings of the ACM on Management of Data10.1145/36549282:3(1-26)Online publication date: 30-May-2024
    • (2024)FuseIM: Fusing Probabilistic Traversals for Influence Maximization on Exascale SystemsProceedings of the 38th ACM International Conference on Supercomputing10.1145/3650200.3656621(38-49)Online publication date: 30-May-2024
    • (2023)Real-Time PageRank on Dynamic GraphsProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3588195.3593004(239-251)Online publication date: 7-Aug-2023
    • (2022)DynamAP: Architectural Support for Dynamic Graph Traversal on the Automata ProcessorACM Transactions on Architecture and Code Optimization10.1145/355697619:4(1-26)Online publication date: 7-Oct-2022
    • (2022)MetallParallel Computing10.1016/j.parco.2022.102905111:COnline publication date: 1-Jul-2022
    • (2021)Position paperProceedings of the 4th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)10.1145/3461837.3464514(1-12)Online publication date: 20-Jun-2021
    • (2020)EMOGIProceedings of the VLDB Endowment10.14778/3425879.342588314:2(114-127)Online publication date: 16-Nov-2020
    • (2020)SageProceedings of the VLDB Endowment10.14778/3397230.339725113:9(1598-1613)Online publication date: 1-May-2020
    • (2020)Balancing Fairness and Efficiency for Cache Sharing in Semi-external Memory SystemProceedings of the 49th International Conference on Parallel Processing10.1145/3404397.3404450(1-11)Online publication date: 17-Aug-2020
    • (2020)Compiler aided checkpointing using crash-consistent data structures in NVMM systemsProceedings of the 34th ACM International Conference on Supercomputing10.1145/3392717.3392755(1-13)Online publication date: 29-Jun-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media