Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2247596.2247651acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

I/O cost minimization: reachability queries processing over massive graphs

Published: 27 March 2012 Publication History

Abstract

Given a directed graph G, a reachability query (u, v) asks whether there exists a path from a node u to a node v in G. The existing studies support reachability queries using indexing techniques, where both the graph and the index are required to reside in main memory. However, they cannot handle reachability queries on massive graphs, when the graph and the index cannot be entirely held in memory because of the high I/O cost. In this paper, we focus on how to minimize the I/O cost when answering reachability queries on massive graphs that cannot reside entirely in memory. First, we propose a new Yes-Label scheme, as a complement of the No-Label used in GRAIL [23], to reduce the number of intermediate results generated. Second, we show how to minimize the number of I/Os using a heap-on-disk data structure when traversing a graph. We also propose new methods to partition the heap-on-disk, in order to ensure that only sequential I/Os are performed. Third, we analyze our approaches and show how to extend our approaches to answer multiple reachability queries effectively. Finally, we conducted extensive performance studies on both large synthetic and large real graphs, and confirm the efficiency of our approaches.

References

[1]
R. Agrawal, A. Borgida, and H. V. Jagadish. Efficient management of transitive relationships in large data and knowledge bases. In Proc. of SIGMOD'89, 1989.
[2]
K. Anyanwu and A. Sheth. ρ-queries: enabling querying for semantic associations on the semantic web. In Proc. of WWW'03, 2003.
[3]
R. Bramandia, B. Choi, and W. K. Ng. On incremental maintenance of 2-hop labeling of graphs. In Proc of WWW'08), 2008.
[4]
L. Chen, A. Gupta, and M. E. Kurul. Stack-based algorithms for pattern matching on dags. In Proc. of VLDB'05, 2005.
[5]
Y. Chen and Y. Chen. An efficient algorithm for answering graph reachability queries. In Proc. of ICDE'08, 2008.
[6]
J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computation of reachability labeling for large graphs. In Proc. of EDBT'06, 2006.
[7]
J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computing reachability labelings for large graphs with high compression rate. In Proc. of EDBT'08, 2008.
[8]
E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. In Proc. of SODA'02, 2002.
[9]
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to algorithms. MIT Press, 2001.
[10]
H. He, H. Wang, J. Yang, and P. S. Yu. Compact reachability labeling for graph-structured data. In Proc. of CIKM'05, 2005.
[11]
H. V. Jagadish. A compression technique to materialize transitive closure. ACM Trans. Database Syst., 15(4):558--598, 1990.
[12]
R. Jin, Y. Xiang, N. Ruan, and D. Fuhry. 3-HOP: A high-compression indexing scheme for reachability query. In Proc. of SIGMOD'09, 2009.
[13]
R. Jin, Y. Xiang, N. Ruan, and H. Wang. Efficiently answering reachability queries on very large directed graphs. In Proc. of SIGMOD'08, 2008.
[14]
L. Roditty and U. Zwick. A fully dynamic reachability algorithm for directed graphs with an almost linear update time. In Proc. of STOC'04, 2004.
[15]
R. Schenkel, A. Theobald, and G. Weikum. Hopi: An efficient connection index for complex XML document collections. In Proc. of EDBT'04, 2004.
[16]
R. Schenkel, A. Theobald, and G. Weikum. Efficient creation and incremental maintenance of the HOPI index for complex XML document collections. In Proc. of ICDE'05, 2005.
[17]
K. Simon. An improved algorithm for transitive closure on acyclic digraphs. Theor. Comput. Sci., 58(1--3):325--346, 1988.
[18]
S. TrißI and U. Leser. Fast and practical indexing and querying of very large graphs. In Proc. of SIGMOD'07, 2007.
[19]
J. van Helden, A. Naim, R. Mancuso, M. Eldridge, L. Wernisch, D. Gilbert, and S. Wodak. Reresenting and analysing molecular and cellular function using the computer. Journal of Biological Chemistry, 381(9--10), 2000.
[20]
S. J. van Schaik and O. de Moor. A memory efficient reachability data structure through bit vector compression. In Proc. of SIGMOD'11, 2011.
[21]
J. S. Vitter. Algorithms and data structures for external memory. Found. Trends Theor. Comput. Sci., 2:305--474, January 2008.
[22]
H. Wang, H. He, J. Yang, P. S. Yu, and J. X. Yu. Dual labeling: Answering graph reachability queries in constant time. In Proc. of ICDE'06, 2006.
[23]
H. Yildirim, V. Chaoji, and M. J. Zaki. Grail: Scalable reachability index for large graphs. PVLDB, 3(1), 2010.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
EDBT '12: Proceedings of the 15th International Conference on Extending Database Technology
March 2012
643 pages
ISBN:9781450307901
DOI:10.1145/2247596
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 March 2012

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

EDBT '12

Acceptance Rates

Overall Acceptance Rate 7 of 10 submissions, 70%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)1
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Path Querying in Graph Databases: A Systematic Mapping StudyIEEE Access10.1109/ACCESS.2024.337197612(33154-33172)Online publication date: 2024
  • (2021)Fast Reachability Queries Answering based on RCN ReductionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.3108433(1-1)Online publication date: 2021
  • (2020)Graph Reachability on Parallel Many-Core ArchitecturesComputation10.3390/computation80401038:4(103)Online publication date: 2-Dec-2020
  • (2020)Answering billion-scale label-constrained reachability queries within microsecondProceedings of the VLDB Endowment10.14778/3380750.338075313:6(812-825)Online publication date: 11-Mar-2020
  • (2020)One Edge at a Time: A Novel Approach Towards Efficient Transitive Reduction Computation on DAGsIEEE Access10.1109/ACCESS.2020.29756508(38010-38022)Online publication date: 2020
  • (2018)Reachability queryingThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-017-0468-327:1(1-26)Online publication date: 1-Feb-2018
  • (2017)Landmark Indexing for Evaluation of Label-Constrained Reachability QueriesProceedings of the 2017 ACM International Conference on Management of Data10.1145/3035918.3035955(345-358)Online publication date: 9-May-2017
  • (2017)Reachability QueryingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2016.263116029:3(683-697)Online publication date: 1-Mar-2017
  • (2016)An Algorithm for All-Pairs Regular Path Problem on External Memory GraphsIEICE Transactions on Information and Systems10.1587/transinf.2015DAP0018E99.D:4(944-958)Online publication date: 2016
  • (2015)AILabel: A Fast Interval Labeling Approach for Reachability Query on Very Large GraphsWeb Technologies and Applications10.1007/978-3-319-25255-1_46(560-572)Online publication date: 13-Nov-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media