Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3299869.3319882acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Efficiently Answering Regular Simple Path Queries on Large Labeled Networks

Published: 25 June 2019 Publication History
  • Get Citation Alerts
  • Abstract

    A fundamental query in labeled graphs is to determine if there exists a path between a given source and target vertices, such that the path satisfies a given label constraint. One of the powerful forms of specifying label constraints is through regular expressions, and the resulting problem of reachability queries under regular simple paths (RSP) form the core of many practical graph query languages such as SPARQL from W3C, Cypher of Neo4J, Oracle's PGQL and LDBC's G-CORE. Despite its importance, since it is known that answering RSP queries is NP-Hard, there are no scalable and practical solutions for answering reachability with full-range of regular expressions as constraints. In this paper, we circumvent this computational bottleneck by designing a random-walk based sampling algorithm called ARRIVAL, which is backed by theoretical guarantees on its expected quality. Extensive experiments on billion-sized real graph datasets with thousands of labels show that ARRIVAL to be 100 times faster than baseline strategies with an average accuracy of 95%.

    References

    [1]
    Serge Abiteboul and Victor Vianu. 1999. Regular path queries with constraints. J. Comput. System Sci., Vol. 58, 3 (1999), 428--452.
    [2]
    A. Anagnostopoulos, R. Kumar, M. Mahdian, E. Upfal, and F. Vandin. 2012. Algorithms on Evolving Graphs. In Proc. Innovations in Theoretical Computer Science (ITCS '12). ACM, 149--160.
    [3]
    Guillaume Bagan, Angela Bonifati, and Beno^i t Groz. 2013. A trichotomy for regular simple path queries on graphs. In Proc. PODS '13 . 261--272.
    [4]
    Angela Bonifati, Wim Martens, and Thomas Timm. 2017. An Analytical Study of Large SPARQL Query Logs. Proc. VLDB Endow., Vol. 11, 2 (Oct. 2017), 149--161.
    [5]
    D. Calvanese, G. De Giacomo, M. Lenzerini, and M. Y. Vardi. 1999. Rewriting of regular expressions and regular path queries. In PODS. 194--204.
    [6]
    Isabel F. Cruz, Alberto O. Mendelzon, and Peter T. Wood. 1987. A graphical query language supporting recursion . In SIGMOD. 323--330.
    [7]
    W. Fan, J. Li, S. Ma, N. Tang, and Y. Wu. 2011. Adding regular expressions to graph reachability and pattern queries. In Proc. ICDE '11 . 39--50.
    [8]
    U. Feige. 1996. A Fast Randomized LOGSPACE Algorithm for Graph Connectivity. Theor. Comput. Sci., Vol. 169, 2 (1996), 147--160.
    [9]
    George HL Fletcher, Jeroen Peters, and Alexandra Poulovassilis. 2016. Efficient regular path query evaluation using path indexes. In EDBT . OpenProceedings. org, 636--639.
    [10]
    Ruoming Jin, Hui Hong, Haixun Wang, Ning Ruan, and Yang Xiang. 2010. Computing Label-constraint Reachability in Graph Databases. In Proc. SIGMOD '10 . ACM, New York, NY, USA, 123--134.
    [11]
    André Koschmieder and Ulf Leser. 2012. Regular Path Queries on Large Graphs. In Proc. SSDBM 2012. 177--194.
    [12]
    Wim Martens and Tina Trautner. 2018a. Evaluation and Enumeration Problems for Regular Path Queries. In ICDT . 19:1--19:21.
    [13]
    Wim Martens and Tina Trautner. 2018b. Evaluation and Enumeration Problems for Regular Path Queries. In Proc ICDT '18. Article 19, bibinfonumpages21 pages.
    [14]
    C. McDiarmid. 1989. On the method of bounded differences. In Surv. Comb. Cambridge University Press, 148--188.
    [15]
    Alberto O. Mendelzon and Peter T. Wood. 1995. Finding Regular Simple Paths in Graph Databases. SIAM J. Comput., Vol. 24, 6 (Dec. 1995), 1235--1258.
    [16]
    Azade Nazi, Zhuojie Zhou, Saravanan Thirumuruganathan, Nan Zhang, and Gautam Das. 2015. Walk, not wait: Faster sampling over online social networks. Proceedings of the VLDB Endowment, Vol. 8, 6 (2015), 678--689.
    [17]
    Neha Sengupta, Amitabha Bagchi, Maya Ramanath, and Srikanta Bedathur. 2019. ARROW: Approximating Reachability using Random-walks Over Web-scale Graphs. (2019). To appear in Proc. ICDE '19.
    [18]
    Stephan Seufert, Klaus Berberich, Srikanta Bedathur, Sarath Kumar Kondreddi, Patrick Ernst, and Gerhard Weikum. 2016. ESPRESSO: Explaining Relationships between Entity Sets. In CIKM. 1311--1320.
    [19]
    Ken Thompson. 1968. Programming Techniques: Regular expression search algorithm. Commun. ACM, Vol. 11, 6 (June 1968), 419--422.
    [20]
    Lucien D.J. Valstar, George H.L. Fletcher, and Yuichi Yoshida. 2017. Landmark Indexing for Evaluation of Label-Constrained Reachability Queries. In Proc. SIGMOD '17. ACM, New York, NY, USA, 345--358.
    [21]
    Nikolay Yakovets, Parke Godfrey, and Jarek Gryz. 2016. Query planning for evaluating SPARQL property paths. In SIGMOD. 1875--1889.
    [22]
    Lei Zou, Kun Xu, Jeffrey Xu Yu, Lei Chen, Yanghua Xiao, and Dongyan Zhao. 2014. Efficient processing of label-constraint reachability queries in large graphs. Information Systems, Vol. 40 (2014), 47 -- 66.

    Cited By

    View all
    • (2024)Efficient Regular Simple Path Queries under Transitive Restricted ExpressionsProceedings of the VLDB Endowment10.14778/3654621.365463617:7(1710-1722)Online publication date: 1-Mar-2024
    • (2024)LM-SRPQ: Efficiently Answering Regular Path Query in Streaming GraphsProceedings of the VLDB Endowment10.14778/3641204.364121417:5(1047-1059)Online publication date: 1-Jan-2024
    • (2024)Materialized View Selection & View-Based Query Planning for Regular Path QueriesProceedings of the ACM on Management of Data10.1145/36549552:3(1-26)Online publication date: 30-May-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '19: Proceedings of the 2019 International Conference on Management of Data
    June 2019
    2106 pages
    ISBN:9781450356435
    DOI:10.1145/3299869
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 June 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. knowledge graphs
    2. random walks
    3. reachability query
    4. regular expression
    5. regular path query

    Qualifiers

    • Research-article

    Conference

    SIGMOD/PODS '19
    Sponsor:
    SIGMOD/PODS '19: International Conference on Management of Data
    June 30 - July 5, 2019
    Amsterdam, Netherlands

    Acceptance Rates

    SIGMOD '19 Paper Acceptance Rate 88 of 430 submissions, 20%;
    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)76
    • Downloads (Last 6 weeks)2

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Efficient Regular Simple Path Queries under Transitive Restricted ExpressionsProceedings of the VLDB Endowment10.14778/3654621.365463617:7(1710-1722)Online publication date: 1-Mar-2024
    • (2024)LM-SRPQ: Efficiently Answering Regular Path Query in Streaming GraphsProceedings of the VLDB Endowment10.14778/3641204.364121417:5(1047-1059)Online publication date: 1-Jan-2024
    • (2024)Materialized View Selection & View-Based Query Planning for Regular Path QueriesProceedings of the ACM on Management of Data10.1145/36549552:3(1-26)Online publication date: 30-May-2024
    • (2024)MWP: Multi-Window Parallel Evaluation of Regular Path Queries on Streaming GraphsProceedings of the ACM on Management of Data10.1145/36392602:1(1-26)Online publication date: 26-Mar-2024
    • (2024)Answering Property Path Queries over Federated RDF SystemsWeb and Big Data10.1007/978-981-97-2387-4_2(16-31)Online publication date: 28-Apr-2024
    • (2023)Integrating Connection Search in Graph Queries2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00200(2607-2620)Online publication date: Apr-2023
    • (2023)A Reachability Index for Recursive Label-Concatenated Graph Queries2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00013(67-81)Online publication date: Apr-2023
    • (2023)Colorful path detection in vertex-colored temporalNetwork Science10.1017/nws.2023.17(1-17)Online publication date: 18-Aug-2023
    • (2023)PO-GNN: Position-observant inductive graph neural networks for position-based predictionInformation Processing & Management10.1016/j.ipm.2023.10333360:3(103333)Online publication date: May-2023
    • (2023)Answering reachability queries with ordered label constraints over labeled graphsFrontiers of Computer Science10.1007/s11704-022-2368-y18:1Online publication date: 12-Aug-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media