Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3318464.3389733acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Regular Path Query Evaluation on Streaming Graphs

Published: 31 May 2020 Publication History

Abstract

We study persistent query evaluation over streaming graphs, which is becoming increasingly important. We focus on navigational queries that determine if there exists a path between two entities that satisfies a user-specified constraint. We adopt the Regular Path Query (RPQ) model that specifies navigational patterns with labeled constraints. We propose deterministic algorithms to efficiently evaluate persistent RPQs under both arbitrary and simple path semantics in a uniform manner. Experimental analysis on real and synthetic streaming graphs shows that the proposed algorithms can process up to tens of thousands of edges per second and efficiently answer RPQs that are commonly used in real-world workloads.

Supplementary Material

MP4 File (3318464.3389733.mp4)
Presentation Video

References

[1]
Daniel J. Abadi, Yanif Ahmad, Magdalena Balazinska, Ugur cCetintemel, Mitch Cherniack, Jeong-Hyon Hwang, Wolfgang Lindner, Anurag Maskey, Alex Rasin, Esther Ryvkina, Nesime Tatbul, Ying Xing, and Stanley B. Zdonik. 2005. The Design of the Borealis Stream Processing Engine. In Proc. 2nd Biennial Conf. on Innovative Data Systems Research. 277--289.
[2]
Daniel J. Abadi, Don Carney, Ugur c Cetintemel, Mitch Cherniack, Christian Convey, Sangdon Lee, Michael Stonebraker, Nesime Tatbul, and Stan Zdonik. 2003. Aurora: a new model and architecture for data stream management. VLDB J., Vol. 12, 2 (2003), 120--139.
[3]
Rakesh Agrawal. 1988. Alpha: An extension of relational algebra to express a class of recursive queries. IEEE Trans. Softw. Eng., Vol. 14, 7 (1988), 879--885.
[4]
Renzo Angles, Marcelo Arenas, Pablo Barcelo, Peter Boncz, George Fletcher, Claudio Gutierrez, Tobias Lindaaker, Marcus Paradies, Stefan Plantikow, Juan Sequeda, et al. 2018. G-CORE: A core for future graph query languages. In Proc. ACM SIGMOD Int. Conf. on Management of Data. 1421--1432.
[5]
Renzo Angles, Marcelo Arenas, Pablo Barceló, Aidan Hogan, Juan Reutter, and Domagoj Vrgovc. 2017. Foundations of modern query languages for graph databases. ACM Comput. Surv., Vol. 50, 5 (2017), 68.
[6]
Darko Anicic, Paul Fodor, Sebastian Rudolph, and Nenad Stojanovic. 2011. EP-SPARQL: a unified language for event processing and stream reasoning. In Proc. 20th Int. World Wide Web Conf. 635--644.
[7]
A. Arasu, S. Babu, and J. Widom. 2006. The CQL Continuous Query Language: Semantic Foundations and Query Execution. VLDB J., Vol. 15, 2 (2006), 121--142.
[8]
B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. 2002. Models and Issues in Data Stream Systems. In Proc. ACM SIGACT-SIGMOD Symp. on Principles of Database Systems. 1--16.
[9]
Pablo Barceló Baeza. 2013. Querying graph databases. In Proc. 32nd ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems. 175--188.
[10]
Guillaume Bagan, Angela Bonifati, Radu Ciucanu, George HL Fletcher, Aurélien Lemay, and Nicky Advokaat. 2016. gMark: schema-driven generation of graphs and queries. IEEE Trans. Knowl. and Data Eng., Vol. 29, 4 (2016), 856--869.
[11]
Guillaume Bagan, Angela Bonifati, and Beno^it Groz. 2013. A trichotomy for regular simple path queries on graphs. In Proc. 32nd ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems. 261--272.
[12]
Davide Francesco Barbieri, Daniele Braga, Stefano Ceri, Emanuele Della Valle, and Michael Grossniklaus. 2009. C-SPARQL: SPARQL for continuous querying. In Proc. 18th Int. World Wide Web Conf. 1061--1062.
[13]
Aaron Bernstein. 2016. Maintaining shortest paths under deletions in weighted directed graphs. SIAM J. on Comput., Vol. 45, 2 (2016), 548--574.
[14]
Angela Bonifati, Radu Ciucanu, and Aurélien Lemay. 2015. Learning Path Queries on Graph Databases. In Proc. 18th Int. Conf. on Extending Database Technology. Bruxelles, Belgium, 109--120. https://doi.org/10.5441/002/edbt.2015.11
[15]
Angela Bonifati, George Fletcher, Hannes Voigt, and Nikolay Yakovets. 2018. Querying Graphs. Synthesis Lectures on Data Management, Vol. 10, 3 (2018), 1--184.
[16]
Angela Bonifati, Wim Martens, and Thomas Timm. 2017. An analytical study of large SPARQL query logs. Proc. VLDB Endowment, Vol. 11, 2 (2017), 149--161.
[17]
Angela Bonifati, Wim Martens, and Thomas Timm. 2019. Navigating the Maze of Wikidata Query Logs. In Proc. 28th Int. World Wide Web Conf. 127--138.
[18]
Jean-Paul Calbimonte. 2017. Linked Data Notifications for RDF Streams. In WSP/WOMoCoE@ ISWC. 66--73.
[19]
Jean-Paul Calbimonte, Oscar Corcho, and Alasdair JG Gray. 2010. Enabling ontology-based access to streaming data sources. In Proc. 9th Int. Semantic Web Conf. 96--111.
[20]
Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini, and Moshe Y Vardi. 2000. Query processing using views for regular path queries with inverse. In Proc. 19th ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems. 58--66.
[21]
Paris Carbone, Asterios Katsifodimos, Stephan Ewen, Volker Markl, Seif Haridi, and Kostas Tzoumas. 2015. Apache Flink#8482;: Stream and Batch Processing in a Single Engine. Q. Bull. IEEE TC on Data Eng., Vol. 38, 4 (2015), 28--38. http://sites.computer.org/debull/A15dec/p28.pdf
[22]
Edith Cohen, Eran Halperin, Haim Kaplan, and Uri Zwick. 2003. Reachability and Distance Queries via 2-Hop Labels. SIAM J. on Comput., Vol. 32, 5 (2003), 1338.
[23]
Isabel F Cruz, Alberto O Mendelzon, and Peter T Wood. 1987. A graphical query language supporting recursion. In ACM SIGMOD Rec., Vol. 16. 323--330.
[24]
Daniele Dell'Aglio, Jean-Paul Calbimonte, Emanuele Della Valle, and Oscar Corcho. 2015. Towards a unified language for RDF stream query processing. In Proc. 12th Extended Semantic Web Conf. 353--363.
[25]
Orri Erling, Alex Averbuch, Josep Larriba-Pey, Hassan Chafi, Andrey Gubichev, Arnau Prat, Minh-Duc Pham, and Peter Boncz. 2015. The LDBC Social Network Benchmark: Interactive Workload. In Proc. ACM SIGMOD Int. Conf. on Management of Data. 619--630. https://doi.org/10.1145/2723372.2742786
[26]
Orri Erling and Ivan Mikhailov. 2009. RDF Support in the Virtuoso DBMS. In Networked Knowledge-Networked Media, Tassilo Pellegrini, Sóren Auer, Klaus Tochtermann, and Sebastian Schaffert (Eds.). 7--24.
[27]
Wenfei Fan, Chunming Hu, and Chao Tian. 2017. Incremental graph computations: Doable and undoable. In Proc. ACM SIGMOD Int. Conf. on Management of Data. 155--169.
[28]
Joan Feigenbaum, Sampath Kannan, Andrew McGregor, Siddharth Suri, and Jian Zhang. 2005. On graph problems in a semi-streaming model. Theor. Comp. Sci., Vol. 348, 2--3 (2005), 207--216.
[29]
George H. L. Fletcher, Jeroen Peters, and Alexandra Poulovassilis. 2016. Efficient regular path query evaluation using path indexes. In Proc. 19th Int. Conf. on Extending Database Technology, Evaggelia Pitoura, Sofian Maabout, Georgia Koutrika, Amé lie Marian, Letizia Tanca, Ioana Manolescu, and Kostas Stefanidis (Eds.). 636--639. https://doi.org/10.5441/002/edbt.2016.67
[30]
Libo Gao, Lukasz Golab, M. Tamer Özsu, and Gunes Aluc. 2018. Stream WatDiv -- A Streaming RDF Benchmark. In Proc. ACM SIGMOD Workshop on Semantic Big Data. 3:1--3:6.
[31]
Lukasz Golab and M. Tamer Özsu. 2003. Issues in data stream management. ACM SIGMOD Rec., Vol. 32, 2 (2003), 5--14.
[32]
Lukasz Golab and M. Tamer Özsu. 2005. Update-Pattern-Aware Modeling and Processing of Continuous Queries. In Proc. ACM SIGMOD Int. Conf. on Management of Data. 658--669.
[33]
Lukasz Golab and M. Tamer Özsu. 2010. Data Stream Systems .Morgan & Claypool.
[34]
Todd J Green, Gerome Miklau, Makoto Onizuka, and Dan Suciu. 2003. Processing XML streams with deterministic automata. In Proc. 9th Int. Conf. on Database Theory. 173--189.
[35]
Ajeet Grewal, Jerry Jiang, Gary Lam, Tristan Jung, Lohith Vuddemarri, Quannan Li, Aaditya Landge, and Jimmy Lin. [n. d.]. RecService: Multi-Tenant Distributed Real-Time Graph Processing at Twitter. In Proc. 10th USENIX Workshop on Hot Topics in Cloud Computing .
[36]
Andrey Gubichev. 2015. Query Processing and Optimization in Graph Databases. Ph.D. Dissertation. Technische Universitat München.
[37]
Andrey Gubichev, Srikanta J Bedathur, and Stephan Seufert. 2013. Sparqling kleene: fast property paths in RDF-3X. In Proc. 1st Int. Workshop on Graph Data Management Experiences and Systems. 14.
[38]
Martin Hirzel, Guillaume Baudart, Angela Bonifati, Emanuele Della Valle, Sherif Sakr, and Akrivi Vlachou. 2018. Stream Processing Languages in the Big Data Era. ACM SIGMOD Rec., Vol. 47, 2 (2018), 29--40.
[39]
John Hopcroft. 1971. An n log n algorithm for minimizing states in a finite automaton. Elsevier Science Publishers, 189--196.
[40]
Bruce M Kapron, Valerie King, and Ben Mountjoy. 2013. Dynamic graph connectivity in polylogarithmic worst case time. 1131--1142.
[41]
Krys J Kochut and Maciej Janik. 2007. SPARQLeR: Extended SPARQL for semantic association discovery. In Proc. 4th European Semantic Web Conf. 145--159.
[42]
Srdjan Komazec, Davide Cerri, and Dieter Fensel. 2012. Sparkwave: continuous schema-enhanced pattern matching over RDF data streams. In Proc. 6th Int. Conf. Distributed Event-Based Systems. 58--68.
[43]
André Koschmieder and Ulf Leser. 2012. Regular path queries on large graphs. In SSDBM12. 177--194.
[44]
Sanjeev Kulkarni, Nikunj Bhagat, Maosong Fu, Vikas Kedigehalli, Christopher Kellogg, Sailesh Mittal, Jignesh M. Patel, Karthik Ramasamy, and Siddarth Taneja. 2015. Twitter Heron: Stream Processing at Scale. In Proc. ACM SIGMOD Int. Conf. on Management of Data. 239--250. https://doi.org/10.1145/2723372.2742788
[45]
Jakub Łka cki. 2011. Improved deterministic algorithms for decremental transitive closure and strongly connected components. 1438--1445.
[46]
Danh Le-Phuoc, Minh Dao-Tran, Josiane Xavier Parreira, and Manfred Hauswirth. 2011. A native and adaptive approach for unified processing of linked streams and linked data. In Proc. 10th Int. Semantic Web Conf. 370--388.
[47]
Leonid Libkin, Wim Martens, and Domagoj Vrgovc. 2016. Querying graphs with data. J. ACM, Vol. 63, 2 (2016), 14.
[48]
Leonid Libkin and Domagoj Vrgovc. 2012. Regular path queries on graphs with data. In Proc. 15th Int. Conf. on Database Theory. 74--85.
[49]
Wim Martens and Tina Trautner. 2017. Enumeration problems for regular path queries. arXiv preprint arXiv:1710.02317 (2017).
[50]
Andrea Mauri, Jean-Paul Calbimonte, Daniele Dell'Aglio, Marco Balduini, Marco Brambilla, Emanuele Della Valle, and Karl Aberer. 2016. Triplewave: Spreading RDF streams on the web. In Proc. 15th Int. Semantic Web Conf. 140--149.
[51]
Andrew McGregor. 2014. Graph stream algorithms: a survey. ACM SIGMOD Rec., Vol. 43, 1 (2014), 9--20.
[52]
Alberto O Mendelzon and Peter T Wood. 1995. Finding regular simple paths in graph databases. SIAM J. on Comput., Vol. 24, 6 (1995), 1235--1258.
[53]
Shanmugavelayutham Muthukrishnan et al. 2005. Data streams: Algorithms and applications. Trends in Theoretical Computed Science, Vol. 1, 2 (2005), 117--236.
[54]
Anil Pacaci, Angela Bonifati, and M. Tamer Özsu. 2020. Regular Path Query Evaluation on Streaming Graphs. arXiv preprint arXiv:2004.02012 (2020).
[55]
Anil Pacaci, Alice Zhou, Jimmy Lin, and M. Tamer Özsu. 2017. Do We Need Specialized Graph Databases?: Benchmarking Real-Time Social Networking Applications. In Proc. 5th Int. Workshop on Graph Data Management Experiences and Systems. Article 12, 7 pages. https://doi.org/10.1145/3078447.3078459
[56]
Ashwin Paranjape, Austin R Benson, and Jure Leskovec. 2017. Motifs in temporal networks. In Proc. 10th ACM Int. Conf. Web Search and Data Mining. 601--610.
[57]
Kostas Patroumpas and Timos Sellis. 2006. Window specification over data streams. In Advances in Database Technology, Proc. 10th Int. Conf. on Extending Database Technology. 445--464.
[58]
Xiafei Qiu, Wubin Cen, Zhengping Qian, You Peng, Ying Zhang, Xuemin Lin, and Jingren Zhou. 2018. Real-time constrained cycle detection in large dynamic graphs. Proc. VLDB Endowment, Vol. 11, 12 (2018), 1876--1888.
[59]
Liam Roditty and Uri Zwick. 2016. A fully dynamic reachability algorithm for directed graphs with an almost linear update time. SIAM J. on Comput., Vol. 45, 3 (2016), 712--733.
[60]
Siddhartha Sahu, Amine Mhedhbi, Semih Salihoglu, Jimmy Lin, and M. Tamer Özsu. 2018. The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing. Proc. VLDB Endowment, Vol. 11, 4 (2018), 420--431.
[61]
S. Seufert, A. Anand, S. Bedathur, and G. Weikum. 2013. FERRARI: Flexible and efficient reachability range assignment for graph indexing. In Proc. 29th Int. Conf. on Data Engineering. 1009--1020. https://doi.org/10.1109/ICDE.2013.6544893
[62]
Andy Seaborne Steve Harris. [n. d.]. SPARQL 1.1 Query Language. https://www.w3.org/TR/sparql11-query/
[63]
Jiao Su, Qing Zhu, Hao Wei, and Jeffrey Xu Yu. 2016. Reachability querying: can it be even faster? IEEE Trans. Knowl. and Data Eng., Vol. 29, 3 (2016), 683--697.
[64]
Ken Thompson. 1968. Programming techniques: Regular expression search algorithm., Vol. 11, 6 (1968), 419--422.
[65]
Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Karthik Ramasamy, Jignesh M Patel, Sanjeev Kulkarni, Jason Jackson, Krishna Gade, Maosong Fu, Jake Donham, et al. 2014. Storm@ twitter. In Proc. ACM SIGMOD Int. Conf. on Management of Data. 147--156.
[66]
Oskar van Rest, Sungpack Hong, Jinha Kim, Xuming Meng, and Hassan Chafi. 2016. PGQL: a property graph query language. In Proc. 4th Int. Workshop on Graph Data Management Experiences and Systems. 7.
[67]
Sarisht Wadhwa, Anagh Prasad, Sayan Ranu, Amitabha Bagchi, and Srikanta Bedathur. 2019. Efficiently Answering Regular Simple Path Queries on Large Labeled Networks. In Proc. ACM SIGMOD Int. Conf. on Management of Data (SIGMOD '19). New York, NY, USA, 1463--1480. https://doi.org/10.1145/3299869.3319882
[68]
Nikolay Yakovets, Parke Godfrey, and Jarek Gryz. 2016. Query planning for evaluating SPARQL property paths. In Proc. ACM SIGMOD Int. Conf. on Management of Data. 1875--1889.
[69]
Hilmi Yildirim, Vineet Chaoji, and Mohammed J. Zaki. 2010. GRAIL: scalable reachability index for large graphs. Proc. VLDB Endowment, Vol. 3, 1 (2010), 276--284. Issue 1--2. http://dl.acm.org/citation.cfm?id=1920841.1920879
[70]
Ying Zhang, Pham Minh Duc, Oscar Corcho, and Jean-Paul Calbimonte. 2012. SRBench: A Streaming RDF/SPARQL Benchmark. In Proc. 11th Int. Semantic Web Conf. 641--657.

Cited By

View all
  • (2024)LSQ 2.0: A linked dataset of SPARQL query logsSemantic Web10.3233/SW-22301515:1(167-189)Online publication date: 12-Jan-2024
  • (2024)D3-GNN: Dynamic Distributed Dataflow for Streaming Graph Neural NetworksProceedings of the VLDB Endowment10.14778/3681954.368196117:11(2764-2777)Online publication date: 1-Jul-2024
  • (2024)Incremental Sliding Window Connectivity over Streaming GraphsProceedings of the VLDB Endowment10.14778/3675034.367504017:10(2473-2486)Online publication date: 1-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
June 2020
2925 pages
ISBN:9781450367356
DOI:10.1145/3318464
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. persistent query evaluation
  2. regular path queries
  3. streaming graphs

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)94
  • Downloads (Last 6 weeks)12
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)LSQ 2.0: A linked dataset of SPARQL query logsSemantic Web10.3233/SW-22301515:1(167-189)Online publication date: 12-Jan-2024
  • (2024)D3-GNN: Dynamic Distributed Dataflow for Streaming Graph Neural NetworksProceedings of the VLDB Endowment10.14778/3681954.368196117:11(2764-2777)Online publication date: 1-Jul-2024
  • (2024)Incremental Sliding Window Connectivity over Streaming GraphsProceedings of the VLDB Endowment10.14778/3675034.367504017:10(2473-2486)Online publication date: 1-Jun-2024
  • (2024)Truss-Based Community Search over Streaming Directed GraphsProceedings of the VLDB Endowment10.14778/3659437.365944017:8(1816-1829)Online publication date: 1-Apr-2024
  • (2024)Efficient Regular Simple Path Queries under Transitive Restricted ExpressionsProceedings of the VLDB Endowment10.14778/3654621.365463617:7(1710-1722)Online publication date: 1-Mar-2024
  • (2024)LM-SRPQ: Efficiently Answering Regular Path Query in Streaming GraphsProceedings of the VLDB Endowment10.14778/3641204.364121417:5(1047-1059)Online publication date: 1-Jan-2024
  • (2024)Querying Structural Diversity in Streaming GraphsProceedings of the VLDB Endowment10.14778/3641204.364121317:5(1034-1046)Online publication date: 2-May-2024
  • (2024)MWP: Multi-Window Parallel Evaluation of Regular Path Queries on Streaming GraphsProceedings of the ACM on Management of Data10.1145/36392602:1(1-26)Online publication date: 26-Mar-2024
  • (2024)An Overview of Continuous Querying in (Modern) Data SystemsCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3654679(605-612)Online publication date: 9-Jun-2024
  • (2024)CheckMate: Evaluating Checkpointing Protocols for Streaming Dataflows2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00309(4030-4043)Online publication date: 13-May-2024
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media