Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Implementation Strategies for Views over Property Graphs

Published: 30 May 2024 Publication History
  • Get Citation Alerts
  • Abstract

    The need to query complex interactions and relationships has motivated interest in property graph database platforms. For some graph applications, graph views are required to abstract the data, e.g., to capture individual-level vs. organization-level relationships; or show single computational steps vs. composite workflows. Emerging efforts to standardize graph query languages have developed semantics and language constructs for graph views.
    This paper considers the task of implementing such views using rewriting techniques --- both using existing property graph DBMSs and converting to relational RDBMSs. We consider both virtual and materialized views, ways of rewriting queries, and structures for indexing data. We also note a common use case of graph views, which involves preserving a graph except minor local transformations; we develop novel extensions and semantics for this. We evaluate and compare the performance of our techniques under a variety of workloads, and we compare existing graph and relational DBMS platforms.

    References

    [1]
    [n. d.]. https://github.com/sraoss/pg_ivm.
    [2]
    [n. d.]. https://github.com/ldbc/lsqb.
    [3]
    [n. d.]. http://www.nltk.org/_modules/nltk/corpus/reader/wordnet.html.
    [4]
    [n. d.]. https://www.gqlstandards.org.
    [5]
    Serge Abiteboul and Oliver Duschka. 1998. Complexity of answering queries using materialized views. In PODS. Seattle, WA, 254--263.
    [6]
    Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of databases. Vol. 8. Addison-Wesley Reading.
    [7]
    Foto Afrati and Vassia Pavlaki. 2006. Rewriting queries using views with negation. AI Communications 19, 3 (2006), 229--237.
    [8]
    Renzo Angles, Marcelo Arenas, Pablo Barceló, Peter Boncz, George Fletcher, Claudio Gutierrez, Tobias Lindaaker, Marcus Paradies, Stefan Plantikow, Juan Sequeda, et al. 2018. G-CORE: A core for future graph query languages. In SIGMOD. ACM, 1421--1432.
    [9]
    Renzo Angles, Angela Bonifati, Stefania Dumbrava, George Fletcher, Alastair Green, Jan Hidders, Bei Li, Leonid Libkin, Victor Marsault, Wim Martens, et al . 2023. PG-Schema: Schemas for property graphs. Proceedings of the ACM on Management of Data 1, 2 (2023), 1--25.
    [10]
    Renzo Angles, Angela Bonifati, Stefania Dumbrava, George Fletcher, Keith W Hare, Jan Hidders, Victor E Lee, Bei Li, Leonid Libkin, Wim Martens, et al. 2021. Pg-keys: Keys for property graphs. In Proceedings of the 2021 International Conference on Management of Data. 2423--2436.
    [11]
    Catriel Beeri and Moshe Y. Vardi. 1984. Formal systems for tuple and equality generating dependencies. SIAM J. Comput. 13, 1 (1984), 76--98.
    [12]
    Olivier Biton, Sarah Cohen Boulakia, Susan B. Davidson, and Carmem S. Hara. 2008. Querying and Managing Provenance through User Views in Scientific Workflows. In ICDE. 1072--1081.
    [13]
    Olivier Biton, Sarah Cohen-Boulakia, Susan B Davidson, and Carmem S Hara. 2008. Querying and managing provenance through user views in scientific workflows. In 2008 IEEE 24th International Conference on Data Engineering. IEEE, 1072--1081.
    [14]
    Olivier Biton, Susan B. Davidson, Sanjeev Khanna, and Sudeepa Roy. 2009. Optimizing user views for workflows. In ICDT. 310--323.
    [15]
    Angela Bonifati, Stefania Dumbrava, Joel Dos Santos, Débora Muchaluat-Saade, Cécile Roisin, Nabil Layaïda, Pierre Senellart, Louis Jachiet, Silviu Maniu, Yann Ramusat, et al . 2018. Graph Queries: From Theory to Practice. SIGMOD RECORD 47, 4 (2018), 5--16.
    [16]
    Angela Bonifati, George Fletcher, Hannes Voigt, and Nikolay Yakovets. 2018. Querying graphs. Synthesis Lectures on Data Management 10, 3 (2018), 1--184.
    [17]
    Peter Buneman, Sanjeev Khanna, Keishi Tajima, and Wang-chiew Tan. 2002. Archiving Scientific Data. Proceedings of the ACM SIGMOD International Conference on Management of Data 29 (05 2002). https://doi.org/10.1145/974750.974752
    [18]
    Andrea Cali, Georg Gottlob, and Michael Kifer. 2013. Taming the infinite chase: Query answering under expressive relational constraints. Journal of Artificial Intelligence Research 48 (2013), 115--174.
    [19]
    Surajit Chaudhuri, Ravi Krishnamurthy, Spyros Potamianos, and Kyuseok Shim. 1995. Optimizing queries with materialized views. In Proceedings of the Eleventh International Conference on Data Engineering. IEEE, 190--200.
    [20]
    Joana MF da Trindade, Konstantinos Karanasos, Carlo Curino, Samuel Madden, and Julian Shun. 2020. Kaskade: Graph views for efficient graph analytics. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 193--204.
    [21]
    Alin Deutsch, Nadime Francis, Alastair Green, Keith Hare, Bei Li, Leonid Libkin, Tobias Lindaaker, Victor Marsault, Wim Martens, Jan Michels, et al. 2022. Graph pattern matching in GQL and SQL/PGQ. In Proceedings of the 2022 International Conference on Management of Data. 2246--2258.
    [22]
    A. Deutsch, L.Popa, and V. Tannen. 2006. Query Reformulation with Constraints. SIGMOD Record 35, 1 (2006), 65--73.
    [23]
    Alin Deutsch, Yu Xu, Mingxi Wu, and Victor E Lee. 2020. Aggregation support for modern graph analytics in TigerGraph. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 377--392.
    [24]
    Ronald Fagin, Phokion G. Kolaitis, and Lucian Popa. 2003. Data exchange: getting to the core. In PODS. 90--101.
    [25]
    Wenfei Fan and Ping Lu. 2019. Dependencies for Graphs. ACM Trans. Database Syst. 44, 2, Article 5 (feb 2019), 40 pages. https://doi.org/10.1145/3287285
    [26]
    Xiyang Feng, Guodong Jin, Ziyi Chen, Chang Liu, and Semih Salihoglu. 2023. KÙZU Graph Database Management System. CIDR.
    [27]
    Ariel Fuxman, Phokion G. Kolaitis, Renée J. Miller, and Wang-Chiew Tan. 2005. Peer data exchange. In PODS. 160--171.
    [28]
    Alastair Green, Martin Junghanns, Max Kießling, Tobias Lindaaker, Stefan Plantikow, and Petra Selmer. 2018. open-Cypher: New Directions in Property Graph Querying. In EDBT. 520--523.
    [29]
    Todd J. Green, Grigoris Karvounarakis, Zachary G. Ives, and Val Tannen. 2007. Update Exchange with Mappings and Provenance. In VLDB. Amended version available as Univ. of Pennsylvania report MS-CIS-07--26.
    [30]
    Ashish Gupta and Inderpal Singh Mumick. 1999. Incremental maintenance of recursive views: A survey. (1999).
    [31]
    Ashish Gupta and Inderpal Singh Mumick (Eds.). 1999. Materialized Views: Techniques, Implementations and Applications. The MIT Press.
    [32]
    Pranjal Gupta, Amine Mhedhbi, and Semih Salihoglu. [n. d.]. Columnar Storage and List-based Processing for Graph Database Management Systems. ([n. d.]).
    [33]
    Alon Y Halevy. 2000. Theory of answering queries using views. ACM SIGMOD Record 29, 4 (2000), 40--47.
    [34]
    Alon Y. Halevy. 2001. Answering Queries Using Views: A Survey. VLDB J. 10, 4 (2001), 270--294.
    [35]
    Alfons Kemper and Guido Moerkotte. 1990. Advanced Query Processing in Object Bases Using Access Support Relations. In VLDB. San Francisco, CA, USA, 290--301.
    [36]
    David Koop and Jay Patel. 2017. Dataflow notebooks: encoding and tracking dependencies of cells. In 9th USENIX Workshop on the Theory and Practice of Provenance (TaPP 17). USENIX Association.
    [37]
    Wangchao Le, Songyun Duan, Anastasios Kementsietsidis, Feifei Li, and Min Wang. 2011. Rewriting queries on SPARQL views. In Proceedings of the 20th international conference on World wide web. ACM, 655--664.
    [38]
    Maurizio Lenzerini. 2002. Data Integration: A Theoretical Perspective. In PODS.
    [39]
    Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data.
    [40]
    Alon Y Levy and Yehoshua Sagiv. 1993. Queries independent of updates. In VLDB, Vol. 93. Citeseer, 171--181.
    [41]
    Microsoft. [n. d.]. Open Academic Graph. https://www.openacademic.ai/oag/
    [42]
    Tova Milo and Dan Suciu. 1999. Index Structures for Path Expressions. In ICDT. 277--295.
    [43]
    Luc Moreau, Paolo Missier, Khalid Belhajjame, Reza B'Far, James Cheney, Sam Coppens, Stephen Cresswell, Yolanda Gil, Paul Groth, Graham Klyne, et al. [n. d.]. PROV-DM: The PROV Data Model. https://www.w3.org/TR/prov-dm/
    [44]
    Kiran-Kumar Muniswamy-Reddy, David A. Holland, Uri Braun, and Margo I. Seltzer. 2006. Provenance-Aware Storage Systems. In USENIX Annual Technical Conference, General Track. 43--56.
    [45]
    Yannis Papakonstantinou and Vasilis Vassalos. 1999. Query rewriting for semistructured data. In ACM SIGMOD Record, Vol. 28. ACM, 455--466.
    [46]
    Christine F. Reilly and Jeffrey F. Naughton. 2009. Transparently Gathering Provenance with Provenance Aware Condor. In TaPP.
    [47]
    Marko A Rodriguez. 2015. The Gremlin graph traversal machine and language (invited talk). In Proceedings of the 15th Symposium on Database Programming Languages. ACM, 1--10.
    [48]
    Manolis Stamatogiannakis, Hasanat Kazmi, Hashim Sharif, Remco Vermeulen, Ashish Gehani, Herbert Bos, and Paul Groth. 2016. Trade-offs in automatic provenance capture. In International Provenance and Annotation Workshop. Springer, 29--41.
    [49]
    Oskar van Rest, Sungpack Hong, Jinha Kim, Xuming Meng, and Hassan Chafi. 2016. PGQL: a property graph query language. In Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems. 1--6.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Management of Data
    Proceedings of the ACM on Management of Data  Volume 2, Issue 3
    SIGMOD
    June 2024
    1953 pages
    EISSN:2836-6573
    DOI:10.1145/3670010
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 May 2024
    Published in PACMMOD Volume 2, Issue 3

    Permissions

    Request permissions for this article.

    Badges

    • Best Paper

    Author Tags

    1. graph databases
    2. indexing
    3. storage
    4. views

    Qualifiers

    • Research-article

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 205
      Total Downloads
    • Downloads (Last 12 months)205
    • Downloads (Last 6 weeks)58
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media