Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

PG-Schema: Schemas for Property Graphs

Published: 20 June 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Property graphs have reached a high level of maturity, witnessed by multiple robust graph database systems as well as the ongoing ISO standardization effort aiming at creating a new standard Graph Query Language (GQL). Yet, despite documented demand, schema support is limited both in existing systems and in the first version of the GQL Standard. It is anticipated that the second version of the GQL Standard will include a rich DDL. Aiming to inspire the development of GQL and enhance the capabilities of graph database systems, we propose PG-Schema, a simple yet powerful formalism for specifying property graph schemas. It features PG-Schema with flexible type definitions supporting multi-inheritance, as well as expressive constraints based on the recently proposed PG-Keys formalism. We provide the formal syntax and semantics of PG-Schema, which meet principled design requirements grounded in contemporary property graph management scenarios, and offer a detailed comparison of its features with those of existing schema languages and graph database systems.

    Supplemental Material

    MP4 File
    Presentation video for SIGMOD 2023 Industrial Track.

    References

    [1]
    ISO/IEC 39075. 2023. Information technology - Database languages - GQL. Standard. International Organization for Standardization, Geneva, CH.
    [2]
    ISO/IEC 9075--16. 2022. Information technology - Database languages SQL - Part 16: Property Graph Queries (SQL/PGQ). Standard. International Organization for Standardization, Geneva, CH.
    [3]
    AgensGraph. 2022. AgensGraph. https://bitnine.net/agensgraph (visited: 2022--11).
    [4]
    Renzo Angles, Marcelo Arenas, Pablo Barceló, Peter A. Boncz, George H. L. Fletcher, Claudio Gutierrez, Tobias Lindaaker, Marcus Paradies, Stefan Plantikow, Juan F. Sequeda, Oskar van Rest, and Hannes Voigt. 2018. G-CORE: A Core for Future Graph Query Languages. In SIGMOD Conference. ACM, 1421--1432.
    [5]
    Renzo Angles, Angela Bonifati, Stefania Dumbrava, George Fletcher, Keith W. Hare, Jan Hidders, Victor E. Lee, Bei Li, Leonid Libkin, Wim Martens, Filip Murlak, Josh Perryman, Ognjen Savkovic, Michael Schmidt, Juan F. Sequeda, Slawek Staworko, and Dominik Tomaszuk. 2021. PG-Keys: Keys for Property Graphs. In International Conference on Management of Data (SIGMOD). ACM, 2423--2436.
    [6]
    ArangoDB. 2022. ArangoDB. https://www.arangodb.com/ (visited: 2022--11).
    [7]
    Thomas Baker and Eric Prud'hommeaux. 2019. Shape Expressions (ShEx) 2.1 Primer. W3C Community Group Final Report. W3C. https://shex.io/shex-primer/index.html.
    [8]
    Angela Bonifati, Stefania Dumbrava, George Fletcher, Jan Hidders, Matthias Hofer, Wim Martens, Filip Murlak, Joshua Shinavier, S?awek Staworko, and Dominik Tomaszuk. 2022. Threshold Queries in Theory and in the Wild. Proc. VLDB Endow. 15, 5 (may 2022), 1105--1118. https://doi.org/10.14778/3510397.3510407
    [9]
    Angela Bonifati, Stefania Dumbrava, George Fletcher, Jan Hidders, Bei Li, Leonid Libkin, Wim Martens, Filip Murlak, Stefan Plantikow, Ognjen Savkovic, Juan Sequeda, S?awek Staworko, Dominik Tomaszuk, Hannes Voigt, Domagoj Vrgoc, and Mingxi Wu. 2022. domel/pgschema: PG-Schema Grammar 0.3. (Nov 2022). https://doi.org/10.5281/zenodo.7362078 https://zenodo.org/record/7362078.
    [10]
    Angela Bonifati, Stefania Dumbrava, George Fletcher, Jan Hidders, Bei Li, Leonid Libkin, Wim Martens, Filip Murlak, Stefan Plantikow, Ognjen Savkovic, Juan Sequeda, S?awek Staworko, Dominik Tomaszuk, Hannes Voigt, Domagoj Vrgoc, and Mingxi Wu. 2022. PG-Schema: Schemas for Property Graphs. https://doi.org/10.48550/arXiv.2211.10962
    [11]
    Angela Bonifati, Stefania-Gabriela Dumbrava, Emile Martinez, Fatemeh Ghasemi, Malo Jaffré, Pacome Luton, and Thomas Pickles. 2022. DiscoPG: Property Graph Schema Discovery and Exploration. Proc. VLDB Endow. 15, 12 (2022), 3654--3657. https://www.vldb.org/pvldb/vol15/p3654-bonifati.pdf
    [12]
    Angela Bonifati, Peter Furniss, Alastair Green, Russ Harmer, Eugenia Oshurko, and Hannes Voigt. 2019. Schema Validation and Evolution for Graph Databases. In ER (Lecture Notes in Computer Science, Vol. 11788). Springer, 448--456.
    [13]
    Gilad Bracha and William R. Cook. 1990. Mixin-based Inheritance. In Conference on Object-Oriented Programming Systems, Languages, and Applications / European Conference on Object-Oriented Programming, OOPSLA/ECOOP 1990, Ottawa, Canada, October 21--25, 1990, Proceedings, Akinori Yonezawa (Ed.). ACM, 303--311. https://doi.org/10.1145/97945.97982
    [14]
    Dan Brickley and Ramanathan Guha. 2014. RDF Schema 1.1. W3C Recommendation. W3C. https://www.w3.org/TR/2014/REC-rdf-schema-20140225/.
    [15]
    Peter P. Chen. 1976. The Entity-Relationship Model - Toward a Unified View of Data. ACM Trans. Database Syst. 1, 1 (1976), 9--36.
    [16]
    DataStax. 2022. DataStax. https://www.datastax.com/ (visited: 2022--11).
    [17]
    Alin Deutsch, Nadime Francis, Alastair Green, Keith Hare, Bei Li, Leonid Libkin, Tobias Lindaaker, Victor Marsault, Wim Martens, Jan Michels, Filip Murlak, Stefan Plantikow, Petra Selmer, Oskar van Rest, Hannes Voigt, Domagoj Vrgoc, Mingxi Wu, and Fred Zemke. 2022. Graph Pattern Matching in GQL and SQL/PGQ. In SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, Zachary Ives, Angela Bonifati, and Amr El Abbadi (Eds.). ACM, 2246--2258.
    [18]
    Ramez Elmasri and Shamkant B. Navathe. 2015. Fundamentals of Database Systems (7th edition) (7th ed.). Pearson.
    [19]
    Facebook. 2018. GraphQL. https://spec.graphql.org/June2018/.
    [20]
    Martin Fowler. 2003. UML Distilled: A Brief Guide to the Standard Object Modeling Language (3 ed.). Addison-Wesley Longman Publishing Co., Inc., USA.
    [21]
    Nadime Francis, Alastair Green, Paolo Guagliardo, Leonid Libkin, Tobias Lindaaker, Victor Marsault, Stefan Plantikow, Mats Rydberg, Petra Selmer, and Andrés Taylor. 2018. Cypher: An Evolving Query Language for Property Graphs. In SIGMOD Conference. ACM, 1433--1445.
    [22]
    D.K. Gosnell and M. Broecheler. 2022. The Practitioner's Guide to Graph Data. https://gra.fo/faq/ (visited: 2022--11).
    [23]
    Gra.fo. 2022. Gra.fo. https://gra.fo/faq/ (visited: 2022--11).
    [24]
    TigerGraph GraphStudioTM. 2022. TigerGraph GraphStudioTM. https://docs.tigergraph.com/gui/current/graphstudio/overview (visited: 2022--11).
    [25]
    LDBC Property Graph Schema Working Group. 2020. LDBC Property Graph Schema contributions to WG3. Open Access to External Paper OAEP-2023-04. Linked Data Benchmark Council (LDBC). https://doi.org/10.54285/ldbc.OFJF3566 Edited and presented by Jan Hidders, George Fletcher and Bei Li.
    [26]
    Neo4j SQL Working Group, Peter Furniss, and Alastair Green. 2018. SQL/PGQ data model and graph schema. Open Access to External Paper OAEP-2023-01. Linked Data Benchmark Council (LDBC). https://doi.org/10.54285/ldbc.QZSK3559
    [27]
    Benoît Groz, Aurélien Lemay, Slawek Staworko, and Piotr Wieczorek. 2022. Inference of Shape Graphs for Graph Databases. In 25th International Conference on Database Theory, ICDT 2022, March 29 to April 1, 2022, Edinburgh, UK (Virtual Conference) (LIPIcs, Vol. 220), Dan Olteanu and Nils Vortmeier (Eds.). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 14:1--14:20. https://doi.org/10.4230/LIPIcs.ICDT.2022.14
    [28]
    Terry Halpin. 2015. Object-Role Modeling Fundamentals: A Practical Guide to Data Modeling with ORM. Technics Publications.
    [29]
    Olaf Hartig and Jorge Pérez. 2018. Semantics and complexity of GraphQL. In Proceedings of the 2018 World Wide Web Conference. 1155--1164.
    [30]
    Pascal Hitzler, Sebastian Rudolph, Markus Krötzsch, Peter Patel-Schneider, and Bijan Parsia. 2012. OWL 2 Web Ontology Language Primer (Second Edition). W3C Recommendation. W3C. https://www.w3.org/TR/2012/REC-owl2-primer-20121211/.
    [31]
    ISO/IEC 19757--2:2008 2008. Information technology - Document Schema Definition Language (DSDL) - Part 2: Regular-grammar-based validation - RELAX NG. Standard. International Organization for Standardization, Geneva, CH.
    [32]
    JanusGraph. 2022. JanusGraph. https://janusgraph.org/ (visited: 2022--11).
    [33]
    Holger Knublauch and Dimitris Kontokostas. 2017. Shapes Constraint Language (SHACL). W3C Recommendation. W3C. https://www.w3.org/TR/2017/REC-shacl-20170720/.
    [34]
    Mark Needham and Amy E. Hodler. 2019. Graph Algorithms. O'Relly Media.
    [35]
    Neo4j 2016. The Definitive Guide to Graph Databases for the RDBMS Developer. Neo4j.
    [36]
    Neo4j. 2019. Graph DDL (Data Definition Language). https://github.com/opencypher/morpheus/blob/master/documentation/asciidoc/backend-sql-graphddl.adoc (visited: 2023-04).
    [37]
    Neo4j. 2022. Neo4j. https://neo4j.com/ (visited: 2022--11).
    [38]
    Neo4j. 2022. Neo4j Browser. https://neo4j.com/product/developer-tools/#browser (visited: 2022--11).
    [39]
    Graph Notebooks. 2022. Graph Notebooks. https://github.com/aws/graph-notebook (visited: 2022--11).
    [40]
    Oracle. 2022. Oracle Spatial and Graph. https://www.oracle.com/database/technologies/spatialandgraph.html (visited: 2022--11).
    [41]
    OrientDB. 2022. OrientDB. https://orientdb.org/ (visited: 2022--11).
    [42]
    David Peterson, Sandy Gao, Paul V. Biron, Michael Sperberg-McQueen, Ashok Malhotra, and Henry Thompson. 2012. W3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes. W3C Recommendation. W3C. https://www.w3.org/TR/2012/REC-xmlschema11--2--20120405/.
    [43]
    Ian Robinson, Jim Webber, and Emil Eifrem. 2015. Graph Databases. O'Reilly Media.
    [44]
    Mats Rydberg. 2016. Cypher schema constraints proposal. Open Access to External Paper OAEP-2023-03. Linked Data Benchmark Council (LDBC). https://doi.org/10.54285/ldbc.KKHM1756
    [45]
    Siddhartha Sahu, Amine Mhedhbi, Semih Salihoglu, Jimmy Lin, and M. Tamer Özsu. 2020. The ubiquity of large graphs and surprising challenges of graph processing: extended survey. VLDB J. 29, 2--3 (2020), 595--618.
    [46]
    Siddhartha Sahu, Amine Mhedhbi, Semih Salihoglu, Jimmy Lin, and M. Tamer Özsu. 2020. The ubiquity of large graphs and surprising challenges of graph processing: extended survey. The VLDB Journal 29, 2 (2020), 595--618. https://doi.org/10.1007/s00778-019-00548-x
    [47]
    Sherif Sakr, Angela Bonifati, Hannes Voigt, Alexandru Iosup, Khaled Ammar, Renzo Angles, Walid G. Aref, Marcelo Arenas, Maciej Besta, Peter A. Boncz, Khuzaima Daudjee, Emanuele Della Valle, Stefania Dumbrava, Olaf Hartig, Bernhard Haslhofer, Tim Hegeman, Jan Hidders, Katja Hose, Adriana Iamnitchi, Vasiliki Kalavri, Hugo Kapp, Wim Martens, M. Tamer Özsu, Eric Peukert, Stefan Plantikow, Mohamed Ragab, Matei Ripeanu, Semih Salihoglu, Christian Schulz, Petra Selmer, Juan F. Sequeda, Joshua Shinavier, Gábor Szárnyas, Riccardo Tommasini, Antonino Tumeo, Alexandru Uta, Ana Lucia Varbanescu, Hsiang-Yun Wu, Nikolay Yakovets, Da Yan, and Eiko Yoneki. 2021. The future is big graphs: a community view on graph processing systems. Commun. ACM 64, 9 (2021), 62--71.
    [48]
    Tom Sawyer. 2022. Graph Database Browser. https://www.tomsawyer.com/graph-database-browser (visited: 2022--11).
    [49]
    Michael Sperberg-McQueen, Henry Thompson, David Beech, Murray Maloney, Noah Mendelsohn, and Sandy Gao. 2012. W3C XML Schema Definition Language (XSD) 1.1 Part 1: Structures. W3C Recommendation. W3C. https://www.w3.org/TR/2012/REC-xmlschema11--1--20120405/.
    [50]
    Slawomir Staworko, Iovka Boneva, José Emilio Labra Gayo, Samuel Hym, Eric Gordon Prud'Hommeaux, and Harold Solbrig. 2015. Complexity and Expressiveness of ShEx for RDF. In 18th International Conference on Database Theory (ICDT 2015).
    [51]
    Neo4j Query Languages Standards & Research Team. 2019. Introduction to GQL Schema design. Open Access to External Paper OAEP-2023-02. Linked Data Benchmark Council (LDBC). https://doi.org/10.54285/ldbc.EPWQ6741 Edited by Alastair Green and Hannes Voigt.
    [52]
    Sparsity Technologies. 2022. Sparksee. https://sparsity-technologies.com/#sparksee (visited: 2022--11).
    [53]
    Bernhard Thalheim. 2018. Extended Entity-Relationship Model. In Encyclopedia of Database Systems, Second Edition, Ling Liu and M. Tamer Özsu (Eds.). Springer. https://doi.org/10.1007/978--1--4614--8265--9_157
    [54]
    TigerGraph. 2022. TigerGraph. https://www.tigergraph.com/ (visited: 2022--11).
    [55]
    Vaticle. 2022. TypeDB. https://vaticle.com/ (visited: 2022--11).
    [56]
    Damian Wile'ski and Dominik Tomaszuk. 2022. damianw27/pgs-grammar-check: Version 1.0.0. (Nov 2022). https://doi.org/10.5281/zenodo.7344227 Available at https://damianw27.github.io/pgs-grammar-check/.
    [57]
    Austin Wright, Henry Andrews, Ben Hutton, and Greg Dennis. 2020. JSON Schema: A Media Type for Describing JSON Documents. Draft. Internet Engineering Task Force.
    [58]
    Min Wu, Xinglu Yi, Hui Yu, Yu Liu, and Yujue Wang. 2022. Nebula Graph: An open source distributed graph database. CoRR abs/2206.07278 (2022).
    [59]
    François Yergeau, Michael Sperberg-McQueen, Tim Bray, Jean Paoli, and Eve Maler. 2008. Extensible Markup Language (XML) 1.0 (Fifth Edition). W3C Recommendation. W3C. https://www.w3.org/TR/2008/REC-xml-20081126/.

    Cited By

    View all
    • (2024)AeonG: An Efficient Built-in Temporal Support in Graph DatabasesProceedings of the VLDB Endowment10.14778/3648160.364818717:6(1515-1527)Online publication date: 3-May-2024
    • (2024)Implementation Strategies for Views over Property GraphsProceedings of the ACM on Management of Data10.1145/36549492:3(1-26)Online publication date: 30-May-2024
    • (2024)Querying Graph Databases at ScaleCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3654695(585-589)Online publication date: 9-Jun-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Management of Data
    Proceedings of the ACM on Management of Data  Volume 1, Issue 2
    PACMMOD
    June 2023
    2310 pages
    EISSN:2836-6573
    DOI:10.1145/3605748
    Issue’s Table of Contents
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 June 2023
    Published in PACMMOD Volume 1, Issue 2

    Badges

    • Best Industry Paper

    Author Tags

    1. graph databases
    2. property graphs
    3. schemas

    Qualifiers

    • Research-article

    Data Availability

    Presentation video for SIGMOD 2023 Industrial Track. https://dl.acm.org/doi/10.1145/3589778#PACMMOD-V1mod198.mp4

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1,120
    • Downloads (Last 6 weeks)161

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)AeonG: An Efficient Built-in Temporal Support in Graph DatabasesProceedings of the VLDB Endowment10.14778/3648160.364818717:6(1515-1527)Online publication date: 3-May-2024
    • (2024)Implementation Strategies for Views over Property GraphsProceedings of the ACM on Management of Data10.1145/36549492:3(1-26)Online publication date: 30-May-2024
    • (2024)Querying Graph Databases at ScaleCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3654695(585-589)Online publication date: 9-Jun-2024
    • (2024)GraphScope Flex: LEGO-like Graph Computing StackCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653383(386-399)Online publication date: 9-Jun-2024
    • (2024)Knowledge Graphs in Practice: Characterizing their Users, Challenges, and Visualization OpportunitiesIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332690430:1(584-594)Online publication date: 1-Jan-2024
    • (2024)Implementing Object-Centric Event Data Models in Event Knowledge GraphsProcess Mining Workshops10.1007/978-3-031-56107-8_33(431-443)Online publication date: 13-Apr-2024
    • (2023)Normalizing Property GraphsProceedings of the VLDB Endowment10.14778/3611479.361150616:11(3031-3043)Online publication date: 1-Jul-2023
    • (2023)A Conceptual Modeling Approach for Risk Assessment and Mitigation in Collision-Free UAV Routing Planning for Beyond-the-Visual-Line-of-Sight FlightsConceptual Modeling10.1007/978-3-031-47262-6_21(394-411)Online publication date: 6-Nov-2023
    • (2023)Knowledge Engineering in the Era of Artificial IntelligenceAdvances in Databases and Information Systems10.1007/978-3-031-42914-9_1(3-15)Online publication date: 4-Sep-2023

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media