Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

TypeQL: A Type-Theoretic & Polymorphic Query Language

Published: 14 May 2024 Publication History

Abstract

Relational data modeling can often be restrictive as it provides no direct facility for modeling polymorphic types, reified relations, multi-valued attributes, and other common high-level structures in data. This creates many challenges in data modeling and engineering tasks, and has led to the rise of more flexible NoSQL databases, such as graph and document databases. In the absence of structured schemas, however, we can neither express nor validate the intention of data models, making long-term maintenance of databases substantially more difficult. To resolve this dilemma, we argue that, parallel to the role of classical predicate logic for relational algebra, contemporary foundations of mathematics rooted in type theory can guide us in the development of powerful new high-level data models and query languages. To this end, we introduce a new polymorphic entity-relation-attribute (PERA) data model, grounded in type-theoretic principles and accessible through classical conceptual modeling, with a near-natural query language: TypeQL. We illustrate the syntax of TypeQL as well as its denotation in the PERA model, formalize our model as an algebraic theory with dependent types, and describe its stratified semantics.

Supplemental Material

MP4 File
Presentation video

References

[1]
2023. TypeDB. (2023). https://www.typedb.com
[2]
S. Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of databases. Addison-Wesley, Reading, Mass. http://webdam.inria.fr/Alice/
[3]
Krzysztof R. Apt, Howard A. Blair, and AdrianWalker. 1988. Towards a Theory of Declarative Knowledge. In Foundations of Deductive Databases and Logic Programming. Elsevier, 89--148. https://doi.org/10.1016/B978-0--934613--40--8.50006--3
[4]
Malcolm P. Atkinson, François Bancilhon, David J. DeWitt, Klaus R. Dittrich, David Maier, and Stanley B. Zdonik. 1990. The Object-Oriented Database System Manifesto. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, Atlantic City, NJ, USA, May 23--25, 1990, Hector Garcia-Molina and H. V. Jagadish (Eds.). ACM Press, 395.
[5]
Elena Botoeva, Diego Calvanese, Benjamin Cogrel, Martin Rezk, and Guohui Xiao. 2016. A Formal Presentation of MongoDB (Extended Version). CoRR abs/1603.09291 (2016). arXiv:1603.09291 http://arxiv.org/abs/1603.09291
[6]
John Cartmell. 1986. Generalised algebraic theories and contextual categories. Ann. Pure Appl. Log. 32 (1986), 209--243. https://doi.org/10.1016/0168-0072(86)90053--9
[7]
Peter P. Chen. 1976. The Entity-Relationship Model - Toward a Unified View of Data. ACM Trans. Database Syst. 1, 1 (1976), 9--36. https://doi.org/10.1145/320434.320440
[8]
Adam Chlipala. 2013. Certified Programming with Dependent Types - A Pragmatic Introduction to the Coq Proof Assistant. MIT Press. http://mitpress.mit.edu/books/certified-programming-dependent-types
[9]
Shumo Chu, Chenglong Wang, Konstantin Weitz, and Alvin Cheung. 2017. Cosette: An Automated Prover for SQL. (2017). http://cidrdb.org/cidr2017/papers/p51-chu-cidr17.pdf
[10]
Shumo Chu, Konstantin Weitz, Alvin Cheung, and Dan Suciu. 2017. HoTTSQL: proving query rewrites with univalent SQL semantics. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, Barcelona Spain, 510--524. https://doi.org/10.1145/3062341.3062348
[11]
Keith L. Clark. 1977. Negation as Failure. In Logic and Data Bases, Symposium on Logic and Data Bases, Centre d'études et de recherches de Toulouse, France, 1977, Hervé Gallaire and Jack Minker (Eds.). Plemum Press, New York, 293--322. https://doi.org/10.1007/978--1--4684--3384--5_11
[12]
William F. Clocksin and Christopher S. Mellish. 1994. Programming in Prolog (4. ed.). Springer.
[13]
E. F. Codd. 1970. A Relational Model of Data for Large Shared Data Banks. Commun. ACM 13, 6 (1970), 377--387. https://doi.org/10.1145/362384.362685
[14]
Robert L. Constable, Stuart F. Allen, Mark Bromley, Rance Cleaveland, J. F. Cremer, Robert Harper, Douglas J. Howe, Todd B. Knoblock, Nax Paul Mendler, Prakash Panangaden, James T. Sasaki, and Scott F. Smith. 1986. Implementing mathematics with the Nuprl proof development system. Prentice Hall. http://dl.acm.org/citation.cfm?id=10510
[15]
Leonardo de Moura, Soonho Kong, Jeremy Avigad, Floris van Doorn, and Jakob von Raumer. 2015. The Lean Theorem Prover (System Description). In Automated Deduction - CADE-25, Amy P. Felty and Aart Middeldorp (Eds.). Springer International Publishing, Cham, 378--388.
[16]
Christoph Dorn and Haikal Pribadi. 2023. Type Theory as a Unifying Paradigm for Modern Databases. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management (CIKM '23). Association for Computing Machinery, New York, NY, USA, 5238--5239. https://doi.org/10.1145/3583780.3615999
[17]
Christoph Dorn and Haikal Pribadi. 2024. Formal Semantics of the Type-Theoretic Language TypeQL. In preparation (2024).
[18]
R Elmasri and SB Navathe. 2016. Fundamentals of Database Systems. Pearson, Boston.
[19]
David W Embley. 2009. Semantic Data Model. In Encyclopedia of Database Systems, M. Tamer Özsu Ling Liu (Ed.). Springer New York, NY, 3391--3393. https://doi.org/10.1007/978-0--387--39940--9
[20]
Henrik Forssell, Hakon Robbestad Gylterud, and David I Spivak. 2020. Type theoretical databases. Journal of Logic and Computation 30, 1 (Jan. 2020), 217--238. https://doi.org/10.1093/logcom/exaa009 arXiv:1406.6268 [cs, math].
[21]
Nadime Francis, Alastair Green, Paolo Guagliardo, Leonid Libkin, Tobias Lindaaker, Victor Marsault, Stefan Plantikow, Mats Rydberg, Petra Selmer, and Andrés Taylor. 2018. Cypher: An Evolving Query Language for Property Graphs. In Proceedings of the 2018 International Conference on Management of Data. ACM, Houston TX USA, 1433--1445. https://doi.org/10.1145/3183713.3190657
[22]
Allen Van Gelder. 1993. The Alternating Fixpoint of Logic Programs with Negation. J. Comput. Syst. Sci. 47, 1 (1993), 185--221. https://doi.org/10.1016/0022-0000(93)90024-Q
[23]
Michael Gelfond. 1987. On Stratified Autoepistemic Theories. In Proceedings of the 6th National Conference on Artificial Intelligence. Seattle, WA, USA, July 1987, Kenneth D. Forbus and Howard E. Shrobe (Eds.). Morgan Kaufmann, 207--211. http://www.aaai.org/Library/AAAI/1987/aaai87-037.php
[24]
Michael Gelfond and Vladimir Lifschitz. 1988. The Stable Model Semantics for Logic Programming. In Logic Programming, Proceedings of the Fifth International Conference and Symposium, Seattle, Washington, USA, August 15--19, 1988 (2 Volumes), Robert A. Kowalski and Kenneth A. Bowen (Eds.). MIT Press, 1070--1080.
[25]
Michael Gelfond and Vladimir Lifschitz. 1991. Classical Negation in Logic Programs and Disjunctive Databases. New Gener. Comput. 9, 3/4 (1991), 365--386. https://doi.org/10.1007/BF03037169
[26]
Jean-Yves Girard, Paul Taylor, and Yves Lafont. 1989. Proofs and types. Vol. 7. Cambridge university press Cambridge.
[27]
Martin Gogolla and Uwe Hohenstein. 1991. Towards a Semantic View of an Extended Entity-Relationship Model. ACM Trans. Database Syst. 16, 3 (1991), 369--416. https://doi.org/10.1145/111197.111200
[28]
Paolo Guagliardo and Leonid Libkin. 2017. A Formal Semantics of SQL Queries, Its Validation, and Applications. Proc. VLDB Endow. 11, 1 (2017), 27--39. https://doi.org/10.14778/3151113.3151116
[29]
Robert Harper. 2016. Practical Foundations for Programming Languages (2nd. Ed.). Cambridge University Press. https://www.cs.cmu.edu/%7Erwh/pfpl/index.html
[30]
Shan Shan Huang, Todd Jeffrey Green, and Boon Thau Loo. 2011. Datalog and emerging applications: an interactive tutorial. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June 12--16, 2011 (SIGMOD '11), Timos K. Sellis, Renée J. Miller, Anastasios Kementsietsidis, and Yannis Velegrakis (Eds.). ACM, NY, USA, 1213--1216. https://doi.org/10.1145/1989323.1989456
[31]
Christopher Ireland, David Bowers, Michael Newton, and Kevin Waugh. 2009. A Classification of Object-Relational Impedance Mismatch. In The First International Conference on Advances in Databases, Knowledge, and Data Applications, DBKDS 2009, Gosier, Guadeloupe, France, 1--6 March 2009, Qiming Chen, Alfredo Cuzzocrea, Takahiro Hara, Ela Hunt, and Manuela Popescu (Eds.). IEEE Computer Society, 36--43. https://doi.org/10.1109/DBKDA.2009.11
[32]
Anirudh Kadadi, Rajeev Agrawal, Christopher Nyamful, and Rahman Atiq. 2014. Challenges of data integration and interoperability in big data. In 2014 IEEE International Conference on Big Data (IEEE BigData 2014), Washington, DC, USA, October 27--30, 2014, Jimmy Lin, Jian Pei, Xiaohua Hu, Wo Chang, Raghunath Nambiar, Charu C. Aggarwal, Nick Cercone, Vasant G. Honavar, Jun Huan, Bamshad Mobasher, and Saumyadipta Pyne (Eds.). IEEE Computer Society, 38--40. https://doi.org/10.1109/BIGDATA.2014.7004486
[33]
Alfons Kemper, Guido Moerkotte, Hans-Dirk Walter, and Andreas Zachmann. 1991. GOM: A Strongly Typed Persistent Object Model With Polymorphism. In Datenbanksysteme in Büro, Technik und Wissenschaft, GI-Fachtagung, Kaiserslautern, 6.-8. März 1991, Proceedings (Informatik-Fachberichte, Vol. 270), Hans-Jürgen Appelrath (Ed.). Springer, 198--217. https://doi.org/10.1007/978--3--642--76530--8_11
[34]
Won Kim. 1990. Object-Oriented Databases: Definition and Research Directions. IEEE Trans. Knowl. Data Eng. 2, 3 (1990), 327--341. https://doi.org/10.1109/69.60796
[35]
Neal Leavitt. 2000. Whatever Happened to Object-Oriented Databases? Computer 33, 8 (2000), 16--19. https: //doi.org/10.1109/MC.2000.10067
[36]
Mengchi Liu. 1999. Deductive Database Languages: Problems and Solutions. ACM Comput. Surv. 31, 1 (1999), 27--62. https://doi.org/10.1145/311531.311533
[37]
John Wylie Lloyd. 1984. Foundations of Logic Programming. Springer Berlin Heidelberg, Berlin, Heidelberg. https: //doi.org/10.1007/978--3--642--96826--6
[38]
Mary E. S. Loomis and Akmal B. Chaudhri (Eds.). 1997. Object Databases in Practice. Prentice-Hall.
[39]
Per Martin-Löf. 1984. Intuitionistic type theory. Studies in proof theory, Vol. 1. Bibliopolis, Naples.
[40]
Atsushi Ohori, Peter Buneman, and Val Tannen. 1989. Database Programming in Machiavelli - a Polymorphic Language with Static Type Inference. (1989), 46--57. https://doi.org/10.1145/67544.66931
[41]
Pasquale Pagano, Leonardo Candela, and Donatella Castelli. 2013. Data Interoperability. Data Sci. J. 12 (2013), GRDI19--GRDI25. https://doi.org/10.2481/DSJ.GRDI-004
[42]
Christine Parent and Stefano Spaccapietra. 2000. Database integration: The key to data interoperability. In Advances in Object-Oriented Data Modeling, Z. Tari M. P. Papazoglou, S. Spaccapietra (Ed.). MIT Press.
[43]
Joan Peckham and Fred J. Maryanski. 1988. Semantic Data Models. ACM Comput. Surv. 20, 3 (1988), 153--189. https://doi.org/10.1145/62061.62062
[44]
Benjamin C. Pierce. 2002. Types and programming languages. MIT Press.
[45]
The Univalent Foundations Program. 2013. Homotopy Type Theory: Univalent Foundations of Mathematics. (2013). https://homotopytypetheory.org/book/
[46]
Teodor C. Przymusinski. 1988. On the Declarative Semantics of Deductive Databases and Logic Programs. In Foundations of Deductive Databases and Logic Programming, Jack Minker (Ed.). Morgan Kaufmann, 193--216. https: //doi.org/10.1016/B978-0--934613--40--8.50009--9
[47]
Jorge Pérez, Marcelo Arenas, and Claudio Gutierrez. 2009. Semantics and complexity of SPARQL. ACM Transactions on Database Systems 34, 3 (Aug. 2009), 1--45. https://doi.org/10.1145/1567274.1567278
[48]
Raghu Ramakrishnan and Jeffrey D. Ullman. 1995. A survey of deductive database systems. J. Log. Program. 23, 2 (1995), 125--149. https://doi.org/10.1016/0743--1066(94)00039--9
[49]
Marko A. Rodriguez. 2015. The Gremlin graph traversal machine and language (invited talk). In Proceedings of the 15th Symposium on Database Programming Languages, Pittsburgh, PA, USA, October 25--30, 2015, James Cheney and Thomas Neumann (Eds.). ACM, 1--10. https://doi.org/10.1145/2815072.2815073
[50]
Peter Scholze. 2022. Liquid Tensor Experiment. Exp. Math. 31, 2 (2022), 349--354. https://doi.org/10.1080/10586458. 2021.1926016
[51]
Patrick Schultz and Ryan Wisnesky. 2017. Algebraic data integration. J. Funct. Program. 27 (2017), e24. https: //doi.org/10.1017/S0956796817000168
[52]
Chandan Sharma and Roopak Sinha. 2022. FLASc: a formal algebra for labeled property graph schema. Autom. Softw. Eng. 29, 1 (2022), 37. https://doi.org/10.1007/S10515-022-00336-Y
[53]
Paul Taylor. 1999. Practical Foundations of Mathematics. Cambridge studies in advanced mathematics, Vol. 59. Cambridge University Press.
[54]
Bernhard Thalheim. 2018. Extended Entity-Relationship Model. (2018). https://doi.org/10.1007/978--1--4614--8265--9_157
[55]
Philip Wadler. 2015. Propositions as types. Commun. ACM 58, 12 (2015), 75--84. https://doi.org/10.1145/2699407

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data
Proceedings of the ACM on Management of Data  Volume 2, Issue 2
PODS
May 2024
852 pages
EISSN:2836-6573
DOI:10.1145/3665155
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 May 2024
Published in PACMMOD Volume 2, Issue 2

Permissions

Request permissions for this article.

Author Tags

  1. data models
  2. er modeling
  3. formal semantics
  4. logic programming
  5. polymorphism
  6. query languages
  7. type theory
  8. type-theoretic databases

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 102
    Total Downloads
  • Downloads (Last 12 months)102
  • Downloads (Last 6 weeks)21
Reflects downloads up to 16 Oct 2024

Other Metrics

Citations

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media