Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

UnQL: a query language and algebra for semistructured data based on structural recursion

Published: 01 March 2000 Publication History

Abstract

This paper presents structural recursion as the basis of the syntax and semantics of query languages for semistructured data and XML. We describe a simple and powerful query language based on pattern matching and show that it can be expressed using structural recursion, which is introduced as a top-down, recursive function, similar to the way XSL is defined on XML trees. On cyclic data, structural recursion can be defined in two equivalent ways: as a recursive function which evaluates the data top-down and remembers all its calls to avoid infinite loops, or as a bulk evaluation which processes the entire data in parallel using only traditional relational algebra operators. The latter makes it possible for optimization techniques in relational queries to be applied to structural recursion. We show that the composition of two structural recursion queries can be expressed as a single such query, and this is used as the basis of an optimization method for mediator systems. Several other formal properties are established: structural recursion can be expressed in first-order logic extended with transitive closure; its data complexity is PTIME; and over relational data it is a conservative extension of the relational calculus. The underlying data model is based on value equality, formally defined with bisimulation. Structural recursion is shown to be invariant with respect to value equality.

References

[1]
{ABS99} S. Abiteboul, P. Buneman, and D. Suciu. Data on the Web: From Relations to Semistructured Data and Xml. Morgan Kaufmann, 1999.
[2]
{AHV95} Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison Wesley Publishing Co, 1995.
[3]
{AK89} S. Abiteboul and P. C. Kanellakis. Object identity as a query language primitive. In Proc. ACM SIGMOD Conference , pages 159-73, Portland, OR, May 1989.
[4]
{AM87} Andrew W. Appel and David B. MacQueen. A standard ml compiler. Functional Programming Languages and Computer Architecture, 1987.
[5]
{AQM+97} S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. Wiener. The Lorel query language for semistructured data. International Journal on Digital Libraries, 1(1):68- 88, April 1997.
[6]
{ATT} AT&T Bell Laboratories, Murray Hill, NJ 07974. Standard ML of New Jersey User's Guide, February 1993.
[7]
{BBKV87} F. Bancilhon, T. Briggs, S. Khoshafian, and P. Valduriez. FAD, a powerful and simple database language. In Proceedings of 13th International Conference on Very Large Data Bases, pages 97-105, 1987.
[8]
{BCD89} F. Bancilhon, S. Cluet, and C. Delobel. A query language for the O2 object-oriented database system. In Proceedings of 2nd International Workshop on Database Programming Languages, pages 122-138. Morgan Kaufmann, 1989.
[9]
{BDHS96} Peter Buneman, Susan Davidson, Gerd Hillebrand, and Dan Suciu. A query language and optimization techniques for unstructured data. In Proceedings of ACM-SIGMOD International Conference on Management of Data, pages 505-516, 1996.
[10]
{BDS95} Peter Buneman, Susan Davidson, and Dan Suciu. Programming constructs for unstructured data. In Proceedings of the Workshop on Database Programming Languages, Gubbio, Italy, September 1995.
[11]
{BLS+94} P. Buneman, L. Libkin, D. Suciu, V. Tannen, and L. Wong. Comprehension syntax. SIGMOD Record, 23(1):87-96, March 1994.
[12]
{BR86} F. Bancilhon and R. Ramakrishnan. An amateur's introduction to recursive query processing strategies. In Proc. ACM SIGMOD Conference, pages 16-52, Washington, DC, USA, May 1986.
[13]
{BTBN91} V. Breazu-Tannen, P. Buneman, and S. Naqvi. Structural recursion as a query language. In Conf. on Database Programming Languages, DBPL, 1991.
[14]
{BTS91} V. Breazu-Tannen and R. Subrahmanyam. Logical and computational aspects of programming with Sets/Bags/Lists. In LNCS 510: Proceedings of 18th International Colloquium on Automata, Languages, and Programming, Madrid, Spain, July 1991, pages 60-75. Springer Verlag, 1991.
[15]
{CD92} Sophie Cluet and Claude Delobel. A general framework for the optimization of object oriented queries. In M. Stonebraker, editor, Proceedings ACM-SIGMOD International Conference on Management of Data, pages 383- 392, San Diego, California, June 1992.
[16]
{Cla99a} James Clark. Xml path language (xpath), 1999. http://www.w3.org/TR/xpath.
[17]
{Cla99b} James Clark. Xsl transformations (xslt) specification, 1999. http://www.w3.org/TR/WD-xslt.
[18]
{CM90} M. P. Consens and A. O. Mendelzon. Graphlog: A visual formalism for real life recursion. In Proc. ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Sys., Nashville, TN, April 1990.
[19]
{Con98} World Wide Web Consortium. Extensible markup language (xml) 1.0, 1998. http://www.w3.org/ TR/REC-xml.
[20]
{Cou90} B. Courcelle. Graph rewriting: An algebraic and logic approach. In Formal Models and Semantics, volume B of Handbook of Theoretical Computer Science, chapter 5, pages 193-242. Elsevier, Amsterdam, 1990.
[21]
{DFF+99} A. Deutsch, M. Fernandez, D. Florescu, A. Levy, and D. Suciu. A query language for xml. In Proceedings of the Eights International World Wide Web Conference (WWW8), Toronto, 1999.
[22]
{DGM98} D. Calvanese, G. Giacomo, and M. Lenzerini. What can knowledge representation do for semi-structured data ? In Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI-98), 1998.
[23]
{FFK+98} Mary Fernandez, Daniela Florescu, Jaewoo Kang, Alon Levy, and Dan Suciu. Catching the boat with Strudel: experience with a web-site management system. In Proceedings of ACM-SIGMOD International Conference on Management of Data, 1998.
[24]
{FFLS97} Mary Fernandez, Daniela Florescu, Alon Levy, and Dan Suciu. A query language for a web-site management system. SIGMOD Record, 26(3):4-11, September 1997.
[25]
{FLMS99} D. Florescu, L. Levy, I Manolescu, and D. Suciu. Query optimization in the presence of limited access patterns. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Philadelphia, June 1999.
[26]
{GJ79} M. Garey and D. Johnson. Computers and Intractability: A Guide to the Theory of NP-completeness. W. H. Freeman, San Francisco, 1979.
[27]
{GPVdBVG90} M. Gyssens, J. Paredaens, J. Van den Bussche, and D. Van Gucht. A graph-oriented object database model. In ACM Symposium on Principles of Database Systems, pages 417-424, 1990.
[28]
{GPVdBVG94} M. Gyssens, J. Paredaens, J. Van den Bussche, and D. Van Gucht. A graph-oriented object database model. IEEE Transactions on Knowledge and Data Engineering, 6(4):572-586, August 1994.
[29]
{GW97} Roy Goldman and Jennifer Widom. DataGuides: enabling query formulation and optimization in semistructured databases. In Proceedings of Very Large Data Bases, pages 436-445, September 1997.
[30]
{HHK95} Monika Henzinger, Thomas Henzinger, and Peter Kopke. Computing simulations on finite and infinite graphs. In Proceedings of 20th Symposium on Foundations of Computer Science, pages 453-462, 1995.
[31]
{HY90} R. Hull and M. Yoshikawa. ILOG: Declarative creation and manipulation of object identifiers. In Proceedings of 16th International Conference on Very Large Data Bases, pages 455-468, 1990.
[32]
{Imm87} Neil Immerman. Languages that capture complexity classes. SIAM Journal of Computing, 16:760-778, 1987.
[33]
{KW93} M. Kifer and J. Wu. A logic for programming with complex objects. Journal of Computer and System Sciences, 47(1):77-120, 1993.
[34]
{Mai86} D. Maier. A logic for objects. In Proceedings of Workshop on Deductive Database and Logic Programming, Washington, D.C., August 1986.
[35]
{Mil89} Robin Milner. Communication and concurrency. Prentice Hall, 1989.
[36]
{MS99} Tova Milo and Dan Suciu. Index structures for path expressions. In Proceedings of the International Conference on Database Theory, pages 277-295, 1999.
[37]
{MW99} J. McHugh and J. Widom. Query optimization for xml. In Proceedings of VLDB, Edinburgh, UK, September 1999.
[38]
{OBB89} A. Ohori, P. Buneman, and V. Breazu-Tannen. Database programming in Machiavelli, a polymorphic language with static type inference. In James Clifford, Bruce Lindsay, and David Maier, editors, Proceedings of ACM-SIGMOD International Conference on Management of Data, pages 46-57, Portland, Oregon, June 1989.
[39]
{PAGM96} Y. Papakonstantinou, S. Abiteboul, and H. Garcia-Molina. Object fusion in mediator systems. In Proceedings of Very Large Data Bases, pages 413-424, September 1996.
[40]
{PGMW95} Y. Papakonstantinou, H. Garcia-Molina, and J. Widom. Object exchange across heterogeneous information sources. In IEEE International Conference on Data Engineering , pages 251-260, March 1995.
[41]
{PT87} Robert Paige and Robert Tarjan. Three partition refinement algorithms. SIAM Journal of Computing, 16:973- 988, 1987.
[42]
{QL9} Query for XML: position papers. http://www.w3.org/TandS/QL/QL98/pp.html.
[43]
{Rob99} Jonathan Robie. The design of xql, 1999. http://www.texcel. no/whitepapers/xql-design.html.
[44]
{RS97} G. Rozenberg and A. Salomaa. Handbook of Formal Languages . Springer Verlag, 1997.
[45]
{Wad92} Philip Wadler. Comprehending monads. Mathematical Structures in Computer Science, 2:461-493, 1992.

Cited By

View all
  • (2024)Partial Bidirectionalization of Model Transformation LanguagesProceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems10.1145/3640310.3674083(1-12)Online publication date: 22-Sep-2024
  • (2024)Multi-model query languages: taming the variety of big dataDistributed and Parallel Databases10.1007/s10619-023-07433-142:1(31-71)Online publication date: 1-Mar-2024
  • (2023)Scalable Reasoning on Document Stores via Instance-Aware Query RewritingProceedings of the VLDB Endowment10.14778/3611479.361148116:11(2699-2713)Online publication date: 24-Aug-2023
  • Show More Cited By

Index Terms

  1. UnQL: a query language and algebra for semistructured data based on structural recursion

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image The VLDB Journal — The International Journal on Very Large Data Bases
    The VLDB Journal — The International Journal on Very Large Data Bases  Volume 9, Issue 1
    March 2000
    110 pages

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 01 March 2000

    Author Tags

    1. Optimization
    2. Query language
    3. Semistructured data
    4. Structural recursion
    5. XML
    6. XSL

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)21
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 25 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Partial Bidirectionalization of Model Transformation LanguagesProceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems10.1145/3640310.3674083(1-12)Online publication date: 22-Sep-2024
    • (2024)Multi-model query languages: taming the variety of big dataDistributed and Parallel Databases10.1007/s10619-023-07433-142:1(31-71)Online publication date: 1-Mar-2024
    • (2023)Scalable Reasoning on Document Stores via Instance-Aware Query RewritingProceedings of the VLDB Endowment10.14778/3611479.361148116:11(2699-2713)Online publication date: 24-Aug-2023
    • (2023)Static Analysis of Graph Database TransformationsProceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3584372.3588654(251-261)Online publication date: 18-Jun-2023
    • (2020)RumbleProceedings of the VLDB Endowment10.14778/3436905.343691014:4(498-506)Online publication date: 1-Dec-2020
    • (2020)A DSL for Automated Data Quality MonitoringDatabase and Expert Systems Applications10.1007/978-3-030-59003-1_6(89-105)Online publication date: 14-Sep-2020
    • (2019)Querying XML documents using Prolog enginesInformation Processing and Management: an International Journal10.1016/j.ipm.2019.05.01156:5(1753-1770)Online publication date: 1-Sep-2019
    • (2018)Incremental View Model Synchronization Using Partial ModelsProceedings of the 21th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems10.1145/3239372.3239412(323-333)Online publication date: 14-Oct-2018
    • (2018)A Unified SQL Middleware for NoSQL DatabasesProceedings of the 3rd International Conference on Big Data and Computing10.1145/3220199.3220212(14-19)Online publication date: 28-Apr-2018
    • (2018)A Graph Database for a Virtualized Network InfrastructureProceedings of the 2018 International Conference on Management of Data10.1145/3183713.3190653(1393-1405)Online publication date: 27-May-2018
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media