Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1514894.1514916acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article
Free access

Efficient asymmetric inclusion between regular expression types

Published: 23 March 2009 Publication History

Abstract

The inclusion of Regular Expressions (REs) is the kernel of any subtype checking algorithm for XML schema languages. XML applications would benefit from the extension of REs with interleaving and counting, but this is not feasible in general, since inclusion is EXPSPACE-complete for such extended REs. In [9] we introduced a notion of "conflict-free REs", which are extended REs with excellent complexity behaviour, including a cubic inclusion algorithm [9] and linear membership [10]. Conflict-free REs have interleaving and counting, but the complexity is tamed by the "conflict-free" limitations, which have been found to be satisfied by the vast majority of the content models published on the Web.
However, the most important use of subtype checking is in the context of type-cheching of XML manipulation languges. A type checker works by testing the inclusion of inferred subtypes in declared supertypes. The conflict-free restriction, while quite harmless for the human-defined supertype, is far too restrictive for the inferred subtype, whose shape is difficult to constrain.
We show here that the PTIME inclusion algorithm can be actually extended to deal with totally unrestricted REs with counting and interleaving in the subtype position, provided that the supertype is conflict-free. This is exactly the expressive power that we need in order to use subtyping inside type-checking algorithms, and the cost of this generalized algorithm is only quadratic, which is as good as the best algorithm we have for the symmetric case (see [5]). The result is extremely surprising, since we had previously found that asymmetric inclusion becomes NP-hard as soon as the candidate subtype is enriched with binary intersection, a generalization that looked much more innocent than what we achieve here.

References

[1]
D. Barbosa, G. Leighton, and A. Smith. Efficient incremental validation of XML documents after composite updates. In XSym, volume 4156 of LNCS, pages 107--121. Springer, 2006.
[2]
M. A. Bender and M. Farach-Colton. The LCA problem revisited. In G. H. Gonnet, D. Panario, and A. Viola, editors, LATIN, volume 1776 of Lecture Notes in Computer Science, pages 88--94. Springer, 2000.
[3]
G. J. Bex, F. Neven, T. Schwentick, and K. Tuyls. Inference of concise dtds from xml data. In U. Dayal, K.-Y. Whang, D. B. Lomet, G. Alonso, G. M. Lohman, M. L. Kersten, S. K. Cha, and Y.-K. Kim, editors, Proceedings of the 32nd International Conference on Very Large Data Bases, Seoul, Korea, September 12--15, 2006, pages 115--126. ACM, 2006.
[4]
B. Choi. What are real DTDs like? In WebDB, pages 43--48, 2002.
[5]
D. Colazzo, G. Ghelli, and C. Sartiani. Efficient inclusion for a class of xml types with interleaving and counting. Information Systems, 2008. To Appear.
[6]
D. Colazzo and C. Sartiani. Efficient subtyping for unordered XML types. Technical report, Dipartimento di Informatica - Università di Pisa, 2007.
[7]
J. N. Foster, B. C. Pierce, and A. Schmitt. A logic your typechecker can count on: Unordered tree types in practice. In Workshop on Programming Language Technologies for XML (PLAN-X), informal proceedings, Jan. 2007.
[8]
W. Gelade, W. Martens, and F. Neven. Optimizing schema languages for xml: Numerical constraints and interleaving. In T. Schwentick and D. Suciu, editors, Proceedings of the 11th International Conference on Database Theory - ICDT 2007, Barcelona, Spain, January 10--12, 2007, volume 4353 of Lecture Notes in Computer Science, pages 269--283. Springer, 2007.
[9]
G. Ghelli, D. Colazzo, and C. Sartiani. Efficient inclusion for a class of XML types with interleaving and counting. In M. Arenas and M. I. Schwartzbach, editors, Proceedings of the 11th International Symposium on Database Programming Languages, DBPL 2007, Vienna, Austria, September 23--24, 2007, Revised Selected Papers, volume 4797 of Lecture Notes in Computer Science, pages 231--245. Springer, 2007.
[10]
G. Ghelli, D. Colazzo, and C. Sartiani. Linear time membership for a class of XML types with interleaving and counting. In ACM Conference on Information and Knowledge Management (CIKM), 2008.
[11]
W. Martens, F. Neven, T. Schwentick, and G. J. Bex. Expressiveness and complexity of xml schema. ACM Trans. Database Syst., 31(3):770--813, 2006.
[12]
A. J. Mayer and L. J. Stockmeyer. Word problems -- this time with interleaving. Inf. Comput., 115(2):293--311, 1994.

Cited By

View all
  • (2020)Inclusion algorithms for one-unambiguous regular expressions and their applicationsScience of Computer Programming10.1016/j.scico.2020.102436(102436)Online publication date: Mar-2020
  • (2017)Linear Time Membership in a Class of Regular Expressions with Counting, Interleaving, and Unordered ConcatenationACM Transactions on Database Systems10.1145/313270142:4(1-44)Online publication date: 13-Nov-2017
  • (2013)Almost-linear inclusion for XML regular expression typesACM Transactions on Database Systems (TODS)10.1145/2508020.250802238:3(1-45)Online publication date: 5-Sep-2013
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICDT '09: Proceedings of the 12th International Conference on Database Theory
March 2009
334 pages
ISBN:9781605584232
DOI:10.1145/1514894
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 March 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. XML
  2. language inclusion
  3. regular expressions

Qualifiers

  • Research-article

Conference

EDBT/ICDT '09
EDBT/ICDT '09: EDBT/ICDT '09 joint conference
March 23 - 25, 2009
St. Petersburg, Russia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)72
  • Downloads (Last 6 weeks)10
Reflects downloads up to 23 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Inclusion algorithms for one-unambiguous regular expressions and their applicationsScience of Computer Programming10.1016/j.scico.2020.102436(102436)Online publication date: Mar-2020
  • (2017)Linear Time Membership in a Class of Regular Expressions with Counting, Interleaving, and Unordered ConcatenationACM Transactions on Database Systems10.1145/313270142:4(1-44)Online publication date: 13-Nov-2017
  • (2013)Almost-linear inclusion for XML regular expression typesACM Transactions on Database Systems (TODS)10.1145/2508020.250802238:3(1-45)Online publication date: 5-Sep-2013
  • (2013)The complexity of regular expressions and property paths in SPARQLACM Transactions on Database Systems10.1145/249452938:4(1-39)Online publication date: 4-Dec-2013
  • (2012)The complexity of evaluating path expressions in SPARQLProceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on Principles of Database Systems10.1145/2213556.2213573(101-112)Online publication date: 21-May-2012
  • (2012)Weak inclusion for recursive XML typesProceedings of the 17th international conference on Implementation and Application of Automata10.1007/978-3-642-31606-7_7(78-89)Online publication date: 17-Jul-2012
  • (2011)Weak inclusion for XML typesProceedings of the 16th international conference on Implementation and application of automata10.5555/2032366.2032372(30-41)Online publication date: 13-Jul-2011
  • (2011)Precision and complexity of XQuery type inferenceProceedings of the 13th international ACM SIGPLAN symposium on Principles and practices of declarative programming10.1145/2003476.2003490(89-100)Online publication date: 20-Jul-2011
  • (2011)Schemas for safe and efficient XML processingProceedings of the 2011 IEEE 27th International Conference on Data Engineering10.1109/ICDE.2011.5767960(1378-1379)Online publication date: 11-Apr-2011
  • (2011)Weak Inclusion for XML TypesImplementation and Application of Automata10.1007/978-3-642-22256-6_4(30-41)Online publication date: 2011
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media