Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1287369.1287455dlproceedingsArticle/Chapter ViewAbstractPublication PagesvldbConference Proceedingsconference-collections
Article

XMark: a benchmark for XML data management

Published: 20 August 2002 Publication History

Abstract

While standardization efforts for XML query languages have been progressing, researchers and users increasingly focus on the database technology that has to deliver on the new challenges that the abundance of XML documents poses to data management: validation, performance evaluation and optimization of XML query processors are the upcoming issues. Following a long tradition in database research, we provide a framework to assess the abilities of an XML database to cope with a broad range of different query types typically encountered in real-world scenarios. The benchmark can help both implementors and users to compare XML databases in a standardized application scenario. To this end, we offer a set of queries where each query is intended to challenge a particular aspect of the query processor. The overall workload we propose consists of a scalable document database and a concise, yet comprehensive set of queries which covers the major aspects of XML query processing ranging from textual features to data analysis queries and ad hoc queries. We complement our research with results we obtained from running the benchmark on several XML database platforms. These results are intended to give a first baseline and illustrate the state of the art.

References

[1]
{1} T. Anderson, A. Berre, M. Mallison, H. Porter, and B. Schneider. The HyperModel Benchmark. In International Conference on Extending Database Technology, volume 416 of Lecture Notes in Computer Science, pages 317-331, 1990.
[2]
{2} T. Böhme and E. Rahm. XMach-1: A Benchmark for XML Data Management. In Proceedings of BTW2001, 2001.
[3]
{3} A. Bonifati and S. Ceri. Comparative Analysis of Five XML Query Languages. ACM SIGMOD Record, 29(1):68-79, 2000.
[4]
{4} R. Bourett. XML Database Products. available at http://www.rpbourret.com/xml/ XMLDatabaseProds.htm, 2000.
[5]
{5} J. Boyer. Canonical XML Version 1.0, 1 2001. available at http://www.w3.org/ TR/xml-c14n.
[6]
{6} T. Bray, J. Paoli, C. M. Sperberg-McQueen, and E. Maler. Extensible Markup Language (XML) 1.0 (Second Edition). available at http:// www.w3.org/TR/REC-xml, 2000.
[7]
{7} S. Bressan, G. Dobbie, Z. Lacroix, M. Lee, Y. Li, and U. Nambiar. X007: Applying 007 Benchmark to XML Query Processing Tools. In International Conference on Information and Knowledge Management, pages 167-174, 2001.
[8]
{8} M. Carey, D. DeWitt, and J. Naughton. The OO7 Benchmark. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 12-21, 1993.
[9]
{9} M. Carey, D. DeWitt, J. Naughton, M. Asgarian, P. Brown, J. Gehrke, and D. Shah. The BUCKY Object-Relational Benchmark. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 135-146, 1997.
[10]
{10} R. Cattell and J. Skeen. Object Operations Benchmark. TODS, 17(1):1-31, 1992.
[11]
{11} D. Chamberlin, D. Florescu, J. Robie, J. Siméeon, and M. Stefanescu. XQuery: A Query Language for XML, February 2001. available at http:// www.w3.org/TR/xquery.
[12]
{12} D. Chamberlin, J. Robie, and D. Florescu. Quilt: An XML Query Language for Heterogeneous Data Sources. In International Workshop on the Web and Databases (WebDB), pages 53-62, 2000.
[13]
{13} James Clark et al. Expat XML Parser. available at http://sourceforge.net/ projects/expat/, 2001.
[14]
{14} D. Florescu and D. Kossmann. Storing and Querying XML Data using an RDMBS. IEEE Data Engineering Bulletin, 22(3):27-34, 1999.
[15]
{15} J. Gray. Database and Transaction Processing Performance Handbook. available at http://www.benchmarkresources. com/handbook/contents.asp, 1993.
[16]
{16} M. Klettke and H. Meyer. XML and Object-Relational Database Systems - Enhancing Structural Mappings Based on Statistics. In International Workshop on the Web and Databases (WebDB), pages 63-68, 2000.
[17]
{17} K. Ramasamy, J. Patel, J. Naughton, and R. Kaushik. Set Containment Joins: The Good, The Bad and The Ugly. In Proceedings of the International Conference on Very Large Data Bases, pages 351-362, 2000.
[18]
{18} A. Schmidt, M. Kersten, D. Florescu, M. Carey, I. Manolescu, and F. Waas. The XML Store Benchmark Project, 2000. http://www. xml-benchmark.org.
[19]
{19} A. Schmidt, M. Kersten, D. Florescu, M. Carey, I. Manolescu, and F. Waas. Example Snippet and Queries, 2002. available at http: //monetdb.cwi.nl/xml/snippet.txt and http://monetdb.cwi.nl/xml/ queries.txt.
[20]
{20} A. Schmidt, M. Kersten, M. Windhouwer, and F. Waas. Efficient Relational Storage and Retrieval of XML Documents. In International Workshop on the Web and Databases (WebDB), pages 47-52, Dallas, TX, USA, 2000.
[21]
{21} A. Schmidt, F. Waas, M. Kersten, D. Florescu, M. Carey, I. Manolescu, and R. Busse. Why And How To Benchmark XML Databases. ACM SIGMOD Record, 30(3):27-32, 2001.
[22]
{22} A. Schmidt, F. Waas, M. Kersten, D. Florescu, I. Manolescu, M. Carey, and R. Busse. The XML Benchmark Project. Technical Report INS- R0103, April 2001.
[23]
{23} J. Shanmugasundaram, K. Tufte, C. Zhang, G. He, D. J. DeWitt, and J. F. Naughton. Relational Databases for Querying XML Documents: Limitations and Opportunities. In Proceedings of the International Conference on Very Large Data Bases, pages 302-314, 1999.
[24]
{24} D. R. Slutz. Massive Stochastic Testing of SQL. In Proceedings of the International Conference on Very Large Data Bases, pages 618-622, 1998.
[25]
{25} W3C. W3C XML Schema. http://www.w3. org/XML/Schema, 2001.
[26]
{26} C. Zhang, J. Naughton, D. DeWitt, Q. Luo, and G. Lohman. On Supporting Containment Queries in Relational Database Management Systems. In Proceedings of the ACM SIGMOD International Conference on Management of Data, 2001.

Cited By

View all
  • (2023)Auto-BI: Automatically Build BI-Models Leveraging Local Join Prediction and Global Schema GraphProceedings of the VLDB Endowment10.14778/3603581.360359616:10(2578-2590)Online publication date: 1-Jun-2023
  • (2022)Substream management in distributed streaming dataflowsProceedings of the 16th ACM International Conference on Distributed and Event-Based Systems10.1145/3524860.3539809(55-66)Online publication date: 27-Jun-2022
  • (2021)Dynamic interleaving of content and structure for robust indexing of semi-structured hierarchical dataProceedings of the VLDB Endowment10.14778/3401960.340196313:10(1641-1653)Online publication date: 10-Mar-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
VLDB '02: Proceedings of the 28th international conference on Very Large Data Bases
August 2002
1110 pages

Publisher

VLDB Endowment

Publication History

Published: 20 August 2002

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Auto-BI: Automatically Build BI-Models Leveraging Local Join Prediction and Global Schema GraphProceedings of the VLDB Endowment10.14778/3603581.360359616:10(2578-2590)Online publication date: 1-Jun-2023
  • (2022)Substream management in distributed streaming dataflowsProceedings of the 16th ACM International Conference on Distributed and Event-Based Systems10.1145/3524860.3539809(55-66)Online publication date: 27-Jun-2022
  • (2021)Dynamic interleaving of content and structure for robust indexing of semi-structured hierarchical dataProceedings of the VLDB Endowment10.14778/3401960.340196313:10(1641-1653)Online publication date: 10-Mar-2021
  • (2019)One SQL to Rule Them All - an Efficient and Syntactically Idiomatic Approach to Management of Streams and TablesProceedings of the 2019 International Conference on Management of Data10.1145/3299869.3314040(1757-1772)Online publication date: 25-Jun-2019
  • (2019)Decidable XPath Fragments in the Real WorldProceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3294052.3319685(285-302)Online publication date: 25-Jun-2019
  • (2017)Multi-query processing of XML data streams on multicoreThe Journal of Supercomputing10.1007/s11227-016-1919-073:6(2339-2368)Online publication date: 1-Jun-2017
  • (2016)TGDBProceedings of the 26th Annual International Conference on Computer Science and Software Engineering10.5555/3049877.3049905(257-267)Online publication date: 31-Oct-2016
  • (2016)Efficient Identification of Structural Relationships for XML Queries using Secure Labeling SchemesInternational Journal of Intelligent Information Technologies10.4018/IJIIT.201610010412:4(63-80)Online publication date: 1-Oct-2016
  • (2016)Nez: practical open grammar languageProceedings of the 2016 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software10.1145/2986012.2986019(29-42)Online publication date: 20-Oct-2016
  • (2016)XAncestorKnowledge-Based Systems10.1016/j.knosys.2016.10.009114:C(167-192)Online publication date: 15-Dec-2016
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media