Abstract
A key problem that arises in the context of storing XML documents in relational databases is that of finding an optimal relational decomposition for a given set of XML documents and a given set of XML queries over those documents. While there have been a number of ad hoc solutions proposed for this problem, to our knowledge this paper represents a first step toward formalizing the problem and studying its complexity. It turns out that to even define what one means by an optimal decomposition, one first needs to specify an algorithm to translate XML queries to relational queries, and a cost model to evaluate the quality of the resulting relational queries. By examining an interesting problem embedded in choosing a relational decomposition, we show that choices of different translation algorithms and cost models result in very different complexities for the resulting optimization problems. Our results suggest that, contrary to the trend in previous work, the eventual development of practical algorithms for finding relational decompositions for XML workloads will require judicious choices of cost models and translation algorithms, rather than an exclusive focus on the decomposition problem in isolation.
Research supported in part by NSF grants CSA-9623632, ITR-0086002, CCR- 9634665 and CCR-0208013
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
P. Alimonti and V. Kann. Hardness of approximating problems on cubic graphs. In Proc. 3rd Italian Conf. on Algorithms and Complexity, Lecture Notes in Computer Science, 1203, pages 288–298. Springer-Verlag, 1997.
R. Bar-Yehuda and S. Even. A local-ratio theorem for approximating the weighted vertex cover problem. Annals of Discrete Mathematics, 25:27–46, 1985.
P. Bohannon, J. Freire, P. Roy, and J. Simeon. From xml schema to relations: A cost-based approach to xml storage. In ICDE, 2002.
A. Deutsch, M. Fernandez, and D. Suciu. Storing semistructured data with stored. In SIGMOD, pages 431–442, 1999.
I. Dinur and S. Safra. The importance of being biased. In Proceedings of the thiryfourth annual ACM symposium on Theory of computing, pages 33–42. ACM Press, 2002.
D. Florescu and D. Kossman. Storing and querying xml data using an rdbms. In Data Engineering Bulletin, volume 22, 1999.
M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, San Francisco, 1979.
C. H. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.
Y. Sagiv and M. Yannakakis. Equivalences among relational expressions with the union and difference operators. Journal of the ACM (JACM), 27(4):633–655, 1980.
A. Schmidt, M. Kersten, M. Windhouwer, and F. Waas. Efficient relational storage and retrieval of xml documents. Lecture Notes in Computer Science, 1997, 2001.
A. R. Schmidt, F. Waas, M. L. Kersten, D. Florescu, I. Manolescu, M. J. Carey, and R. Busse. The XML Benchmark Project. Technical Report INS-R0103, CWI, Amsterdam, The Netherlands, April 2001.
J. Shanmugasundaram, K. Tufte, G. He, C. Zhang, D. DeWitt, and J. Naughton. Relational databases for querying xml documents: Limitations and opportunities. In Proceedings of the VLDB Conference, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Krishnamurthy, R., Chakaravarthy, V.T., Naughton, J.F. (2003). On the Difficulty of Finding Optimal Relational Decompositions for XML Workloads: A Complexity Theoretic Perspective. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds) Database Theory — ICDT 2003. ICDT 2003. Lecture Notes in Computer Science, vol 2572. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36285-1_18
Download citation
DOI: https://doi.org/10.1007/3-540-36285-1_18
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00323-6
Online ISBN: 978-3-540-36285-2
eBook Packages: Springer Book Archive