Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Storing and Querying of XML Documents Without Redundant Path Information

  • Conference paper
Computational Science and Its Applications - ICCSA 2006 (ICCSA 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3981))

Included in the following conference series:

  • 643 Accesses

Abstract

We propose an improved approach that stores and queries a large volume of XML documents in a relational database, while removing the redundancy of path information and using an inverted index on the reduced path information. In order to store XML documents in a relational database, the XML document is decomposed into nodes based on its tree structure, and stored in relational tables with path information from the root node to each node. The existing XML storage methods which use relational data model, usually store path information for every node. Thus, they can increase storage overhead and decrease query processing performance with the increased data volume. Our approach stores only leaf node path information in XML tree structure while finding out internal node path information from the leaf node path information. In this manner, our approach can reduce data volume for a large amount of XML documents to a degree and also reduce the size of inverted index for the path information with the smaller number of posting lists by key words. We show the effectiveness of this approach through several experiments that compare XPath query performance with the existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  • Chung, C., Min, J., Shim, K.: APEX: An Adaptive Path Index for XML Data. In: ACM SIGMOD 2002 (2002)

    Google Scholar 

  • Cooper, B.F., Sample, N., Franklin, M.J., Hjaltason, G.R., Shadmon, M.: A Fast Index for Semi-structured Data. In: VLDB 2001 (2001)

    Google Scholar 

  • Florescu, D., Kossmann, D.: Storing and Querying XML Data Using an RDBMS. IEEE Data Engineering Bulletin 22(3), 27–34 (1999)

    Google Scholar 

  • Jiang, H., Lu, H., Wang, W., Yu, J.: XParent: An Efficient RDBMS-Based XML Database System. In: ICDE 2002 (2002)

    Google Scholar 

  • Li, Q., Moon, B.: Indexing and Querying XML Data for Regular Path Expression. In: VLDB 2001 (2001)

    Google Scholar 

  • Min, K.S., Kim, H.J.: Inverted Index Method for Processing Path Queries in XML Documents. Journal of KISS 30(4), 420–428 (2003)

    Google Scholar 

  • Pal, S., Cseri, I., Seeliger, O., Schaller, G., Giakoumakis, L., Zolotov, V.: Indexing XML Data Stored in a Relational Database. In: VLDB 2004 (2004)

    Google Scholar 

  • Park, Y.H., Whang, K.Y., Lee, B.S., Han, W.S.: Efficient Evaluation of Partial Match Queires for XML Documents Using Information Retrieval Techniques. In: Zhou, L.-z., Ooi, B.-C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 95–112. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  • Shanmugasundaram, J., et al.: Relational Databases for Querying XML Documents: Limitation and Opportunities. In: VLDB 1999 (1999)

    Google Scholar 

  • Sundara, S., Hu, Y., Chorma, T., Srimivasan, J.: Developing an Indexing Scheme XML Document Collections Using the Oracle8i Extensibility Framework. In: VLDB 2001 (2001)

    Google Scholar 

  • Tatarinov, I., Viglas, S.D., Beyer, K., Shanmugasundaram, J., Shekita, E., Zhang, C.: Storing and Querying Ordered XML Using a Relational Database System. In: ACM SIGMOD 2002 (2002)

    Google Scholar 

  • Wisconsin XML Data Set, http://www.cs.wisc.edu/niagara/

  • Yoshikawa, M., Amagasa, T., Shimura, T., Uemura, S.: XRel: A Path-Based Approach to Storage and Retrieval of XML Documents using Relational Databases. ACM Transactions on Internet Technology 1, 110–141 (2001)

    Article  Google Scholar 

  • Zhang, J.: Application of OODB and SGML Techniques in Text Database: An Electronic Dictionary System. ACM SIGMOD Record 24, 3–8 (1995)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jeong, BS., Lee, YK. (2006). Storing and Querying of XML Documents Without Redundant Path Information. In: Gavrilova, M.L., et al. Computational Science and Its Applications - ICCSA 2006. ICCSA 2006. Lecture Notes in Computer Science, vol 3981. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11751588_53

Download citation

  • DOI: https://doi.org/10.1007/11751588_53

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-34072-0

  • Online ISBN: 978-3-540-34074-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics