Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Efficient Storage and Temporal Query Evaluation in Hierarchical Data Archiving Systems

  • Conference paper
Scientific and Statistical Database Management (SSDBM 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6809))

Abstract

Data archiving has been commonly used in many fields for data backup and analysis purposes. Although comprehensive application software, new computing and storage technologies, and the Internet have made it easier to create, collect and store all types of data, the meaningful storing, accessing, and managing of database archives in a cost-effective way remains extremely challenging. In this paper, we focus on hierarchical data archiving that has been popularly used in the scientific field and web data management. First, we propose a novel compaction scheme for archiving hierarchical data. By compacting both data and timestamps, our scheme substantially reduces not only the amount of needed storage, but also the incremental archiving time. Second, we design a temporal query language to support data retrieval from the compact data archives. Third, as compaction on data and timestamps may bring significant overhead to query evaluation, we investigate how to optimize such overhead by exploiting the characteristics of the queries and of the archived hierarchical data. Finally, we conduct an extensive experimentation to demonstrate the effectiveness and efficiency of both our efficient storage and query optimization techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. JDOM XML parser, http://www.jdom.org

  2. Wutka DTD parser, http://www.wutka.com/dtdparser.html

  3. IBM XML data generator, http://www.alphaworks.ibm.com/tech/xmlgenerator

  4. XMark XML benchmark project, http://monetdb.cwi.nl/xml/

  5. XML Data Repository of University of Washington, http://www.cs.washington.edu/research/xmldatasets/www/repository.html

  6. Annis, J., Zhao, Y., Vockler, J.-S., Wilde, M., Kent, S., Foster, I.T.: Applying chimera virtual data concepts to cluster finding in the sloan sky survey. In: Supercomputing (2002)

    Google Scholar 

  7. Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: SIGMOD (2002)

    Google Scholar 

  8. Buneman, P., Khanna, S., Tajima, K., Tan, W.-C.: Archiving scientific data. ACM Transactions on Database Systems (2004)

    Google Scholar 

  9. Chapman, A.P., Jagadish, H., Ramanan, P.: Efficient provenance storage. In: SIGMOD (2008)

    Google Scholar 

  10. Chawathe, S., Garcia-molina, H.: Meaningful change detection in structured data. In: SIGMOD (1997)

    Google Scholar 

  11. Chawathe, S.S., Rajaraman, A., Garcia-Molina, H., Widom, J.: Change detection in hierarchically structured information. In: SIGMOD (1996)

    Google Scholar 

  12. Chien, S.-Y., Tsotras, V.J., Zaniolo, C., Zhang, D.: Supporting complex queries on multiversion xml documents. ACM Transactions on Internet Technology (2006)

    Google Scholar 

  13. Cobena, G., Abiteboul, S., Marian, A.: Detecting changes in xml documents. In: ICDE (2002)

    Google Scholar 

  14. Gou, G., Chirkova, R.: Efficiently querying large XML data repositories: A survey. IEEE Trans. Knowl. Data Eng. 19(10), 1381–1403 (2007)

    Article  Google Scholar 

  15. Groth, P., Miles, S., Fang, W., Wong, S.C., peter Zauner, K., Moreau., L.: Recording and using provenance in a protein compressibility experiment. In: HPDC (2005)

    Google Scholar 

  16. Jayant, P.T., Haritsa, J.R.: Xgrind: A query-friendly xml compressor. In: ICDE (2002)

    Google Scholar 

  17. Jiang, H., Wang, W., Lu, H., Yu, J.X.: Holistic twig joins on indexed XML documents. In: VLDB (2003)

    Google Scholar 

  18. Koltsidas, I., Muller, H., Viglas, S.D.: Sorting hierarchical data in external memory for archiving. In: PVLDB (2008)

    Google Scholar 

  19. Liefke, H., Suciu, D.: XMill: an efficient compressor for XML data. In: SIGMOD (1999)

    Google Scholar 

  20. Marian, A., Abiteboul, S., Mignet, L.: Change-centric management of versions in an xml warehouse. In: VLDB (2001)

    Google Scholar 

  21. Müller, H., Buneman, P., Koltsidas, I.: Xarch: Archiving scientific and reference data. In: SIGMOD (2008)

    Google Scholar 

  22. Pancerella, C., Myers, J.D., Allison, T.C., Amin, K., Bittner, R., Frenklach, M., Green, W.H., ling Ho, Y., Hewson, J., Koegler, W., Yang, C.: Metadata in the collaboratory for multi-scale chemical science. In: Dublin Core Conference (2003)

    Google Scholar 

  23. Rizzolo, F., Vaisman, A.A.: Temporal xml: modeling, indexing, and query processing. The VLDB Journal 17, 1179–1212 (2008)

    Article  Google Scholar 

  24. Tichy, W.F.: RCS - a system for version control. Software-Practice & Experience (1985)

    Google Scholar 

  25. Wang, F., Zaniolo, C.: Temporal queries in XML document archives and web warehouses. In: TIME-ICTL (2003)

    Google Scholar 

  26. Wang, F., Zaniolo, C.: Temporal queries and version management in XML-based document archives. Data Knowl. Eng. 65, 304–324 (2008)

    Article  Google Scholar 

  27. Wang, Y., DeWitt, D.J., yi Cai, J.: X-Diff: An effective change detection algorithm for XML documents. In: ICDE (2003)

    Google Scholar 

  28. Wong, R., Lam, N.: Managing and querying multi-version xml data with update logging. In: DocEng. (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, H.(., Liu, R., Theodoratos, D., Wu, X. (2011). Efficient Storage and Temporal Query Evaluation in Hierarchical Data Archiving Systems. In: Bayard Cushing, J., French, J., Bowers, S. (eds) Scientific and Statistical Database Management. SSDBM 2011. Lecture Notes in Computer Science, vol 6809. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22351-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22351-8_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22350-1

  • Online ISBN: 978-3-642-22351-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics