Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

Coding-based Join Algorithms for Structural Queries on Graph-Structured XML Document

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

In many applications, XML documents need to be modelled as graphs. The query processing of graph-structured XML documents brings new challenges. In this paper, we design a method based on labelling scheme for structural queries processing on graph-structured XML documents. We give each node some labels, the reachability labelling scheme. By extending an interval-based reachability labelling scheme for DAG by Rakesh et al., we design labelling schemes to support the judgements of reachability relationships for general graphs. Based on the labelling schemes, we design graph structural join algorithms to answer the structural queries with only ancestor-descendant relationship efficiently. For the processing of subgraph query, we design a subgraph join algorithm. With efficient data structure, the subgraph join algorithm can process subgraph queries with various structures efficiently. Experimental results show that our algorithms have good performance and scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Al-Khalifa, S., Jagadish, H.V., Patel, J.M., Wu, Y., Koudas, N., Srivastava, D.: Structural joins: a primitive for efficient XML query pattern matching. In: Proceedings of the 18th International Conference on Data Engineering (ICDE 2002), pp. 141–152 (2002)

  2. Braga, D., Campi, A.: XQBE: a graphical environment to query xml data. World Wide Web 8(3), 287–316 (2005)

    Article  Google Scholar 

  3. Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (SIGMOD 2002), pp. 310–321 (2002)

  4. Chamberlin, D.D., Florescu, D., Robie, J.: XQuery: a query language for XML. In: W3C Working Draft. http://www.w3.org/TR/xquery (2001)

  5. Chen, L., Gupta, A., Kurul, M.E.: Stack-based algorithms for pattern matching on dags. In: VLDB, pp. 493–504 (2005)

  6. Cheng, J., Yu, J.X., Lin, X., Wang, H., Yu, P.S.: Fast computation of reachability labeling for large graphs. In: Ioannidis, Y.E., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT. Lecture Notes in Computer Science, vol. 3896, pp. 961–979. Springer (2006)

  7. Chien, S.-Y., Vagena, Z., Zhang, D., Tsotras, V.J., Zaniolo, C.: Efficient structural joins on indexed XML documents. In: Proceedings of 28th International Conference on Very Large Data Bases (VLDB 2002), pp. 263–274 (2002)

  8. Clark, J., DeRose, S.: XML path language (XPath). In W3C recommendation, 16 November 1999. http://www.w3.org/TR/xpath (1999)

  9. Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. In: SODA, pp. 937–946 (2002)

  10. Grust, T.: Accelerating XPath location steps. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (SIGMOD 2002), pp. 109–120. Hong Kong, China August (2002)

  11. He, H., Wang, H., Yang, J., Yu, P.S.: Compact reachability labeling for graph-structured data. In: Herzog, O., Schek, H.-J., Fuhr, N., Chowdhury, A., Teiken, W. (eds.) Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management (CIKM2005), Bremen, Germany, October 31–November 5, 2005, pp. 594–601. ACM (2005)

  12. Jiang, H., Lu, H., Wang, W., Ooi, B.C.: XR-Tree: indexing XML data for efficient structural join. In: Proceedings of the 19th International Conference on Data Engineering (ICDE 2003), pp. 253–263 (2003)

  13. Kameda, T.: On the vector representation of the reachability in planar directed graphs. Information Process Letters 3(3), 78–80 (1975)

    Article  MathSciNet  Google Scholar 

  14. Kaushik, R., Bohannon, P., Naughton, J.F., Korth, H.F.: Covering indexes for branching path queries. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (SIGMOD 2002), pp. 133–144 (2002)

  15. Li, Q., Moon, B.: Indexing and querying XML data for regular path expressions. In: Proceedings of 27th International Conference on Very Large Data Base (VLDB 2001), pp. 361–370 (2001)

  16. Milo, T., Suciu, D.: Index structures for path expressions. In: Proceedings of the 7th International Conference on Database Theory (ICDT 1999), pp. 277–295 (1999)

  17. Rakesh Agrawal, H.V.J., Borgida A.: Efficient management of transitive relationships in large data and knowledge bases. In: Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data (SIGMOD 1989), pp. 253–262. Portland, Oregon, May (1989)

  18. Ralf Schenkel, G.W., Theobald, A.: HOPI: an efficient connection index for complex xml document collections. In: Advances in Database Technology—EDBT 2004, 9th International Conference on Extending Database Technology(EDBT04), pp. 237–255, Heraklion, Crete, Greece, March 14–18 (2004)

  19. Schmidt, A., Waas, F., Kersten, M.L., Carey, M.J., Manolescu, I., Busse, R.: XMark: a benchmark for XML data management. In: Proceedings of 28th International Conference on Very Large Data Bases (VLDB 2002), pp. 974–985 (2002)

  20. Cormen, C.L.T., Rivest, R.: Introduction to Algorithms. MIT Press, Cambridge MA (1990)

    Google Scholar 

  21. Bray, J.P.T., Sperberg-McQueen, C.M., Yergeau, F.: Extensible markup language (xml) 1.0 (third edition). In: W3C Recommendation 04 February 2004. http://www.w3.org/TR/REC-xml/ (2004)

  22. Trißl, S., Leser, U.: Fast and practical indexing and querying of very large graphs. In: SIGMOD Conference, pp. 845–856 (2007)

  23. Vassilis Christophides, M.S.S.T., Plexousakis, D.: On labeling schemes for the semantic web. In: Proceedings of the Twelfth International World Wide Web Conference(WWW2003), pp. 544–555. Budapest, Hungary, May (2003)

  24. Wang, H., He, H., Yang, J., Yu, P.S., Yu, J.X.: Dual labeling: answering graph reachability queries in constant time. In: Liu, L., Reuter, A., Whang, K.-Y., Zhang, J. (eds.) Proceedings of the 22nd International Conference on Data Engineering, ICDE 2006, 3–8 April 2006, Atlanta, GA, USA, p 75. IEEE Computer Society (2006)

  25. Wang, H., Li, J., Wang, H.: Clustered chain path index for xml document: efficiently processing branch queries. World Wide Web 11(1), 153–168 (2008)

    Article  Google Scholar 

  26. Wang, W., Jiang, H., Lu, H., Yu, J.X.: PBiTree coding and efficient processing of containment joins. In: Proceedings of the 19th International Conference on Data Engineering (ICDE 2003), pp. 391–402 (2003)

  27. Wong, K.-F., Yu, J.X., Tang, N.: Answering xml queries using path-based indexes: a survey. World Wide Web 9(3), 277–299 (2006)

    Article  Google Scholar 

  28. Zhang, C., Naughton, J.F., DeWitt, D.J., Luo, Q., Lohman, G.M.: On supporting containment queries in relational database management systems. In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data (SIGMOD 2001), pp. 425–436 (2001)

  29. Zografoula Vagena, V.J.T., Moura Moro, M.: Twig query processing over graph-structured xml data. In: Proceedings of the Seventh International Workshop on the Web and Databases(WebDB 2004), pp. 43–48 (2004)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongzhi Wang.

Additional information

Support by the Key Program of the National Natural Science Foundation of China under Grant No.60533110; the National Grand Fundamental Research 973 Program of China under Grant No. 2006CB303000; the National Natural Science Foundation of China under Grant No. 60773068 and No. 60773063.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, H., Li, J., Wang, W. et al. Coding-based Join Algorithms for Structural Queries on Graph-Structured XML Document. World Wide Web 11, 485–510 (2008). https://doi.org/10.1007/s11280-008-0050-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-008-0050-4

Keywords