Abstract
When XML documents are modeled as graphs, many challenging research issues arise. In particular, query processing for graph-structured XML data brings new challenges because traditional structural join methods cannot be directly applied. In this paper, we propose a labeling scheme for graph-structured XML data. With this labeling scheme, the reachability relationship of two nodes can be judged efficiently without accessing other nodes. Based on this labeling scheme, we design efficient structural join algorithms to evaluate reachability queries. Experiments show that our algorithms have high efficiency and good scalability.
This work was partially supported by ARC Discovery Grant – DP0346004.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Introduction to Algorithms. MIT Press, Cambridge (1990)
Al-Khalifa, S., Jagadish, H.V., Patel, J.M., Wu, Y., Koudas, N., Srivastava, D.: Structural joins: A primitive for efficient XML query pattern matching. In: Proceedings of the 18th International Conference on Data Engineering (ICDE 2002), pp. 141–152 (2002)
Alstrup, S., Rauhe, T.: Small induced-universal graphs and compact implicit graph representations. In: Proceedings of 2002 IEEE Symposium on Foundations of Computer Science (FOCS 2002), Vancouver, BC, Canada, November 2002, pp. 53–62 (2002)
Chamberlin, D.D., Florescu, D., Robie, J.: XQuery: A query language for XML. W3C Working Draft (2001), http://www.w3.org/TR/xquery
Chien, S.-Y., Vagena, Z., Zhang, D., Tsotras, V.J., Zaniolo, C.: Efficient structural joins on indexed XML documents. In: Bressan, S., Chaudhri, A.B., Li Lee, M., Yu, J.X., Lacroix, Z. (eds.) CAiSE 2002 and VLDB 2002. LNCS, vol. 2590, pp. 263–274. Springer, Heidelberg (2003)
Christophides, V., Plexousakis, D., Scholl, M., Tourtounis, S.: On labeling schemes for the semantic web. In: Proceedings of the Twelfth International World Wide Web Conference (WWW 2003), Budapest, Hungary, May 2003, pp. 544–555 (2003)
Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. In: Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms (SODA 2002), San Francisco, CA, USA, January 2002, pp. 937–946 (2002)
Grust, T.: Accelerating XPath location steps. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (SIGMOD 2002), Hong Kong, China, August 2002, pp. 109–120 (2002)
Jiang, H., Lu, H., Wang, W., Ooi, B.C.: XR-Tree Indexing XML data for efficient structural join. In: Proceedings of the 19th International Conference on Data Engineering (ICDE 2003), pp. 253–263 (2003)
Kameda, T.: On the vector representation of the reachability in planar directed graphs. Information Process Letters 3(3), 78–80 (1975)
Kaplan, H., Milo, T., Shabo, R.: A comparison of labeling schemes for ancestor queries. In: Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms (SODA 2002), San Francisco, CA, USA, January 2002, pp. 954–963 (2002)
Kaushik, R., Bohannon, P., Naughton, J.F., Korth, H.F.:Covering indexes for branching path queries. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (SIGMOD 2002), pp. 133–144 (2002)
Li, Q., Moon, B.: Indexing and querying XML data for regular path expressions. In: Proceedings of 27th International Conference on Very Large Data Base (VLDB 2001), pp. 361–370 (2001)
Milo, T., Suciu, D.: Index structures for path expressions. In: Proceedings of the 7th International Conference on Database Theory (ICDE 1999), pp. 277–295 (1999)
Agrawal., H.V., Rakesh, J.: Alexander Borgida. Efficient management of transitive relationships in large data and knowledge bases. In: Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data (SIGMOD 1989), Portland, Oregon, May 1989, pp. 253–262 (1989)
Schmidt, A., Waas, F., Kersten, M.L., Carey, M.J., Manolescu, I., Busse, R.: XMark: A benchmark for XML data management. In: Bressan, S., Chaudhri, A.B., Li Lee, M., Yu, J.X., Lacroix, Z. (eds.) CAiSE 2002 and VLDB 2002. LNCS, vol. 2590, pp. 974–985. Springer, Heidelberg (2003)
Tamassia, R., Tollis, I.G.: Dynamic reachability in planar digraphs with one source and one sink. Theoretical Computer Science 119(2), 331–343 (1993)
Sperberg-McQueen., C.M., Yergeau., F., Bray, T., Paoli, J.: Extensible markup language (xml) 1.0 (third edition). In: W3C Recommendation 04 February 2004 (2004), http://www.w3.org/TR/REC-xml/
Vagena, Z., Moro, M.M., Tsotras, V.J.:Twig query processing over graph-structured xml data. In: Proceedings of the Seventh International Workshop on the Web and Databases(WebDB 2004), pp. 43–48 (2004)
Wang, W., Jiang, H., Lu, H., Yu, J.X.:PBiTree coding and efficient processing of containment joins. In: Proceedings of the 19th International Conference on Data Engineering (ICDE 2003), pp. 391–402 (2003)
Zhang, C., Naughton, J.F., DeWitt, D.J., Luo, Q., Lohman, G.M.: On supporting containment queries in relational database management systems. In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data (SIGMOD 2001), pp. 425–436 (2001)
Zibin, Y., Gil, J.: Efficient subtyping tests with pq-encoding. In: Proceedings of the 2001 ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA 2001), San Francisco, CA, USA, October 2001, pp. 96–107 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, H., Wang, W., Lin, X., Li, J. (2005). Labeling Scheme and Structural Joins for Graph-Structured XML Data. In: Zhang, Y., Tanaka, K., Yu, J.X., Wang, S., Li, M. (eds) Web Technologies Research and Development - APWeb 2005. APWeb 2005. Lecture Notes in Computer Science, vol 3399. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31849-1_28
Download citation
DOI: https://doi.org/10.1007/978-3-540-31849-1_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25207-8
Online ISBN: 978-3-540-31849-1
eBook Packages: Computer ScienceComputer Science (R0)