Abstract
We present an algorithm to approximate edit distance between two ordered and rooted trees of bounded degree. In this algorithm, each input tree is transformed into a string by computing the Euler string, where labels of some edges in the input trees are modified so that structures of small subtrees are reflected to the labels. We show that the edit distance between trees is at least 1/6 and at most O(n 3/4) of the edit distance between the transformed strings, where n is the maximum size of two input trees and we assume unit cost edit operations for both trees and strings. The algorithm works in O(n 2) time since computation of edit distance and reconstruction of tree mapping from string alignment takes O(n 2) time though transformation itself can be done in O(n) time.
Similar content being viewed by others
References
Akutsu, T.: A relation between edit distance for ordered trees and edit distance for Euler strings. Inf. Process. Lett. 100, 105–109 (2006)
Bar-Yossef, Z., Jayram, T., Krauthgamer, R., Kumar, R.: Approximating edit distance efficiently. In: Proc. 45th IEEE Symp. Foundations on Computer Science, pp. 550–559 (2004)
Batu, T., Ergun, F., Sahinalp, S.C.: Oblivious string embeddings and edit distance approximations. In: Proc. 17th ACM-SIAM Symp. Discrete Algorithms, pp. 792–801 (2006)
Bille, P.: A survey on tree edit distance and related problem. Theor. Comput. Sci. 337, 217–239 (2005)
Chen, W.: New algorithm for ordered tree-to-tree correction problem. J. Algorithms 40, 135–158 (2001)
Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. In: Proc. 13th ACM-SIAM Symp. Discrete Algorithms, pp. 667–676 (2002)
Cormode, G., Paterson, M., Sahinalp, S.C., Vishkin, U.: Communication complexity of document exchange. In: Proc. 11st ACM-SIAM Symp. Discrete Algorithms, pp. 197–206 (2000)
Czumaj, A., Gasieniec, L.: On the complexity of determining the period of a string. In: Proc. 11th Symp. Combinatorial Pattern Matching, pp. 412–422 (2000)
Demaine, E., Mozes, S., Rossman, B., Weimann, O.: An optimal decomposition algorithm for tree edit distance. In: Proc. 34th International Colloquium on Automata, Languages and Programming, pp. 146–157 (2007)
Garofalakis, M., Kumar, A.: XML stream processing using tree-edit distance embedding. ACM Trans. Database Syst. 30, 279–332 (2005)
Grossi, R.: On finding common subtrees. Theor. Comput. Sci. 108, 345–256 (1993)
Guha, S., Jagadish, H.V., Koudas, N., Srivastava, D., Yu, T.: Approximate XML joins. In: Proc. ACM SIGMOD, pp. 287–298 (2002)
Khot, S., Naor, A.: Nonembeddability theorems via Fourier analysis. In: Proc. 46th IEEE Symp. Foundations on Computer Science, pp. 101–110 (2005)
Klein, P.N.: Computing the edit-distance between unrooted ordered trees. In: Proc. 6th European Symp. Algorithms, pp. 91–102 (1998)
Krauthgamer, R., Rabani, Y.: Improved lower bounds for embeddings into L 1. In: Proc. 17th ACM-SIAM Symp. Discrete Algorithms, pp. 1010–1017 (2006)
Muthukrishnan, S., Sahinalp, S.C.: Approximate nearest neighbors and sequence comparison with block operations. In: Proc. 32nd ACM Symp. Theory of Computing, pp. 412–416 (2000)
Ostrovsky, R., Rabani, Y.: Low distortion embeddings for edit distance. In: Proc. 37th ACM Symp. Theory of Computing, pp. 218–224 (2005)
Tai, K.-C.: The tree-to-tree correction problem. J. ACM 26, 422–433 (1979)
Valiente, G.: Algorithms on Trees and Graphs. Springer, Berlin (2002)
Yang, R., Kalnis, P., Tung, A.K.H.: Similarity evaluation on tree-structured data. In: Proc. ACM SIGMOD, pp. 754–765 (2005)
Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM J. Comput. 18, 1245–1262 (1989)
Author information
Authors and Affiliations
Corresponding author
Additional information
A preliminary version of this paper appeared in the Proceedings of the 17th Annual International Symposium on Algorithms and Computation (ISAAC’06), Lecture Notes in Computer Science, 4288, pp. 90–99, 2006. This work is partially supported by Grants-in-Aid on Scientific Research on Priority Areas “Cyber Infrastructure for the Information-explosion Era” and “Systems Genomics”, and Grant-in-Aid #16300092 from MEXT, Japan.
Rights and permissions
About this article
Cite this article
Akutsu, T., Fukagawa, D. & Takasu, A. Approximating Tree Edit Distance through String Edit Distance. Algorithmica 57, 325–348 (2010). https://doi.org/10.1007/s00453-008-9213-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-008-9213-z