Abstract
Tree comparison metrics have proven to be an invaluable aide in the reconstruction and analysis of phylogenetic (evolutionary) trees. The path-length distance between trees is a particularly attractive measure as it reflects differences in tree shape as well as differences between branch lengths. The distance equals the sum, over all pairs of taxa, of the squared differences between the lengths of the unique path connecting them in each tree. We describe an \(O(n \log n)\) time for computing this distance, making extensive use of tree decomposition techniques introduced by Brodal et al. (Algorithmica 38(2):377–395, 2004).
Similar content being viewed by others
Notes
A bipartition A|B with \(A\cup B=X\) is in a phylogenetic tree \(T=(V,E)\) if there exists an edge \(e\in E\) such that its removal creates two trees with taxon sets A and B.
References
Brent, R.P.: The parallel evaluation of general arithmetic expressions. J. ACM (JACM) 21(2), 201–206 (1974)
Brodal, G.S., Fagerberg, R., Pedersen, C.N.: Computing the quartet distance between evolutionary trees in time \(O (n \log n)\). Algorithmica 38(2), 377–395 (2004)
Bryant, D.: A classification of consensus methods for phylogenetics. DIMACS Ser. Discrete Math. Theor. Comput. Sci. 61, 163–184 (2003)
Bryant, D., Waddell, P.: Rapid evaluation of least squares and minimum evolution criteria on phylogenetic trees. Mol. Biol. Evol. 15(10), 1346–1359 (1997)
Cohen, R.F., Tamassia, R.: Dynamic expression trees. Algorithmica 13(3), 245–265 (1995)
Farris, J.S.: A successive approximations approach to character weighting. Syst. Biol. 18(4), 374–385 (1969)
Hartigan, J.A.: Representation of similarity matrices by trees. J. Am. Stat. Assoc. 62(320), 1140–1158 (1967)
Hillis, D.M., Heath, T.A., John, K.S.: Analysis and visualization of tree space. Syst. Biol. 54(3), 471–482 (2005)
Holmes, S.: Statistical approach to tests involving phylogenies. In: Gascuel, O. (ed.) Mathematics of Phylogeny and Evolution, chap. 4, pp. 91–117. New York: Oxford University Press (2005)
Lapointe, F.J., Cucumel, G.: The average consensus procedure: combination of weighted trees containing identical or overlapping sets of taxa. Syst. Biol. 46(2), 306–312 (1997)
Penny, D., Watson, E.E., Steel, M.A.: Trees from languages and genes are very similar. Syst. Biol. 42(3), 382–384 (1993)
Robinson, D., Foulds, L.: Comparison of phylogenetic trees. Math. Biosci. 53, 131–147 (1981)
Susko, E.: Improved least squares topology testing and estimation. Syst. Biol. 60(5), 668–675 (2011)
Swofford, D.L.: When are phylogeny estimates from molecular and morphological data incongruent? In: Miyamoto, M.M., Cracraft, J. (eds.) Phylogenetic Analysis of DNA Sequences, pp. 295–333. Oxford University Press, Oxford (1991)
Williams, W.T., Clifford, H.T.: On the comparison of two classifications of the same set of elements. Taxon 20(4), 519–522 (1971)
Acknowledgements
This research was made possible due to travel funds made available from a Marsden Grant to DB. Both authors thank David Swofford for help finding an error in an earlier version of Proposition 1.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bryant, D., Scornavacca, C. An \(O(n \log n)\) Time Algorithm for Computing the Path-Length Distance Between Trees. Algorithmica 81, 3692–3706 (2019). https://doi.org/10.1007/s00453-019-00594-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-019-00594-5