Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3388440.3412479acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article

A Generalized Robinson-Foulds Distance for Clonal Trees, Mutation Trees, and Phylogenetic Trees and Networks

Published: 10 November 2020 Publication History

Abstract

Cancer evolution is often modeled by clonal trees (whose nodes are labeled by multiple somatic mutations) or mutation trees (where nodes are labeled by single somatic mutations). Clonal trees are generated from sequence data with different computational methods that may produce different clone phylogenies, rendering their analysis and comparison necessary to infer mutation order and clone origin during tumor progression. In this paper, we present a distance metric for multi-labeled trees that generalizes the Robinson-Foulds distance for phylogenetic trees, allows for a similarity assessment at much higher resolution, and can be applied to trees and networks with different sets of node labels. The generalized Robinson-Foulds distance can be computed in time quadratic in the size of the input multisets of multisets of node labels, and is a metric for clonal trees, mutation trees, phylogenetic trees, and several classes of phylogenetic networks.

References

[1]
Nuraini Aguse, Yuanyuan Qi, and Mohammed El-Kebir. 2019. Summarizing the Solution Space in Tumor Phylogeny Inference by Multiple Consensus Trees. Bioinformatics 35, 14 (2019), i408-i416. https://doi.org/10.1093/bioinformatics/btz312
[2]
Tetsuo Asano, Jesper Jansson, Kunihiko Sadakane, Ryuhei Uehara, and Gabriel Valiente. 2012. Faster Computation of the Robinson-Foulds Distance between Phylogenetic Networks. Inf. Sci. 197 (2012), 77--90. https://doi.org/10.1016/j.ins.2012.01.038
[3]
Niko Beerenwinkel, Roland F. Schwarz, Moritz Gerstung, and Florian Markowetz. 2015. Cancer Evolution: Mathematical Models and Computational Inference. Syst. Biol. 64, 1 (2015), e1-e25. https://doi.org/10.1093/sysbio/syu081
[4]
Sebastian Böcker, Stefan Canzar, and Gunnar W. Klau. 2013. The Generalized Robinson-Foulds Metric. In Proc. 13th Int. Workshop Algorithms in Bioinformatics (Lecture Notes in Computer Science), Aaron Darling and Jens Stoye (Eds.), Vol. 8126. Springer, Berlin, Heidelberg, 156--169. https://doi.org/10.1007/978-3-642-40453-5_13
[5]
Paola Bonizzoni, Chiara Braghin, Riccardo Dondi, and Gabriella Trucco. 2012. The Binary Perfect Phylogeny with Persistent Characters. Theor. Comput. Sci. 454, 5 (2012), 51--63. https://doi.org/10.1016/j.tcs.2012.05.035
[6]
Paola Bonizzoni, Simone Ciccolella, Gianluca Della Vedova, and Mauricio Soto. 2017. Beyond Perfect Phylogeny: Multisample Phylogeny Reconstruction via ILP. In Proc. 2017 ACM Int. Conf. Bioinformatics, Computational Biology, and Health Informatics. Association for Computing Machinery, New York, NY, 1--10. https://doi.org/10.1145/3107411.3107441
[7]
Paola Bonizzoni, Simone Ciccolella, Gianluca Della Vedova, and Mauricio Soto. 2019. Does Relaxing the Infinite Sites Assumption Give Better Tumor Phylogenies? An ILP-Based Comparative Approach. IEEE ACM T. Comput. Bi. 16, 5 (2019), 1410--1423. https://doi.org/10.1109/TCBB.2018.2865729
[8]
Luka Borozan, Domagoj Matijević, and Stefan Canzar. 2019. Properties of the Generalized Robinson-Foulds Metric. In Proc. 42nd Int. Convention on Information and Communication Technology, Electronics and Microelectronics. Institute of Electrical and Electronic Engineers, New York, NY, 330--335. https://doi.org/10.23919/MIPRO.2019.8756638
[9]
David Bryant and Mike Steel. 2009. Computing the Distribution of a Tree Metric. IEEE ACM T. Comput. Bi. 6, 3 (2009), 420--426. https://doi.org/10.1109/TCBB.2009.32
[10]
Joseph H. Camin and Robert R. Sokal. 1965. A Method for deducing Branching Sequences in Phylogeny. Evolution 19, 3 (1965), 311--326. https://doi.org/10.1111/j.1558--5646.1965.tb01722.x
[11]
Gabriel Cardona, Mercè Llabrés, Francesc Rosselló, and Gabriel Valiente. 2009. Metrics for Phylogenetic Networks I: Generalizations of the Robinson-Foulds Metric. IEEE ACM T. Comput. Bi. 6, 1 (2009), 46--61. https://doi.org/10.1109/TCBB.2008.70
[12]
Gabriel Cardona, Mercè Llabrés, Francesc Rosselló, and Gabriel Valiente. 2011. Comparison of Galled Trees. IEEE ACM T. Comput. Bi. 8, 2 (2011), 410--427. https://doi.org/10.1109/TCBB.2010.60
[13]
Gabriel Cardona, Mercè Llabrés, Francesc Rosselló, and Gabriel Valiente. 2014. The Comparison of Tree-Sibling Time Consistent Phylogenetic Networks is Graph Isomorphism-Complete. Sci. World J. 2014, 254279 (2014). https://doi.org/10.1155/2014/254279
[14]
Gabriel Cardona, Francesc Rosselló, and Gabriel Valiente. 2008. A Distance Metric for a Class of Tree-Sibling Phylogenetic Networks. Bioinformatics 24, 13 (2008), 1481--1488. https://doi.org/10.1093/bioinformatics/btn231
[15]
Gabriel Cardona, Francesc Rosselló, and Gabriel Valiente. 2008. A Perl Package and an Alignment Tool for Phylogenetic Networks. BMC Bioinformatics 9, 175 (2008). https://doi.org/10.1186/1471-2105-9-175
[16]
Gabriel Cardona, Francesc Rosselló, and Gabriel Valiente. 2008. Tripartitions do not always discriminate Phylogenetic Networks. Math. Biosci. 211, 2 (2008), 356--370. https://doi.org/10.1016/j.mbs.2007.11.003
[17]
Gabriel Cardona, Francesc Rosselló, and Gabriel Valiente. 2009. Comparison of Tree-Child Phylogenetic Networks. IEEE ACM T. Comput. Bi. 6, 4 (2009), 552--569. https://doi.org/10.1109/TCBB.2007.70270
[18]
William H. E. Day. 1985. Optimal Algorithms for comparing Trees with Labeled Leaves. J. Classif. 2, 1 (1985), 7--28. https://doi.org/10.1007/BF01908061
[19]
Michel Marie Deza and Elena Deza. 2009. Encyclopedia of Distances. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30958-8
[20]
Zach DiNardo, Kiran Tomlinson, Anna Ritz, and Layla Oesper. 2020. Distance Measures for Tumor Evolutionary Trees. Bioinformatics 36, 7 (2020), 2090--2097. https://doi.org/10.1093/bioinformatics/btz869
[21]
Mohammed El-Kebir, Gryte Satas, Layla Oesper, and Benjamin J. Raphael. 2016. Inferring the Mutational History of a Tumor using Multi-State Perfect Phylogeny Mixtures. Cell Syst. 3 (2016), 43--53. https://doi.org/10.1016/j.cels.2016.07.004
[22]
James S. Farris. 1977. Phylogenetic Analysis under Dollo's Law. Syst. Zool. 26, 1 (1977), 77--88. https://doi.org/10.1093/sysbio/26.1.77
[23]
Osamu Fujita. 2013. Metrics based on Average Distance between Sets. Japan J. Indust. Appl. Math. 30, 1 (2013), 1--19. https://doi.org/10.1007/s13160-012-0089-6
[24]
Kiya Govek, Camden Sikes, and Layla Oesper. 2018. A Consensus Approach to infer Tumor Evolutionary Histories. In Proc. 2018 ACM Int. Conf. Bioinformatics, Computational Biology, and Health Informatics. Association for Computing Machinery, New York, NY, 63--72. https://doi.org/10.1145/3233547.3233584
[25]
Dan Gusfield. 1991. Efficient Algorithms for inferring Evolutionary Trees. Networks 21, 1 (1991), 19--28. https://doi.org/10.1002/net.3230210104
[26]
Kathy J. Horadam and Michael A. Nyblom. 2014. Distances between Sets based on Set Commonality. Discr. Appl. Math. 167 (2014), 310--314. https://doi.org/10.1016/j.dam.2013.10.037
[27]
Katharina T. Huber, Andreas Spillner, Radosłfaw Suchecki, and Vincent Moulton. 2011. Metrics on Multilabeled Trees: Interrelationships and Diameter Bounds. IEEE ACM T. Comput. Bi. 8, 4 (2011), 1029--1040. https://doi.org/10.1109/TCBB.2010.122
[28]
Wazim Mohammed Ismail, Etienne Nzabarushimana, and Haixu Tang. 2019. Algorithmic Approaches to Clonal Reconstruction in Heterogeneous Cell Populations. Quant. Biol. 7, 4 (2019), 255--265. https://doi.org/10.1007/s40484-019-0188-3
[29]
Paul Jaccard. 1912. The Distribution of Flora in the Alpine Zone. New Phytol. 11, 2 (1912), 37--50. https://doi.org/10.1111/j.1469--8137.1912.tb05611.x
[30]
Katharina Jahn, Niko Beerenwinkel, and Louxin Zhang. 2020. The Bourque Distances for Mutation Trees of Cancer. In Proc. 20th Int. Workshop Algorithms in Bioinformatics (Leibniz International Proceedings in Informatics), Nadia Pisanti and Carl Kingsford (Eds.), Vol. 172. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 14:1--14:23. https://doi.org/10.4230/LIPIcs.WABI.2020.14
[31]
Katherine St. John. 2017. The Shape of Phylogenetic Treespace. Syst. Biol. 66, 1 (2017), e83-e94. https://doi.org/10.1093/sysbio/syw025
[32]
Nikolai Karpov, Salem Malikic, Md. Khaledur Rahman, and S. Cenk Sahinalp. 2019. A Multi-Labeled Tree Dissimilarity Measure for Comparing "Clonal Trees" of Tumor Progression. Algorithms Mol. Biol. 14, 17 (2019). https://doi.org/10.1186/s13015-019-0152-9
[33]
Kyung In Kim and Richard Simon. 2014. Using Single Cell Sequencing Data to Model the Evolutionary History of a Tumor. BMC Bioinformatics 15, 27 (2014). https://doi.org/10.1186/1471-2105-15-27
[34]
Motoo Kimura. 1969. The Number of Heterozygous Nucleotide Sites Maintained in a Finite Population due to Steady Flux of Mutations. Genetics 61, 4 (1969), 893--903.
[35]
Donald E. Knuth. 1997. The Art of Computer Programming (3rd ed.). Vol. 1: Fundamental Algorithms. Addison-Wesley, Boston, MA.
[36]
Michael Levandowsky and David Winter. 1971. Distance between Sets. Nature 234 (1971), 34--35. https://doi.org/10.1038/234034a0
[37]
Jian Ma, Aakrosh Ratan, Brian J. Raney, Bernard B. Suh, Webb Miller, and David Haussler. 2008. The Infinite Sites Model of Genome Evolution. PNAS 105, 38 (2008), 14254--14261. https://doi.org/10.1073/pnas.0805217105
[38]
John H. Mason. 1972. Distance between Sets. Nat. Phys. Sci. 235 (1972), 80. https://doi.org/10.1038/physci235080a0
[39]
Kurt Mehlhorn and Peter Sanders. 2016. Algorithms and Data Structures: The Basic Toolbox. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77978-0
[40]
Peter C. Nowell. 1976. The Clonal Evolution of Tumor Cell Populations. Science 194, 4260 (1976), 23--28. https://doi.org/10.1126/science.959840
[41]
Barbara L. Parsons. 2008. Many Different Tumor Types have Polyclonal Tumor Origin: Evidence and Implications. Mutat. Res. 659, 1 (2008), 232--247. https://doi.org/10.1016/j.mrrev.2008.05.004
[42]
Barbara L. Parsons. 2018. Multiclonal Tumor Origin: Evidence and Implications. Mutat. Res. 777, 1 (2018), 1--18. https://doi.org/10.1016/j.mrrev.2018.05.001
[43]
Nicholas D. Pattengale, Eric J. Gottlieb, and Bernard M. E. Moret. 2007. Efficiently Computing the Robinson-Foulds Metric. J. Comput. Biol. 14, 6 (2007), 724--735. https://doi.org/10.1089/cmb.2007.R012
[44]
David Posada and Keith A. Crandall. 2001. Intraspecific Gene Genealogies: Trees Grafting into Networks. Trends Ecol. Evol. 16, 1 (2001), 37--45. https://doi.org/10.1016/S0169--5347(00)02026--7
[45]
Mark A. Ragan. 2009. Trees and Networks before and after Darwin. Biol. Direct 4, 43 (2009). https://doi.org/10.1186/1745--6150--4-43
[46]
David F. Robinson and L. R. Foulds. 1981. Comparison of Phylogenetic Trees. Math. Biosci. 53, 1--2 (1981), 131--147. https://doi.org/10.1016/0025--5564(81)90043--2
[47]
Anna Schuh, Jennifer Becq, Sean Humphray, Adrian Alexa, Adam Burns, Ruth Clifford, Stephan M. Feller, Russell Grocock, Shirley Henderson, Irina Khrebtukova, Zoya Kingsbury, Shujun Luo, David McBride, Lisa Murray, Toshi Menju, Adele Timbs, Mark Ross, Jenny Taylor, and David Bentley. 2012. Monitoring Chronic Lymphocytic Leukemia Progression by Whole Genome Sequencing reveals Heterogeneous Clonal Evolution Patterns. Blood 120, 20 (2012), 4191--4196. https://doi.org/10.1182/blood-2012--05--433540
[48]
Russell Schwartz and Alejandro A. Schäffer. 2017. The Evolution of Tumour Phylogenetics: Principles and Practice. Nat. Rev. Genet. 18, 4 (2017), 213--229. https://doi.org/10.1038/nrg.2016.170
[49]
Mícheál O. Searcóid. 2007. Metric Spaces. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-1-84628-627-8
[50]
Mike Steel. 2016. Phylogeny: Discrete and Random Processes in Evolution. Society for Industrial and Applied Mathematics, Philadelphia, PA. https://doi.org/10.1137/1.9781611974485
[51]
Lajos Takács. 1993. Enumeration of Rooted Trees and Forests. Math. Scientist 18, 1 (1993), 1--10.
[52]
Gabriel Valiente. 2002. Algorithms on Trees and Graphs. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-04921-1
[53]
Gabriel Valiente. 2009. Combinatorial Pattern Matching Algorithms in Computational Biology Using Perl and R. Chapman & Hall/CRC, Boca Raton, FL. https://doi.org/10.1201/9781420069747
[54]
Leo van Iersel, Judith Keijsper, Steven Kelk, Leen Stougie, Ferry Hagen, and Teun Boekhout. 2009. Constructing Level-2 Phylogenetic Networks from Triplets. IEEE ACM T. Comput. Bi. 6, 4 (2009), 667--681. https://doi.org/10.1007/978-3-540-78839-3_40
[55]
Fabio Vandin. 2017. Computational Methods for Characterizing Cancer Mutational Heterogeneity. Front. Genet. 8, 83 (2017). https://doi.org/10.3389/fgene.2017.00083
[56]
Rutger A. Vos, Jason Caravas, Klaas Hartmann, Mark A. Jensen, and Chase Miller. 2011. Bio::Phylo: Phyloinformatic Analysis using Perl. BMC Bioinformatics 12, 63 (2011). https://doi.org/10.1186/1471-2105-12-63

Cited By

View all
  • (2024)New generalized metric based on branch length distance to compare B cell lineage treesAlgorithms for Molecular Biology10.1186/s13015-024-00267-119:1Online publication date: 5-Oct-2024
  • (2023)The K-Robinson Foulds Measures for Labeled TreesComparative Genomics10.1007/978-3-031-36911-7_10(146-161)Online publication date: 13-Jul-2023
  • (2020)Untangling mechanized proofsProceedings of the 13th ACM SIGPLAN International Conference on Software Language Engineering10.1145/3426425.3426940(155-174)Online publication date: 16-Nov-2020

Index Terms

  1. A Generalized Robinson-Foulds Distance for Clonal Trees, Mutation Trees, and Phylogenetic Trees and Networks

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      BCB '20: Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
      September 2020
      193 pages
      ISBN:9781450379649
      DOI:10.1145/3388440
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 10 November 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Cancer genomics phylogenetics
      2. Robinson-Foulds distance, metrics
      3. clonal tree
      4. multi-labeled tree
      5. mutation tree
      6. phylogenetic network
      7. phylogenetic tree

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      BCB '20
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 254 of 885 submissions, 29%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)10
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 14 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)New generalized metric based on branch length distance to compare B cell lineage treesAlgorithms for Molecular Biology10.1186/s13015-024-00267-119:1Online publication date: 5-Oct-2024
      • (2023)The K-Robinson Foulds Measures for Labeled TreesComparative Genomics10.1007/978-3-031-36911-7_10(146-161)Online publication date: 13-Jul-2023
      • (2020)Untangling mechanized proofsProceedings of the 13th ACM SIGPLAN International Conference on Software Language Engineering10.1145/3426425.3426940(155-174)Online publication date: 16-Nov-2020

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media