Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3307339.3342151acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article
Open access

Robinson-Foulds Reticulation Networks

Published: 04 September 2019 Publication History

Abstract

Phylogenetic (hybridization) networks allow investigation of evolutionary species histories that involve complex phylogenetic events other than speciation, such as reassortment in virus evolution or introgressive hybridization in invertebrates and mammals. Reticulation networks can be inferred by solving the reticulation network problem, typically known as the hybridization network problem. Given a collection of phylogenetic input trees, this problem seeks a minimum reticulation network with the smallest number of reticulation vertices into which the input trees can be embedded exactly. Unfortunately, this problem is limited in practice, since minimum reticulation networks can be easily obfuscated by even small topological errors that typically occur in input trees inferred from biological data. We adapt the reticulation network problem to address erroneous input trees using the classic Robinson-Foulds distance. The RF embedding cost allows trees to be embedded into reticulation networks inexactly, but up to a measurable error. The adapted problem, called the Robinson-Foulds reticulation network (RF-Network) problem is, as we show and like many other problems applied in molecular biology, NP-hard. To address this, we employ local search strategies that have been successfully applied in other NP-hard phylogenetic problems. Our local search method benefits from recent theoretical advancements in this area. Further, we introduce in-practice effective algorithms for the computational challenges involved in our local search approach. Using simulations we experimentally validate the ability of our method, RF-Net, to reconstruct correct phylogenetic networks in the presence of error in input data. Finally, we demonstrate how RF-networks can help identify reassortment in influenza A viruses, and provide insight into the evolutionary history of these viruses. RF-Net was able to estimate a large and credible reassortment network with 164 taxa.

References

[1]
Benjamin Albrecht. 2016. Computing hybridization networks using agreement forests . Ph.D. Dissertation. Ludwig-Maximilians-Universit"at München .
[2]
Benjamin Albrecht, Celine Scornavacca, Alberto Cenci, et almbox. 2011. Fast computation of minimum hybridization networks. Bioinformatics, Vol. 28, 2 (2011), 191--197.
[3]
Tavis K Anderson, Martha I Nelson, Pravina Kitikoon, et almbox. 2013. Population dynamics of cocirculating swine influenza A viruses in the United States from 2009 to 2012 . Influenza and Other Respiratory Viruses, Vol. 7 (2013), 42--51.
[4]
Mihaela Baroni, Charles Semple, and Mike Steel. 2005. A framework for representing reticulate evolution. Annals of Combinatorics, Vol. 8, 4 (2005), 391--408.
[5]
Olaf R.P. Bininda-Emonds (Ed.). 2004. Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life. Computational Biology, Vol. 4. Springer Verlag.
[6]
Maciej F. Boni, Menno D. de Jong, H. Rogier van Doorn, et almbox. 2010. Guidelines for Identifying Homologous Recombination Events in Influenza A Virus . PLoS One, Vol. 5, 5 (2010), 1--11.
[7]
Maciej F Boni, David Posada, and Marcus W Feldman. 2007. An exact nonparametric method for inferring mosaic structure in sequence triplets. Genetics (2007).
[8]
Magnus Bordewich, Simone Linz, and Charles Semple. 2017. Lost in space? Generalising subtree prune and regraft to spaces of phylogenetic networks. J Theor Biol, Vol. 423 (2017), 1--12.
[9]
Magnus Bordewich and Charles Semple. 2007 a. Computing the hybridization number of two phylogenetic trees is fixed-parameter tractable. IEEE/ACM transactions on computational biology and bioinformatics (TCBB), Vol. 4, 3 (2007), 458--466.
[10]
Magnus Bordewich and Charles Semple. 2007 b. Computing the minimum number of hybridization events for a consistent evolutionary history. Discrete Appl Math, Vol. 155, 8 (2007), 914 -- 928.
[11]
Andrew S Bowman, Rasna R Walia, Jacqueline M Nolting, et almbox. 2017. Influenza A (H3N2) virus in swine at agricultural fairs and transmission to humans, Michigan and Ohio, USA, 2016. Emerg Infect Dis, Vol. 23, 9 (2017), 1551.
[12]
Gabriel Cardona, Francesc Rosselló, and Gabriel Valiente. 2008. Tripartitions do not always discriminate phylogenetic networks. Math Biosci, Vol. 211, 2 (2008), 356--370.
[13]
Joseph Minhow Chan, Gunnar Carlsson, and Raul Rabadan. 2013. Topology of viral evolution. PNAS, Vol. 110, 46 (2013), 18566--18571.
[14]
James A Cotton and Mark Wilkinson. 2007. Majority-rule supertrees. Syst Biol, Vol. 56, 3 (2007), 445--452.
[15]
RA Leo Elworth, Huw A Ogilvie, Jiafan Zhu, et almbox. 2018. Advances in Computational Methods for Phylogenetic Networks in the Presence of Hybridization. arXiv preprint arXiv:1808.08662 (2018).
[16]
Rongxiang Fang, Willy Min Jou, Danny Huylebroeck, et almbox. 1981. Complete structure of A/duck/Ukraine/63 influenza hemagglutinin gene: animal virus as progenitor of human H3 Hong Kong 1968 influenza hemagglutinin . Cell, Vol. 25, 2 (1981), 315--323.
[17]
Shibo Gao, Tavis K. Anderson, Rasna R. Walia, et almbox. 2017. The genomic evolution of H1 influenza A viruses from swine detected in the United States between 2009 and 2016. J Gen Virol, Vol. 98, 8 (2017), 2001--2010.
[18]
Rebecca J Garten, C Todd Davis, Colin A Russell, et almbox. 2009. Antigenic and genetic characteristics of swine-origin 2009 A (H1N1) influenza viruses circulating in humans. Science, Vol. 325, 5937 (2009), 197--201.
[19]
Bryan T Grenfell, Oliver G Pybus, Julia R Gog, et almbox. 2004. Unifying the epidemiological and evolutionary dynamics of pathogens. Science, Vol. 303, 5656 (2004), 327--332.
[20]
Yi Guan, Dhanasekaran Vijaykrishna, Justin Bahl, et almbox. 2010. The emergence of pandemic influenza viruses. Protein & cell, Vol. 1, 1 (2010), 9--13.
[21]
Andreas DM Gunawan, Bhaskar DasGupta, and Louxin Zhang. 2017. A decomposition theorem and two algorithms for reticulation-visible networks. Information and Computation, Vol. 252 (2017), 161--175.
[22]
Simon R. Harris, Edward J.P. Cartwright, M Esté ée Tö rö k, et almbox. 2013. Whole-genome sequencing for analysis of an outbreak of meticillin-resistant Staphylococcus aureus: a descriptive study. Lancet Infect Dis, Vol. 13, 2 (2013), 130--6.
[23]
Hussein A Hejase and Kevin J Liu. 2016. A scalability study of phylogenetic network inference methods using empirical datasets and simulations involving a single reticulation. BMC Bioinf, Vol. 17, 1 (2016), 422.
[24]
Edward C Holmes, Elodie Ghedin, Naomi Miller, et almbox. 2005. Whole-genome analysis of human influenza A virus reveals multiple persistent lineages and reassortment among recent H3N2 viruses . PLoS Biol, Vol. 3, 9 (2005), e300.
[25]
Daniel H Huson, Regula Rupp, and Celine Scornavacca. 2010. Phylogenetic networks: concepts, algorithms and applications .Cambridge University Press.
[26]
Daniel H Huson and Celine Scornavacca. 2011. A survey of combinatorial methods for phylogenetic networks. Genome biology and evolution, Vol. 3 (2011), 23--35.
[27]
Leo van Iersel, Steven Kelk, Nela Lekić, et almbox. 2014. A practical approximation algorithm for solving massive instances of hybridization number for binary and nonbinary trees. BMC Bioinf, Vol. 15, 1 (2014), 127.
[28]
Andrew P. Jackson. 2004. A reconciliation analysis of host switching in plant-fungal symbioses . Evolution, Vol. 58, 9 (2004), 1909--23.
[29]
Remie Janssen, Mark Jones, Péter L. ErdHo s, et almbox. 2018. Exploring the Tiers of Rooted Phylogenetic Network Space Using Tail Moves. Bull Math Biol, Vol. 80, 8 (2018), 2177--2208.
[30]
Iyad A Kanj, Luay Nakhleh, Cuong Than, et almbox. 2008. Seeing the trees and their branches in the network is hard. Theor Comp Sci, Vol. 401, 1--3 (2008), 153--164.
[31]
Yoshihiro Kawaoka, Scott Krauss, and Robert G Webster. 1989. Avian-to-human transmission of the PB1 gene of influenza A viruses in the 1957 and 1968 pandemics . J Virol, Vol. 63, 11 (1989), 4603--4608.
[32]
Nicola S Lewis, Colin A Russell, Pinky Langat, et almbox. 2016. The global antigenic diversity of swine influenza A viruses . Elife, Vol. 5 (2016), e12217.
[33]
Wayne P Maddison. 1997. Gene trees in species trees. Syst Biol, Vol. 46, 3 (1997), 523--536.
[34]
Alexey Markin, Tavis K Anderson, Venkata SKT Vadali, et almbox. 2019. Robinson-Foulds Reticulation Networks. bioRxiv (2019).
[35]
FR McMorris and Michael A Steel. 1994. The complexity of the median procedure for binary trees. New Approaches in Classification and Data Analysis. Springer, 136--140.
[36]
Chen Meng and Laura Salter Kubatko. 2009. Detecting hybrid speciation in the presence of incomplete lineage sorting using gene tree incongruence: a model. Theor Popul Biol, Vol. 75, 1 (2009), 35--45.
[37]
Serena Nik-Zainal, Peter Van Loo, David C Wedge, et almbox. 2012. The life history of 21 breast cancers. Cell, Vol. 149, 5 (2012), 994--1007.
[38]
Daniela S Raj ao, Tavis K Anderson, Pravina Kitikoon, Jered Stratton, et almbox. 2018. Antigenic and genetic evolution of contemporary swine H1 influenza viruses in the United States . Virology, Vol. 518 (2018), 45--54.
[39]
Daniela S. Raj ao, Phillip C. Gauger, Tavis K. Anderson, et almbox. 2015. Novel Reassortant Human-Like H3N2 and H3N1 Influenza A Viruses Detected in Pigs Are Virulent and Antigenically Distinct from Swine Viruses Endemic to the United States. J Virol, Vol. 89, 22 (2015), 11213--11222.
[40]
Daniela S Raj ao, Rasna R Walia, Brian Campbell, et almbox. 2017. Reassortment between swine H3N2 and 2009 pandemic H1N1 in the United States resulted in influenza A viruses with diverse genetic constellations with variable virulence in pigs . J Virol, Vol. 91, 4 (2017), e01763--16.
[41]
D.F. Robinson and L.R. Foulds. 1981. Comparison of phylogenetic trees. Math Biosci, Vol. 53 (1981), 131--147.
[42]
Gavin JD Smith, Dhanasekaran Vijaykrishna, Justin Bahl, et almbox. 2009. Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic . Nature, Vol. 459, 7250 (2009), 1122.
[43]
Claudia Solís-Lemus and Cécile Ané. 2016. Inferring Phylogenetic Networks with Maximum Pseudolikelihood under Incomplete Lineage Sorting. PLos Genet, Vol. 12, 3 (2016), 1--21.
[44]
Mike Steel and Allen Rodrigo. 2008. Maximum likelihood supertrees. Syst Biol, Vol. 57, 2 (2008), 243--50.
[45]
Mike A. Steel and David Penny. 1993. Distributions of tree comparison metrics. Syst Biol, Vol. 42, 2 (1993), 126--141.
[46]
Cuong Than, Derek Ruths, and Luay Nakhleh. 2008. PhyloNet: a software package for analyzing and reconstructing reticulate evolutionary relationships. BMC Bioinf, Vol. 9, 1 (2008), 322.
[47]
Leo van Iersel and Charles Semple. 2010. Locating a tree in a phylogenetic network. Inform. Process. Lett., Vol. 110 (2010), 1037--1043.
[48]
Dingqiao Wen, Yun Yu, Matthew W Hahn, et almbox. 2016. Reticulate evolutionary history and extensive introgression in mosquito species revealed by phylogenetic network analysis. Mol Ecol, Vol. 25, 11 (2016), 2361--2372.
[49]
Chris Whidden, Robert G Beiko, and Norbert Zeh. 2013. Fixed-parameter algorithms for maximum agreement forests. SIAM J Comput, Vol. 42, 4 (2013), 1431--1466.
[50]
Ann Willyard, Richard Cronn, and Aaron Liston. 2009. Reticulate evolution and incomplete lineage sorting among the ponderosa pines. Mol Phylogenet Evol, Vol. 52, 2 (2009), 498--511.
[51]
Yun Yu, R. Matthew Barnett, and Luay Nakhleh. 2013. Parsimonious Inference of Hybridization in the Presence of Incomplete Lineage Sorting. Syst Biol, Vol. 62, 5 (2013), 738--751.
[52]
Yun Yu, James H Degnan, and Luay Nakhleh. 2012. The probability of a gene tree topology within a phylogenetic network with applications to hybridization detection. PLos Genet, Vol. 8, 4 (2012), 1--10.
[53]
Yun Yu, Jianrong Dong, Kevin J. Liu, et almbox. 2014. Maximum likelihood inference of reticulate evolutionary histories. PNAS, Vol. 111, 46 (2014), 16448--16453.
[54]
Michael A. Zeller, Tavis K. Anderson, Rasna W. Walia, et almbox. 2018. ISU FLUture: a veterinary diagnostic laboratory web-based platform to monitor the temporal genetic patterns of Influenza A virus in swine . BMC Bioinf, Vol. 19, 1 (2018), 397.
[55]
Y Zhang, BD Aevermann, TK Anderson, et almbox. 2017. Influenza Research Database: An integrated bioinformatics resource for influenza virus research. Nucleic Acids Res, Vol. 45, D1 (2017), D466--D474.

Cited By

View all
  • (2024)Asymmetric Cluster-Based Measures for Comparative PhylogeneticsJournal of Computational Biology10.1089/cmb.2023.033831:4(312-327)Online publication date: 1-Apr-2024
  • (2022)Embedding gene trees into phylogenetic networks by conflict resolution algorithmsAlgorithms for Molecular Biology10.1186/s13015-022-00218-817:1Online publication date: 19-May-2022
  • (2022)Orchard Networks are Trees with Additional Horizontal ArcsBulletin of Mathematical Biology10.1007/s11538-022-01037-z84:8Online publication date: 21-Jun-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
September 2019
716 pages
ISBN:9781450366663
DOI:10.1145/3307339
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 September 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. error-correction
  2. hybridization
  3. iav
  4. phylogenetic networks
  5. reassortment
  6. robinson-foulds
  7. snpr

Qualifiers

  • Research-article

Funding Sources

Conference

BCB '19
Sponsor:

Acceptance Rates

BCB '19 Paper Acceptance Rate 42 of 157 submissions, 27%;
Overall Acceptance Rate 254 of 885 submissions, 29%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)82
  • Downloads (Last 6 weeks)10
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Asymmetric Cluster-Based Measures for Comparative PhylogeneticsJournal of Computational Biology10.1089/cmb.2023.033831:4(312-327)Online publication date: 1-Apr-2024
  • (2022)Embedding gene trees into phylogenetic networks by conflict resolution algorithmsAlgorithms for Molecular Biology10.1186/s13015-022-00218-817:1Online publication date: 19-May-2022
  • (2022)Orchard Networks are Trees with Additional Horizontal ArcsBulletin of Mathematical Biology10.1007/s11538-022-01037-z84:8Online publication date: 21-Jun-2022
  • (2020)Assessing the fit of the multi-species network coalescent to multi-locus dataBioinformatics10.1093/bioinformatics/btaa86337:5(634-641)Online publication date: 7-Dec-2020

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media