Abstract
Current knowledge of RNA virus biodiversity is both biased and fragmentary, reflecting a focus on culturable or disease-causing agents. Here we profile the transcriptomes of over 220 invertebrate species sampled across nine animal phyla and report the discovery of 1,445 RNA viruses, including some that are sufficiently divergent to comprise new families. The identified viruses fill major gaps in the RNA virus phylogeny and reveal an evolutionary history that is characterized by both host switching and co-divergence. The invertebrate virome also reveals remarkable genomic flexibility that includes frequent recombination, lateral gene transfer among viruses and hosts, gene gain and loss, and complex genomic rearrangements. Together, these data present a view of the RNA virosphere that is more phylogenetically and genomically diverse than that depicted in current classification schemes and provide a more solid foundation for studies in virus ecology and evolution.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Accession codes
References
Koonin, E. V., Senkevich, T. G. & Dolja, V. V. The ancient Virus World and evolution of cells. Biol. Direct 1, 29 (2006)
Junglen, S. & Drosten, C. Virus discovery and recent insights into virus diversity in arthropods. Curr. Opin. Microbiol. 16, 507â513 (2013)
Li, C. X. et al. Unprecedented genomic diversity of RNA viruses in arthropods reveals the ancestry of negative-sense RNA viruses. eLife 4, e05378 (2015)
Bekal, S., Domier, L. L., Niblack, T. L. & Lambert, K. N. Discovery and initial analysis of novel viral genomes in the soybean cyst nematode. J. Gen. Virol. 92, 1870â1879 (2011)
Ballinger, M. J., Bruenn, J. A., Hay, J., Czechowski, D. & Taylor, D. J. Discovery and evolution of bunyavirids in arctic phantom midges and ancient bunyavirid-like sequences in insect genomes. J. Virol. 88, 8783â8794 (2014)
Qin, X. C. et al. A tick-borne segmented RNA virus contains genome segments derived from unsegmented viral ancestors. Proc. Natl Acad. Sci. USA 111, 6744â6749 (2014)
Tokarz, R. et al. Virome analysis of Amblyomma americanum, Dermacentor variabilis, and Ixodes scapularis ticks reveals novel highly divergent vertebrate and invertebrate viruses. J. Virol. 88, 11480â11492 (2014)
Webster, C. L. et al. The discovery, distribution, and evolution of viruses associated with Drosophila melanogaster. PLoS Biol. 13, e1002210 (2015)
Shi, M. et al. Divergent viruses discovered in arthropods and vertebrates revise the evolutionary history of the Flaviviridae and related viruses. J. Virol. 90, 659â669 (2015)
Holmes, E. C. The Evolution and Emergence of RNA Viruses. (Oxford Univ. Press, 2009)
Koonin, E. V. The phylogeny of RNA-dependent RNA polymerases of positive-strand RNA viruses. J. Gen. Virol. 72, 2197â2206 (1991)
Feschotte, C. & Gilbert, C. Endogenous viruses: insights into viral evolution and impact on host biology. Nat. Rev. Genet. 13, 283â296 (2012)
Philippe, H., Lartillot, N. & Brinkmann, H. Multigene analyses of bilaterian animals corroborate the monophyly of Ecdysozoa, Lophotrochozoa, and Protostomia. Mol. Biol. Evol. 22, 1246â1253 (2005)
King, A. M. Q., Adams, M. J., Carstens, E. B. & Lefkowitz, E. J. Virus Taxonomy: 9th Report of the International Committee on Taxonomy of Viruses. (Elsevier Academic Press, 2012)
Gauthier, L. et al. Viral load estimation in asymptomatic honey bee colonies using the quantitative RTâPCR technique. Apidologie (Celle) 38, 426â435 (2007)
Genersch, E. et al. The German bee monitoring project: a long term study to understand periodically high winter losses of honey bee colonies. Apidologie (Celle) 41, 332â352 (2010)
Tentcheva, D. et al. Prevalence and seasonal variations of six bee viruses in Apis mellifera L. and Varroa destructor mite populations in France. Appl. Environ. Microbiol. 70, 7185â7191 (2004)
Baranowski, E., Ruiz-Jarabo, C. M. & Domingo, E. Evolution of cell recognition by viruses. Science 292, 1102â1105 (2001)
Andersson, S. G. & Kurland, C. G. Origins of mitochondria and hydrogenosomes. Curr. Opin. Microbiol. 2, 535â541 (1999)
Gray, M. W., Burger, G. & Lang, B. F. Mitochondrial evolution. Science 283, 1476â1481 (1999)
Botstein, D. A theory of modular evolution for bacteriophages. Ann. NY Acad. Sci. 354, 484â490 (1980)
Suttle, C. A. Viruses in the sea. Nature 437, 356â361 (2005)
Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644â652 (2011)
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357â359 (2012)
Thorvaldsdóttir, H., Robinson, J. T. & Mesirov, J. P. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief. Bioinform. 14, 178â192 (2013)
Li, B., Ruotti, V., Stewart, R. M., Thomson, J. A. & Dewey, C. N. RNA-seq gene expression estimation with read mapping uncertainty. Bioinformatics 26, 493â500 (2010)
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772â780 (2013)
Capella-Gutiérrez, S., Silla-MartÃnez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972â1973 (2009)
Darriba, D., Taboada, G. L., Doallo, R. & Posada, D. ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics 27, 1164â1165 (2011)
Guindon, S. & Gascuel, O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52, 696â704 (2003)
Acknowledgements
This study was supported by the National Natural Science Foundation of China (Grants 81290343, 81273014, 81672057), the Special National Project on Research and Development of Key Biosafety Technologies (Grants 2016YFC1201900, 2016YFC1200101), the 12th Five-Year Major National Science and Technology Projects of China (2014ZX10004001-005), and an NHMRC Australia Fellowship (GNT1037231).
Author information
Authors and Affiliations
Contributions
Conceptualization: M.S. and Y.-Z.Z. Methodology: M.S., L.-J.C, C.-X.L., J.L., J.-S.E, J.B., E.C.H. and Y.-Z.Z. Investigation: M.S., X.-D.L., J.-H.T., L.-J.C, X.C., C.-X.L. and X.-C.Q. Writing (original draft): M.S., E.C.H. and Y.-Z.Z. Writing (review and editing): M.S., X.-D.L., J.-H.T., L.-J.C, X.C., C.-X.L, J.-S.E, J.X., E.C.H. and Y.-Z.Z. Funding Acquisition: J.X., E.C.H. and Y.-Z.Z. Resources (sampling): M.S., X.-D.L., J.-H.T., L.-J.C, X.C., C.-X.L., J.-P.C., W.W. and Y.-Z.Z. Resources (computational): M.S., J.L., J.B. and E.C.H. Supervision: E.C.H. and Y.-Z.Z.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Additional information
Reviewer Information Nature thanks E. Ghedin, D. Obbard and the other anonymous reviewer(s) for their contribution to the peer review of this work.
Extended data figures and tables
Extended Data Figure 1 The contribution of major viral clades to the total virome of each host phylum/order.
a, b, These analyses are based on viruses at all frequency levels (a), and viruses in which the frequency exceeds 0.1% of the total number of non-rRNA reads (b).
Extended Data Figure 2 Phylogenetic incongruence between the RdRp and structural proteins.
a, Match between the phylogenies of the RdRp and coat proteins (S-domain like) for non-segmented members of the TombusâNoda clade. The relationship between the two phylogenies is displayed to maximize topological congruence. b, The degree of phylogenetic incongruence for different pairs of structural and non-structural phylogenies. The comparisons were based on patristic distances matrices derived from the phylogenies.
Extended Data Figure 3 The gain and loss of RNA virus structural proteins.
a, The parallel acquisition of multiple copies of structural proteins by viruses within the HepeâVirga clade. Left panel shows an outline of the structural part of their genomes, with homologous structural genes marked in yellow and multiple copies of these proteins within the same genome labelled as âIâ, âIIâ, and âIIIâ. Right panel shows a maximum-likelihood phylogeny depicting the evolutionary history of the corresponding structural proteins of these viruses. b, Acquisition of a glycoprotein in the genome of Hubei Lepidoptera virus 2 from the MonoâChu Clade. Its genome is compared against that of a closely related virus (Hubei dimarhabdovirus-like virus 2). Homologous proteins are connected with dotted lines, and the target glycoprotein is shown in red. c, Three examples of glycoprotein loss in the MonoâChu Clade. Homologous proteins are connected with dotted lines, and the target glycoproteins are shown in blue.
Extended Data Figure 4 Lateral gene transfer between RNA viruses and cellular organisms.
a, Evolutionary origin of two exoribonucleases (cd06133) in two sea-slater-associated viruses (Beihai hepe-like virus 2 and Beihai sea slater virus 4). Top, alignment of viral and (human) cellular exoribonucleases. The solid triangles indicate the key catalytic sites. Lower left panel shows the phylogenetic positions of the two viruses (marked with solid red circles) whose genomes contain these exoribonucleases. The host information for each virus is shown in parentheses. Lower right panel shows the phylogenetic position of the virus exoribonucleases (solid red circle) in the context of cellular exoribonucleases. b, Evolutionary origin of viral serine proteases (cd00190). The phylogeny contains serine proteases from RNA viruses (solid red circles), DNA viruses (solid blue circles) and cellular organisms. Serine proteases from RNA viruses are either highly divergent or group within the diversity of cellular proteins. c, Relative positions of different protein domains in the replicase of selected HepeâVirga viruses. The domains are shown as ovals and marked with different colours, and comprise: RdRp (cd01699), Helicase (pfam01443), FstJ (pfam01728), OTU (OTU-like cysteine protease, pfam02338), Macro (cl00019), NADAR (cd15457), and viral methyltransferase (pfam01660). More detailed depictions of lateral gene transfer can be found in Supplementary Data 22â36.
Supplementary information
Supplementary Data
This file contains Supplementary Data 1-36, phylogenies and genome structures of each major virus clade. The phylogenies (SI data 1-21) contain detailed information on evolutionary relationships, the name of the viruses, the frequency of viral RNA, and the presence and location of endogenous virus elements (EVEs). The genome structures (SI data 22-36) contain information on the genome organization and the structural domains of representative viruses. (PDF 1870 kb)
Supplementary Table 1
This table contains the detailed information of each pool/library. (PDF 217 kb)
Supplementary Table 2
This table contains the detailed information on each virus discovered in this study. (XLSX 231 kb)
Rights and permissions
About this article
Cite this article
Shi, M., Lin, XD., Tian, JH. et al. Redefining the invertebrate RNA virosphere. Nature 540, 539â543 (2016). https://doi.org/10.1038/nature20167
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nature20167
This article is cited by
-
Covert infection with an RNA virus affects medfly fitness and the interaction with its natural parasitoid Aganaspis daci
Journal of Pest Science (2024)
-
Non-retroviral Endogenous Viral Elements in Tephritid Fruit Flies Reveal Former Viral Infections Not Related to Known Circulating Viruses
Microbial Ecology (2024)
-
Genetic diversity of RNA viruses infecting invertebrate pests of rice
Science China Life Sciences (2024)
-
A virus from Aspergillus cibarius with features of alpha- and betachrysoviruses
Virus Genes (2024)
-
Forest Tree Virome as a Source of Tree Diseases and Biological Control Agents
Current Forestry Reports (2024)