Additional file 11. Phylogenetic tree of the polA and dnaG genes in Flanders-like phages. Branche... more Additional file 11. Phylogenetic tree of the polA and dnaG genes in Flanders-like phages. Branches composed of GenBank phages are colored in orange and branches of gut metagenomic phages in blue. Branches with sequences labelled as bacteria in the GenBank database, likely representing cryptic prophages, are colored in grey.
Additional file 9. Alignment of the template and variable repeats from the Quimbyvirus DGRs. The ... more Additional file 9. Alignment of the template and variable repeats from the Quimbyvirus DGRs. The template repeat (TR) from Quimbyvirus is the first listed sequence and is followed by the variable repeats from either ORF80 (VR1) or ORF 47 (VR2) encoded in phage genomes that are nearly identical to Quimbyvirus (> 95 % average nucleotide identity). A total of 21 adenine residues (green) in the template repeat exhibit a least one substitution in a corresponding variable repeat.
Additional file 7. Coverage heatmap of a "Quimbyviridae" genome across human gut virome... more Additional file 7. Coverage heatmap of a "Quimbyviridae" genome across human gut viromes. The coverage of the most abundant "Quimbyviridae" phage genome (accession OMAC01000147.1) is plotted as a heatmap, scaled from 0 – 100x per 100 bp window.
Additional file 4. Host ranges inferred from CRISPR-spacer matches. The nucleotide coordinates of... more Additional file 4. Host ranges inferred from CRISPR-spacer matches. The nucleotide coordinates of the protospacer are provided in columns 2 and 3. The sequence of the CRISPR spacer and protospacer are provided in columns 4 and 5. The taxonomic information of the host is listed in the subsequent columns.
Additional file 3. Distribution of the marker profiles identified on each phage genome and pie ch... more Additional file 3. Distribution of the marker profiles identified on each phage genome and pie chart representation of the gut phage taxonomy. (A) Histogram of marker proteins detected on each phage genome recovered from human gut metagenomes. Abbreviations are as follows: M, major capsid protein; P, portal, T, terminase large subunit. (B) Taxonomic assignments of the dereplicated contigs (n = 1,886), with the outermost ring corresponding to ICTV families.
Additional file 10. HHPred alignments of four Quimbyviridae proteins and one Gratiaviridae protei... more Additional file 10. HHPred alignments of four Quimbyviridae proteins and one Gratiaviridae protein with their top-scoring templates, including a replication initiator protein, a cytosine-specific methyltransferase, an adenine-specific methyltransferase, a MutY nuclease and HipA kinase.
Chernevaya taiga in West Siberia is a unique environment, with gigantism of grasses and shrubs. E... more Chernevaya taiga in West Siberia is a unique environment, with gigantism of grasses and shrubs. Exceptionally high productivity of plants is determined by the synergistic interaction of various factors, with a special role belonging to microorganisms colonizing the plant roots. This research explored whether agricultural plants can recruit specific microorganisms from within virgin Chernevaya Umbrisol and thus increase their productivity. Radish and wheat plants were grown on the Umbrisol (T1) and control Retisol of Scotch pine forest stand (T3) soils in the phytotron, and then a bacterial community analysis of the rhizosphere was performed using high-throughput sequencing of the 16S rRNA genes. In laboratory experiments, the plant physiological parameters were significantly higher when growing on the Umbrisol as compared to the Retisol. Bacterial diversity in T1 soil was considerably higher than in the control sample, and the principal coordinate analysis demonstrated apparent diff...
Background Bacteriophages play key roles in the dynamics of the human microbiome. By far the most... more Background Bacteriophages play key roles in the dynamics of the human microbiome. By far the most abundant components of the human gut virome are tailed bacteriophages of the realm Duplodnaviria, in particular, crAss-like phages. However, apart from duplodnaviruses, the gut virome has not been dissected in detail. Results Here we report a comprehensive census of a minor component of the gut virome, the tailless bacteriophages of the realm Varidnaviria. Tailless phages are primarily represented in the gut by prophages of the families Corticoviridae and Autolykiviridae that jointly comprise the order Vinavirales and are mostly integrated as prophages in genomes of Alphaproteobacteria and Verrucomicrobia. Phylogenetic analysis of the major capsid proteins (MCP) and packaging ATPases suggests that at least three new families within Vinavirales should be established to accommodate the diversity of prophages from the human gut virome. Previously, only the MCP and ATPase genes were reporte...
Although the use of long-read sequencing improves the contiguity of assembled viral genomes compa... more Although the use of long-read sequencing improves the contiguity of assembled viral genomes compared to short-read methods, assembling complex viral communities remains an open problem. We describe the viralFlye tool for identification and analysis of metagenome-assembled viruses in long-read assemblies. We show it significantly improves viral assemblies and demonstrate that long-reads result in a much larger array of predicted virus-host associations as compared to short-read assemblies. We demonstrate that the identification of novel CRISPR arrays in bacterial genomes from a newly assembled metagenomic sample provides information for predicting novel hosts for novel viruses.
Apomictic plants (reproducing via asexual seeds), unlike sexual individuals, avoid meiosis and eg... more Apomictic plants (reproducing via asexual seeds), unlike sexual individuals, avoid meiosis and egg cell fertilization. Consequently, apomixis is very important for fixing maternal genotypes in the next plant generations. Despite the progress in the study of apomixis, molecular and genetic regulation of the latter remains poorly understood. So far APOLLO gene encoding aspartate glutamate aspartate aspartate histidine exonuclease is one of the very few described genes associated with apomixis in Boechera species. The centromere-specific histone H3 variant encoded by CENH3 gene is essential for cell division. Mutations in CENH3 disrupt chromosome segregation during mitosis and meiosis since the attachment of spindle microtubules to a mutated form of the CENH3 histone fails. This paper presents in silico characteristic of APOLLO and CENH3 genes, which may affect apomixis. Furthermore, we characterize the structure of CENH3 by bioinformatic tools, study expression levels of APOLLO and CE...
CrAssphage is the most abundant human-associated virus and the founding member of a large group o... more CrAssphage is the most abundant human-associated virus and the founding member of a large group of bacteriophages, discovered in animal-associated and environmental metagenomes, that infect bacteria of the phylum Bacteroidetes. We analyze 4907 Circular Metagenome Assembled Genomes (cMAGs) of putative viruses from human gut microbiomes and identify nearly 600 genomes of crAss-like phages that account for nearly 87% of the DNA reads mapped to these cMAGs. Phylogenetic analysis of conserved genes demonstrates the monophyly of crAss-like phages, a putative virus order, and of 5 branches, potential families within that order, two of which have not been identified previously. The phage genomes in one of these families are almost twofold larger than the crAssphage genome (145-192 kilobases), with high density of self-splicing introns and inteins. Many crAss-like phages encode suppressor tRNAs that enable read-through of UGA or UAG stop-codons, mostly, in late phage genes. A distinct featur...
Boreal forests are one of the largest stores of carbon on Earth, and two-thirds of them are locat... more Boreal forests are one of the largest stores of carbon on Earth, and two-thirds of them are located in Siberia. Despite the fact that these forests have a significant influence on the global climate, they continue to remain understudied. Chernevaya taiga is a unique example of a highly productive Siberian boreal ecosystem. This type of forest is characterized by a series of unique ecological traits, the most notable of which are the gigantism of the perennial herbaceous plants and bushes, complete lack of moss cover on soil surface, and the type of soil it grows on, notable for its particularly high rate of decomposition of vegetative remains and low humic acid content. Abundant rainfall actively washes out nutrients from the top layers of the soil, but its fertility level remains very high. In fact, based on the existing data, it is twice as high as that of fertilized agricultural lands. In some ways the conditions within this type of forest closely resemble those observed in tropi...
CrAssphage is the most abundant virus identified in the human gut virome and the founding member ... more CrAssphage is the most abundant virus identified in the human gut virome and the founding member of a large group of bacteriophages that infect bacteria of the phylum Bacteroidetes and have been discovered by metagenomics of both animal-associated and environmental habitats. By analysis of circular contigs from human gut microbiomes, we identified nearly 600 genomes of crAss-like phages. Phylogenetic analysis of conserved genes demonstrates the monophyly of crAss-like phages, which can be expected to become a new order of viruses, and of 5 distinct branches, likely, families within that order. Two of these putative families have not been identified previously. The phages in one of these groups have large genomes (145-192 kilobases) and contain an unprecedented high density of self-splicing introns and inteins. Many crAss-like phages encode suppressor tRNAs that enable readthrough of UGA or UAG stop-codons, mostly, in late phage genes, which could represent a distinct anti-defense st...
Proceedings of the National Academy of Sciences, 2019
The white shark ( Carcharodon carcharias ; Chondrichthyes, Elasmobranchii) is one of the most pub... more The white shark ( Carcharodon carcharias ; Chondrichthyes, Elasmobranchii) is one of the most publicly recognized marine animals. Here we report the genome sequence of the white shark and comparative evolutionary genomic analyses to the chondrichthyans, whale shark (Elasmobranchii) and elephant shark (Holocephali), as well as various vertebrates. The 4.63-Gbp white shark genome contains 24,520 predicted genes, and has a repeat content of 58.5%. We provide evidence for a history of positive selection and gene-content enrichments regarding important genome stability-related genes and functional categories, particularly so for the two elasmobranchs. We hypothesize that the molecular adaptive emphasis on genome stability in white and whale sharks may reflect the combined selective pressure of large genome sizes, high repeat content, high long-interspersed element retrotransposon representation, large body size, and long lifespans, represented across these two species. Molecular adaptati...
Closely related to the model plant , the genus is known to contain both sexual and apomictic spec... more Closely related to the model plant , the genus is known to contain both sexual and apomictic species or accessions. is a diploid sexually reproducing species and is thought to be an ancestral parent species of apomictic species. Here we report the de novo assembly of the genome using short Illumina and Roche reads from 1 paired-end and 3 mate pair libraries. The distribution of 23-mers from the paired end library has indicated a low level of heterozygosity and the presence of detectable duplications and triplications. The genome size was estimated to be equal 227 Mb. N50 of the assembled scaffolds was 2.3 Mb. Using a hybrid approach that combines homology-based and de novo methods 27,048 protein-coding genes were predicted. Also repeats, transfer RNA (tRNA) and ribosomal RNA (rRNA) genes were annotated. Finally, genes of and 6 other Brassicaceae species were used for phylogenetic tree reconstruction. In addition, we explored the histidine exonuclease locus, related to apomixis in , ...
Additional file 11. Phylogenetic tree of the polA and dnaG genes in Flanders-like phages. Branche... more Additional file 11. Phylogenetic tree of the polA and dnaG genes in Flanders-like phages. Branches composed of GenBank phages are colored in orange and branches of gut metagenomic phages in blue. Branches with sequences labelled as bacteria in the GenBank database, likely representing cryptic prophages, are colored in grey.
Additional file 9. Alignment of the template and variable repeats from the Quimbyvirus DGRs. The ... more Additional file 9. Alignment of the template and variable repeats from the Quimbyvirus DGRs. The template repeat (TR) from Quimbyvirus is the first listed sequence and is followed by the variable repeats from either ORF80 (VR1) or ORF 47 (VR2) encoded in phage genomes that are nearly identical to Quimbyvirus (> 95 % average nucleotide identity). A total of 21 adenine residues (green) in the template repeat exhibit a least one substitution in a corresponding variable repeat.
Additional file 7. Coverage heatmap of a "Quimbyviridae" genome across human gut virome... more Additional file 7. Coverage heatmap of a "Quimbyviridae" genome across human gut viromes. The coverage of the most abundant "Quimbyviridae" phage genome (accession OMAC01000147.1) is plotted as a heatmap, scaled from 0 – 100x per 100 bp window.
Additional file 4. Host ranges inferred from CRISPR-spacer matches. The nucleotide coordinates of... more Additional file 4. Host ranges inferred from CRISPR-spacer matches. The nucleotide coordinates of the protospacer are provided in columns 2 and 3. The sequence of the CRISPR spacer and protospacer are provided in columns 4 and 5. The taxonomic information of the host is listed in the subsequent columns.
Additional file 3. Distribution of the marker profiles identified on each phage genome and pie ch... more Additional file 3. Distribution of the marker profiles identified on each phage genome and pie chart representation of the gut phage taxonomy. (A) Histogram of marker proteins detected on each phage genome recovered from human gut metagenomes. Abbreviations are as follows: M, major capsid protein; P, portal, T, terminase large subunit. (B) Taxonomic assignments of the dereplicated contigs (n = 1,886), with the outermost ring corresponding to ICTV families.
Additional file 10. HHPred alignments of four Quimbyviridae proteins and one Gratiaviridae protei... more Additional file 10. HHPred alignments of four Quimbyviridae proteins and one Gratiaviridae protein with their top-scoring templates, including a replication initiator protein, a cytosine-specific methyltransferase, an adenine-specific methyltransferase, a MutY nuclease and HipA kinase.
Chernevaya taiga in West Siberia is a unique environment, with gigantism of grasses and shrubs. E... more Chernevaya taiga in West Siberia is a unique environment, with gigantism of grasses and shrubs. Exceptionally high productivity of plants is determined by the synergistic interaction of various factors, with a special role belonging to microorganisms colonizing the plant roots. This research explored whether agricultural plants can recruit specific microorganisms from within virgin Chernevaya Umbrisol and thus increase their productivity. Radish and wheat plants were grown on the Umbrisol (T1) and control Retisol of Scotch pine forest stand (T3) soils in the phytotron, and then a bacterial community analysis of the rhizosphere was performed using high-throughput sequencing of the 16S rRNA genes. In laboratory experiments, the plant physiological parameters were significantly higher when growing on the Umbrisol as compared to the Retisol. Bacterial diversity in T1 soil was considerably higher than in the control sample, and the principal coordinate analysis demonstrated apparent diff...
Background Bacteriophages play key roles in the dynamics of the human microbiome. By far the most... more Background Bacteriophages play key roles in the dynamics of the human microbiome. By far the most abundant components of the human gut virome are tailed bacteriophages of the realm Duplodnaviria, in particular, crAss-like phages. However, apart from duplodnaviruses, the gut virome has not been dissected in detail. Results Here we report a comprehensive census of a minor component of the gut virome, the tailless bacteriophages of the realm Varidnaviria. Tailless phages are primarily represented in the gut by prophages of the families Corticoviridae and Autolykiviridae that jointly comprise the order Vinavirales and are mostly integrated as prophages in genomes of Alphaproteobacteria and Verrucomicrobia. Phylogenetic analysis of the major capsid proteins (MCP) and packaging ATPases suggests that at least three new families within Vinavirales should be established to accommodate the diversity of prophages from the human gut virome. Previously, only the MCP and ATPase genes were reporte...
Although the use of long-read sequencing improves the contiguity of assembled viral genomes compa... more Although the use of long-read sequencing improves the contiguity of assembled viral genomes compared to short-read methods, assembling complex viral communities remains an open problem. We describe the viralFlye tool for identification and analysis of metagenome-assembled viruses in long-read assemblies. We show it significantly improves viral assemblies and demonstrate that long-reads result in a much larger array of predicted virus-host associations as compared to short-read assemblies. We demonstrate that the identification of novel CRISPR arrays in bacterial genomes from a newly assembled metagenomic sample provides information for predicting novel hosts for novel viruses.
Apomictic plants (reproducing via asexual seeds), unlike sexual individuals, avoid meiosis and eg... more Apomictic plants (reproducing via asexual seeds), unlike sexual individuals, avoid meiosis and egg cell fertilization. Consequently, apomixis is very important for fixing maternal genotypes in the next plant generations. Despite the progress in the study of apomixis, molecular and genetic regulation of the latter remains poorly understood. So far APOLLO gene encoding aspartate glutamate aspartate aspartate histidine exonuclease is one of the very few described genes associated with apomixis in Boechera species. The centromere-specific histone H3 variant encoded by CENH3 gene is essential for cell division. Mutations in CENH3 disrupt chromosome segregation during mitosis and meiosis since the attachment of spindle microtubules to a mutated form of the CENH3 histone fails. This paper presents in silico characteristic of APOLLO and CENH3 genes, which may affect apomixis. Furthermore, we characterize the structure of CENH3 by bioinformatic tools, study expression levels of APOLLO and CE...
CrAssphage is the most abundant human-associated virus and the founding member of a large group o... more CrAssphage is the most abundant human-associated virus and the founding member of a large group of bacteriophages, discovered in animal-associated and environmental metagenomes, that infect bacteria of the phylum Bacteroidetes. We analyze 4907 Circular Metagenome Assembled Genomes (cMAGs) of putative viruses from human gut microbiomes and identify nearly 600 genomes of crAss-like phages that account for nearly 87% of the DNA reads mapped to these cMAGs. Phylogenetic analysis of conserved genes demonstrates the monophyly of crAss-like phages, a putative virus order, and of 5 branches, potential families within that order, two of which have not been identified previously. The phage genomes in one of these families are almost twofold larger than the crAssphage genome (145-192 kilobases), with high density of self-splicing introns and inteins. Many crAss-like phages encode suppressor tRNAs that enable read-through of UGA or UAG stop-codons, mostly, in late phage genes. A distinct featur...
Boreal forests are one of the largest stores of carbon on Earth, and two-thirds of them are locat... more Boreal forests are one of the largest stores of carbon on Earth, and two-thirds of them are located in Siberia. Despite the fact that these forests have a significant influence on the global climate, they continue to remain understudied. Chernevaya taiga is a unique example of a highly productive Siberian boreal ecosystem. This type of forest is characterized by a series of unique ecological traits, the most notable of which are the gigantism of the perennial herbaceous plants and bushes, complete lack of moss cover on soil surface, and the type of soil it grows on, notable for its particularly high rate of decomposition of vegetative remains and low humic acid content. Abundant rainfall actively washes out nutrients from the top layers of the soil, but its fertility level remains very high. In fact, based on the existing data, it is twice as high as that of fertilized agricultural lands. In some ways the conditions within this type of forest closely resemble those observed in tropi...
CrAssphage is the most abundant virus identified in the human gut virome and the founding member ... more CrAssphage is the most abundant virus identified in the human gut virome and the founding member of a large group of bacteriophages that infect bacteria of the phylum Bacteroidetes and have been discovered by metagenomics of both animal-associated and environmental habitats. By analysis of circular contigs from human gut microbiomes, we identified nearly 600 genomes of crAss-like phages. Phylogenetic analysis of conserved genes demonstrates the monophyly of crAss-like phages, which can be expected to become a new order of viruses, and of 5 distinct branches, likely, families within that order. Two of these putative families have not been identified previously. The phages in one of these groups have large genomes (145-192 kilobases) and contain an unprecedented high density of self-splicing introns and inteins. Many crAss-like phages encode suppressor tRNAs that enable readthrough of UGA or UAG stop-codons, mostly, in late phage genes, which could represent a distinct anti-defense st...
Proceedings of the National Academy of Sciences, 2019
The white shark ( Carcharodon carcharias ; Chondrichthyes, Elasmobranchii) is one of the most pub... more The white shark ( Carcharodon carcharias ; Chondrichthyes, Elasmobranchii) is one of the most publicly recognized marine animals. Here we report the genome sequence of the white shark and comparative evolutionary genomic analyses to the chondrichthyans, whale shark (Elasmobranchii) and elephant shark (Holocephali), as well as various vertebrates. The 4.63-Gbp white shark genome contains 24,520 predicted genes, and has a repeat content of 58.5%. We provide evidence for a history of positive selection and gene-content enrichments regarding important genome stability-related genes and functional categories, particularly so for the two elasmobranchs. We hypothesize that the molecular adaptive emphasis on genome stability in white and whale sharks may reflect the combined selective pressure of large genome sizes, high repeat content, high long-interspersed element retrotransposon representation, large body size, and long lifespans, represented across these two species. Molecular adaptati...
Closely related to the model plant , the genus is known to contain both sexual and apomictic spec... more Closely related to the model plant , the genus is known to contain both sexual and apomictic species or accessions. is a diploid sexually reproducing species and is thought to be an ancestral parent species of apomictic species. Here we report the de novo assembly of the genome using short Illumina and Roche reads from 1 paired-end and 3 mate pair libraries. The distribution of 23-mers from the paired end library has indicated a low level of heterozygosity and the presence of detectable duplications and triplications. The genome size was estimated to be equal 227 Mb. N50 of the assembled scaffolds was 2.3 Mb. Using a hybrid approach that combines homology-based and de novo methods 27,048 protein-coding genes were predicted. Also repeats, transfer RNA (tRNA) and ribosomal RNA (rRNA) genes were annotated. Finally, genes of and 6 other Brassicaceae species were used for phylogenetic tree reconstruction. In addition, we explored the histidine exonuclease locus, related to apomixis in , ...
Uploads
Papers by Mikhail Rayko