Progress in genome sequencing now enables the large-scale generation of reference genomes. Variou... more Progress in genome sequencing now enables the large-scale generation of reference genomes. Various international initiatives aim to generate reference genomes representing global biodiversity. These genomes provide unique insights into genomic diversity and architecture, thereby enabling comprehensive analyses of population and functional genomics, and are expected to revolutionize conservation genomics.
Background: Studies in vertebrate genomics require sampling from a broad range of tissue types, t... more Background: Studies in vertebrate genomics require sampling from a broad range of tissue types, taxa, and localities. Recent advancements in long-read and long-range genome sequencing have made it possible to produce high-quality chromosome-level genome assemblies for almost any organism. However, adequate tissue preservation for the requisite ultra-high molecular weight DNA (uHMW DNA) remains a major challenge. Here we present a comparative study of preservation methods for field and laboratory tissue sampling, across vertebrate classes and different tissue types. Results: We find that storage temperature was the strongest predictor of uHMW fragment lengths. While immediate flash-freezing remains the sample preservation gold standard, samples preserved in 95% EtOH or 20-25% DMSO-EDTA showed little fragment length degradation when stored at 4 • C for 6 hours. Samples in 95% EtOH or 20-25% DMSO-EDTA kept at 4 • C for 1 week after dissection still yielded adequate amounts of uHMW DNA for most applications. Tissue type was a significant predictor of total DNA yield but not fragment length. Preservation solution had a smaller but significant influence on both fragment length and DNA yield. Conclusion: We provide sample preservation guidelines that ensure sufficient DNA integrity and amount required for use with longread and long-range sequencing technologies across vertebrates. Our best practices generated the uHMW DNA needed for the highquality reference genomes for phase 1 of the Vertebrate Genomes Project, whose ultimate mission is to generate chromosome-level reference genome assemblies of all ∼70,000 extant vertebrate species.
High-quality and complete reference genome assemblies are fundamental for the application of geno... more High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are only available for a few non-microbial species1–4. To address this issue, the international Genome 10K (G10K) consortium5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling the most accurate and complete reference genomes to date. Here we summarize these developments, introduce a set of quality standards, and present lessons learned from sequencing and assembling 16 species representing major vertebrate lineages (mammals, birds, reptiles, amphibians, teleost fishes and cartilaginous fishes). We confirm that long-read sequencing technologies are essential for maximizing genome quality and that unresolved complex repeats and haplotype heterozygosity are major sources of error in assemblies. Our new assemblies identify and correct substantial errors in some o...
New applications of DNA and RNA sequencing are expanding the field of biodiversity discovery and ... more New applications of DNA and RNA sequencing are expanding the field of biodiversity discovery and ecological monitoring, yet questions remain regarding precision and efficiency. Due to primer bias, the ability of metabarcoding to accurately depict biomass of different taxa from bulk communities remains unclear, while PCR-free whole mitochondrial genome (mitogenome) sequencing may provide a more reliable alternative. Here we used a set of documented mock communities comprising 13 species of freshwater macroinvertebrates of estimated individual biomass, to compare the detection efficiency of COI metabarcoding (3 different amplicons) and shotgun mitogenome sequencing. Additionally, we used individual COI barcoding and de novo mitochondrial genome sequencing, to provide reference sequences for OTU assignment and metagenome mapping (mitogenome-skimming) respectively. We found that even though both methods occasionally failed to recover very low abundance species, metabarcoding was less co...
The use of environmental DNA (eDNA) in biodiversity assessments offers a step-change in sensitivi... more The use of environmental DNA (eDNA) in biodiversity assessments offers a step-change in sensitivity, throughput and simultaneous measures of ecosystem diversity and function. There remains, however, a need to examine eDNA persistence in the wild through simultaneous temporal measures of eDNA and biota. Here, we use metabarcoding of two markers of different lengths, derived from an annual time series of aqueous lake eDNA to examine temporal shifts in ecosystem biodiversity and in an ecologically important group of macroinvertebrates (Diptera: Chironomidae). The analyses allow different levels of detection and validation of taxon richness and community composition (β-diversity) through time, with shorter eDNA fragments dominating the eDNA community. Comparisons between eDNA, community DNA, taxonomy and UK species abundance data further show significant relationships between diversity estimates derived across the disparate methodologies. Our results reveal the temporal dynamics of eDNA...
High-quality and complete reference genome assemblies are fundamental for the application of geno... more High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are only available for a few non-microbial species 1-4 . To address this issue, the international Genome 10K (G10K) consortium 5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling the most accurate and complete reference genomes to date. Here we summarize these developments, introduce a set of quality standards, and present lessons learned from sequencing and assembling 16 species representing major vertebrate lineages (mammals, birds, reptiles, amphibians, teleost fishes and cartilaginous fishes). We confirm that long-read sequencing technologies are essential for maximizing genome quality and that unresolved complex repeats and haplotype heterozygosity are major sources of error in assemblies. Our new assemblies identify and correct substantial errors in some of the best historical reference genomes. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an effort to generate high-quality, complete reference genomes for all ~70,000 extant vertebrate species and help enable a new era of discovery across the life sciences.
New applications of DNA and RNA sequencing are expanding the field of biodiversity discovery and ... more New applications of DNA and RNA sequencing are expanding the field of biodiversity discovery and ecological monitoring, yet questions remain regarding precision and efficiency. Due to primer bias, the ability of metabarcoding to accurately depict biomass of different taxa from bulk communities remains unclear, while PCR-free whole mitochondrial genome (mitogenome) sequencing may provide a more reliable alternative. Here, we used a set of documented mock communities comprising 13 species of freshwater macroinvertebrates of estimated individual biomass, to compare the detection efficiency of COI metabarcoding (three different amplicons) and shotgun mitogenome sequencing. Additionally, we used individual COI barcoding and de novo mitochondrial genome sequencing, to provide reference sequences for OTU assignment and metagenome mapping (mitogenome skimming), respectively. We found that, even though both methods occasionally failed to recover very low abundance species, metabarcoding was less consistent, by failing to recover some species with higher abundances, probably due to primer bias. Shotgun sequencing results provided highly significant correlations between read number and biomass in all but one species. Conversely, the read-biomass relationships obtained from metabarcoding varied across amplicons. Specifically, we found significant relationships for eight of 13 (amplicons B1FR-450 bp, FF130R-130 bp) or four of 13 (am-plicon FFFR, 658 bp) species. Combining the results of all three COI amplicons (multiamplicon approach) improved the read-biomass correlations for some of the species. Overall, mitogenomic sequencing yielded more informative predictions of biomass content from bulk macroinvertebrate communities than metabarcoding. However, for large-scale ecological studies, metabarcoding currently remains the most commonly used approach for diversity assessment. K E Y W O R D S biodiversity, biomass, genome skimming, invertebrates, metabarcoding, metagenomics
The use of environmental DNA (eDNA) in biodiversity assessments offers a step-change in sensitivi... more The use of environmental DNA (eDNA) in biodiversity assessments offers a step-change in sensitivity, throughput and simultaneous measures of ecosystem diversity and function. There remains, however, a need to examine eDNA persistence in the wild through simultaneous temporal measures of eDNA and biota. Here, we use metabarcoding of two markers of different lengths, derived from an annual time series of aqueous lake eDNA to examine temporal shifts in ecosystem biodiversity and in an ecologically important group of macro-invertebrates (Diptera: Chironomidae). The analyses allow different levels of detection and validation of taxon richness and community composition (b-diversity) through time, with shorter eDNA fragments dominating the eDNA community. Comparisons between eDNA, community DNA, taxonomy and UK species abundance data further show significant relationships between diversity estimates derived across the disparate methodologies. Our results reveal the temporal dynamics of eDNA and validate the utility of eDNA metabarcoding for tracking seasonal diversity at the ecosystem scale.
Freshwater biomonitoring using macroinvertebrates as ecological indicators is a well-established ... more Freshwater biomonitoring using macroinvertebrates as ecological indicators is a well-established method of ecosystem health assessment. Traditional biomonitoring applications employ morphological identification of species, which is very labour intensive and in many cases lacking in accuracy and speed . Recent advances in molecular analysis though, like DNA Barcoding and Next Generation Sequencing (NGS) have allowed for these techniques to be employed for the identification of species. DNA barcoding of freshwater macroinvertebrates can increase the accuracy and resolution of identification which in turn can increase the efficiency and accuracy of ecosystem health assessment in applied biomonitoring . At Bangor University, alongside the Environment Agency (EA) we are producing a comprehensive DNA barcoding library for 180 species of Trichoptera, Mollusca and Chironomidae from UK lakes and rivers, aiming to provide more information for the direct application of molecular methods in freshwater biomonitoring. The Chironomidae family of non-biting midges in particular, represent an excellent group for ecosystem assessment due to their huge capacity to act as indicator species and their extensive species diversity . Furthermore, we are using Environmental DNA (eDNA), extracted directly from water samples, to detect the presence and temporal variation of chironomids as well as reconstruct ecological networks inside a Welsh lake ecosystem. By performing NGS of eDNA and bulk invertebrate samples, alongside sequence information from our Barcoding library, we are testing a faster, more accurate and less labour intensive method of ecosystem health assessment.
The purpose of this study was to determine the phylogeographic structure of the brackish-hypersal... more The purpose of this study was to determine the phylogeographic structure of the brackish-hypersaline cyprinodont fish Aphanius fasciatus (Valenciennes, 1821), using sequencing and RFLP analysis of a 1,330 bp mitochondrial DNA segment containing part of the 16S rRNA gene as well as the genes for tRNA-Leu, NADH subunit 1 and tRNA-Ile. Individuals were collected from 13 different sites in Greece and Turkey, while seven published A. fasciatus sequences were also included to cover the area of distribution of the species. Pairwise sequence divergence values ranged from 0 to 4.51%. Congruent phylogenies were recovered with maximum likelihood, maximum parsimony and neighbour-joining methods. All analyses revealed two main groups. The first group consists of populations from almost all localities that drain into the Aegean Sea. The second group comprises the remaining population samples, which in some cases seem to consist of population-specific subgroups. Our results show that vicariant events have predominantly affected the evolution of A. fasciatus, with the Messinian salinity crisis having shaped the present genetic structure of its populations. Additionally, the life-history traits of the species, which determine a low potential for dispersal, coupled with the typical fragmentation of brackish-hypersaline water habitats have led to a high degree of isolation of A. fasciatus populations, even at restricted spatial scales. Analysis of the partitioning of the total amount of polymorphism with analyses of molecular variance (AMOVA) gave a value of F ST = 84.6%. Potential conservation policies concerning A. fasciatus should also consider the low-genetic variability in the majority of its populations and the presence of fixed haplotypes in some of them.
Transactions of the American Fisheries Society, 2009
Mislabeling of North American merlucciid bakes in stock surveys and commercial market samples was... more Mislabeling of North American merlucciid bakes in stock surveys and commercial market samples was detected by employing nuclear 5S ribosomal DNA (rDNA) and mitochondrial cytochrome b variation as molecular markers. Results showed that offshore hake Merluccius albidus is sold in European markets but is labeled as the morphologically similar silver hake M. bilinearis, which is the target species of the fishery. This suggests that offshore hake may be inadvertently included within silver hake landings, as the two species overlap in the southern area of silver hake distribution (approximately 41 degrees-35 degrees N latitude near North American coasts). An inexpensive and technically easy technique based on polymerase chain reaction (PCR) amplification of a fragment of 5S rDNA and visualization of PCR products in agarose gels is recommended for routine species assignation in landings for purposes of exploitation estimates and for authentication of commercial hake species.
Progress in genome sequencing now enables the large-scale generation of reference genomes. Variou... more Progress in genome sequencing now enables the large-scale generation of reference genomes. Various international initiatives aim to generate reference genomes representing global biodiversity. These genomes provide unique insights into genomic diversity and architecture, thereby enabling comprehensive analyses of population and functional genomics, and are expected to revolutionize conservation genomics.
Background: Studies in vertebrate genomics require sampling from a broad range of tissue types, t... more Background: Studies in vertebrate genomics require sampling from a broad range of tissue types, taxa, and localities. Recent advancements in long-read and long-range genome sequencing have made it possible to produce high-quality chromosome-level genome assemblies for almost any organism. However, adequate tissue preservation for the requisite ultra-high molecular weight DNA (uHMW DNA) remains a major challenge. Here we present a comparative study of preservation methods for field and laboratory tissue sampling, across vertebrate classes and different tissue types. Results: We find that storage temperature was the strongest predictor of uHMW fragment lengths. While immediate flash-freezing remains the sample preservation gold standard, samples preserved in 95% EtOH or 20-25% DMSO-EDTA showed little fragment length degradation when stored at 4 • C for 6 hours. Samples in 95% EtOH or 20-25% DMSO-EDTA kept at 4 • C for 1 week after dissection still yielded adequate amounts of uHMW DNA for most applications. Tissue type was a significant predictor of total DNA yield but not fragment length. Preservation solution had a smaller but significant influence on both fragment length and DNA yield. Conclusion: We provide sample preservation guidelines that ensure sufficient DNA integrity and amount required for use with longread and long-range sequencing technologies across vertebrates. Our best practices generated the uHMW DNA needed for the highquality reference genomes for phase 1 of the Vertebrate Genomes Project, whose ultimate mission is to generate chromosome-level reference genome assemblies of all ∼70,000 extant vertebrate species.
High-quality and complete reference genome assemblies are fundamental for the application of geno... more High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are only available for a few non-microbial species1–4. To address this issue, the international Genome 10K (G10K) consortium5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling the most accurate and complete reference genomes to date. Here we summarize these developments, introduce a set of quality standards, and present lessons learned from sequencing and assembling 16 species representing major vertebrate lineages (mammals, birds, reptiles, amphibians, teleost fishes and cartilaginous fishes). We confirm that long-read sequencing technologies are essential for maximizing genome quality and that unresolved complex repeats and haplotype heterozygosity are major sources of error in assemblies. Our new assemblies identify and correct substantial errors in some o...
New applications of DNA and RNA sequencing are expanding the field of biodiversity discovery and ... more New applications of DNA and RNA sequencing are expanding the field of biodiversity discovery and ecological monitoring, yet questions remain regarding precision and efficiency. Due to primer bias, the ability of metabarcoding to accurately depict biomass of different taxa from bulk communities remains unclear, while PCR-free whole mitochondrial genome (mitogenome) sequencing may provide a more reliable alternative. Here we used a set of documented mock communities comprising 13 species of freshwater macroinvertebrates of estimated individual biomass, to compare the detection efficiency of COI metabarcoding (3 different amplicons) and shotgun mitogenome sequencing. Additionally, we used individual COI barcoding and de novo mitochondrial genome sequencing, to provide reference sequences for OTU assignment and metagenome mapping (mitogenome-skimming) respectively. We found that even though both methods occasionally failed to recover very low abundance species, metabarcoding was less co...
The use of environmental DNA (eDNA) in biodiversity assessments offers a step-change in sensitivi... more The use of environmental DNA (eDNA) in biodiversity assessments offers a step-change in sensitivity, throughput and simultaneous measures of ecosystem diversity and function. There remains, however, a need to examine eDNA persistence in the wild through simultaneous temporal measures of eDNA and biota. Here, we use metabarcoding of two markers of different lengths, derived from an annual time series of aqueous lake eDNA to examine temporal shifts in ecosystem biodiversity and in an ecologically important group of macroinvertebrates (Diptera: Chironomidae). The analyses allow different levels of detection and validation of taxon richness and community composition (β-diversity) through time, with shorter eDNA fragments dominating the eDNA community. Comparisons between eDNA, community DNA, taxonomy and UK species abundance data further show significant relationships between diversity estimates derived across the disparate methodologies. Our results reveal the temporal dynamics of eDNA...
High-quality and complete reference genome assemblies are fundamental for the application of geno... more High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are only available for a few non-microbial species 1-4 . To address this issue, the international Genome 10K (G10K) consortium 5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling the most accurate and complete reference genomes to date. Here we summarize these developments, introduce a set of quality standards, and present lessons learned from sequencing and assembling 16 species representing major vertebrate lineages (mammals, birds, reptiles, amphibians, teleost fishes and cartilaginous fishes). We confirm that long-read sequencing technologies are essential for maximizing genome quality and that unresolved complex repeats and haplotype heterozygosity are major sources of error in assemblies. Our new assemblies identify and correct substantial errors in some of the best historical reference genomes. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an effort to generate high-quality, complete reference genomes for all ~70,000 extant vertebrate species and help enable a new era of discovery across the life sciences.
New applications of DNA and RNA sequencing are expanding the field of biodiversity discovery and ... more New applications of DNA and RNA sequencing are expanding the field of biodiversity discovery and ecological monitoring, yet questions remain regarding precision and efficiency. Due to primer bias, the ability of metabarcoding to accurately depict biomass of different taxa from bulk communities remains unclear, while PCR-free whole mitochondrial genome (mitogenome) sequencing may provide a more reliable alternative. Here, we used a set of documented mock communities comprising 13 species of freshwater macroinvertebrates of estimated individual biomass, to compare the detection efficiency of COI metabarcoding (three different amplicons) and shotgun mitogenome sequencing. Additionally, we used individual COI barcoding and de novo mitochondrial genome sequencing, to provide reference sequences for OTU assignment and metagenome mapping (mitogenome skimming), respectively. We found that, even though both methods occasionally failed to recover very low abundance species, metabarcoding was less consistent, by failing to recover some species with higher abundances, probably due to primer bias. Shotgun sequencing results provided highly significant correlations between read number and biomass in all but one species. Conversely, the read-biomass relationships obtained from metabarcoding varied across amplicons. Specifically, we found significant relationships for eight of 13 (amplicons B1FR-450 bp, FF130R-130 bp) or four of 13 (am-plicon FFFR, 658 bp) species. Combining the results of all three COI amplicons (multiamplicon approach) improved the read-biomass correlations for some of the species. Overall, mitogenomic sequencing yielded more informative predictions of biomass content from bulk macroinvertebrate communities than metabarcoding. However, for large-scale ecological studies, metabarcoding currently remains the most commonly used approach for diversity assessment. K E Y W O R D S biodiversity, biomass, genome skimming, invertebrates, metabarcoding, metagenomics
The use of environmental DNA (eDNA) in biodiversity assessments offers a step-change in sensitivi... more The use of environmental DNA (eDNA) in biodiversity assessments offers a step-change in sensitivity, throughput and simultaneous measures of ecosystem diversity and function. There remains, however, a need to examine eDNA persistence in the wild through simultaneous temporal measures of eDNA and biota. Here, we use metabarcoding of two markers of different lengths, derived from an annual time series of aqueous lake eDNA to examine temporal shifts in ecosystem biodiversity and in an ecologically important group of macro-invertebrates (Diptera: Chironomidae). The analyses allow different levels of detection and validation of taxon richness and community composition (b-diversity) through time, with shorter eDNA fragments dominating the eDNA community. Comparisons between eDNA, community DNA, taxonomy and UK species abundance data further show significant relationships between diversity estimates derived across the disparate methodologies. Our results reveal the temporal dynamics of eDNA and validate the utility of eDNA metabarcoding for tracking seasonal diversity at the ecosystem scale.
Freshwater biomonitoring using macroinvertebrates as ecological indicators is a well-established ... more Freshwater biomonitoring using macroinvertebrates as ecological indicators is a well-established method of ecosystem health assessment. Traditional biomonitoring applications employ morphological identification of species, which is very labour intensive and in many cases lacking in accuracy and speed . Recent advances in molecular analysis though, like DNA Barcoding and Next Generation Sequencing (NGS) have allowed for these techniques to be employed for the identification of species. DNA barcoding of freshwater macroinvertebrates can increase the accuracy and resolution of identification which in turn can increase the efficiency and accuracy of ecosystem health assessment in applied biomonitoring . At Bangor University, alongside the Environment Agency (EA) we are producing a comprehensive DNA barcoding library for 180 species of Trichoptera, Mollusca and Chironomidae from UK lakes and rivers, aiming to provide more information for the direct application of molecular methods in freshwater biomonitoring. The Chironomidae family of non-biting midges in particular, represent an excellent group for ecosystem assessment due to their huge capacity to act as indicator species and their extensive species diversity . Furthermore, we are using Environmental DNA (eDNA), extracted directly from water samples, to detect the presence and temporal variation of chironomids as well as reconstruct ecological networks inside a Welsh lake ecosystem. By performing NGS of eDNA and bulk invertebrate samples, alongside sequence information from our Barcoding library, we are testing a faster, more accurate and less labour intensive method of ecosystem health assessment.
The purpose of this study was to determine the phylogeographic structure of the brackish-hypersal... more The purpose of this study was to determine the phylogeographic structure of the brackish-hypersaline cyprinodont fish Aphanius fasciatus (Valenciennes, 1821), using sequencing and RFLP analysis of a 1,330 bp mitochondrial DNA segment containing part of the 16S rRNA gene as well as the genes for tRNA-Leu, NADH subunit 1 and tRNA-Ile. Individuals were collected from 13 different sites in Greece and Turkey, while seven published A. fasciatus sequences were also included to cover the area of distribution of the species. Pairwise sequence divergence values ranged from 0 to 4.51%. Congruent phylogenies were recovered with maximum likelihood, maximum parsimony and neighbour-joining methods. All analyses revealed two main groups. The first group consists of populations from almost all localities that drain into the Aegean Sea. The second group comprises the remaining population samples, which in some cases seem to consist of population-specific subgroups. Our results show that vicariant events have predominantly affected the evolution of A. fasciatus, with the Messinian salinity crisis having shaped the present genetic structure of its populations. Additionally, the life-history traits of the species, which determine a low potential for dispersal, coupled with the typical fragmentation of brackish-hypersaline water habitats have led to a high degree of isolation of A. fasciatus populations, even at restricted spatial scales. Analysis of the partitioning of the total amount of polymorphism with analyses of molecular variance (AMOVA) gave a value of F ST = 84.6%. Potential conservation policies concerning A. fasciatus should also consider the low-genetic variability in the majority of its populations and the presence of fixed haplotypes in some of them.
Transactions of the American Fisheries Society, 2009
Mislabeling of North American merlucciid bakes in stock surveys and commercial market samples was... more Mislabeling of North American merlucciid bakes in stock surveys and commercial market samples was detected by employing nuclear 5S ribosomal DNA (rDNA) and mitochondrial cytochrome b variation as molecular markers. Results showed that offshore hake Merluccius albidus is sold in European markets but is labeled as the morphologically similar silver hake M. bilinearis, which is the target species of the fishery. This suggests that offshore hake may be inadvertently included within silver hake landings, as the two species overlap in the southern area of silver hake distribution (approximately 41 degrees-35 degrees N latitude near North American coasts). An inexpensive and technically easy technique based on polymerase chain reaction (PCR) amplification of a fragment of 5S rDNA and visualization of PCR products in agarose gels is recommended for routine species assignation in landings for purposes of exploitation estimates and for authentication of commercial hake species.
Uploads
Papers by Ibis Ibis