Current methodologies of genome-wide Single Nucleotide Polymorphism (SNP) genotyping produce larg... more Current methodologies of genome-wide Single Nucleotide Polymorphism (SNP) genotyping produce large amounts of missing data that may affect statistical inference and bias the outcome of experiments. Genotype imputation is routinely used in well-studied species to buffer the impact in downstream analysis and several algorithms are available to fill in missing genotypes. The lack of reference haplotype panels precludes the use of these methods in genomic studies on non-model organisms. As an alternative, machine learning algorithms are employed to explore the genotype data and to estimate the missing genotypes. Here, we propose an imputation method based on Self-Organizing Maps (SOM), a widely used neural networks formed by spatially distributed neurons that cluster similar inputs into close neurons. We follow a classical approach that explores genotype datasets to select SNP loci for each query missing SNP genotype to build training sets, and that initializes and trains the neural net...
Hybridization and introgression between cork oak (Quercus suber) and holm oak (Q. ilex) have trad... more Hybridization and introgression between cork oak (Quercus suber) and holm oak (Q. ilex) have traditionally been reckoned as undesirable processes, since hybrid individuals lack the profitable bark characteristics of cork oak. Nevertheless, a systematic and quantitative description of the bark of these hybrids at the microscopic level, based on a significant number of individuals, is not available to date. In this work we provide such a qualitative and quantitative description, identifying the most relevant variables for their classification. Hybrids show certain features intermediate between those of the parent species (such as phellem percentage in the outer bark, which was approximately 40% as a mean value for hybrids, 20% in holm oak and almost 99% in cork oak), as well as other unique features, such as the general suberization of inactive phloem (up to 25% in certain individuals), reported here for the first time. These results suggest a relevant hybridization-induced modificati...
Hybridization and introgression between cork oak (Quercus suber) and holm oak (Q. ilex) have trad... more Hybridization and introgression between cork oak (Quercus suber) and holm oak (Q. ilex) have traditionally been reckoned as undesirable processes, since hybrid individuals lack the profitable bark characteristics of cork oak. Nevertheless, a systematic and quantitative description of the bark of these hybrids at the microscopic level based on a significant number of individuals was not available to date.In this work we provide such a qualitative and quantitative description, identifying the most relevant variables for their classification. Hybrids show certain features intermediate between those of the parent species, as well as other unique features, as the general suberization of inactive phloem, reported here for the first time. These results suggest a relevant hybridization-induced modification of the genetic expression patterns. Therefore, hybrid individuals provide a valuable material to disentangle the molecular mechanisms underpinning bark development in angiosperms.
Trabajo presentado en el II International Conference on Island Evolution, Ecology and Conservatio... more Trabajo presentado en el II International Conference on Island Evolution, Ecology and Conservation, celebrado en Angra do Heroismo, Islas Azores (Portugal) del 18 al 22 de julio de 2016.
Hybridization and its relevance is a hot topic in ecology and evolutionary biology. Interspecific... more Hybridization and its relevance is a hot topic in ecology and evolutionary biology. Interspecific gene flow may play a key role in species adaptation to environmental change, as well as in the survival of endangered populations. Despite the fact that hybridization is quite common in plants, many hybridizing species, such as Quercus spp., maintain their integrity, while precise determination of genomic boundaries between species remains elusive. Novel high throughput sequencing techniques have opened up new perspectives in the comparative analysis of genomes and in the study of historical and current interspecific gene flow. In this work, we applied ddRADseq technique and developed an ad hoc bioinformatics pipeline for the study of ongoing hybridization between two relevant Mediterranean oaks, Q. ilex and Q. suber. We adopted a local scale approach, analyzing adult hybrids (sensu lato) identified in a mixed stand and their open-pollinated progenies. We have identified up to 9,251 mar...
Background:Bioinformatics software for RNA-seq analysis has a high computational requirement in t... more Background:Bioinformatics software for RNA-seq analysis has a high computational requirement in terms of the number of CPUs, RAM size, and processor characteristics. Specifically, de novo transcriptome assembly demands large computational infrastructure due to the massive data size, and complexity of the algorithms employed. Comparative studies on the quality of the transcriptome yielded by de novo assemblers have been previously published, lacking, however, a hardware efficiency-oriented approach to help select the assembly hardware platform in a cost-efficient way.Objective:We tested the performance of two popular de novo transcriptome assemblers, Trinity and SOAPdenovo-Trans (SDNT), in terms of cost-efficiency and quality to assess limitations, and provided troubleshooting and guidelines to run transcriptome assemblies efficiently.Methods:We built virtual machines with different hardware characteristics (CPU number, RAM size) in the Amazon Elastic Compute Cloud of the Amazon Web ...
A noticeable proportion of low transcribed genes involved in wood formation in conifers may have ... more A noticeable proportion of low transcribed genes involved in wood formation in conifers may have been missed in previous transcriptomic studies. This could be the case for genes related to less abundant cell types, such as axial parenchyma and resin ducts, and genes related to juvenile wood. In this study, two normalized libraries have been obtained from the cambial zone of young individuals of Pinus canariensis C. Sm. ex DC, a species in which such cells are comparatively abundant. These two libraries cover earlywood (EW) and latewood (LW) differentiation, and reads have been de novo meta-assembled into one transcriptome. A high number of previously undescribed genes have been found. The transcriptional profiles during the growing season have been analyzed and several noticeable differences with respect to previous studies have been found. This work contributes to a more complete picture of wood formation in conifers. The genes and their transcription profiles described here provid...
Double‐digested RADseq (ddRADseq) is a NGS methodology that generates reads from thousands of loc... more Double‐digested RADseq (ddRADseq) is a NGS methodology that generates reads from thousands of loci targeted by restriction enzyme cut sites, across multiple individuals. To be statistically sound and economically optimal, a ddRADseq experiment has a preliminary design stage that needs to consider issues related to the selection of enzymes, particular features of the genome of the focal species, possible modifications to the library construction protocol, coverage needed to minimize missing data, and the potential sources of error that may impact upon the coverage. We present ddradseqtools, a software package to help ddRADseq experimental design by (i) the generation of in silico double‐digested fragments; (ii) the construction of modified ddRADseq libraries using adapters with either one or two indexes and degenerate base regions (DBRs) to quantify PCR duplicates; and (iii) the initial steps of the bioinformatics preprocessing of reads. ddradseqtools generates single‐end (SE) or pai...
Summary European aspen (Populus tremula L.) has been traditionally thought to establish new stand... more Summary European aspen (Populus tremula L.) has been traditionally thought to establish new stands by vegetative propagation through root suckers produced by very few individuals (often only one). Morphological traits and isozyme patterns were studied in five small stands in northern Spain. Both isozyme and morphological approaches showed variation within and between stands. Estimated intrapopulational variation was higher than the expected, and clusters of individuals with the same isozyme multilocus patterns within each population have been identified. In order to check to what extent morphological markers are affected by the genotypes or clones, comparisons between leaf parameters and isozyme patterns were performed by hierarchical ANOVA and tests of hypothesis were constructed from the components of variance. Leaf shape parameters show a good correlation with the isozyme multilocus patterns. On the other hand, leaf size parameters, were more influenced by environmental factors. ...
Current methodologies of genome-wide Single Nucleotide Polymorphism (SNP) genotyping produce larg... more Current methodologies of genome-wide Single Nucleotide Polymorphism (SNP) genotyping produce large amounts of missing data that may affect statistical inference and bias the outcome of experiments. Genotype imputation is routinely used in well-studied species to buffer the impact in downstream analysis and several algorithms are available to fill in missing genotypes. The lack of reference haplotype panels precludes the use of these methods in genomic studies on non-model organisms. As an alternative, machine learning algorithms are employed to explore the genotype data and to estimate the missing genotypes. Here, we propose an imputation method based on Self-Organizing Maps (SOM), a widely used neural networks formed by spatially distributed neurons that cluster similar inputs into close neurons. We follow a classical approach that explores genotype datasets to select SNP loci for each query missing SNP genotype to build training sets, and that initializes and trains the neural net...
Hybridization and introgression between cork oak (Quercus suber) and holm oak (Q. ilex) have trad... more Hybridization and introgression between cork oak (Quercus suber) and holm oak (Q. ilex) have traditionally been reckoned as undesirable processes, since hybrid individuals lack the profitable bark characteristics of cork oak. Nevertheless, a systematic and quantitative description of the bark of these hybrids at the microscopic level, based on a significant number of individuals, is not available to date. In this work we provide such a qualitative and quantitative description, identifying the most relevant variables for their classification. Hybrids show certain features intermediate between those of the parent species (such as phellem percentage in the outer bark, which was approximately 40% as a mean value for hybrids, 20% in holm oak and almost 99% in cork oak), as well as other unique features, such as the general suberization of inactive phloem (up to 25% in certain individuals), reported here for the first time. These results suggest a relevant hybridization-induced modificati...
Hybridization and introgression between cork oak (Quercus suber) and holm oak (Q. ilex) have trad... more Hybridization and introgression between cork oak (Quercus suber) and holm oak (Q. ilex) have traditionally been reckoned as undesirable processes, since hybrid individuals lack the profitable bark characteristics of cork oak. Nevertheless, a systematic and quantitative description of the bark of these hybrids at the microscopic level based on a significant number of individuals was not available to date.In this work we provide such a qualitative and quantitative description, identifying the most relevant variables for their classification. Hybrids show certain features intermediate between those of the parent species, as well as other unique features, as the general suberization of inactive phloem, reported here for the first time. These results suggest a relevant hybridization-induced modification of the genetic expression patterns. Therefore, hybrid individuals provide a valuable material to disentangle the molecular mechanisms underpinning bark development in angiosperms.
Trabajo presentado en el II International Conference on Island Evolution, Ecology and Conservatio... more Trabajo presentado en el II International Conference on Island Evolution, Ecology and Conservation, celebrado en Angra do Heroismo, Islas Azores (Portugal) del 18 al 22 de julio de 2016.
Hybridization and its relevance is a hot topic in ecology and evolutionary biology. Interspecific... more Hybridization and its relevance is a hot topic in ecology and evolutionary biology. Interspecific gene flow may play a key role in species adaptation to environmental change, as well as in the survival of endangered populations. Despite the fact that hybridization is quite common in plants, many hybridizing species, such as Quercus spp., maintain their integrity, while precise determination of genomic boundaries between species remains elusive. Novel high throughput sequencing techniques have opened up new perspectives in the comparative analysis of genomes and in the study of historical and current interspecific gene flow. In this work, we applied ddRADseq technique and developed an ad hoc bioinformatics pipeline for the study of ongoing hybridization between two relevant Mediterranean oaks, Q. ilex and Q. suber. We adopted a local scale approach, analyzing adult hybrids (sensu lato) identified in a mixed stand and their open-pollinated progenies. We have identified up to 9,251 mar...
Background:Bioinformatics software for RNA-seq analysis has a high computational requirement in t... more Background:Bioinformatics software for RNA-seq analysis has a high computational requirement in terms of the number of CPUs, RAM size, and processor characteristics. Specifically, de novo transcriptome assembly demands large computational infrastructure due to the massive data size, and complexity of the algorithms employed. Comparative studies on the quality of the transcriptome yielded by de novo assemblers have been previously published, lacking, however, a hardware efficiency-oriented approach to help select the assembly hardware platform in a cost-efficient way.Objective:We tested the performance of two popular de novo transcriptome assemblers, Trinity and SOAPdenovo-Trans (SDNT), in terms of cost-efficiency and quality to assess limitations, and provided troubleshooting and guidelines to run transcriptome assemblies efficiently.Methods:We built virtual machines with different hardware characteristics (CPU number, RAM size) in the Amazon Elastic Compute Cloud of the Amazon Web ...
A noticeable proportion of low transcribed genes involved in wood formation in conifers may have ... more A noticeable proportion of low transcribed genes involved in wood formation in conifers may have been missed in previous transcriptomic studies. This could be the case for genes related to less abundant cell types, such as axial parenchyma and resin ducts, and genes related to juvenile wood. In this study, two normalized libraries have been obtained from the cambial zone of young individuals of Pinus canariensis C. Sm. ex DC, a species in which such cells are comparatively abundant. These two libraries cover earlywood (EW) and latewood (LW) differentiation, and reads have been de novo meta-assembled into one transcriptome. A high number of previously undescribed genes have been found. The transcriptional profiles during the growing season have been analyzed and several noticeable differences with respect to previous studies have been found. This work contributes to a more complete picture of wood formation in conifers. The genes and their transcription profiles described here provid...
Double‐digested RADseq (ddRADseq) is a NGS methodology that generates reads from thousands of loc... more Double‐digested RADseq (ddRADseq) is a NGS methodology that generates reads from thousands of loci targeted by restriction enzyme cut sites, across multiple individuals. To be statistically sound and economically optimal, a ddRADseq experiment has a preliminary design stage that needs to consider issues related to the selection of enzymes, particular features of the genome of the focal species, possible modifications to the library construction protocol, coverage needed to minimize missing data, and the potential sources of error that may impact upon the coverage. We present ddradseqtools, a software package to help ddRADseq experimental design by (i) the generation of in silico double‐digested fragments; (ii) the construction of modified ddRADseq libraries using adapters with either one or two indexes and degenerate base regions (DBRs) to quantify PCR duplicates; and (iii) the initial steps of the bioinformatics preprocessing of reads. ddradseqtools generates single‐end (SE) or pai...
Summary European aspen (Populus tremula L.) has been traditionally thought to establish new stand... more Summary European aspen (Populus tremula L.) has been traditionally thought to establish new stands by vegetative propagation through root suckers produced by very few individuals (often only one). Morphological traits and isozyme patterns were studied in five small stands in northern Spain. Both isozyme and morphological approaches showed variation within and between stands. Estimated intrapopulational variation was higher than the expected, and clusters of individuals with the same isozyme multilocus patterns within each population have been identified. In order to check to what extent morphological markers are affected by the genotypes or clones, comparisons between leaf parameters and isozyme patterns were performed by hierarchical ANOVA and tests of hypothesis were constructed from the components of variance. Leaf shape parameters show a good correlation with the isozyme multilocus patterns. On the other hand, leaf size parameters, were more influenced by environmental factors. ...
Uploads
Papers by Unai Lopez de Heredia