Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
John Hancock

    John Hancock

    The relationship between the level of repetitiveness in genomic sequences and genome size has been re-investigated making use of the rapidly growing database of complete eubacterial and archaeal genome sequences combined with the... more
    The relationship between the level of repetitiveness in genomic sequences and genome size has been re-investigated making use of the rapidly growing database of complete eubacterial and archaeal genome sequences combined with the fragmentary but now large amount of data from eukaryotic genomes. Relative simplicity factors (RSFs), which measure the repetitiveness of sequences, were calculated and significantly simple motifs (SSMs), which identify the kinds of sequences that are repeated, were identified. A previously reported correlation between genome size and repetitiveness was confirmed, but it was shown that the higher RSFs seen in eukaryotic genomes also reflect a generally higher level of repetitiveness independent of genome size differences. Differences in genome size are responsible for about 10% of the variance in RSF seen between species. The spectrum of SSMs seen within a genome differed markedly within the eubacteria but less so in eukaryotes and, particularly, in archaea. Species with SSM spectra that differ from the norm tend also to have high RSFs for their genome size and to be pathogens that make use of repetitive sequences to avoid host defence responses. Some of the variance in repetitiveness seen in other species may therefore also reflect the action of selection, although other forces such as variation in the effectiveness of mechanisms for regulating slippage errors of replication, may also be important.
    ABSTRACT Motivation: A major challenge in modern biology is to link genome sequence information to organismal function. In many organisms this is being done by characterising phenotypes resulting from mutations. Efficiently expressing... more
    ABSTRACT Motivation: A major challenge in modern biology is to link genome sequence information to organismal function. In many organisms this is being done by characterising phenotypes resulting from mutations. Efficiently expressing phenotypic information requires combinatorial use of ontologies. However tools are not currently available to visualise combinations of ontologies. Here we describe CRAVE (Concept Relation Assay Value Explorer), a package allowing storage, active updating and visualization of multiple ontologies. Results: CRAVE is a web-accessible JAVA application that accesses an underlying MySQL database of ontologies via a JAVA persistent middleware layer (Chameleon). This maps the database tables into discrete JAVA classes and creates memory resident, interlinked objects corresponding to the ontology data. These JAVA objects are accessed via calls through the middleware’s API. CRAVE allows simultaneous display and linking of multiple ontologies and searching using Boolean and advanced searches.
    Eukaryotic ribosomal RNA genes contain rapidly evolving regions of unknown function termed expansion segments. We present the comparative analysis of the primary and secondary structure of two expansion segments from the large subunit... more
    Eukaryotic ribosomal RNA genes contain rapidly evolving regions of unknown function termed expansion segments. We present the comparative analysis of the primary and secondary structure of two expansion segments from the large subunit rRNA gene of ten species of Drosophila and the tsetse fly species Glossina morsitans morsitans. At the primary sequence level, most of the differences observed in the sequences obtained are single base substitutions. This is in marked contrast with observations in vertebrate species in which the insertion or deletion of repetitive motifs, probably generated by a DNA-slippage mechanism, is a major factor in the evolution of these regions. The secondary structure of the two regions, supported by multiple compensatory base changes, is highly conserved between the species examined and supports the existence of a general folding pattern for all eukaryotes. Intriguingly, the evolutionary rate of expansion segments is very slow relative to other genic and non-genic regions of the Drosophila genome. These results suggest that the evolution of expansion segments in the rDNA multigene family is a balance between the homogenization of new mutations by unequal crossing over and a combination of selection against some such mutations per se and selection for subsequent compensatory mutations, in order to maintain a particular RNA secondary structure.
    Eukaryotic ribosomal RNA genes contain rapidly evolving regions of unknown function termed expansion segments. We present the comparative analysis of the primary and secondary structure of two expansion segments from the large subunit... more
    Eukaryotic ribosomal RNA genes contain rapidly evolving regions of unknown function termed expansion segments. We present the comparative analysis of the primary and secondary structure of two expansion segments from the large subunit rRNA gene of ten species of Drosophila and the tsetse fly species Glossina morsitans morsitans. At the primary sequence level, most of the differences observed in the sequences obtained are single base substitutions. This is in marked contrast with observations in vertebrate species in which the insertion or deletion of repetitive motifs, probably generated by a DNA-slippage mechanism, is a major factor in the evolution of these regions. The secondary structure of the two regions, supported by multiple compensatory base changes, is highly conserved between the species examined and supports the existence of a general folding pattern for all eukaryotes. Intriguingly, the evolutionary rate of expansion segments is very slow relative to other genic and non-genic regions of the Drosophila genome. These results suggest that the evolution of expansion segments in the rDNA multigene family is a balance between the homogenization of new mutations by unequal crossing over and a combination of selection against some such mutations per se and selection for subsequent compensatory mutations, in order to maintain a particular RNA secondary structure.
    Numerous databases containing information about DNA, RNA, and protein variations are available. Gene-specific variant databases (locus-specific variation databases, LSDBs) are typically curated and maintained for single genes or groups of... more
    Numerous databases containing information about DNA, RNA, and protein variations are available. Gene-specific variant databases (locus-specific variation databases, LSDBs) are typically curated and maintained for single genes or groups of genes for a certain disease(s). These databases are widely considered as the most reliable information source for a particular gene/protein/disease, but it should also be made clear they may have widely varying contents, infrastructure, and quality. Quality is very important to evaluate because these databases may affect health decision-making, research, and clinical practice. The Human Variome Project (HVP) established a Working Group for Variant Database Quality Assessment. The basic principle was to develop a simple system that nevertheless provides a good overview of the quality of a database. The HVP quality evaluation criteria that resulted are divided into four main components: data quality, technical quality, accessibility, and timeliness. This report elaborates on the developed quality criteria and how implementation of the quality scheme can be achieved. Examples are provided for the current status of the quality items in two different databases, BTKbase, an LSDB, and ClinVar, a central archive of submissions about variants and their clinical significance.
    Sequence variation in the middle part of the small-subunit rRNA was studied for representatives of the major groups in the family Cicindelidae (Coleoptera). All taxa exhibited a much expanded segment in variable region V4 com- pared to D.... more
    Sequence variation in the middle part of the small-subunit rRNA was studied for representatives of the major groups in the family Cicindelidae (Coleoptera). All taxa exhibited a much expanded segment in variable region V4 com- pared to D. melanogaster. This expanded segment was not found in other groups of beetles, including three taxa in the closely related Carabidae. Secondary structure
    ABSTRACT Introduction Eukaryotic genomes contain a rich complement of repetitive sequences, ranging from transposable elements to simple sequence repeats such as mini- and microsatellites. Tautz et al [1] defined a class of simple... more
    ABSTRACT Introduction Eukaryotic genomes contain a rich complement of repetitive sequences, ranging from transposable elements to simple sequence repeats such as mini- and microsatellites. Tautz et al [1] defined a class of simple sequence repeats which they called cryptically simple sequences (CSS). These correspond to regions of low complexity and contain concentrations of one or a few short motifs at above random expectation. Analyses of CSS at the genome level [2, 3] show that they are present in all organisms but are much more common in eukaryotes than prokaryotes, apparently reflecting different mutational processes in the different types of organism. CSS in ribosomal RNA genes of insects show similar mutational patterns to microsatellites [4] suggesting they also mutate primarily by replication slippage. Tandem repeated sequences are generally rare in genes, probably due to selection against frameshifts due to gain and loss of repeated units. Tandem repeats of codons are less rare and
    CiteSeerX - Document Details (Isaac Councill, Lee Giles): Proteins containing heme, iron(protoporphyrin IX) and its variants, continue to be one of the moststudied classes of biomolecules due to their diverse range of biological... more
    CiteSeerX - Document Details (Isaac Councill, Lee Giles): Proteins containing heme, iron(protoporphyrin IX) and its variants, continue to be one of the moststudied classes of biomolecules due to their diverse range of biological functions. The literature is abundant with ...
    DNA sequences of the first ribosomal internal transcribed spacer (ITS1) were isolated from 10 ladybird beetle species (Coleoptera: Coccinellidae) representing four subfamilies (Coccinellinae, Chilocorinae, Scymninae, and Coc- cidulinae).... more
    DNA sequences of the first ribosomal internal transcribed spacer (ITS1) were isolated from 10 ladybird beetle species (Coleoptera: Coccinellidae) representing four subfamilies (Coccinellinae, Chilocorinae, Scymninae, and Coc- cidulinae). The spacers ranged in length from 791 to 2,572 bp, thereby including one of the longest ITS1s and exhibiting one of the most extreme cases of ITS1 size variation in eukaryotes recorded
    The function of the majority of genes in the mouse and human genomes remains unknown. The mouse embryonic stem cell knockout resource provides a basis for the characterization of relationships between genes and phenotypes. The EUMODIC... more
    The function of the majority of genes in the mouse and human genomes remains unknown. The mouse embryonic stem cell knockout resource provides a basis for the characterization of relationships between genes and phenotypes. The EUMODIC consortium developed and validated robust methodologies for the broad-based phenotyping of knockouts through a pipeline comprising 20 disease-oriented platforms. We developed new statistical methods for pipeline design and data analysis aimed at detecting reproducible phenotypes with high power. We acquired phenotype data from 449 mutant alleles, representing 320 unique genes, of which half had no previous functional annotation. We captured data from over 27,000 mice, finding that 83% of the mutant lines are phenodeviant, with 65% demonstrating pleiotropy. Surprisingly, we found significant differences in phenotype annotation according to zygosity. New phenotypes were uncovered for many genes with previously unknown function, providing a powerful basis...
    Understanding the functions encoded in the mouse genome will be central to an understanding of the genetic basis of human disease. To achieve this it will be essential to be able to characterize the phenotypic consequences of variation... more
    Understanding the functions encoded in the mouse genome will be central to an understanding of the genetic basis of human disease. To achieve this it will be essential to be able to characterize the phenotypic consequences of variation and alterations in individual genes. Data on the phenotypes of mouse strains are currently held in a number of different forms (detailed
    ABSTRACT Now that not only the mouse genome sequence but also the ability to carry out high throughput manipulation of mouse ES cells is in place, projects are underway to understand the functions of individual mouse genes in a systematic... more
    ABSTRACT Now that not only the mouse genome sequence but also the ability to carry out high throughput manipulation of mouse ES cells is in place, projects are underway to understand the functions of individual mouse genes in a systematic manner. Central to this will be the systematic analysis of the phenotypes of mutant mouse lines. EUMODIC, the first large-scale project to screen mouse knockout lines for disease-related phenotypes is underway and experience shows the necessity of a well-organised bioinformatics infrastructure for the capture, analysis and dissemination of the data emerging from such projects. Here we discuss the fundamental requirements for such a bioinformatics infrastructure, progress so far and the developments that will be required in future.
    The mouse is a key model organism for the study of mammalian genetics, development, physiology and biochemistry. The determination of the mouse genome sequence was therefore an early priority in the genome project. A draft sequence became... more
    The mouse is a key model organism for the study of mammalian genetics, development, physiology and biochemistry. The determination of the mouse genome sequence was therefore an early priority in the genome project. A draft sequence became available in 2002 and many chromosomes are now close to being finished. Comparative analysis of the mouse genome sequence with that of the human and other genomes has revealed a wealth of information on genome evolution in the mammalian lineage and assisted in the annotation of both genomes. With the availability of a well-annotated mouse genome sequence, mouse geneticists are now poised to undertake the challenge of generating mutations at every gene in the mouse genome. Systematic mutagenesis of the mouse genome will be an important step towards the first comprehensive functional annotation of a mammalian genome and the identification and characterisation of models for the study of human genetic disease.
    The structured description of mutant phenotypes presents a major conceptual and practical problem. A general model for generating mouse phenotype ontologies that involves combing a variety of different ontologies to better link and... more
    The structured description of mutant phenotypes presents a major conceptual and practical problem. A general model for generating mouse phenotype ontologies that involves combing a variety of different ontologies to better link and describe phenotypes is presented. This model is based on the Phenotype and Trait Ontology schema proposal and incorporates practical limitations and designing solutions in an attempt to model a testbed for the first phenotype ontology constructed in this manner, namely the mouse behavior phenotype ontology. We propose the application of such a model could provide curators with a powerful mechanism of annotation, mining and knowledge representation as well as achieving some level of free text disassociation.
    The set of "expansion segments" of any eukaryotic 26S/28S ribosomal RNA (rRNA) gene is responsible for the bulk of the difference in length between the prokaryotic 23S rRNA gene and the eukaryotic 26S/28S rRNA gene. The... more
    The set of "expansion segments" of any eukaryotic 26S/28S ribosomal RNA (rRNA) gene is responsible for the bulk of the difference in length between the prokaryotic 23S rRNA gene and the eukaryotic 26S/28S rRNA gene. The expansion segments are also responsible for interspecific fluctuations in length during eukaryotic evolution. They show a consistent bias in base composition in any species; for example, they are AT rich in Drosophila melanogaster and GC rich in vertebrate species. Dot-matrix comparisons of sets of expansion segments reveal high similarities between members of a set within any 28S rRNA gene of a species, in contrast to the little or spurious similarity that exists between sets of expansion segments from distantly related species. Similarities among members of a set of expansion segments within any 28S rRNA gene cannot be accounted for by their base-compositional bias alone. In contrast, no significant similarity exists within a set of "core" segme...
    This paper examines the effects of DNA sequence evolution on RNA secondary structures and compensatory mutations. Models of the secondary structures of Drosophila melanogaster 18S ribosomal RNA (rRNA) and of the complex between 2S, 5.8S,... more
    This paper examines the effects of DNA sequence evolution on RNA secondary structures and compensatory mutations. Models of the secondary structures of Drosophila melanogaster 18S ribosomal RNA (rRNA) and of the complex between 2S, 5.8S, and 28S rRNAs have been drawn on the basis of comparative and energetic criteria. The overall AU richness of the D. melanogaster rRNAs allows the resolution of some ambiguities in the structures of both large rRNAs. Comparison of the sequence of expansion segment V2 in D. melanogaster 18S rRNA with the same region in three other Drosophila species and the tsetse fly (Glossina morsitans morsitans) allows us to distinguish between two models for the secondary structure of this region. The secondary structures of the expansion segments of D. melanogaster 28S rRNA conform to a general pattern for all eukaryotes, despite having highly divergent sequences between D. melanogaster and vertebrates. The 70 novel compensatory mutations identified in the 28S rR...
    In this, the first of three papers, we present the sequence of the ribosomal RNA (rRNA) genes of Drosophila melanogaster. The gene regions of D. melanogaster rDNA encode four individual rRNAs: 18S (1,995 nt), 5.8S (123 nt), 2S (30 nt),... more
    In this, the first of three papers, we present the sequence of the ribosomal RNA (rRNA) genes of Drosophila melanogaster. The gene regions of D. melanogaster rDNA encode four individual rRNAs: 18S (1,995 nt), 5.8S (123 nt), 2S (30 nt), and 28S (3,945 nt). The ribosomal DNA (rDNA) repeat of D. melanogaster is AT rich (65.9% overall), with the spacers being particularly AT rich. Analysis of DNA simplicity reveals that, in contrast to the intergenic spacer (IGS) and the external transcribed spacer (ETS), most of the rRNA gene regions have been refractory to the action of slippage-like events, with the exception of the 28S rRNA gene expansion segments. It would seem that the 28S rRNA can accommodate the products of slippage-like events without loss of activity. In the following two papers we analyze the effects of sequence divergence on the evolution of (1) the 28S gene "expansion segments" and (2) the 28S and 18S rRNA secondary structures among eukaryotic species, respectivel...
    The mouse is an important model of human genetic disease. Describing phenotypes of mutant mice in a standard, structured manner that will facilitate data mining is a major challenge for bioinformatics. Here we describe a novel,... more
    The mouse is an important model of human genetic disease. Describing phenotypes of mutant mice in a standard, structured manner that will facilitate data mining is a major challenge for bioinformatics. Here we describe a novel, compositional approach to this problem which combines core ontologies from a variety of sources. This produces a framework with greater flexibility, power and economy than previous approaches. We discuss some of the issues this approach raises.
    ... Alessandro DiCara2, Maxime Durot3, John M ... such as the mass action law (Guldberg and Waage 1879), the Michaelis-Menten rate law (Briggs and Haldane 1925; Michaelis and Menten 1913), or more complicated forms to attain some specific... more
    ... Alessandro DiCara2, Maxime Durot3, John M ... such as the mass action law (Guldberg and Waage 1879), the Michaelis-Menten rate law (Briggs and Haldane 1925; Michaelis and Menten 1913), or more complicated forms to attain some specific kinetics (Cornish-Bowden et al. ...
    Ontologies are becoming increasingly important for the efficient storage, retrieval and mining of biological data. The description of phenotypes using ontologies is a particularly complex problem. We outline a schema that can be used to... more
    Ontologies are becoming increasingly important for the efficient storage, retrieval and mining of biological data. The description of phenotypes using ontologies is a particularly complex problem. We outline a schema that can be used to describe phenotypes by combining orthologous axiomatic ontologies. We also describe tools for storing, browsing and searching such complex ontologies. Central to this approach is that assays (protocols for measuring phenotypic characters) describe what has been measured as well as how this was done, allowing assays to link individual organisms to ontologies describing phenotypes. We have evaluated this approach by automatically annotating data on 600,000 mutant mice phenotypes using the SHIRPA protocol. We believe this approach will enable the flexible, extensible and detailed description of phenotypes from any organism.
    ... standards for microarray data," Nat Genet, vol. 29(4), pp. 365-371, 2001. [15] CF Taylor, "Standards for reporting bioscience data: a forward look," Drug Discov Today, vol. 12, pp.... more
    ... standards for microarray data," Nat Genet, vol. 29(4), pp. 365-371, 2001. [15] CF Taylor, "Standards for reporting bioscience data: a forward look," Drug Discov Today, vol. 12, pp. 527-533, 2007. [16] EW Deutsch, CA Ball, GS Bova ...
    ... Christina Chandras, Thomas Weaver, Michael Zouberakis, John M. Hancock, Paul N. Schofield, and Vassilis Aidinis ... Page 5. Neme: interleukin 10 Symbol: 1110 Synonym/Old Neme: cytokine synthesis inhibitory factor, IL-l0 Chromosome: 1... more
    ... Christina Chandras, Thomas Weaver, Michael Zouberakis, John M. Hancock, Paul N. Schofield, and Vassilis Aidinis ... Page 5. Neme: interleukin 10 Symbol: 1110 Synonym/Old Neme: cytokine synthesis inhibitory factor, IL-l0 Chromosome: 1 Comment Conditional IL-l0 mutation ...
    Integration of biological data of various types and development of adapted bioinformatics tools represent critical objectives to enable research at the systems level. The European Network of Excellence ENFIN is engaged in developing an... more
    Integration of biological data of various types and development of adapted bioinformatics tools represent critical objectives to enable research at the systems level. The European Network of Excellence ENFIN is engaged in developing an adapted infrastructure to connect databases, and platforms to enable both generation of new bioinformatics tools and experimental validation of computational predictions. With the aim of bridging
    This paper describes an approach to providing computer-interpretable logical definitions for the terms of the Human Phenotype Ontology (HPO) using PATO, the ontology of phenotypic qualities, to link terms of the HPO to the anatomic and... more
    This paper describes an approach to providing computer-interpretable logical definitions for the terms of the Human Phenotype Ontology (HPO) using PATO, the ontology of phenotypic qualities, to link terms of the HPO to the anatomic and other entities that are affected by abnormal phenotypic qualities. This approach will allow improved computerized reasoning as well as a facility to compare phenotypes between different species. The PATO mapping will also provide direct links from phenotypic abnormalities and underlying anatomic structures encoded using the Foundational Model of Anatomy, which will be a valuable resource for computational investigations of the links between anatomical components and concepts representing diseases with abnormal phenotypes and associated genes.
    We present an extensible software model for the genotype and phenotype community, XGAP. Readers can download a standard XGAP (http://www.xgap.org) or auto-generate a custom version using MOLGENIS with programming interfaces to R-software... more
    We present an extensible software model for the genotype and phenotype community, XGAP. Readers can download a standard XGAP (http://www.xgap.org) or auto-generate a custom version using MOLGENIS with programming interfaces to R-software and web-services or user interfaces for biologists. XGAP has simple load formats for any type of genotype, epigenotype, transcript, protein, metabolite or other phenotype data. Current functionality includes tools ranging from eQTL analysis in mouse to genome-wide association studies in humans.

    And 62 more