Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

    Inge Jonassen

    ELIXIR, the European life science infrastructure for biological information, is a unique initiative to consolidate Europe's national centres, services, and core bioinformatics resources into... more
    ELIXIR, the European life science infrastructure for biological information, is a unique initiative to consolidate Europe's national centres, services, and core bioinformatics resources into a single, coordinated infrastructure. ELIXIR brings together Europe's major life-science data archives and connects these with national bioinformatics infrastructures  - the ELIXIR Nodes. This editorial introduces the ELIXIR channel in F1000Research; the aim of the channel is to collect and present ELIXIR's scientific and operational output, engage with the broad life science community and encourage discussion on proposed infrastructure solutions. Submissions will be assessed by the ELIXIR channel Editorial Board to ensure they are relevant to ELIXIR community, and subjected to F1000Research open peer review process.
    ... 206 Stanislav Angelov, Sanjeev Khanna, Li Li, and Fernando Pereira Local Search Heuristic for Rigid Protein Docking..... ... 350 Tanya Y. Berger-Wolf Relation of Residues in the Variable Region of 16S rDNA Sequences and Their... more
    ... 206 Stanislav Angelov, Sanjeev Khanna, Li Li, and Fernando Pereira Local Search Heuristic for Rigid Protein Docking..... ... 350 Tanya Y. Berger-Wolf Relation of Residues in the Variable Region of 16S rDNA Sequences and Their Relevance to Genus-Specificity ...
    this paper is as follows. This introduction is followed in Section 2 by a briefintroduction to some problems in machine learning, and especially to some approaches to learningfrom strings. In Section 3 we discuss the problem of... more
    this paper is as follows. This introduction is followed in Section 2 by a briefintroduction to some problems in machine learning, and especially to some approaches to learningfrom strings. In Section 3 we discuss the problem of discovering bio-patterns, on the backgroundprovided in Section 2. The major part of the thesis is a portfolio of research papers of the author,with
    ... 2.13. Immunoelectrophoretic techniques Crossed immunoelectrophoresis and tandem crossed immunoelectrophoresis was carried out as described by AXELSEN et al. (2). 105 09C T ~ O75 o ~ 0.60 z < 045 0 < 0 3 0... more
    ... 2.13. Immunoelectrophoretic techniques Crossed immunoelectrophoresis and tandem crossed immunoelectrophoresis was carried out as described by AXELSEN et al. (2). 105 09C T ~ O75 o ~ 0.60 z < 045 0 < 0 3 0 015 CM I CMIII car ...
    © The Author (s) 2012. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons. org/licenses/by-nc/3.0), which... more
    © The Author (s) 2012. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons. org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
    Thermophiles, mesophiles, and psychrophiles have different amino acid frequencies in their proteins, probably because of the way the species adapt to very different temperatures in their environment. In this paper, we analyse how contacts... more
    Thermophiles, mesophiles, and psychrophiles have different amino acid frequencies in their proteins, probably because of the way the species adapt to very different temperatures in their environment. In this paper, we analyse how contacts between sidechains vary between homologous proteins from species that are adapted to different temperatures, but displaying relatively high sequence similarity. We investigate whether specific contacts between amino acids sidechains is a key factor in thermostabilisation in proteins. The dataset was divided into two subsets with optimal growth temperatures from 0– 40 and 35–102°C. Comparison of homologues was made between low-temperature species and high-temperature species within each subset. We found that unspecific interactions like hydrophobic interactions in the core and solvent interactions and entropic effects at the surface, appear to be more important factors than specific contact types like salt bridges and aromatic clusters.
    Research Interests:
    The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer... more
    The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use of the resources is hampered by the lack of widely used, standard data-exchange formats for the basic, everyday bioinformatics data types. BioXSD has been developed as a candidate for standard, canonical exchange format for basic bioinformatics data. BioXSD is represented by a dedicated XML Schema and defines syntax for biological sequences, sequence annotations, alignments and references to resources. We have adapted a set of web services to use BioXSD as the input and output format, and implemented a test-case workflow. This demonstrates that the approach is feasible and provides smooth interoperability. Semantics for BioXSD is provided by annotation with the EDAM ontology. We discuss in a separate section how BioXSD relates to other initiatives and approaches, including existing standards and the Semantic Web. The BioXSD 1.0 XML Schema is freely available at http://www.bioxsd.org/BioXSD-1.0.xsd under the Creative Commons BY-ND 3.0 license. The http://bioxsd.org web page offers documentation, examples of data in BioXSD format, example workflows with source codes in common programming languages, an updated list of compatible web services and tools and a repository of feature requests from the community.
    Motivation: The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources... more
    Motivation: The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use of the resources is hampered by the lack of widely used, standard data-exchange formats for the basic, everyday bioinformatics data types.
    Results: BioXSD has been developed as a candidate for standard, canonical exchange format for basic bioinformatics data. BioXSD is represented by a dedicated XML Schema and defines syntax for biological sequences, sequence annotations, alignments and references to resources. We have adapted a set of web services to use BioXSD as the input and output format, and implemented a testcase workflow. This demonstrates that the approach is feasible and provides smooth interoperability. Semantics for BioXSD is provided by annotation with the EDAM ontology. We discuss in a separate section how BioXSD relates to other initiatives and approaches, including existing standards and the Semantic Web.
    Availability: The BioXSD 1.0 XML Schema is freely available at http://www.bioxsd.org/BioXSD-1.0.xsd under the Creative Commons BY-ND 3.0 license. The http://bioxsd.org web page offers documentation, examples of data in BioXSD format, example workflows with source codes in common programming languages, an updated list of compatible web services and tools and a repository of feature requests from the community.
    Contact: matus.kalas@bccs.uib.no; developers@bioxsd.org; support@bioxsd.org
    Research Interests:
    Abstract: The invention related to method and systems for the determination of alteration of gene expression in M. capsulatus under a variery of conditions. A preferred embodiment of the invention relates to micro arrays comprising... more
    Abstract: The invention related to method and systems for the determination of alteration of gene expression in M. capsulatus under a variery of conditions. A preferred embodiment of the invention relates to micro arrays comprising polynucleotides or oligonucleotides representative for a selective number of the genes of M. capsulatus.
    A method for large scale pattern searching has been developed in order to find patterns for each of the selected families. It builds on the Pratt program that allows automatic discovery of patterns matching at least a minimum number of a... more
    A method for large scale pattern searching has been developed in order to find patterns for each of the selected families. It builds on the Pratt program that allows automatic discovery of patterns matching at least a minimum number of a given set of unaligned sequences. The search performed by Pratt is governed by user-defined parameters, most importantly; the minimum
    An important problem in sequence analysis is to discover patterns matching subsets ofa given set of bio-sequences. When a pattern common to a subset is found, the quality ofthe match should be evaluated. This paper proposes that an... more
    An important problem in sequence analysis is to discover patterns matching subsets ofa given set of bio-sequences. When a pattern common to a subset is found, the quality ofthe match should be evaluated. This paper proposes that an evaluation scheme for measuringthe quality of a match between a sequence set and a common pattern should takeinto account both the strength
    The amount of publicly shared proteomics data has grown exponentially over the last decade as the solutions for sharing and storing the data have improved. However, the use of the data is often limited by the manner of which it is made... more
    The amount of publicly shared proteomics data has grown exponentially over the last decade as the solutions for sharing and storing the data have improved. However, the use of the data is often limited by the manner of which it is made available. There are two main approaches: download and inspect the proteomics data locally, or interact with the data via one or more web pages. The first is limited by having to download the data and thus requires local computational skills and resources, while the latter most often is limited in terms of interactivity and the analysis options available. A solution is to develop web-based systems supporting distributed and fully interactive visual analysis of proteomics data. The use of a distributed architecture makes it possible to perform the computational analysis at the server, while the results of the analysis can be displayed via a web browser without the need to download the whole dataset. Here the challenges related to developing such systems for omics data will be discussed. Especially how this allows for multiple connected interactive visual displays of omics dataset in a web-based setting, and the benefits this provide for computational analysis of proteomics data. The approach detailed for better computational analysis of shared proteomics data via a web-based distributed architecture can greatly improve the ease of which shared proteomics data is utilized. Especially the support for multiple connected interactive visual displays of the same omics dataset has the potential of transforming what is now mainly static information into interactive resources, greatly simplifying the re-analysis of shared proteomics data and the extraction of biological knowledge.
    Microarrays have emerged as the preferred platform for high throughput gene expression analysis. Cross-hybridization among genes with high sequence similarities can be a source of error reducing the reliability of DNA microarray results.... more
    Microarrays have emerged as the preferred platform for high throughput gene expression analysis. Cross-hybridization among genes with high sequence similarities can be a source of error reducing the reliability of DNA microarray results. We have developed a tool called XHM (cross hybridization on microarrays) for assessment of the reliability of hybridization signals by detecting potential cross-hybridizations on DNA microarrays. This is done by comparing the sequences of the probes against an extensive database representing the transcriptome of the organism in question. XHM is available online at http://www.bioinfo.no/tools/xhm/. Using XHM with its user-adjustable parameters will enable scientists to check their lists of differentially expressed genes from microarray experiments for potential cross-hybridizations. This provides information that may be useful in the validation of the microarray results.
    Methods for extracting useful information from the datasets produced by microarray experiments are at present of much interest. Here we present new methods for finding gene sets that are well suited for distinguishing experiment classes,... more
    Methods for extracting useful information from the datasets produced by microarray experiments are at present of much interest. Here we present new methods for finding gene sets that are well suited for distinguishing experiment classes, such as healthy versus diseased tissues. Our methods are based on evaluating genes in pairs and evaluating how well a pair in combination distinguishes two experiment classes. We tested the ability of our pair-based methods to select gene sets that generalize the differences between experiment classes and compared the performance relative to two standard methods. To assess the ability to generalize class differences, we studied how well the gene sets we select are suited for learning a classifier. We show that the gene sets selected by our methods outperform the standard methods, in some cases by a large margin, in terms of cross-validation prediction accuracy of the learned classifier. We show that on two public datasets, accurate diagnoses can be ...
    Motivation: Gene expression is dependent on two main types of signals; one involving transcription factors which initiates gene transcription, and another which regulates the translation of a nascent mRNA. These post- transcriptional... more
    Motivation: Gene expression is dependent on two main types of signals; one involving transcription factors which initiates gene transcription, and another which regulates the translation of a nascent mRNA. These post- transcriptional events play an important yet incompletely understood role in regulating gene expression and cellular behavior. Many of the identified cis acting elements for translational regulation occur within the
    High throughput sequencing technology has great promise for biodiversity studies. However, an underlying assumption is that the primers used in these studies are universal for the prokaryotic or eukaryotic groups of interest. Full primer... more
    High throughput sequencing technology has great promise for biodiversity studies. However, an underlying assumption is that the primers used in these studies are universal for the prokaryotic or eukaryotic groups of interest. Full primer universality is difficult or impossible to achieve and studies using different primer sets make biodiversity comparisons problematic. The aim of this study was to design and optimize universal eukaryotic primers that could be used as a standard in future biodiversity studies. Using the alignment of all eukaryotic sequences from the publicly available SILVA database, we generated a full characterization of variable versus conserved regions in the 18S rRNA gene. All variable regions within this gene were analyzed and our results suggested that the V2, V4 and V9 regions were best suited for biodiversity assessments. Previously published universal eukaryotic primers as well as a number of self-designed primers were mapped to the alignment. Primer select...
    A method is described for the refinement of rough protein models based on finding a selection of structural fragments that match the model. Unlike most fragment-based methods, these are not necessarily contiguous in the sequence and form... more
    A method is described for the refinement of rough protein models based on finding a selection of structural fragments that match the model. Unlike most fragment-based methods, these are not necessarily contiguous in the sequence and form a tiling (tessellation) that covers most of the structure. The residue positions of the fragments are then used as a target for the model atoms to generate a revised model which is used as the basis of a subsequent pattern definition and search. The method was shown to improve the recognition of the native fold in a series of decoys largely as a result of improved secondary structure representation.
    We discuss the problem of algorithmic discovery of patterns common to sets of sequences and its applications to computational biology. We formulate a three step paradigm for pattern discovery, which is based on choosing the hypothesis... more
    We discuss the problem of algorithmic discovery of patterns common to sets of sequences and its applications to computational biology. We formulate a three step paradigm for pattern discovery, which is based on choosing the hypothesis space, designing the function rating a pattern in respect to the given sequences, and developing an algorithm finding the highest rating patterns. We give some examples of implementing this paradigm, and present experimental results of discovering new patterns in...
    We used a protein structure prediction method to generate a variety of folds as alpha-carbon models with realistic secondary structures and good hydrophobic packing. The prediction method used only idealized constructs that are not based... more
    We used a protein structure prediction method to generate a variety of folds as alpha-carbon models with realistic secondary structures and good hydrophobic packing. The prediction method used only idealized constructs that are not based on known protein structures or fragments of them, producing an unbiased distribution. Model and native fold comparison used a topology-based method as superposition can only be relied on in similar structures. When all the models were compared to a nonredundant set of all known structures, only one-in-ten were found to have a match. This large excess of novel folds was associated with each protein probe and if true in general, implies that the space of possible folds is larger than the space of realized folds, in much the same way that sequence-space is larger than fold-space. The large excess of novel folds exhibited no unusual properties and has been likened to cosmological dark matter.
    The EMBRACE (European Model for Bioinformatics Research and Community Education) web service collection is the culmination of a 5-year project that set out to investigate issues involved in developing and deploying web services for use in... more
    The EMBRACE (European Model for Bioinformatics Research and Community Education) web service collection is the culmination of a 5-year project that set out to investigate issues involved in developing and deploying web services for use in the life sciences. The project concluded that in order for web services to achieve widespread adoption, standards must be defined for the choice of web service technology, for semantically annotating both service function and the data exchanged, and a mechanism for discovering services must be provided. Building on this, the project developed: EDAM, an ontology for describing life science web services; BioXSD, a schema for exchanging data between services; and a centralized registry (http://www.embraceregistry.net) that collects together around 1000 services developed by the consortium partners. This article presents the current status of the collection and its associated recommendations and standards definitions.
    Atlantic cod (Gadus morhua) is a large, cold-adapted teleost that sustains long-standing commercial fisheries and incipient aquaculture. Here we present the genome sequence of Atlantic cod, showing evidence for complex thermal adaptations... more
    Atlantic cod (Gadus morhua) is a large, cold-adapted teleost that sustains long-standing commercial fisheries and incipient aquaculture. Here we present the genome sequence of Atlantic cod, showing evidence for complex thermal adaptations in its haemoglobin gene cluster and an unusual immune architecture compared to other sequenced vertebrates. The genome assembly was obtained exclusively by 454 sequencing of shotgun and paired-end libraries, and automated annotation identified 22,154 genes. The major histocompatibility complex (MHC) II is a conserved feature of the adaptive immune system of jawed vertebrates, but we show that Atlantic cod has lost the genes for MHC II, CD4 and invariant chain (Ii) that are essential for the function of this pathway. Nevertheless, Atlantic cod is not exceptionally susceptible to disease under natural conditions. We find a highly expanded number of MHC I genes and a unique composition of its Toll-like receptor (TLR) families. This indicates how the Atlantic cod immune system has evolved compensatory mechanisms in both adaptive and innate immunity in the absence of MHC II. These observations affect fundamental assumptions about the evolution of the adaptive immune system and its components in vertebrates.
    This article investigates aspects of pairwise and multiple structure comparison, and the problem of automatically discover common patterns in a set of structures. Descriptions and representation of structures and patterns are described,... more
    This article investigates aspects of pairwise and multiple structure comparison, and the problem of automatically discover common patterns in a set of structures. Descriptions and representation of structures and patterns are described, as well as scoring and algorithms for comparison and discovery. A framework and nomenclature is developed for classifying different methods, and many of these are reviewed and placed into this framework.

    And 10 more