To make full use of research data, the bioscience community needs to adopt technologies and rewar... more To make full use of research data, the bioscience community needs to adopt technologies and reward mechanisms that support interoperability and promote the growth of an open'data commoning'culture. Here we describe the prerequisites for data commoning and present an established and growing ecosystem of solutions using the shared'Investigation-Study-Assay'framework to support that vision.
Background A wide variety of ontologies relevant to the biological and medical domains are availa... more Background A wide variety of ontologies relevant to the biological and medical domains are available through the OBO Foundry portal, and their number is growing rapidly. Integration of these ontologies, while requiring considerable effort, is extremely desirable. However, heterogeneities in format and style pose serious obstacles to such integration. In particular, inconsistencies in naming conventions can impair the readability and navigability of ontology class hierarchies, and hinder their alignment and integration.
The requirements for standard formats for functional genomics experiments have been recognised fo... more The requirements for standard formats for functional genomics experiments have been recognised for some time, leading to the development of a standard microarray format, MAGE-ML [9], and a proposal for a proteome standard called PEDRo [12]. Furthermore, the Proteomics Standards Initiative (PSI) and the Microarray Gene Expression Data (MGED) society have been established to manage standards and ontologies.
abstract Motivation: Data standards and object models are being developed for a variety of functi... more abstract Motivation: Data standards and object models are being developed for a variety of functional genomics domains. Many of these object models include a reference to an ontology concept in order to provide a rich set of terms for annotation. The MGED Ontology was developed to provide terms to be used with the MicroArray and Gene Expression Object Model and has been successfully implemented in production annotation applications.
A proposal for the introduction of the Minimal Information (MI) platform dedicated to the acquisi... more A proposal for the introduction of the Minimal Information (MI) platform dedicated to the acquisition and annotation of data concerning recombinant proteins (Minimal Information for Protein Functionality Evaluation–MIPFE) was recently published [1] and discussed at the 5th Recombinant Protein Production Conference (Alghero 2008) and the 2009 PEP Talk meeting (San Diego).
Abstract This report summarizes the proceedings of the one day BioSharing meeting held at the Int... more Abstract This report summarizes the proceedings of the one day BioSharing meeting held at the Intelligent Systems for Molecular Biology (ISMB) 2010 conference in Boston, MA, USA This inaugural BioSharing event was hosted by the Genomic Standards Consortium as part of its M3 & BioSharing special interest group (SIG) workshop. The BioSharing event included invited talks from a range of community leaders and a panel discussion at the end of the day.
Abstract The aim of the ISPIDER project is to create a proteomics grid; that is, a technical plat... more Abstract The aim of the ISPIDER project is to create a proteomics grid; that is, a technical platform that supports bioinformaticians in constructing, executing and evaluating in silico analyses of proteomics data. It will be constructed using a combination of generic e-science and Grid technologies, plus proteomics specific components and clients that embody knowledge of the proteomics domain and the available resources.
Abstract The Bioinformatics Committee of the HUPO Brain Proteome Project (HUPO BPP) meets regular... more Abstract The Bioinformatics Committee of the HUPO Brain Proteome Project (HUPO BPP) meets regularly to execute the post-lab analyses of the data produced in the HUPO BPP pilot studies. On July 7, 2005 the members came together for the 5th time at the European Bioinformatics Institute (EBI) in Hinxton, UK, hosted by Rolf Apweiler. As a main result, the parameter set of the semi-automated data re-analysis of MS/MS spectra has been elaborated and the subsequent work steps have been defined.
Background As the size and complexity of scientific datasets and the corresponding information st... more Background As the size and complexity of scientific datasets and the corresponding information stores grow, standards for collecting, describing, formatting, submitting and exchanging information are playing an increasingly active role. Several initiatives occupy strategic positions in the international scenario, both within and across domains.
As researchers involved in the development of the MGED Ontology (MO) and other bio-ontologies, we... more As researchers involved in the development of the MGED Ontology (MO) and other bio-ontologies, we were pleased to see Nature Biotechnology foster dialogue on the challenges in building robust and optimal ontologies for biomedical research in a commentary by Soldatova and King published in the September issue (Nat. Biotechnol. 23, 1095–1098, 2005).
Summary The ever increasing volumes of proteomic data now being produced by laboratories across t... more Summary The ever increasing volumes of proteomic data now being produced by laboratories across the world have resulted in major issues in data storage and accessibility. The further demands of multilaboratory initiatives has highlighted issues when collaborators cannot import data generated within the same project but generated by different hardware types and processed by laboratoryspecific work flows and analyses packages.
Abstract In recent years, bioscience communities centered on particular areas of study, or groups... more Abstract In recent years, bioscience communities centered on particular areas of study, or groups of technologies, have generated so-called Minimum Information (MI) checklists specifying the data and metadata that should be captured from the totality of information generated in the course of an investigation. In parallel, ontologies, formats, data capture tools and databases have been developed that can support the collection, validation, archiving and sharing of MI checklist-compliant data sets.
Abstract Modern biological science addresses a variety of subjects using an array of analytical t... more Abstract Modern biological science addresses a variety of subjects using an array of analytical techniques. Few relations between subject and technique are exclusive, making for a very large number of potential workflows, combinatorially-speaking. While this diversity is to be celebrated, it presents informatics challenges that require resolution if the data-sharing ambitions of many funders are to be realised, and the consequent benefits to science obtained.
Gel electrophoresis is a reliable, wellcharacterised separation technique for proteins and peptid... more Gel electrophoresis is a reliable, wellcharacterised separation technique for proteins and peptides (inter alia), underpinning a wide variety of specific protocols. The performance of gel electrophoresis (whether one or two dimensional, denaturing or'native', simple or multiplexed, etc.) is typically followed by gel image capture, minimally to provide an (annotated) electronic record of the result.
Abstract To facilitate sharing of Omics data, many groups of scientists have been working to esta... more Abstract To facilitate sharing of Omics data, many groups of scientists have been working to establish the relevant data standards. The main components of data sharing standards are experiment description standards, data exchange standards, terminology standards, and experiment execution standards. Here we provide a survey of existing and emerging standards that are intended to assist the free and open exchange of large-format data.
Abstract The theme of the third annual Spring workshop of the HUPO-PSI was “proteomics and beyond... more Abstract The theme of the third annual Spring workshop of the HUPO-PSI was “proteomics and beyond” and its underlying goal was to reach beyond the boundaries of the proteomics community to interact with groups working on the similar issues of developing interchange standards and minimal reporting requirements.
Abstract The present article proposes the adoption of a community-defined, uniform, generic descr... more Abstract The present article proposes the adoption of a community-defined, uniform, generic description of the core attributes of biological databases, BioDBCore. The goals of these attributes are to provide a general overview of the database landscape, to encourage consistency and interoperability between resources and to promote the use of semantic and syntactic standards. BioDBCore will make it easier for users to evaluate the scope and relevance of available resources.
Abstract A broad range of mass spectrometers are used in mass spectrometry (MS)-based proteomics ... more Abstract A broad range of mass spectrometers are used in mass spectrometry (MS)-based proteomics research. Each type of instrument possesses a unique design, data system and performance specifications, resulting in strengths and weaknesses for different types of experiments. Unfortunately, the native binary data formats produced by each type of mass spectrometer also differ and are usually proprietary.
The rate of production of proteomics, transcriptomics and metabolomics data (inter alia) continue... more The rate of production of proteomics, transcriptomics and metabolomics data (inter alia) continues to increase as high-throughput approaches become more robust. The complexity of such data sets is also increasing as workflows and platforms evolve and diversify and data are frequently fixed in proprietary formats, computational access to which is often contingent on the availability of particular software. This deluge of diverse data mandates sophisticated data handling techniques.
To make full use of research data, the bioscience community needs to adopt technologies and rewar... more To make full use of research data, the bioscience community needs to adopt technologies and reward mechanisms that support interoperability and promote the growth of an open'data commoning'culture. Here we describe the prerequisites for data commoning and present an established and growing ecosystem of solutions using the shared'Investigation-Study-Assay'framework to support that vision.
Background A wide variety of ontologies relevant to the biological and medical domains are availa... more Background A wide variety of ontologies relevant to the biological and medical domains are available through the OBO Foundry portal, and their number is growing rapidly. Integration of these ontologies, while requiring considerable effort, is extremely desirable. However, heterogeneities in format and style pose serious obstacles to such integration. In particular, inconsistencies in naming conventions can impair the readability and navigability of ontology class hierarchies, and hinder their alignment and integration.
The requirements for standard formats for functional genomics experiments have been recognised fo... more The requirements for standard formats for functional genomics experiments have been recognised for some time, leading to the development of a standard microarray format, MAGE-ML [9], and a proposal for a proteome standard called PEDRo [12]. Furthermore, the Proteomics Standards Initiative (PSI) and the Microarray Gene Expression Data (MGED) society have been established to manage standards and ontologies.
abstract Motivation: Data standards and object models are being developed for a variety of functi... more abstract Motivation: Data standards and object models are being developed for a variety of functional genomics domains. Many of these object models include a reference to an ontology concept in order to provide a rich set of terms for annotation. The MGED Ontology was developed to provide terms to be used with the MicroArray and Gene Expression Object Model and has been successfully implemented in production annotation applications.
A proposal for the introduction of the Minimal Information (MI) platform dedicated to the acquisi... more A proposal for the introduction of the Minimal Information (MI) platform dedicated to the acquisition and annotation of data concerning recombinant proteins (Minimal Information for Protein Functionality Evaluation–MIPFE) was recently published [1] and discussed at the 5th Recombinant Protein Production Conference (Alghero 2008) and the 2009 PEP Talk meeting (San Diego).
Abstract This report summarizes the proceedings of the one day BioSharing meeting held at the Int... more Abstract This report summarizes the proceedings of the one day BioSharing meeting held at the Intelligent Systems for Molecular Biology (ISMB) 2010 conference in Boston, MA, USA This inaugural BioSharing event was hosted by the Genomic Standards Consortium as part of its M3 & BioSharing special interest group (SIG) workshop. The BioSharing event included invited talks from a range of community leaders and a panel discussion at the end of the day.
Abstract The aim of the ISPIDER project is to create a proteomics grid; that is, a technical plat... more Abstract The aim of the ISPIDER project is to create a proteomics grid; that is, a technical platform that supports bioinformaticians in constructing, executing and evaluating in silico analyses of proteomics data. It will be constructed using a combination of generic e-science and Grid technologies, plus proteomics specific components and clients that embody knowledge of the proteomics domain and the available resources.
Abstract The Bioinformatics Committee of the HUPO Brain Proteome Project (HUPO BPP) meets regular... more Abstract The Bioinformatics Committee of the HUPO Brain Proteome Project (HUPO BPP) meets regularly to execute the post-lab analyses of the data produced in the HUPO BPP pilot studies. On July 7, 2005 the members came together for the 5th time at the European Bioinformatics Institute (EBI) in Hinxton, UK, hosted by Rolf Apweiler. As a main result, the parameter set of the semi-automated data re-analysis of MS/MS spectra has been elaborated and the subsequent work steps have been defined.
Background As the size and complexity of scientific datasets and the corresponding information st... more Background As the size and complexity of scientific datasets and the corresponding information stores grow, standards for collecting, describing, formatting, submitting and exchanging information are playing an increasingly active role. Several initiatives occupy strategic positions in the international scenario, both within and across domains.
As researchers involved in the development of the MGED Ontology (MO) and other bio-ontologies, we... more As researchers involved in the development of the MGED Ontology (MO) and other bio-ontologies, we were pleased to see Nature Biotechnology foster dialogue on the challenges in building robust and optimal ontologies for biomedical research in a commentary by Soldatova and King published in the September issue (Nat. Biotechnol. 23, 1095–1098, 2005).
Summary The ever increasing volumes of proteomic data now being produced by laboratories across t... more Summary The ever increasing volumes of proteomic data now being produced by laboratories across the world have resulted in major issues in data storage and accessibility. The further demands of multilaboratory initiatives has highlighted issues when collaborators cannot import data generated within the same project but generated by different hardware types and processed by laboratoryspecific work flows and analyses packages.
Abstract In recent years, bioscience communities centered on particular areas of study, or groups... more Abstract In recent years, bioscience communities centered on particular areas of study, or groups of technologies, have generated so-called Minimum Information (MI) checklists specifying the data and metadata that should be captured from the totality of information generated in the course of an investigation. In parallel, ontologies, formats, data capture tools and databases have been developed that can support the collection, validation, archiving and sharing of MI checklist-compliant data sets.
Abstract Modern biological science addresses a variety of subjects using an array of analytical t... more Abstract Modern biological science addresses a variety of subjects using an array of analytical techniques. Few relations between subject and technique are exclusive, making for a very large number of potential workflows, combinatorially-speaking. While this diversity is to be celebrated, it presents informatics challenges that require resolution if the data-sharing ambitions of many funders are to be realised, and the consequent benefits to science obtained.
Gel electrophoresis is a reliable, wellcharacterised separation technique for proteins and peptid... more Gel electrophoresis is a reliable, wellcharacterised separation technique for proteins and peptides (inter alia), underpinning a wide variety of specific protocols. The performance of gel electrophoresis (whether one or two dimensional, denaturing or'native', simple or multiplexed, etc.) is typically followed by gel image capture, minimally to provide an (annotated) electronic record of the result.
Abstract To facilitate sharing of Omics data, many groups of scientists have been working to esta... more Abstract To facilitate sharing of Omics data, many groups of scientists have been working to establish the relevant data standards. The main components of data sharing standards are experiment description standards, data exchange standards, terminology standards, and experiment execution standards. Here we provide a survey of existing and emerging standards that are intended to assist the free and open exchange of large-format data.
Abstract The theme of the third annual Spring workshop of the HUPO-PSI was “proteomics and beyond... more Abstract The theme of the third annual Spring workshop of the HUPO-PSI was “proteomics and beyond” and its underlying goal was to reach beyond the boundaries of the proteomics community to interact with groups working on the similar issues of developing interchange standards and minimal reporting requirements.
Abstract The present article proposes the adoption of a community-defined, uniform, generic descr... more Abstract The present article proposes the adoption of a community-defined, uniform, generic description of the core attributes of biological databases, BioDBCore. The goals of these attributes are to provide a general overview of the database landscape, to encourage consistency and interoperability between resources and to promote the use of semantic and syntactic standards. BioDBCore will make it easier for users to evaluate the scope and relevance of available resources.
Abstract A broad range of mass spectrometers are used in mass spectrometry (MS)-based proteomics ... more Abstract A broad range of mass spectrometers are used in mass spectrometry (MS)-based proteomics research. Each type of instrument possesses a unique design, data system and performance specifications, resulting in strengths and weaknesses for different types of experiments. Unfortunately, the native binary data formats produced by each type of mass spectrometer also differ and are usually proprietary.
The rate of production of proteomics, transcriptomics and metabolomics data (inter alia) continue... more The rate of production of proteomics, transcriptomics and metabolomics data (inter alia) continues to increase as high-throughput approaches become more robust. The complexity of such data sets is also increasing as workflows and platforms evolve and diversify and data are frequently fixed in proprietary formats, computational access to which is often contingent on the availability of particular software. This deluge of diverse data mandates sophisticated data handling techniques.
Uploads
Papers by Chris Taylor