Computers and biomedical research, an international journal, 1994
We aim to develop an open software system to handle human genome data. The system, called Integra... more We aim to develop an open software system to handle human genome data. The system, called Integrated Genomic Database (IGD), will integrate information from many genomic databases and experimental resources into a comprehensive target-end database (IGD TED). Users will access front-end client systems (IGD FRED) to download data of interest to their computers and merge them with their own local data. FREDs will provide persistent storage of, and instant access to, retrieved data; a friendly graphical interface; tools for querying, browsing, analyzing, and editing local data; interface to external analysis; and tools for communicating with the outside world. The TED will be accessible over the network (online and offline) as a read-only resource for multiple clients. It collects data from major databases for nucleotide and protein sequences and structures, genome maps, experimental reagents, phenotypes, and bibliographic data, and sets of raw data produced at genome centers and labora...
ABSTRACT This paper describes approaches to achieve distributed access to analysis tools in the l... more ABSTRACT This paper describes approaches to achieve distributed access to analysis tools in the life sciences domain. The underlying technologies are the Web, CORBA, and the use of descriptive meta-data. The need for standarisation, extensibility and portability is underlined. The two separate applications presented here (W2H and AppLab) are both availablefreely.
Meaningful exchange of microarray data is currently difficult because it is rare that published d... more Meaningful exchange of microarray data is currently difficult because it is rare that published data provide sufficient information depth or are even in the same format from one publication to another. Only when data can be easily exchanged will the entire biological community be able to derive the full benefit from such microarray studies. To this end we have developed three key ingredients towards standardizing the storage and exchange of microarray data. First, we have created a minimal information for the annotation of a microarray experiment (MIAME)-compliant conceptualization of microarray experiments modeled using the unified modeling language (UML) named MAGE-OM (microarray gene expression object model). Second, we have translated MAGE-OM into an XML-based data format, MAGE-ML, to facilitate the exchange of data. Third, some of us are now using MAGE (or its progenitors) in data production settings. Finally, we have developed a freely available software tool kit (MAGE-STK) th...
The interaction between biological researchers and the bioinformatics tools they use is still ham... more The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system ...
Concurrency and Computation: Practice and Experience, 2000
SUMMARY Life sciences research is based on individuals, often with diverse skills, assembled into... more SUMMARY Life sciences research is based on individuals, often with diverse skills, assembled into research groups. These groups use their specialist expertise to address scientiflc problems. The in silico experiments undertaken by these research groups can be represented as work∞ows involving the co-ordinated use of analysis programs and information repositories that may be globally distributed. With regards to Grid computing,
Motivation: In silico experiments necessitate the virtual organization of people, data, tools and... more Motivation: In silico experiments necessitate the virtual organization of people, data, tools and machines.The scientific process also necessitates an awareness of the experience base, both of personal data as well as the wider context of work. The management of all these data and the co-ordination of resources to manage such virtual organizations and the data surrounding them needs significant computational
The availability of World-Wide-Web browsers for many computer systems made it possible to develop... more The availability of World-Wide-Web browsers for many computer systems made it possible to develop platform-independent graphical user interfaces for command-line-driven UNIX applications. We followed this trend with our development of a WWW-interface for the ...
The Generation Challenge programme (GCP) is a global crop research consortium directed toward cro... more The Generation Challenge programme (GCP) is a global crop research consortium directed toward crop improvement through the application of comparative biology and genetic resources characterization to plant breeding. A key consortium research activity is the development of a GCP crop bioinformatics platform to support GCP research. This platform includes the following: (i) shared, public platform-independent domain models, ontology, and data formats to enable interoperability of data and analysis flows within the platform; (ii) web service and registry technologies to identify, share, and integrate information across diverse, globally dispersed data sources, as well as to access high-performance computational (HPC) facilities for computationally intensive, high-throughput analyses of project data; (iii) platform-specific middleware reference implementations of the domain model integrating a suite of public (largely open-access/-source) databases and software tools into a workbench to...
Computers and biomedical research, an international journal, 1994
We aim to develop an open software system to handle human genome data. The system, called Integra... more We aim to develop an open software system to handle human genome data. The system, called Integrated Genomic Database (IGD), will integrate information from many genomic databases and experimental resources into a comprehensive target-end database (IGD TED). Users will access front-end client systems (IGD FRED) to download data of interest to their computers and merge them with their own local data. FREDs will provide persistent storage of, and instant access to, retrieved data; a friendly graphical interface; tools for querying, browsing, analyzing, and editing local data; interface to external analysis; and tools for communicating with the outside world. The TED will be accessible over the network (online and offline) as a read-only resource for multiple clients. It collects data from major databases for nucleotide and protein sequences and structures, genome maps, experimental reagents, phenotypes, and bibliographic data, and sets of raw data produced at genome centers and labora...
ABSTRACT This paper describes approaches to achieve distributed access to analysis tools in the l... more ABSTRACT This paper describes approaches to achieve distributed access to analysis tools in the life sciences domain. The underlying technologies are the Web, CORBA, and the use of descriptive meta-data. The need for standarisation, extensibility and portability is underlined. The two separate applications presented here (W2H and AppLab) are both availablefreely.
Meaningful exchange of microarray data is currently difficult because it is rare that published d... more Meaningful exchange of microarray data is currently difficult because it is rare that published data provide sufficient information depth or are even in the same format from one publication to another. Only when data can be easily exchanged will the entire biological community be able to derive the full benefit from such microarray studies. To this end we have developed three key ingredients towards standardizing the storage and exchange of microarray data. First, we have created a minimal information for the annotation of a microarray experiment (MIAME)-compliant conceptualization of microarray experiments modeled using the unified modeling language (UML) named MAGE-OM (microarray gene expression object model). Second, we have translated MAGE-OM into an XML-based data format, MAGE-ML, to facilitate the exchange of data. Third, some of us are now using MAGE (or its progenitors) in data production settings. Finally, we have developed a freely available software tool kit (MAGE-STK) th...
The interaction between biological researchers and the bioinformatics tools they use is still ham... more The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system ...
Concurrency and Computation: Practice and Experience, 2000
SUMMARY Life sciences research is based on individuals, often with diverse skills, assembled into... more SUMMARY Life sciences research is based on individuals, often with diverse skills, assembled into research groups. These groups use their specialist expertise to address scientiflc problems. The in silico experiments undertaken by these research groups can be represented as work∞ows involving the co-ordinated use of analysis programs and information repositories that may be globally distributed. With regards to Grid computing,
Motivation: In silico experiments necessitate the virtual organization of people, data, tools and... more Motivation: In silico experiments necessitate the virtual organization of people, data, tools and machines.The scientific process also necessitates an awareness of the experience base, both of personal data as well as the wider context of work. The management of all these data and the co-ordination of resources to manage such virtual organizations and the data surrounding them needs significant computational
The availability of World-Wide-Web browsers for many computer systems made it possible to develop... more The availability of World-Wide-Web browsers for many computer systems made it possible to develop platform-independent graphical user interfaces for command-line-driven UNIX applications. We followed this trend with our development of a WWW-interface for the ...
The Generation Challenge programme (GCP) is a global crop research consortium directed toward cro... more The Generation Challenge programme (GCP) is a global crop research consortium directed toward crop improvement through the application of comparative biology and genetic resources characterization to plant breeding. A key consortium research activity is the development of a GCP crop bioinformatics platform to support GCP research. This platform includes the following: (i) shared, public platform-independent domain models, ontology, and data formats to enable interoperability of data and analysis flows within the platform; (ii) web service and registry technologies to identify, share, and integrate information across diverse, globally dispersed data sources, as well as to access high-performance computational (HPC) facilities for computationally intensive, high-throughput analyses of project data; (iii) platform-specific middleware reference implementations of the domain model integrating a suite of public (largely open-access/-source) databases and software tools into a workbench to...
Uploads
Papers by Martin Senger