Abstract
The emergence of Grid computing technology has opened up an unprecedented opportunity for biologists to share and access data, resources and tools in an integrated environment leading to a greater chance of knowledge discovery. GeneGrid is a Grid computing framework that seamlessly integrates a myriad of heterogeneous resources spanning multiple administrative domains and locations. It provides scientists an integrated environment for the streamlined access of a number of bioinformatics programs and databases through a simple and intuitive interface. It acts as a virtual bioinformatics laboratory by allowing scientists to create, execute and manage workflows that represent bioinformatics experiments. A number of cooperating Grid services interact in an orchestrated manner to provide this functionality. This paper gives insight into the details of the architecture, components and implementation of GeneGrid.
Similar content being viewed by others
Abbreviations
- OGSA:
-
open Grid services architecture
- SOA:
-
service oriented architecture
- OGSI:
-
open Grid services infrastructure
- SOAP:
-
simple object access protocol
- GAMSF:
-
GeneGrid application manager service factory
- GAMS:
-
GeneGrid application manager service
- OGSA-DAI:
-
open Grid services architecture-database access and integration
- GDMSF:
-
GeneGrid data manager service factory
- GDMS:
-
GeneGrid data manager service
- GWDD:
-
GeneGrid workflow definition database
- GSTRIP:
-
GeneGrid status tracking, results & input parameters database
- GWMSF:
-
GeneGrid workflow manager service factory
- GWMS:
-
GeneGrid workflow manager service
- GNM:
-
GeneGrid node monitor
- GARR:
-
GeneGrid application and resources registry
References
Genomes OnLine Database, see website http://www.genomesonline.org
Foster, I., Kesselman, C., Tuecke, S.: The anatomy of the Grid: Enabling scalable virtual organisations. Int. J. Supercomput. Appl. 15(3) (2003)
Foster, I., Kesselman, C., Nick, J., Tuecke, S.: The physiology of the Grid: An open Grid services architecture for distributed systems integration. Open Grid Service Infrastructure WG, Global Grid Forum, June 22 (2002)
Foster, I.: Service-oriented science. Science 308, 814–817 (6 May 2005)
Foster, I., Kesselman, C.: Globus: A metacomputing infrastructure toolkit. Int. J. Supercomput. Appl. 11, 115–128 (1997)
Tuecke, S., Czajkowski, K., Foster, I., Frey, J., Graham, S., Kesselman, C., Maguire, T., Sandholm, T., Vanderbilt, P., Snelling, D.: Open Grid services infrastructure (OGSI) Version 1.0. Global Grid Forum Draft Recommendation, 6/27/2003
Biomedical Informatics Research Network, see website http://www.nbirn.net
Cancer Biomedical Informatics Grid, see website https://cabig.nci.nih.gov
myGrid, see website http://www.mygrid.org.uk
North Carolina BioGrid, see website http://www.ncbiogrid.org
Bio-GRID, see website http://biogrid.icm.edu.pl
Donachy, P., Harmer, T.J., Perrott, R.H., et al.: Grid based virtual bioinformatics laboratory. In: Proceedings of the UK e-Science All Hands Meeting, Nottingham, pp. 111–116, 2003
Joseph, J., Ernest, M., Fellenstein, C.: Evolution of Grid computing architecture and Grid adoption models. IBM Syst. J. 43, 624–645 (2004)
Jithesh, P.V., Kelly, N., Simpson, D.R., et al.: Bioinformatics application integration and management in genegrid: Experiments and experiences. In: Proceedings of UK e-Science All Hands Meeting, Nottingham, pp. 563–570, 2004
Altschul, S.F., et al.: Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (Sep 1. 1997)
Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (Nov 11. 1994)
Eddy, S.R.: Profile hidden Markov models. Bioinformatics 14, 755–763 (1998)
Rice, P., Longden, I., Bleasby, A.: EMBOSS: The European molecular biology open software suite. Trends Genet. 16, 276–277 (2000)
Krogh, A., et al.: Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. J. Mol. Biol. 305(3), 567–580 (January 2001)
Bendtsen, J.D., Nielsen, H., von Heijne, G., Brunak, S.: Improved prediction of signal peptides: SignalP 3.0. J. Mol. Biol. 340, 783–795 (Jul 16. 2004)
Darling, A., Carey, L., Feng, W.: The design, implementation, and evaluation of mpiBLAST. In: ClusterWorld Conference & Expo in conjunction with the 4th International Conference on Linux Clusters: The HPC Revolution 2003, San Jose, CA, June 2003
Stajich, J.E., et al.: The bioperl toolkit: Perl modules for the life sciences. Genome Res. 12, 1611–1618 (October 2002)
OGSA-DAI Project, see website http://www.ogsadai.org.uk
Kanz, C., Aldebert, P., Althorpe, N., et al.: The EMBL nucleotide sequence database. Nucleic Acids Res. 33 Database Issue, D29–D33 (Jan 1. 2005)
Apweiler, R., Bairoch, A., Wu, C.H., et al.: UniProt: The universal protein knowledgebase. Nucleic Acids Res. 32, D115–D119 (Jan 1. 2004)
The Gene Ontology Consortium. Gene ontology: Tool for the unification of biology. Nat. Genet. 25, 25–29 (2000)
Wolfgang Meier. eXist: An open source native XML database. In: Chaudri, A.B., Jeckle, M., Rahm, E., Unland, R. (eds.) Web, Web-Services, and Database Systems. NODe 2002 Web- and Database-Related Workshops, Erfurt, Germany, October 2002
MySQL, see website http://www.mysql.com
Novotny, J., Russell, M., Wehrens, O.: GridSphere: An advanced portal framework. In: Proceedings of EuroMicro Conference, pp. 412–419, 2004
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jithesh, P.V., Donachy, P., Harmer, T. et al. GeneGrid: Architecture, Implementation and Application. J Grid Computing 4, 209–222 (2006). https://doi.org/10.1007/s10723-006-9045-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-006-9045-5