Abstract
Research in Systems Biology involves integrating data and knowledge about the dynamic processes in biological systems in order to understand and model them. Semantic web technologies should be ideal for exploring the complex networks of genes, proteins and metabolites that interact, but much of this data is not natively available to the semantic web. Data is typically collected and stored with free-text annotations in spreadsheets, many of which do not conform to existing metadata standards and are often not publically released.
Along with initiatives to promote more data sharing, one of the main challenges is therefore to semantically annotate and extract this data so that it is available to the research community. Data annotation and curation are expensive and undervalued tasks that have enormous benefits to the discipline as a whole, but fewer benefits to the individual data producers.
By embedding semantic annotation into spreadsheets, however, and automatically extracting this data into RDF at the time of repository submission, the process of producing standards-compliant data, that is available for semantic web querying, can be achieved without adding additional overheads to laboratory data management. This paper describes these strategies in the context of semantic data management in the SEEK. The SEEK is a web-based resource for sharing and exchanging Systems Biology data and models that is underpinned by the JERM ontology (Just Enough Results Model), which describes the relationships between data, models, protocols and experiments. The SEEK was originally developed for SysMO, a large European Systems Biology consortium studying micro-organisms, but it has since had widespread adoption across European Systems Biology.
Chapter PDF
Similar content being viewed by others
Keywords
References
Antezana, E., Blonde, W., Egana, M., Rutherford, A., Stevens, R., De Baets, B., Mironov, V., Kuiper, M.: BioGateway: a semantic Systems Biology tool for the life sciences. BMC Bioinformatics (10 suppl. 10), S11 (2009)
Chen, B., Dong, X., Jiao, D., Wang, H., Zhu, Q., Ding, Y., Wild, D.J.: Chem2Bio2RDF: a semantic framework for linking and data mining chemo-genomic and systems chemical biology data. BMC Bioinformatics 11, 255 (2010)
Courtot, M., Juty, N., Knupfer, C., Waltemath, D., Zhukova, A., Drager, A., Dumontier, M., Finney, A., Golebiewski, M., Hastings, J., et al.: Controlled vocabularies and semantics in Systems Biology. Mol. Syst. Biol. 7, 543 (2011)
Barrett, T., Edgar, R.: Gene expression omnibus: microarray data storage, submission, retrieval, and analysis. Methods Enzymol. 411, 352–369 (2006)
Parkinson, H., Kapushesky, M., Kolesnikov, N., Rustici, G., Shojatalab, M., Abeygunawardena, N., Berube, H., Dylag, M., Emam, I., Farne, A., et al.: ArrayExpress update–from an archive of functional genomics experiments to the atlas of gene expression. Nucleic Acids Res. 37, D868–D872 (2009)
Ball, C.A., Brazma, A.: MGED standards: work in progress. OMICS 10, 138–144 (2006)
Wolstencroft, K., Owen, S., du Preez, F., Krebs, O., Mueller, W., Goble, C., Snoep, J.L.: The SEEK: a platform for sharing data and models in Systems Biology. Methods Enzymol. 500, 629–655 (2011)
Sansone, S.A., Rocca-Serra, P., Brandizi, M., Brazma, A., Field, D., Fostel, J., Garrow, A.G., Gilbert, J., Goodsaid, F., Hardy, N., et al.: The first RSBI (ISA-TAB) workshop: “can a simple format work for complex studies?”. OMICS 12, 143–149 (2008)
Wolstencroft, K., Owen, S., Horridge, M., Krebs, O., Mueller, W., Snoep, J.L., du Preez, F., Goble, C.: RightField: embedding ontology annotation in spreadsheets. Bioinformatics 27, 2021–2022 (2011)
Taylor, C.F., Field, D., Sansone, S.A., Aerts, J., Apweiler, R., Ashburner, M., Ball, C.A., Binz, P.A., Bogue, M., Booth, T., et al.: Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project. Nat. Biotechnol. 26, 889–896 (2008)
Gray, J., Szalay, A.: Microsoft Research. Microsoft Corporation (2004)
Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res. 40, D71–D75 (2012)
Harris, M.A., Clark, J., Ireland, A., Lomax, J., Ashburner, M., Foulger, R., Eilbeck, K., Lewis, S., Marshall, B., Mungall, C., et al.: The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 32, D258–D261 (2004)
Rocca-Serra, P., Brandizi, M., Maguire, E., Sklyar, N., Taylor, C., Begley, K., Field, D., Harris, S., Hide, W., Hofmann, O., et al.: ISA software suite: supporting standards-compliant experimental annotation and enabling cura-tion at the community level. Bioinformatics 26, 2354–2356 (2010)
Belleau, F., Nolin, M.A., Tourigny, N., Rigault, P., Morissette, J.: Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J. Biomed. Inform. 41, 706–716 (2008)
Dreher, F., Kreitler, T., Hardt, C., Kamburov, A., Yildirimman, R., Schellander, K., Lehrach, H., Lange, B.M., Herwig, R.: DIPSBC–data integration platform for Systems Biology collaborations. BMC Bioinformatics 13, 85 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wolstencroft, K. et al. (2013). Semantic Data and Models Sharing in Systems Biology: The Just Enough Results Model and the SEEK Platform. In: Alani, H., et al. The Semantic Web – ISWC 2013. ISWC 2013. Lecture Notes in Computer Science, vol 8219. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41338-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-41338-4_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41337-7
Online ISBN: 978-3-642-41338-4
eBook Packages: Computer ScienceComputer Science (R0)