A Bayesian Approach to High-Throughput Biological Model Generation

Shi, Xinghua; Stevens, Rick

doi:10.1007/978-3-642-00727-9_35

Xinghua Shi²⁰ &
Rick Stevens^20,21

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5462))

Included in the following conference series:

International Conference on Bioinformatics and Computational Biology

1130 Accesses

Abstract

With the availability of hundreds and soon thousands of complete genomes, the construction of genome-scale metabolic models for these organisms has attracted much attention. Manual work still dominates the process of model generation, however, and leads to the huge gap between the number of complete genomes and genome-scale metabolic models. The challenge in constructing genome-scale models from existing databases is that usually such a directly extracted model is incomplete and contains network holes. Network holes occur when a network is disconnected and certain metabolites cannot be produced or consumed. In order to construct a valid metabolic model, network holes need to be filled by introducing candidate reactions into the network. As a step toward the high-throughput generation of biological models, we propose a Bayesian approach to improving draft genome-scale metabolic models. A collection of 23 types of biological and topological evidence is extracted from the SEED [1], KEGG [2], and BiGG [3] databases. Based on this evidence, we create 23 individual predictors using Bayesian approaches. To combine these individual predictors and unify their predictive results, we build an ensemble of individual predictors on majority vote and four classifiers: naive Bayes classifier, Bayesian network, multilayer perceptron network and AdaBoost. A set of experiments is performed to train and test individual predictors and integrative mechanisms of single predictors and to evaluate the performance of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Reconstructing High-Quality Large-Scale Metabolic Models with merlin

Automated generation of genome-scale metabolic draft reconstructions based on KEGG

Article Open access 04 December 2018

Combining multiple functional annotation tools increases coverage of metabolic annotation

Article Open access 19 December 2018

References

Overbeek, R., Begley, T., Butler, R.M., Choudhuri, J.V., Chuang, H.Y., Cohoon, M., de Crécy–Lagard, V., Diaz, N., Disz, T., Edwards, R., Fonstein, M., Frank, E.D., Gerdes, S., Glass, E.M., Goesmann, A., Hanson, A., Iwata–Reuyl, D., Jensen, R., Jamshidi, N., Krause, L., Kubal, M., Larsen, N., Linke, B., McHardy, A.C., Meyer, F., Neuweger, H., Olsen, G., Olson, R., Osterman, A., Portnoy, V., Pusch, G.D., Rodionov, D.A., Rückert, C., Steiner, J., Stevens, R., Thiele, I., Vassieva, O., Ye, Y., Zagnitko, O., Vonstein, V.: IThe Subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res. 33(17), 5691–5702 (2005)
Article CAS PubMed PubMed Central Google Scholar
Kanehisa, M., Araki, M., Goto, S., Hattori, M., Hirakawa, M., Itoh, M., Katayama, T., Kawashima, S., Okuda, S., Tokimatsu, T., Yamanishi, Y.: KEGG for linking genomes to life and the environment. Nucleic Acids Res. 36, 480–484 (2008)
Article Google Scholar
BiGG: A Biochemical Genetic and Genomic Database of Large Scale Metabolic Reconstructions, http://bigg.ucsd.edu/
CellDesigner, http://www.systems-biology.org/cd/
Weka: Data mining software in Java, http://www.cs.waikato.ac.nz/~ml/weka/
Feist, A.M., Herrgard, M.J., Thiele, I., Reed, J.L., Palsson, B.O.: Reconstruction of biochemical networks in microbial organisms. Nat. Rev. Microbiol. (2008)
Google Scholar
Palsson, B.: Systems biology: properties of reconstructed networks. Cambridge University Press, Cambridge (2006)
Book Google Scholar
Reed, J.L., Palsson, B.O.: Minireview thirteen years of building constraint-based in silico models of escherichia coli. Journal of Bacteriology, 2692–2699 (2003)
Google Scholar
Reed, J.L., Vo, T.D., Schilling, C.H., Palsson, B.O.: An expanded genome-scale model of Escherichia coli k-12 (ijr904 gsm/gpr). Genome Biol. 4(9), R54 (2003)
Article Google Scholar
Edwards, J.S., Palsson, B.: Robustness analysis of the Escherichia coli metabolic network. Biotechnology Prog. 16, 927–939 (2000)
Article CAS Google Scholar
Kharchenko, P., Chen, L., Freund, Y., Vitkup, D., Church, G.M.: Identifying metabolic enzymes with multiple types of association evidence. BMC Bioinformatics 7(1), 177 (2006)
Article PubMed PubMed Central Google Scholar
Chen, L., Vitkup, D.: Predicting genes for orphan metabolic activities using phylogenetic profiles. Geno. Biol. 7, R17 (2006)
Article Google Scholar
DeJongh, M., Formsma, K., Boillot, P., Gould, J., Rycenga, M., Best, A.: Toward the automated generation of genome-scale metabolic networks in the SEED. BMC Bioinformatics 8(139) (2007)
Google Scholar
Green, M.L., Karp, P.D.: A Bayesian method for identifying missing enzymes in predicted metabolic pathway databases. BMC Bioinformatics 5(76) (2004)
Google Scholar
Kharchenko, P., Vitkup, D., Church, G.M.: Filling gaps in a metabolic network using expression information. Bioinformatics 20(suppl. 1), I178–I185 (2004)
Article Google Scholar
Gil, R., Silva, F.J., Pereto, J., Moya, A.: Determination of the core of a minimal bacterial gene set. Microbiology and Molecular Biology Reviews 68(3), 518–537 (2004)
Article CAS PubMed PubMed Central Google Scholar
Overbeek, R., Begley, T., et al.: The subsystems approach to genome annotation and its use in the Project to Annotate 1000 Genomes. Nucleic Acids Res. 33(17), 5691–5702 (2005)
Article CAS PubMed PubMed Central Google Scholar
Aziz, R.K., Bartels, D., et al.: The RAST server: Rapid Annotations using Subsystems Technology. BMC Genomics 9(75) (2008)
Google Scholar
Becker, S.A., Palsson, B.O.: Genome-scale reconstruction of the metabolic network in Staphylococcus aureus n315: an initial draft to the two-dimensional annotation. BMC Microbiol. 5(8) (2005)
Google Scholar
Shi, X., Stevens, R.: SWARM: a scientific workflow for supporting bayesian approaches to improve metabolic models. In: Proceedings of the 6th international workshop on Challenges of Large Applications in Distributed Environments(CLADE) (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Chicago, Chicago, IL 60637, USA
Xinghua Shi & Rick Stevens
The Computing, Environment and Life Science, Argonne National Laboratory, Argonne, IL 60439, USA
Rick Stevens

Authors

Xinghua Shi
View author publications
You can also search for this author in PubMed Google Scholar
Rick Stevens
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, University of Connecticut, 257 ITE Building, 371 Fairfield Way, CT 06269-2155, Storrs, USA
Sanguthevar Rajasekaran

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shi, X., Stevens, R. (2009). A Bayesian Approach to High-Throughput Biological Model Generation. In: Rajasekaran, S. (eds) Bioinformatics and Computational Biology. BICoB 2009. Lecture Notes in Computer Science(), vol 5462. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00727-9_35

Download citation

DOI: https://doi.org/10.1007/978-3-642-00727-9_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00726-2
Online ISBN: 978-3-642-00727-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Bayesian Approach to High-Throughput Biological Model Generation

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Reconstructing High-Quality Large-Scale Metabolic Models with merlin

Automated generation of genome-scale metabolic draft reconstructions based on KEGG

Combining multiple functional annotation tools increases coverage of metabolic annotation

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Bayesian Approach to High-Throughput Biological Model Generation

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Reconstructing High-Quality Large-Scale Metabolic Models with merlin

Automated generation of genome-scale metabolic draft reconstructions based on KEGG

Combining multiple functional annotation tools increases coverage of metabolic annotation

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation