Abstract
Gene expression on prokaryotes initiates when the RNA-polymerase enzyme interacts with DNA regions called promoters, where are located the main regulatory elements of the transcription process. Despite the improvement of in vitro techniques for molecular biology analysis, characterizing and identifying promoters is a complex task. In silico approaches are used to recognize theses regions. Nevertheless, they confront the absence of a large set of promoters to identify conserved patterns among the species. Hence, a methodology able to predict them on any genome is a challenge. This work proposes a methodology based on Hidden Markov Models (HMMs), Decision Threshold Estimation and Discrimination Analysis. For three investigated prokaryotic species, the mainly results are: a reduction in 44.96% of recognition error rate compared with previous works on Escherichia coli, an accuracy of 95% on recognition and 78% on prediction for Bacillus subtilis. However, it was found a large number of false positives on Helicobacter pylori.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baldi, P., Chauvin, Y., Hunkapiller, T., McClure, M.A.: Hidden Markov models of biological primary sequence information. Proc. Natl. Acad. Sci. 91, 1059–1063 (1994)
Clote, P., Backofen, R.: Computational Molecular Biology: an introduction. John Wiley & Sons, Chichester (2000)
Eddy, S.R.: Profile hidden Markov models. Bioinformatics 14(9), 755–763 (1998)
Helmann, J.D.: Compilation and analysis of Bacillus subtilis A-dependent promoter sequences: evidence for extended contact between RNA polymerase and upstream promoter DNA. Nucleic Acids Research 23(13), 2351–2360 (1995)
Huerta, A.M., Salgado, H., Thieffry, D., Collado-Vides, J.: RegulonDB: a database on transcriptional regulation in Escherichia coli. Nucleic Acids Res. 26, 55–59 (1998)
Krogh, A.: An Introduction to Hidden Markov Models for Biological Sequences. In: Computational Methods in Molecular Biology, pp. 45–63. Elsevier, Amsterdam (1998)
Karp, P.D., Arnaud, M., Collado-Vides, J., Ingraham, J., Paulsen, I.T., Jr., M.H.S.: The E. coli Ecocyc database: No longer just a metabolic pathway database. ASM News (2004)
Lukashin, A.V., Borodovsky, M.: GeneMark.hmm: new solutions for gene finding. Nucleic Acids Research 26(4), 1107–1115 (1998)
Mount, D.W.: Bioinformatics: Sequence and Genome Analysis. CSHL Press, New York (2001)
Pedersen, A.G., Baldi, P., Brunak, S., Chauvin, Y.: Characterization of prokaryotic and eukaryotic promoters using hidden Markov models. In: Proc. Int. Conf. Intell. Syst. Mol. Biol., pp. 182–191 (1996)
Pevzner, P.A.: Computational Molecular Biology: An Algorithmic Approach. Cambridge University, London (2000)
Qiu, P.: Recent advances in computational promoter analysis in understanding the transcriptional regulatory network. Biochemical and Biophysical Research Communications 309, 495–501 (2003)
Salgado, H., Gama-Castro, S., Martínez-Antonio, A., Díaz-Peredo, E., Sánchez-Solano, F., Peralta-Gil, M., Garcia-Alonso, D., Jiménez-Jacinto, V., Santos-Zavaleta, A., Bonavides-Martínez, C., Collado-Vides, J.: RegulonDB (version 4.0): transcriptional regulation, operon organization and growth conditions in Escherichia coli K-12. Nucleic Acids Research 32, D303–D306 (2004)
Salzberg, S.L., Delcher, A.L., Kasif, S., White, O.: Microbial gene identification using interpolated Markov Models. Nucleic Acids Research 26(2), 544–548 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
dos Reis, A.N., Lemke, N. (2005). An Improved Hidden Markov Model Methodology to Discover Prokaryotic Promoters. In: Setubal, J.C., Verjovski-Almeida, S. (eds) Advances in Bioinformatics and Computational Biology. BSB 2005. Lecture Notes in Computer Science(), vol 3594. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11532323_10
Download citation
DOI: https://doi.org/10.1007/11532323_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28008-8
Online ISBN: 978-3-540-31861-3
eBook Packages: Computer ScienceComputer Science (R0)