Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/369133.369171acmconferencesArticle/Chapter ViewAbstractPublication PagesrecombConference Proceedingsconference-collections
Article

Predicting the β-helix fold from protein sequence data

Published: 22 April 2001 Publication History
  • Get Citation Alerts
  • Abstract

    A method is presented that uses β-strand interactions to predict the right-handed β-helix super-secondary structural motif in protein sequences. A program called BetaWrap implements this method, and is shown to score known β-helices above non-β-helices in the Protein Data Bank in cross-validation. It is demonstrated that BetaWrap learns each of the seven known SCOP β-helix families, when trained on the the known β-helices from outside the family. BetaWrap also predicts many bacterial proteins of unknown structure that play a role in human infectious disease to β-helices; in particular, these proteins serve as virulence factors, adhesins and toxins in bacterial pathogenesis, and include cell surface proteins from Chlamydia and the intestinal bacterium Helicobacter pylori. The computational method used here may generalize to other β structures for which strand topology and profiles of residue accessibility are well conserved.

    References

    [1]
    S. Altschul, W. Gish, W. Miller, E. Myers, and D. Lipman. Basic local alignment search tool. J. Mol. Biol., 215:403-410, 1990.
    [2]
    S. Altschul, T. Madden, A. Schaffer, J. Zhang, Z.Zhang, W. Miller, and L. Lipman. Gapped BLAST and PSI-BLAST: a new generation of protein data base search programs. Nucleic Acids Res., 25:3389-3402, 1997.
    [3]
    A. Bairoch and R. Apweiler. The SWISS-PROT protein database and its supplement TrEMBL in 2000. Nucleic Acids Res., 28:45-48, 2000.
    [4]
    B. Berger. Algorithms for protein structural motif recognition. J. of Computational Biology, 2:125-138, 1995.
    [5]
    B. Berger and M. Singh. An iterative method for improved protein structural motif recognition. J. of Computational Biology, 4(3):261-273, Fall 1997.
    [6]
    B. Berger, D. B. Wilson, E. Wolf, T. Tonchev, M. Milla, and P. S. Kim. Predicting coiled coils by use of pairwise residue correlations. Proc. of the Natl. Academy of Sci., USA, 92:8259-8263, Aug. 1995.
    [7]
    S. Bryant. Evaluation of threading specificity and accuracy. Proteins, 26:172-185, 1996.
    [8]
    S. Eddy. Hidden markov models and large-scale genome analysis. Transactions of the American American Crystallographic Association, 1997.
    [9]
    P. Emsley, I. Charles, N. Fairweather, and N. Isaacs. Structure of bordetella pertussis virulence factor p.69 pertactin. Nature, 381:90-92, 1996.
    [10]
    D. Engelman, T. Steitz, and A. Goldman. Identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins {review}. Annual Review of Biophysics and Biophysical Chemistry, 15:321-53, 1986.
    [11]
    D. Frishman and P. Argos. Knowledge-based secondary structure assignment. Proteins: structure, function and genetics, pages 556-579, 1995.
    [12]
    J. Garnier, J. Gibrat, and B.Robson. GOR secondary structure prediction method version IV. Methods in Enzymology, 266:540-553, 1996.
    [13]
    C. Haase-Pettingell and J. King. Prevalence of temperature sensitive folding mutations in the parallel beta coil domain of the phage p22 tailspike endorhamnosidase. J. Mol. Biol., 267:88-102, 1997.
    [14]
    S. Heffron, G. Moe, V. Sieber, J. Mengaud, P. Cossart, J. Vitali, and F. Jurnak. Sequence profile of the parallel helix in the Pectate Lyase superfamily. Journal of Structural Biology, 122:223-235, 1998.
    [15]
    U. Hobohm and C. Sander. Enlarged representative set of protein structures. Protein Science, 3:522-524, 1994.
    [16]
    U. Hobohm, M. Scharf, R. Schneider, and C. Sander. Selection of a representative set of structures from the Brookhaven Protein Data bank. Protein Science, 1:409-417, 1992.
    [17]
    T. Hubbard and J. Park. Fold recognition and ab initio structure predictions using hidden Markov models and beta-strand pair potentials. Proteins, 3:398-402, 1995.
    [18]
    J. Jenkins, O. Mayans, and R. Pickersgill. Structure and evolution of parallel helix proteins. Journal of Structural Biology, 122:236-246, 1998.
    [19]
    D. Jones. Genthreader: an efficient and reliable protein fold recognition method for genomic sequences. J. Mol. Biol., 287:797-815, 1999.
    [20]
    D. Jones. Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol., 292:195-202, 1999.
    [21]
    D. Jones, W. Taylor, and J. Thornton. A new approach to protein fold recognition. Nature, 358:86-89, 1992.
    [22]
    K. Karplus, C. Barrett, and R. Hughey. Hidden Markov models for detectin remote protein homologies. Bioinformatics, 14:846-856, 1998.
    [23]
    L. Kelley, R. MacCallum, and M. Sternberg. Enhanced genome annotation using structure profiles in the program 3D-PSSM. J. Mol. Biol., 299(2):501-522, 2000.
    [24]
    P. Koehl and M. Levitt. A brighter future for protein structure prediction. Nat. Struct. Biol., 6:108-111, 1999.
    [25]
    S. Lifson and C. Sander. Specific recognition in the tertiary structure of sheets of proteins. Journal of Molecular Biology, 139:627-629, 1980.
    [26]
    A. Murzin, S. Brenner, T. Hubbard, and C. Chothia. Scop: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol., 297:536-540, 1995.
    [27]
    W. Pearson and D. Lipman. Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA, 85:2444-8, 1988.
    [28]
    B. Rost and C. Sander. Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol., 232:584-599, 1993.
    [29]
    M. Singh, B. Berger, and P. S. Kim. Learncoil-VMF: Computational evidence for coiled coil-like motifs in many viral membrane fussion proteins. J. of Molecular Biology, 290(1):241-251, 1999.
    [30]
    M. Singh, B. Berger, P. S. Kim, J. Berger, and A. Cochran. Computational learning reveals coiled coil-like motifs in histidine kinase linker domains. Proc. of the Natl. Academy of Sci., USA, 95(6):2738-2743, 1998.
    [31]
    M. Sippl and S. Weitckus. Detection of nativelike models for amino acid sequences of unknown three-dimensional structure in a data base of known protein conformations. Proteins, 13:258-271, 1992.
    [32]
    E. Sonnhamer, S. Eddy, E. Birney, A. Bateman, and R. Durbin. Pfam: multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Res, 26(1):320-322, 1998.
    [33]
    M. Sternberg, P. Bates, K. A. Kelley, and R. M. MacCallum. Progress in protein structure prediction: Assessment of CASP3. Curt. Opin. Struct. Biol., 9:368-373, 1999.
    [34]
    E. Wolf, P. S. Kim, and B. Berger. MultiCoil: a program for predicting two and three stranded coiled coils. Protein Science, 6(6):1179-1189, June 1997.
    [35]
    M. D. Yoder, N. T. Keen, and F. Jurnak. New domain motif: structure of pectate lyase C, a secreted plant virulence factor. Science, 260:1503-1507, 1993.
    [36]
    H. Zhu and W. Braun. Sequence specificity, statistical potentials and 3D structure prediction with self-correcting distance geometry calculations of beta-sheet formation in proteins. Protein Science, 8:326-342, 1999.

    Cited By

    View all
    • (2008)Dynactin Function in Mitotic Spindle PositioningTraffic10.1111/j.1600-0854.2008.00710.x9:4(510-527)Online publication date: 22-Jan-2008
    • (2006)Protein Fold Recognition Using Segmentation Conditional Random Fields (SCRFs)Journal of Computational Biology10.1089/cmb.2006.13.39413:2(394-406)Online publication date: Mar-2006
    • (2005)Predicting protein folds with structural repeats using a chain graph modelProceedings of the 22nd international conference on Machine learning10.1145/1102351.1102416(513-520)Online publication date: 7-Aug-2005
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    RECOMB '01: Proceedings of the fifth annual international conference on Computational biology
    April 2001
    316 pages
    ISBN:1581133537
    DOI:10.1145/369133
    • Chairman:
    • Thomas Lengauer
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 April 2001

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Conference

    RECOMB01
    Sponsor:

    Acceptance Rates

    RECOMB '01 Paper Acceptance Rate 35 of 128 submissions, 27%;
    Overall Acceptance Rate 148 of 538 submissions, 28%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2008)Dynactin Function in Mitotic Spindle PositioningTraffic10.1111/j.1600-0854.2008.00710.x9:4(510-527)Online publication date: 22-Jan-2008
    • (2006)Protein Fold Recognition Using Segmentation Conditional Random Fields (SCRFs)Journal of Computational Biology10.1089/cmb.2006.13.39413:2(394-406)Online publication date: Mar-2006
    • (2005)Predicting protein folds with structural repeats using a chain graph modelProceedings of the 22nd international conference on Machine learning10.1145/1102351.1102416(513-520)Online publication date: 7-Aug-2005
    • (2005)The Genome of S-PM2, a “Photosynthetic” T4-Type Bacteriophage That Infects Marine Synechococcus StrainsJournal of Bacteriology10.1128/JB.187.9.3188-3200.2005187:9(3188-3200)Online publication date: 1-May-2005
    • (2005)Segmentation conditional random fields (SCRFs)Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology10.1007/11415770_31(408-422)Online publication date: 14-May-2005
    • (2004)Wrap-and-packProceedings of the eighth annual international conference on Research in computational molecular biology10.1145/974614.974654(298-307)Online publication date: 27-Mar-2004
    • (2003)Chlamydia trachomatis and Chlamydia pneumoniae VaccinesNew Bacterial Vaccines10.1007/978-1-4615-0053-7_7(93-109)Online publication date: 2003

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media