Abstract
We present a new knowledge-based Model Quality Assessment Program (MQAP) at the residue level which evaluates single protein structure models. We use a tree representation of the C α trace to train a novel Neural Network Pairwise Interaction Field (NN-PIF) to predict the global quality of a model. We also attempt to extract local quality from global quality. The model allows fast evaluation of multiple different structure models for a single sequence. In our tests on a large set of structures, our model outperforms most other methods based on different and more complex protein structure representations in both local and global quality prediction. The method is available upon request from the authors. Method-specific rankers may also built by the authors upon request.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cozzetto, D., Kryshtafovych, A., Ceriani, M., Tramontano, A.: Assessment of predictions in the model quality assessment category. Proteins 69(suppl. 8), 175–183 (2007)
Cornell, W., Cieplak, P., Bayly, C., Gould, I., Merz, K., Ferguson, D., Spellmeyer, D., Fox, T., Caldwell, J., Kollman, P.: A second generation force field for the simulation of proteins, nucleic acids, and organic molecules. J. Am. Chem. Soc. 117, 5179–5197 (1995)
MacKerell, A., Bashford, D., Bellott, M., Dunbrack, R., Evanseck, J., Field, M., Fischer, S., Gao, J., Guo, H., Ha, S., Joseph-McCarthy, D., Kuchnir, L., Kuczera, K., Lau, F., Mattos, C., Michnick, S., Ngo, T., Nguyen, D., Prodhom, B., Reiher, W., Roux, B., Schlenkrich, M., Smith, J., Stote, R., Straub, J., Watanabe, M., Wiorkiewicz-Kuczera, J., Yin, D., Karplus, M.: All-atom empirical potential for molecular modelling and dynamics studies of proteins. J. Phys. Chem. 102, 3586–3616 (1998)
Scott, W., Hünenberger, P., Tironi, I., Mark, A., Billeter, S., Fennen, J., Torda, A., Huber, T., Krüger, P., van Gunsteren, W.F.: The gromos biomolecular simulation program package. J. Phys. Chem. 103, 3596–3607 (1999)
Krieger, E., Koraimann, G., Vriend, G.: Increasing the precision of comparative models with yasara nova a self-parameterising force field. PROTEINS: Structure, Function, and Bioinformatics 47, 393–402 (2002)
Krieger, E., Darden, T., Nabuurs, S., Finkelstein, A., Vriend, G.: Making optimal use of empirical energy functions: Force-field parameterisation in crystal space. PROTEINS: Structure, Function, and Bioinformatics 57, 678–683 (2004)
Colubri, A., Jha, A., Shen, M., Sali, A., Berry, R., Sosnick, T., Freed, K.: Minimalist representations and the importance of nearest neighbour effects in protein folding simulations. J. Mol. Biol. 363, 835–857 (2006)
Fitzgerald, J., Jha, A., Colubri, A., Sosnick, T., Freed, K.: Reduced c β statistical potentials can outperform all-atom potentials in decoy identification. Protein Science 16, 2123–2139 (2001)
Wu, Y., Lu, M., Chen, M., Li, J., Ma, J.: Opus-c α : A knowledge-based potential function requiring only c α positions. Protein Science 16, 1449–1463 (2007)
Lu, M., Dousis, A., Ma, J.: Opuspsp: An orientation-dependent statistical all-atom potential derived from side-chain packing. J. Mol. Biol. 376, 288–301 (2008)
Leherte, L.: Application of multiresolution analyses to electron density maps of small molecules: Critical point representations for molecular superposition. J. of Math. Chem. 29(1), 47–83 (2001)
Simons, K., Kooperberg, T., Huang, E., Baker, D.: Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and bayesian scoring functions. J. Mol. Biol. 268, 209–225 (1997)
Baú, D., Pollastri, G., Vullo, A.: Distill: a machine learning approach to ab initio protein structure prediction. In: Bandyopadhyay, S., Maulik, U., Wang, J.T.L. (eds.) Analysis of Biological Data: A Soft Computing Approach. World Scientific, Singapore (2006)
Wu, S., Skolnick, J., Zhang, Y.: Ab initio modelling of small proteins by iterative tasser simulations. BMC Biology 5, 17 (2007)
Pettitt, C., McGuffin, L., Jones, D.: Improving sequence-based fold recognition by using 3d model quality assessment. Bioinformatics 21(17), 3509–3515 (2005)
Adcock, S.: Peptide backbone reconstruction using dead-end elimination and a knowledge-based forcefield. J. Comput. Chem. 25, 16–27 (2004)
Bower, M., Cohen, F., Dunbrack, R.: Prediction of protein side-chain rotamers from a backbone-dependent rotamer library: A new homology modelling tool. J. Mol. Biol. 267, 1268–1282 (1997)
Khatun, J., Khare, S., Dokhlyan, N.: Can contact potentials reliably predict stability of proteins? J. Mol. Biol. 336, 1223–1238 (2004)
Zhou, H., Zhou, Y.: Distance-scaled, finite ideal-gas reference state improves and stability prediction structure-derived potentials of mean force for structure selection. Protein Science 11, 2714–2726 (2002)
Hoppe, C., Schomburg, D.: Prediction of protein thermostability with a direction- and distance-dependent knowledge-based potential. Protein Science 14, 2682–2692 (2005)
Shao, Y., Bystroff, C.: Predicting interresidue contacts using templates and pathways. PROTEINS: Structure, Function, and Bioinformatics 53, 497–502 (2003)
Vullo, A., Walsh, I., Pollastri, G.: A two-stage approach for improved prediction of residue contact maps. BMC Bioinformatics 7, 18 (2006)
Martin, A., Baú, D., Walsh, I., Vullo, A., Pollastri, G.: Long-range information and physicality constraints improve predicted protein contact maps. Journal of Bioinformatics and Computational Biology 6(5) (2008)
Kleywegt, G.: Validation of protein models from c-alpha coordinates alone. J. Mol. Biol. 273, 371–376 (1997)
Ngan, S., Inouye, M., Samudrala, R.: A knowledge-based scoring function based on residue triplets for protein structure prediction. Protein Engineering, Desing & Selection 19(5), 187–193 (2006)
Feng, Y., Kloczkowski, A., Jernigan, R.: Four-body contact potentials derived from two protein datasets to discriminate native structures from decoys. PROTEINS: Structure, Function, and Bioinformatics 68, 57–66 (2007)
Loose, C., Klepeis, J., Floudas, C.: A new pairwise folding potential based on improved decoy generation and side-chain packing. PROTEINS: Structure, Function, and Bioinformatics 54, 303–314 (2004)
Heo, M., Kim, S., Moon, E., Cheon, M., Chung, K., Chang, I.: Perceptron learning of pairwise contact energies for proteins incorporating the amino acid environment. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 72, 011906 (2005)
Sippl, M.: Recognition of errors in three-dimensional structures of proteins. PROTEINS: Structure, Function, and Bioinformatics 17, 355–362 (1993)
Benkert, P., Tosatto, S., Schomburg, D.: Qmean: A comprehensive scoring function for model quality assessment. PROTEINS: Structure, Function, and Bioinformatics 71(1), 261–277 (2008)
Dong, Q., Wang, X., Lin, L.: Novel knowledge-based mean force potential at the profile level. BMC Bioinformatics 7, 324 (2006)
Zhang, C., Kim, S.: Environment-dependent residue contact energies for proteins. PNAS 97(6), 2550–2555 (2000)
Fogolari, F., Pieri, L., Dovier, A., Bortolussi, L., Giugliarelli, G., Corazza, A., Esposito, G., Viglino, P.: Scoring predictive models using a reduced representation of proteins: model and energy definition. BMC Structural Biology 7(15), 17 (2007)
Wallner, B., Elofsson, A.: Can correct protein models be identified? Protein Science 12, 1073–1086 (2003)
Wallner, B., Elofsson, A.: Identification of correct regions in protein models using structural, alignment, and consensus information. Protein Science 15, 900–913 (2006)
Samudrala, R., Moult, J.: An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. J. Mol. Biol. 275, 895–916 (1998)
Eisenberg, D., Lthy, R., Bowie, J.: Verify 3d: assessment of protein models with three-dimensional profiles. Methods Enzymol. 277, 396–404 (1997)
Wallner, B., Fang, H., Elofsson, A.: Automatic consensus-based fold recognition using pcons, proq, and pmodeller. PROTEINS: Structure, Function, and Genetics 53, 534–541 (2003)
McGuffin, L.: Benchmarking consensus model quality assessment for protein fold recognition. BMC Bioinformatics 8, 15 (2007)
Wallner, B., Elofsson, A.: Prediction of global and local model quality in casp7 using pcons and proq. PROTEINS: Structure, Function, and Bioinformatics 69(suppl. 8), 184–193 (2007)
Ginalski, K., Elofsson, A., Fischer, D., Rychlewski, L.: 3d-jury: a simple approach to improve protein structure predictions. Bioinformatics 19(8), 1015–1018 (2003)
Qiu, J., Sheffler, W., Baker, D., Noble, W.: Ranking predicted protein structures with support vector regression. PROTEINS: Structure, Function, and Bioinformatics 71, 1175–1182 (2008)
Zhou, H., Skolnick, J.: Protein model quality assessment prediction by combining fragment comparisons and a consensus ca contact potential. PROTEINS: Structure, Function, and Bioinformatics 71, 1211–1218 (2008)
Battey, J., Kopp, J., Bordoli, L., Read, R., Clarke, N., Schwede, T.: Automated server predictions in casp7. Proteins 69(suppl. 8), 68–82 (2007)
Sperduti, A., Starita, A.: Supervised neural networks for the classification of structures. IEEETNN 8(3), 714–735 (1997)
Frasconi, P.: An introduction to learning structured information. In: Giles, C.L., Gori, M. (eds.) IIASS-EMFCSC-School 1997. LNCS (LNAI), vol. 1387, pp. 99–120. Springer, Heidelberg (1998)
Frasconi, P., Gori, M., Sperduti, A.: A general framework for adaptive processing of data structures. IEEETNN 9(5), 768–786 (1998)
Martin, J., Letellier, G., Marin, A., Taly, J., de Brevern, A.G., Gibrat, J.F.: Protein secondary structure assignment revisited: a detailed analysis of different assignment methods. BMC Struct. Biol. 5, 17 (2005)
Majumdar, I., Krishna, S., Grishin, N.: Palsse: A program to delineate linear secondary structural elements from protein structures. BMC Bioinformatics 6(202), 24 (2005)
Labesse, G., Colloc’h, N., Pothier, J., Mornon, J.: P-sea: a new efficient assignment of secondary structure from c alpha trace of proteins. CABIOS 13(3), 291–295 (1997)
Hamelryck, T.: An amino acid has two sides: A new 2d measure provides a different view of solvent exposure. PROTEINS: Structure, Function, and Bioinformatics 59, 38–48 (2005)
Zemla, A., Venclovas, C., Moult, J., Fidelis, K.: Processing and analysis of casp3 protein structure predictions. Proteins 37(suppl. 3), 22–29 (1999)
Siew, N., Elofsson, A., Rychlewski, L., Fischer, D.: MaxSub: an automated measure for the assessment of protein structure prediction quality. Bioinformatics 16(9), 776–785 (2000)
Cristobal, S., Zemla, A., Fischer, D., Rychlewski, L., Elofsson, A.: A study of quality measures for protein threading models. BMC Bioinformatics 2(5), 15 (2001)
Zhang, Y., Skolnick, J.: Scoring function for automated assessment of protein structure template quality. PROTEINS: Structure, Function, and Bioinformatics 57, 702–710 (2004)
Tsai, J., Bonneau, R., Morozov, A., Kuhlman, B., Rohl, C., Baker, D.: An improved protein decoy set for testing energy functions for protein structure prediction. PROTEINS: Structure, Function, and Bioinformatics 53, 76–87 (2003)
Tosatto, S.: The victor/FRST function for model quality estimation. J. Comput. Biol. 12(10), 1316–1327 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Martin, A.J.M., Vullo, A., Pollastri, G. (2009). Neural Network Pairwise Interaction Fields for Protein Model Quality Assessment. In: Stützle, T. (eds) Learning and Intelligent Optimization. LION 2009. Lecture Notes in Computer Science, vol 5851. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11169-3_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-11169-3_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11168-6
Online ISBN: 978-3-642-11169-3
eBook Packages: Computer ScienceComputer Science (R0)