Abstract
Targeting protein–protein interactions is a challenge and crucial task of the drug discovery process. A good starting point for rational drug design is the identification of hot spots (HS) at protein–protein interfaces, typically conserved residues that contribute most significantly to the binding. In this chapter, we depict point-by-point an in-house pipeline used for HS prediction using only sequence-based features from the well-known SpotOn dataset of soluble proteins (Moreira et al., Sci Rep 7:8007, 2017), through the implementation of a deep neural network. The presented pipeline is divided into three steps: (1) feature extraction, (2) deep learning classification, and (3) model evaluation. We present all the available resources, including code snippets, the main dataset, and the free and open-source modules/packages necessary for full replication of the protocol. The users should be able to develop an HS prediction model with accuracy, precision, recall, and AUROC of 0.96, 0.93, 0.91, and 0.86, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Abbreviations
- FN:
-
False negatives
- FP:
-
False positives
- TN:
-
True negatives
- TP:
-
True positives
References
Kotlyar M, Pastrello C, Malik Z et al (2019) IID 2018 update: context-specific physical protein–protein interactions in human, model organisms and domesticated species. Nucleic Acids Res 47:D581–D589
Lage K (2014) Protein–protein interactions and genetic diseases: the interactome. Biochim Biophys Acta Mol basis Dis 1842:1971–1980
Ran X, Gestwicki JE (2018) Inhibitors of protein–protein interactions (PPIs): an analysis of scaffold choices and buried surface area. Curr Opin Chem Biol 44:75–86
Fry DC (2015) Targeting protein-protein interactions for drug discovery. Protein-protein interactions. Methods Mol Biol 1278:93–106
Moreira IS, Koukos PI, Melo R et al (2017) SpotOn: high accuracy identification of protein-protein Interface hot-spots. Sci Rep 7:8007
Moreira IS, Fernandes PA, Ramos MJ (2007) Hot spots-a review of the protein-protein interface determinant amino-acid residues. Proteins 68:803–812
Melo R, Fieldhouse R, Melo A et al (2016) A machine learning approach for hot-spot detection at protein-protein interfaces. Int J Mol Sci 17:1215
Sommer C, Gerlich DW (2013) Machine learning in cell biology—teaching computers to recognize phenotypes. J Cell Sci 126:5529–5539
Libbrecht MW, Noble WS (2015) Machine learning applications in genetics and genomics. Nat Rev Genet 16:321–332
Lise S, Buchan D, Pontil M et al (2011) Predictions of hot spot residues at protein-protein interfaces using support vector machines. PLoS One 6:e16774
Ofran Y, Rost B (2007) ISIS: interaction sites identified from sequence. Bioinformatics 23:e13–e16
Wang H, Liu C, Deng L (2018) Enhanced prediction of hot spots at protein-protein interfaces using extreme gradient boosting. Sci Rep 8:14285
Jain AK, Jianchang M, Mohiuddin KM (1996) Artificial neural networks: a tutorial. Computer (Long Beach Calif) 29:31–44
Gonzalez RC (2018) Deep convolutional neural networks [lecture notes]. IEEE Signal Process Mag 35:79–87
Bengio Y (2009) Learning deep architectures for AI. Found trends®. Mach Learn 2:1–127
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444
Cock PJA, Antao T, Chang JT et al (2009) Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25:1422–1423
van der Walt S, Colbert SC, Varoquaux G (2011) The NumPy Array: a structure for efficient numerical computation. Comput Sci Eng 13:22–30
McKinney W (2010) Data structures for statistical computing in python, in: proceeding of the 9th python in science Conf (SciPy 2010), Austin, Texas
Rossum G van, Boer J de (1991) Linking a stub generator (AIL) to a prototyping language (python), In: EurOpen Conference Proceedings, Tromso, Norway
Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Abadi M, Agarwal A, Barham P et al (2015) TensorFlow: large-scale machine learning on heterogeneous distributed systems, preprint available at arXiv:1603.04467
Buckman J, Roy A, Raffel C et al (2018), Thermometer encoding: one hot way to resist adversarial examples. In: 6th international conference on learning representations (ICLR 2018), Vancouver, Canada
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. Preprint available at arXiv:1412.6980
Crowther PS, Cox RJ (2005) A method for optimal division of data sets for use in neural networks, presented at the knowledge-based intelligent information and engineering systems. KES 2005. In: Lecture notes in computer science, vol 3684. Springer, Berlin, Heidelberg
Acknowledgments
This work was supported by the European Regional Development Fund (ERDF), through the Centro 2020 Regional Operational Programme under project CENTRO-01-0145-FEDER-000008: BrainHealth 2020 and through the COMPETE 2020—Operational Programme for Competitiveness and Internationalisation and Portuguese national funds via FCT—Fundação para a Ciência e a Tecnologia, under project[s] POCI-01-0145-FEDER-031356, PTDC/QUI-OUT/32243/2017, and UIDB/04539/2020. A. J. Preto was also supported by FCT through PhD scholarship SFRH/BD/144966/2019. I. S. Moreira was funded by the FCT Investigator Programme—IF/00578/2014 (co-financed by European Social Fund and Programa Operacional Potencial Humano). The authors would like also to acknowledge ERNEST—European Research Network on Signal Transduction, CA18133, and STRATAGEM—New diagnostic and therapeutic tools against multidrug-resistant tumors, CA17104.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Preto, A.J., Matos-Filipe, P., de Almeida, J.G., Mourão, J., Moreira, I.S. (2021). Predicting Hot Spots Using a Deep Neural Network Approach. In: Cartwright, H. (eds) Artificial Neural Networks. Methods in Molecular Biology, vol 2190. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0826-5_13
Download citation
DOI: https://doi.org/10.1007/978-1-0716-0826-5_13
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-0825-8
Online ISBN: 978-1-0716-0826-5
eBook Packages: Springer Protocols