Abstract
Human leukocyte antigen (HLA) is a molecule that exists on the surface of most human cells and is capable of recognizing and binding to foreign peptides, triggering an immune response. Predicting the binding of peptides to HLA (pHLA) is crucial for screening effective immune therapy antigen targets. However, little attention has been paid to the relationship and comparative information between positive and negative samples. In this paper, we propose an attention-based contrastive learning model, ACLPHLA, for inferring pHLA binding specificity. We use a Transformer encoder to convert peptides into latent representations, and then mask a portion of the amino acids based on attention weights to generate their contrastive views. Compared to a fully supervised baseline model, we demonstrate that large-scale peptide sequence pre-training based on contrastive learning significantly improves the sequence representation and downstream task prediction performance. We explore different masking strategies, among which masking a certain percentage of amino acids with lower attention weights exhibits the best performance. Comparative experiments on two independent datasets show that our method outperforms other existing algorithms. In addition, our statistical analysis of attention weights reveals important amino acids and their position preferences in pHLA binding, demonstrating the potential interpretability of our proposed model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lundegaard, C., Lund, O., Buus, S., Nielsen, M.: Major histocompatibility complex class I binding predictions as a tool in epitope discovery. Immunology 130(3), 309–318 (2010)
Xie, X., Han, Y., Zhang, K.: Mhcherrypan: a novel pan-specific model for binding affinity prediction of class I HLA-peptide. Int. J. Data Min. Bioinform. 24(3), 201–219 (2020)
Yang, X., Zhao, L., Wei, F., Li, J.: Deepnetbim: deep learning model for predicting HLA-epitope interactions based on network analysis by harnessing binding and immunogenicity information. BMC Bioinform. 22(1), 1–16 (2021)
Jing, J., et al.: Deep learning pan-specific model for interpretable MHC-I peptide binding prediction with improved attention mechanism. Proteins Struct. Funct. Bioinform. 89(7), 866–883 (2021)
Chu, Y., et al. A transformer-based model to predict peptide–HLA class i binding and optimize mutated peptides for vaccine design. Nature Mach. Intell. 4(3):300–311 (2022)
Zhang, H., Lund, O., Nielsen, M.: The pickpocket method for predicting binding specificities for receptors based on receptor pocket similarities: application to MHC-peptide binding. Bioinformatics 25(10), 1293–1299 (2009)
Mei, S., et al.: Anthem: a user customised tool for fast and accurate prediction of binding between peptides and HLA class I molecules. Briefings Bioinform. 22(5), bbaa415 (2021)
Yan, H., et al.: Acme: pan-specific peptide–MHC class I binding prediction through attention-based deep neural networks. Bioinformatics 35(23), 4946–4954 (2019)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)
Wang, Q., et al.: Pssm-distil: Protein secondary structure prediction (pssp) on low-quality pssm by knowledge distillation with contrastive learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 617–625 (2021)
Fang, Y., Liu, X., Liu, H.: Attention-aware contrastive learning for predicting t cell receptor–antigen binding specificity. Briefings Bioinform. 23(6), bbac378 (2022)
Reynisson, B., Alvarez, B., Paul, S., Peters, B., Nielsen, M.: Netmhcpan-4.1 and netmhciipan-4.0: improved predictions of mhc antigen presentation by concurrent motif deconvolution and integration of ms mhc eluted ligand data. Nucleic Acids Res. 48(W1), W449–W454 (2020)
Jurtz, V., Paul, S., Andreatta, M., Marcatili, P., Peters, B., Nielsen, M.: Netmhcpan-4.0: improved peptide–MHC class I interaction predictions integrating eluted ligand and peptide binding affinity data. J. Immunol. 199(9), 3360–3368 (2017). https://doi.org/10.4049/jimmunol.1700893
Larsen, M.V., et al.: An integrative approach to CTL epitope prediction: a combined algorithm integrating MHC class I binding, tap transport efficiency, and proteasomal cleavage predictions. Eur. J. Immunol. 35(8), 2295–2303 (2005)
van den Oord, V., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)
Liu, Q., et al.: Deeptorrent: a deep learning-based approach for predicting DNA n4methylcytosine sites. Briefings Bioinform. 22(3), bbaa124 (2021)
Hasegawa, D., Kaneko, N., Shirakawa, S., Sakuta, H., Sumi, K.: Evaluation of speech-to-gesture generation using bi-directional LSTM network. In: Proceedings of the 18th International Conference on Intelligent Virtual Agents, pp. 79–86 (2018)
Singh, V., Shrivastava, S., Singh, S.K., Kumar, A., Saxena, S.: Stable-abppred: a stacked ensemble predictor based on bilstm and attention mechanism for accelerated discovery of antibacterial peptides. Briefings Bioinform. 23(1):bbab439 (2022)
Sharma, R., Shrivastava, S., Singh, S.K., Kumar, A., Saxena, S., Singh., R.K.: Deep-afppred: identifying novel antifungal peptides using pretrained embeddings from seq2vec with 1dcnn-bilstm. Briefings Bioinform. 23(1), bbab422 (2022)
Andreatta, M., Nielsen, M.: Gapped sequence alignment using artificial neural networks: application to the MHC class I system. Bioinformatics 32(4), 511–517 (2016)
Moutaftsi, M., et al.: A consensus epitope prediction approach identifies the breadth of murine tcd8+-cell responses to vaccinia virus. Nature Biotechnol. 24(7), 817–819 (2006)
Karosiene, E., Lundegaard, C., Lund, O., Nielsen, M.: Netmhccons: a consensus method for the major histocompatibility complex class I predictions. Immunogenetics 64, 177–186 (2012)
Rasmussen, M., et al.: Pan-specific prediction peptide–MHC class I complex stability, a correlate of T cell immunogenicity. J. Immunol. 197(4), 1517–1524 (2016)
Madden, D.R.: The three-dimensional structure of peptide-MHC complexes. Ann. Rev. Immunol. 13(1), 587–622 (1995)
Parker, K.C., Shields, M., DiBrino, M., Brooks, A., Coligan, J.E.: Peptide binding to MHC class I molecules: implications for antigenic peptide prediction. Immunol. Res. 14, 34–57 (1995)
Stewart-Jones, G.B.E., et al.: Structures of three hiv-1 hla-b* 5703-peptide complexes and identification of related hlas potentially associated with long-term nonprogression. J. Immunol. 175(4), 2459–2468 (2005)
Niu, L., et al.: Structural basis for the differential classification of hla-a*6802 and hla-a* 6801 into the a2 and a3 supertypes. Molecul. Immunol. 55(3–4), 381–392 (2013)
Macdonald, W.A.,et al.: A naturally selected dimorphism within the hla-b44 supertype alters class I structure, peptide repertoire, and t cell recognition. J. Exper. Med. 198(5), 679–691 (2003). https://doi.org/10.1084/jem.20030066
Acknowledgements
This work is supported by the National Natural Science Foundation of China (grant nos. 62072384, 61872309, 62072385, 61772441), and the Zhejiang Lab (2022RD0AB02), and the National Key R&D Program of China (2017YFE0130600).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Luo, P. et al. (2023). Attention-Aware Contrastive Learning for Predicting Peptide-HLA Binding Specificity. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science, vol 14088. Springer, Singapore. https://doi.org/10.1007/978-981-99-4749-2_46
Download citation
DOI: https://doi.org/10.1007/978-981-99-4749-2_46
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4748-5
Online ISBN: 978-981-99-4749-2
eBook Packages: Computer ScienceComputer Science (R0)