Molecular contrastive learning of representations via graph neural networks

Wang, Yuyang; Wang, Jianren; Cao, Zhonglin; Barati Farimani, Amir

doi:10.1038/s42256-022-00447-x

Article
Published: 03 March 2022

Molecular contrastive learning of representations via graph neural networks

Yuyang WangÂ ORCID: orcid.org/0000-0003-0723-6246^1,2,
Jianren Wang³,
Zhonglin CaoÂ ORCID: orcid.org/0000-0003-2096-1178¹ &
â¦
Amir Barati FarimaniÂ ORCID: orcid.org/0000-0002-2952-8576^1,2,4Â

Nature Machine Intelligence volumeÂ 4,Â pages 279â287 (2022)Cite this article

20k Accesses
319 Citations
65 Altmetric
Metrics details

Subjects

A preprint version of the article is available at arXiv.

Abstract

Molecular machine learning bears promise for efficient molecular property prediction and drug discovery. However, labelled molecule data can be expensive and time consuming to acquire. Due to the limited labelled data, it is a great challenge for supervised-learning machine learning models to generalize to the giant chemical space. Here we present MolCLR (Molecular Contrastive Learning of Representations via Graph Neural Networks), a self-supervised learning framework that leverages large unlabelled data (~10 million unique molecules). In MolCLR pre-training, we build molecule graphs and develop graph-neural-network encoders to learn differentiable representations. Three molecule graph augmentations are proposed: atom masking, bond deletion and subgraph removal. A contrastive estimator maximizes the agreement of augmentations from the same molecule while minimizing the agreement of different molecules. Experiments show that our contrastive learning framework significantly improves the performance of graph-neural-network encoders on various molecular property benchmarks including both classification and regression tasks. Benefiting from pre-training on the large unlabelled database, MolCLR even achieves state of the art on several challenging benchmarks after fine-tuning. In addition, further investigations demonstrate that MolCLR learns to embed molecules into representations that can distinguish chemically reasonable molecular similarities.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 2: Investigation of molecule graph augmentations on classification benchmarks.**

**Fig. 3: Visualization of molecular representations learned by MolCLR via t-SNE.**

**Fig. 4: Comparison of MolCLR-learned representations and conventional FPs using the query molecule (PubChem ID 42953211).**

Hierarchical Molecular Graph Self-Supervised Learning for property prediction

Article Open access 17 February 2023

A knowledge-guided pre-training framework for improving molecular representation learning

Article Open access 21 November 2023

Geometry-enhanced molecular representation learning for property prediction

Article Open access 07 February 2022

Data availability

The pre-training data and molecular property prediction benchmarks used in this work are available in the both the CodeOcean capsule at https://doi.org/10.24433/CO.8582800.v1⁴⁹ and the GitHub repository at https://github.com/yuyangw/MolCLR.

Code availability

The code accompanying this work is available in both the CodeOcean capsule at https://doi.org/10.24433/CO.8582800.v1⁴⁹ and the GitHub repository at https://github.com/yuyangw/MolCLR.

References

BartÃ³k, A. P., Kondor, R. & CsÃ¡nyi, G. On representing chemical environments. Phys. Rev. B 87, 184115 (2013).
ArticleÂ Google ScholarÂ
Huang, B. & Von Lilienfeld, O. A. Communication: Understanding molecular representations in machine learning: the role of uniqueness and target similarity. J. Chem. Phys. 145, 161102 (2016).
ArticleÂ Google ScholarÂ
David, L., Thakkar, A., Mercado, R. & Engkvist, O. Molecular representations in AI-driven drug discovery: a review and practical guide. J. Cheminform. 12, 56 (2020).
ArticleÂ Google ScholarÂ
Oprea, T. I. & Gottfries, J. Chemography: the art of navigating in chemical space. J. Comb. Chem. 3, 157â166 (2001).
ArticleÂ Google ScholarÂ
Rogers, D. & Hahn, M. Extended-connectivity fingerprints. J. Chem. Inf. Model. 50, 742â754 (2010).
ArticleÂ Google ScholarÂ
Duvenaud, D. et al. Convolutional networks on graphs for learning molecular fingerprints. In Proc. 28th International Conference on Neural Information Processing Systems 2224â2232 (MIT Press, 2015).
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In International Conference on Machine Learning 1263â1272 (PMLR, 2017).
Karamad, M. et al. Orbital graph convolutional neural network for material property prediction. Phys. Rev. Mater. 4, 093801 (2020).
ArticleÂ Google ScholarÂ
Chmiela, S., Sauceda, H. E., MÃ¼ller, K.-R. & Tkatchenko, A. Towards exact molecular dynamics simulations with machine-learned force fields. Nat. Commun. 9, 3887 (2018).
Deringer, V. L. et al. Realistic atomistic structure of amorphous silicon from machine-learning-driven molecular dynamics. J. Phys. Chem. Lett. 9, 2879â2885 (2018).
ArticleÂ Google ScholarÂ
Wang, W. & GÃ³mez-Bombarelli, R. Coarse-graining auto-encoders for molecular dynamics. npj Comput. Mater. 5, 125 (2019).
ArticleÂ Google ScholarÂ
Altae-Tran, H., Ramsundar, B., Pappu, A. S. & Pande, V. Low data drug discovery with one-shot learning. ACS Cent. Sci. 3, 283â293 (2017).
ArticleÂ Google ScholarÂ
Magar, R., Yadav, P. & Farimani, A. B. Potential neutralizing antibodies discovered for novel corona virus using machine learning. Sci. Rep. 11, 5261 (2021).
ArticleÂ Google ScholarÂ
Wang, Y., Cao, Z. & Farimani, A. B. Efficient water desalination with graphene nanopores obtained using artificial intelligence. npj 2D Mater. Appl. 5, 66 (2021).
ArticleÂ Google ScholarÂ
Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comp. Sci. 28, 31â36 (1988).
ArticleÂ Google ScholarÂ
Krenn, M., HÃ¤se, F., Nigam, A., Friederich, P. & Aspuru-Guzik, A. Self-referencing embedded strings (selfies): a 100% robust molecular string representation. Mach. Learn. Sci. Technol. 1, 045024 (2020).
ArticleÂ Google ScholarÂ
Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations (2017).
Xu, K., Hu, W., Leskovec, J. & Jegelka, S. How powerful are graph neural networks? In International Conference on Learning Representations (2019).
SchÃ¼tt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A. & MÃ¼ller, K.-R. SchNetâa deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).
ArticleÂ Google ScholarÂ
Yang, K. et al. Analyzing learned molecular representations for property prediction. J. Chem. Inf. Model. 59, 3370â3388 (2019).
ArticleÂ Google ScholarÂ
Kirkpatrick, P. & Ellis, C. Chemical space. Nature 432, 823 (2004).
Bohacek, R. S., McMartin, C. & Guida, W. C. The art and practice of structure-based drug design: a molecular modeling perspective. Med. Res. Rev. 16, 3â50 (1996).
ArticleÂ Google ScholarÂ
Brown, N., Fiscato, M., Segler, M. H. & Vaucher, A. C. GuacaMol: benchmarking models for de novo molecular design. J. Chem. Inf. Model. 59, 1096â1108 (2019).
ArticleÂ Google ScholarÂ
Wu, Z. et al. Moleculenet: a benchmark for molecular machine learning. Chem. Sci. 9, 513â530 (2018).
ArticleÂ Google ScholarÂ
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436â444 (2015).
ArticleÂ Google ScholarÂ
Vamathevan, J. et al. Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 18, 463â477 (2019).
ArticleÂ Google ScholarÂ
Unterthiner, T. et al. Deep learning as an opportunity in virtual screening. In Proc. Deep Learning Workshop at NIPS Vol. 27 (2014).
Ma, J., Sheridan, R. P., Liaw, A., Dahl, G. E. & Svetnik, V. Deep neural nets as a method for quantitative structureâactivity relationships. J. Chem. Inf. Model. 55, 263â274 (2015).
ArticleÂ Google ScholarÂ
Ramsundar, B. et al. Massively multitask networks for drug discovery. Preprint at https://arxiv.org/abs/1502.02072 (2015).
Kusner, M. J., Paige, B. & HernÃ¡ndez-Lobato, J. M. Grammar variational autoencoder. In International Conference on Machine Learning 1945â1954 (PMLR, 2017).
Gupta, A. et al. Generative recurrent networks for de novo drug design. Mol. Inf. 37, 1700111 (2018).
ArticleÂ Google ScholarÂ
Xu, Z., Wang, S., Zhu, F. & Huang, J. Seq2seq fingerprint: an unsupervised deep molecular embedding for drug discovery. In Proc. 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics 285â294 (ACM, 2017).
GÃ³mez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4, 268â276 (2018).
ArticleÂ Google ScholarÂ
Schwaller, P. et al. Molecular transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS Cent. Sci. 5, 1572â1583 (2019).
ArticleÂ Google ScholarÂ
Maziarka, Å. et al. Molecule attention transformer. Preprint at https://arxiv.org/abs/2002.08264 (2020).
Feinberg, E. N. et al. PotentialNet for molecular property prediction. ACS Cent. Sci. 4, 1520â1530 (2018).
ArticleÂ Google ScholarÂ
Klicpera, J., GroÃ, J. & GÃ¼nnemann, S. Directional message passing for molecular graphs. Preprint at https://arxiv.org/abs/2003.03123 (2020).
Gaulton, A. et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 40, D1100âD1107 (2012).
ArticleÂ Google ScholarÂ
Sterling, T. & Irwin, J. J. Zinc 15âligand discovery for everyone. J. Chem. Inf. Model. 55, 2324â2337 (2015).
ArticleÂ Google ScholarÂ
Kim, S. et al. Pubchem 2019 update: improved access to chemical data. Nucleic Acids Res. 47, D1102âD1109 (2019).
ArticleÂ Google ScholarÂ
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. Preprint at https://arxiv.org/abs/1810.04805 (2018).
Chithrananda, S., Grand, G. & Ramsundar, B. ChemBERTa: large-scale self-supervised pretraining for molecular property prediction. Preprint at https://arxiv.org/abs/2010.09885 (2020).
Wang, S., Guo, Y., Wang, Y., Sun, H. & Huang, J. SMILES-BERT: large scale unsupervised pre-training for molecular property prediction. In Proc. 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics 429â436 (ACM, 2019).
Liu, S., Demirel, M. F. & Liang, Y. N-gram graph: simple unsupervised representation for graphs, with applications to molecules. In Thirty-third Conference on Neural Information Processing Systems (NeurIPS, 2019).
Hu, W. et al. Strategies for pre-training graph neural networks. In International Conference on Learning Representations (2020).
You, Y. et al. Graph contrastive learning with augmentations. Adv. Neural Inf. Process. Syst. 33, 5812â5823 (2020).
Google ScholarÂ
van den Oord, A., Li, Y. & Vinyals, O. Representation learning with contrastive predictive coding. Preprint at https://arxiv.org/abs/1807.03748 (2018).
Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning 1597â1607 (PMLR, 2020).
Wang, Y., Wang, J., Cao, Z. & Farimani, A. B. MolCLR: molecular contrastive learning of representations via graph neural networks. CodeOcean https://doi.org/10.24433/CO.8582800.v1 (2021).
Chen, T., Kornblith, S., Swersky, K., Norouzi, M. & Hinton, G. Big self-supervised models are strong semi-supervised learners. Preprint at https://arxiv.org/abs/2006.10029 (2020).
Do, K., Tran, T. & Venkatesh, S. Graph transformation policy network for chemical reaction prediction. In Proc. 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 750â760 (ACM, 2019).
Jin, W., Barzilay, R. & Jaakkola, T. Hierarchical generation of molecular graphs using structural motifs. In International Conference on Machine Learning 4839â4848 (PMLR, 2020).
Lu, C. et al. Molecular property prediction: a multilevel quantum interactions modeling perspective. In Proc. AAAI Conference on Artificial Intelligence Vol. 33, 1052â1060 (AAAI, 2019).
Van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579â2605 (2008).
MATHÂ Google ScholarÂ
Yun, S., Jeong, M., Kim, R., Kang, J. & Kim, H. J. Graph transformer networks. In Advances in Neural Information Processing Systems Vol. 32 (eds. Wallach, H. et al.) (Curran Associates, 2019).
Pope, P. E., Kolouri, S., Rostami, M., Martin, C. E. & Hoffmann, H. Explainability methods for graph convolutional neural networks. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 10772â10781 (IEEE, 2019).
Bronstein, M. M., Bruna, J., LeCun, Y., Szlam, A. & Vandergheynst, P. Geometric deep learning: going beyond Euclidean data. IEEE Signal Process. Mag. 34, 18â42 (2017).
ArticleÂ Google ScholarÂ
He, K., Fan, H., Wu, Y., Xie, S. & Girshick, R. Momentum contrast for unsupervised visual representation learning. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 9729â9738 (IEEE, 2020).
Gao, T., Yao, X. & Chen, D. SimCSE: simple contrastive learning of sentence embeddings. Preprint at https://arxiv.org/abs/2104.08821 (2021).
Wang, J., Lu, Y. & Zhao, H. CLOUD: contrastive learning of unsupervised dynamics. Preprint at https://arxiv.org/abs/2010.12488 (2020).
Landrum, G. RDKit: open-source cheminformatics (2006); https://www.rdkit.org/
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).
Fey, M. & Lenssen, J. E. Fast graph representation learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds (2019).
Ho, T. K. Random decision forests. In Proc. 3rd International Conference on Document Analysis and Recognition Vol. 1, 278â282 (IEEE, 1995).
Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273â297 (1995).
MATHÂ Google ScholarÂ

Download references

Acknowledgements

We thank the start-up fund provided by the Department of Mechanical Engineering at Carnegie Mellon University. The work is also funded in part by the Advanced Research Projects Agency-Energy (ARPA-E), US Department of Energy, under award no. DE-AR0001221.

Author information

Authors and Affiliations

Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
Yuyang Wang,Â Zhonglin CaoÂ &Â Amir Barati Farimani
Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA, USA
Yuyang WangÂ &Â Amir Barati Farimani
Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
Jianren Wang
Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
Amir Barati Farimani

Authors

Yuyang Wang
View author publications
You can also search for this author in PubMedÂ Google Scholar
Jianren Wang
View author publications
You can also search for this author in PubMedÂ Google Scholar
Zhonglin Cao
View author publications
You can also search for this author in PubMedÂ Google Scholar
Amir Barati Farimani
View author publications
You can also search for this author in PubMedÂ Google Scholar

Contributions

Y.W., J.W. and A.B.F. designed the research study. Y.W., J.W. and Z.C. developed the method, wrote the code and performed the analysis. All authors wrote and approved the manuscript.

Corresponding author

Correspondence to Amir Barati Farimani.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks AlÃ¡n Aspuru-Guzik and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisherâs note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1â3, Discussion and Tables 1â5.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Y., Wang, J., Cao, Z. et al. Molecular contrastive learning of representations via graph neural networks. Nat Mach Intell 4, 279â287 (2022). https://doi.org/10.1038/s42256-022-00447-x

Download citation

Received: 14 June 2021
Accepted: 14 January 2022
Published: 03 March 2022
Issue Date: March 2022
DOI: https://doi.org/10.1038/s42256-022-00447-x

This article is cited by

Multi-channel learning for integrating structural hierarchies into context-dependent molecular representation
- Yue Wan
- Jialu Wu
- Xiaowei Jia
Nature Communications (2025)
Coverage bias in small molecule machine learning
- Fleming Kretschmer
- Jan Seipp
- Sebastian BÃ¶cker
Nature Communications (2025)
MultiGranDTI: an explainable multi-granularity representation framework for drug-target interaction prediction
- Xu Gong
- Qun Liu
- Guoyin Wang
Applied Intelligence (2025)
Enhancing molecular property prediction with auxiliary learning and task-specific adaptation
- Vishal Dey
- Xia Ning
Journal of Cheminformatics (2024)
Drug-target interaction prediction with collaborative contrastive learning and adaptive self-paced sampling strategy
- Zhen Tian
- Yue Yu
- Quan Zou
BMC Biology (2024)