An equivariant generative framework for molecular graph-structure co-design

Z Zhang, Q Liu, CK Lee, CY Hsieh, E Chen - Chemical Science, 2023 - pubs.rsc.org
Chemical Science, 2023pubs.rsc.org
Designing molecules with desirable physiochemical properties and functionalities is a long-
standing challenge in chemistry, material science, and drug discovery. Recently, machine
learning-based generative models have emerged as promising approaches for de novo
molecule design. However, further refinement of methodology is highly desired as most
existing methods lack unified modeling of 2D topology and 3D geometry information and fail
to effectively learn the structure–property relationship for molecule design. Here we present …
Designing molecules with desirable physiochemical properties and functionalities is a long-standing challenge in chemistry, material science, and drug discovery. Recently, machine learning-based generative models have emerged as promising approaches for de novo molecule design. However, further refinement of methodology is highly desired as most existing methods lack unified modeling of 2D topology and 3D geometry information and fail to effectively learn the structure–property relationship for molecule design. Here we present MolCode, a roto-translation equivariant generative framework for molecular graph-structure Co-design. In MolCode, 3D geometric information empowers the molecular 2D graph generation, which in turn helps guide the prediction of molecular 3D structure. Extensive experimental results show that MolCode outperforms previous methods on a series of challenging tasks including de novo molecule design, targeted molecule discovery, and structure-based drug design. Particularly, MolCode not only consistently generates valid (99.95% validity) and diverse (98.75% uniqueness) molecular graphs/structures with desirable properties, but also generates drug-like molecules with high affinity to target proteins (61.8% high affinity ratio), which demonstrates MolCode's potential applications in material design and drug discovery. Our extensive investigation reveals that the 2D topology and 3D geometry contain intrinsically complementary information in molecule design, and provide new insights into machine learning-based molecule representation and generation.
The Royal Society of Chemistry