Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

3DReact: Geometric deep learning
for chemical reactions

Puck van Gerwen Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland    Ksenia R. Briling Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland    Charlotte Bunne National Center for Competence in Research – Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland    Vignesh Ram Somnath National Center for Competence in Research – Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland    Ruben Laplaza Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland    Andreas Krause National Center for Competence in Research – Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland    Clemence Corminboeuf clemence.corminboeuf@epfl.ch Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
Abstract

Geometric deep learning models, which incorporate the relevant molecular symmetries within the neural network architecture, have considerably improved the accuracy and data efficiency of predictions of molecular properties. Building on this success, we introduce 3DReact, a geometric deep learning model to predict reaction properties from three-dimensional structures of reactants and products. We demonstrate that the invariant version of the model is sufficient for existing reaction datasets. We illustrate its competitive performance on the prediction of activation barriers on the GDB7-22-TS, Cyclo-23-TS and Proparg-21-TS datasets in different atom-mapping regimes. We show that, compared to existing models for reaction property prediction, 3DReact offers a flexible framework that exploits atom-mapping information, if available, as well as geometries of reactants and products (in an invariant or equivariant fashion). Accordingly, it performs systematically well across different datasets, atom-mapping regimes, as well as both interpolation and extrapolation tasks.

keywords:
machine learning, equivariant neural networks, geometric deep learning, activation energies, chemical reactions
\SectionNumbersOn\altaffiliation

These authors contributed equally to this work. \alsoaffiliationNational Center for Competence in Research – Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland \altaffiliationThese authors contributed equally to this work. \alsoaffiliationLearning & Adaptive Systems Group, Department of Computer Science, ETH Zurich, 8092 Zurich, Switzerland \alsoaffiliationLearning & Adaptive Systems Group, Department of Computer Science, ETH Zurich, 8092 Zurich, Switzerland \alsoaffiliationNational Center for Competence in Research – Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland \alsoaffiliationLearning & Adaptive Systems Group, Department of Computer Science, ETH Zurich, 8092 Zurich, Switzerland \alsoaffiliationNational Center for Competence in Research – Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland

1 Introduction

Physics-inspired representations that take as input the three-dimensional structure1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 (as well as, in some cases, electronic structure14, 15, 16, 17) of molecules and transform it into a fixed-length vector, while respecting known physical laws, have a rich history in molecular property prediction.2, 8, 4, 6, 7, 13, 18, 19, 20, 5, 1, 21, 3, 9, 10, 22, 23, 24, 25, 12, 26, 27, 28, 29, 30 Common desiderata31, 32, 33, 34 for high-performing representations are (i) smoothness, (ii) encoding of the appropriate symmetries to permutations, rotations and translations,24, 35 (iii) completeness and (iv) additivity to allow for extrapolation to larger systems. Such fingerprints,2, 4, 3, 24, 6, 7, 8, 5, 11, 12, 13 being rooted in fundamental principles, are designed to be property-independent: a single representation can be constructed for a molecule to predict any quantum-chemical target. This is analogous to the molecular Hamiltonian, which specifies the energy and all other properties of a system as a function of atoms’ types and positions in three-dimensional space (assuming the molecules are charge neutral and singlets). These representations are typically used in combination with kernel models due to their data efficiency, ability to deal with high-dimensional feature vectors, and interpretability of the similarity kernel.2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 31, 32, 33 Early works showed that combining such representations2, 4, 6, 8, 36 with simple feed-forward neural networks instead of kernel models did not necessarily led to better performance.37, 38

More recently, end-to-end neural networks have been proposed that learn the representation as part of the (supervised) training process,39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 based on similar principles to the aforementioned physics-inspired representations: they take as input a three-dimensional structure, as well as in some cases charge and spin information.46, 51, 52, 53 The network may be invariant or equivariant to rotations and translations of the input molecules. The former is typically achieved by operating on distances between atoms,39, 40, 42 and the latter by operating on relative position vectors and angular information processed by rotationally-equivariant convolutional layers.46, 49, 50, 62, 59, 45, 43, 41, 44, 48, 54, 55, 56, 57, 58 Equivariant models are naturally suited to predict vectorial44, 49, 62, 43, 59, 48, 45 or higher order tensorial59, 52, 54, 55, 63 properties. They have also been demonstrated to exhibit improved data efficiency and generalization capabilities compared to their invariant counterparts on predictions of scalar properties,43 albeit at a higher computational cost. Nevertheless, given an expressive enough architecture (i.e., using higher-order messages41, 62, 64, 65, 56, 66, 67 and/or enough convolutional layers43, 52, 56), invariant models are sufficient for many property prediction tasks.56

Despite these advances for molecular property prediction, the prediction of computed reaction properties (principally, reaction barriers68, 36, 38, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82) is still in its infancy.83 Machine learning approaches span from utilizing simple two-dimensional fingerprints of reaction components84, 85 (reactants and products) to physical-organic descriptors86, 87, 76, 88, 89, 75, 90, 91, 92, 93, 94, 95, 80, 96, 97, 98, 82, or electronic structure-inspired features99, to transformer models100, 101 adapted for regression,102 and 2D graph-based approaches71, 103, 70, 81. The latter, particularly the ChemProp model,71, 103 are often best-in-class in predicting reaction properties.103 It has been shown38 that these models achieve their impressive performance by exploiting atom-mapping information,104, 105, 106, 107 which provide information analogous to the reaction mechanism.

Another category of reaction fingerprints arises from discretization of physically-inspired functions2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13 constructed using a cheap estimate of the transition state (TS) structure73 or rather the structures of the reaction components36, 69, 74 The SLATMd representation69, 36 in particular has been shown38 to yield accurate predictions of reaction barriers, particularly for datasets108, 69 relying on subtle changes in the geometry of reactants and/or products. End-to-end models based on three-dimensional structures of reactants and products have also recently emerged.99, 72, 109 In a different vein, several works110, 111, 112, 113, 114, 115 aim to directly predict the TS structure, which together with the reactant structure gives the reaction barrier. These approaches lie outside the scope of the property-prediction focus here.

Due to the diversity of challenges posed by different reaction datasets, neither atom-mapping-based models nor 3D-geometry-based models achieve consistently better performance on reaction property prediction tasks.38 To date, no model has been proposed that can incorporate both chemical (atom-maps) and physical (geometry) priors. To address this gap, we introduce 3DReact, a geometric deep learning model that encodes both the three-dimensional structures of reactants and products as well as atom-mapping information or proxies thereof to predict properties of chemical reactions (showcased here for activation energies).

We demonstrate the performance of 3DReact on three datasets of reaction barriers: GDB7-22-TS,116 Cyclo-23-TS,117 and Proparg-21-TS.108, 69 As discussed in previous works,38 these datasets present a myriad of challenges for ML models, from the dependence on chemical information116 to the distinction of subtle changes in configurations.108, 69 We show that, compared to state-of-the-art models for reaction property prediction,103, 36 3DReact offers accurate and reliable performance across different datasets as well as atom-mapping regimes, reduced dependence on the quality of three-dimensional geometries, and stable extrapolation behavior.

2 Architecture

Refer to caption
Figure 1: Architecture of 3DReact. Molecules pass through independent symmetry-adapted (invariant or equivariant) channels (green and orange). These are combined to yield a reaction representation (blue) which is used to predict a reaction property, such as the activation energy (red dot).

3DReact is built from O(3)O3\mathrm{O}(3)roman_O ( 3 )-equivariant convolutional networks over point clouds as implemented in e3nn.118 Specifically, we use the tensor field network architecture47 for molecular components as in Corso et al119 While the architecture is equivariant by default, it can easily be made invariant (vide infra). The geometries of molecules constituting reactants and products of each reaction are passed through separate channels, detailed in Section 2.1. They are then combined to eventually predict a reaction property, such as the activation energy, as detailed in Section 2.2. The overall architecture is summarized in Figure 1.

2.1 Symmetry-adapted molecular channels

A molecule with Natsubscript𝑁atN_{\mathrm{at}}italic_N start_POSTSUBSCRIPT roman_at end_POSTSUBSCRIPT atoms is represented as a distance-based graph where nodes describe atoms and edges describe bonds. Instead of explicitly using connectivity information, the “bonds” of atom a𝑎aitalic_a are formed with all the neighboring Neigh(a)Neigh𝑎\operatorname{Neigh}(a)roman_Neigh ( italic_a ) atoms within the cutoff rmaxsubscript𝑟r_{\max}italic_r start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT. Initial scalar bond (edge) features {𝐞ab(0)}subscriptsuperscript𝐞0𝑎𝑏\{\mathrm{\mathbf{e}}^{(0)}_{ab}\}{ bold_e start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_a italic_b end_POSTSUBSCRIPT } between atoms a𝑎aitalic_a and b𝑏bitalic_b, as well as spherical harmonics filters {𝐳ab}subscript𝐳𝑎𝑏\{\mathrm{\mathbf{z}}_{ab}\}{ bold_z start_POSTSUBSCRIPT italic_a italic_b end_POSTSUBSCRIPT }, are computed from internal coordinates, as detailed in equations LABEL:S-eq:initial-edge-features-startLABEL:S-eq:initial-edge-features-end. The atom (node) features {𝐱a(0)}subscriptsuperscript𝐱0𝑎\{\mathrm{\mathbf{x}}^{(0)}_{a}\}{ bold_x start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT } are initialized with nf=16subscript𝑛𝑓16n_{f}{=}16italic_n start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = 16 cheminformatics descriptors computed with RDKit.120 These include atomic number, chirality tag (unspecified, tetrahedral, or other, including octahedral, square planar, allene-type), number of directly-bonded neighbors, number of rings, implicit valence, formal charge, number of attached hydrogens, number of unpaired electrons, hybridization, aromaticity, and presence in rings of specified sizes from 3333 to 7777. This choice is inspired by EquiBind121 and DiffDock119 and is in line with the improved 72 features used for 2D-based methods.

The initial node and edge features pass through embeddings to give {𝐱a(1)}subscriptsuperscript𝐱1𝑎\{\mathrm{\mathbf{x}}^{(1)}_{a}\}{ bold_x start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT } and {𝐞ab}subscript𝐞𝑎𝑏\{\mathrm{\mathbf{e}}_{ab}\}{ bold_e start_POSTSUBSCRIPT italic_a italic_b end_POSTSUBSCRIPT } respectively, the former are then updated by nconv{2,3}subscript𝑛conv23n_{\mathrm{conv}}\in\{2,3\}italic_n start_POSTSUBSCRIPT roman_conv end_POSTSUBSCRIPT ∈ { 2 , 3 } equivariant convolutional layers. Each layer is a fully-connected weighted tensor product, as defined in e3nn118. Equations LABEL:S-eq:convs-startLABEL:S-eq:convs-end describe the equivariant operations performed by the network (see Section LABEL:S-sec:molecular_channels for mathematical details). The network with equivariant molecular components as described is referred to as EquiReact, where its invariant counterpart InReact uses only the =00\ell=0roman_ℓ = 0 (scalar) spherical harmonics to construct the convolution filters. The output of the molecular channels is the local molecular representation 𝐗Nat×D𝐗superscriptsubscript𝑁at𝐷\mathrm{\mathbf{X}}\in\mathbb{R}^{N_{\mathrm{at}}\times D}bold_X ∈ blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT roman_at end_POSTSUBSCRIPT × italic_D end_POSTSUPERSCRIPT corresponding to Natsubscript𝑁atN_{\mathrm{at}}italic_N start_POSTSUBSCRIPT roman_at end_POSTSUBSCRIPT atoms associated with D𝐷Ditalic_D features. Depending on the sum_mode hyperparameter, it is constructed either from the node features (node mode) or both node and edge features (both mode).

Inspired by the ChemProp model,71, 103 we added an option to exclude hydrogen atoms as nodes when constructing the graph. The only information about hydrogens is then contained in the initial edge features of heavy atoms.

2.2 Combining molecules for reactions

Once atom-wise molecular representations 𝐗𝐗\mathrm{\mathbf{X}}bold_X are learned for reactant and product molecules, they must be combined to form a reaction representation 𝐗rxnsubscript𝐗rxn\mathrm{\mathbf{X}}_{\mathrm{rxn}}bold_X start_POSTSUBSCRIPT roman_rxn end_POSTSUBSCRIPT.

Refer to caption
Figure 2: Scheme illustrating how the reactant (green) and product (orange) representations are combined to form a reaction representation (blue) and eventually predict the target property (red dot) using a multilayer perceptron (mlp). \sum refers to the summation over atom-wise environments. Oblong rectangles and squares represent vectors and scalars, respectively.

For certain datasets, atom-mapping information is available, which correlates individual atoms in reactant molecules to individual atoms in product molecules according to the reaction mechanism. In this setting, the representations 𝐗reactantsubscript𝐗reactant\mathrm{\mathbf{X}}_{\mathrm{reactant}}bold_X start_POSTSUBSCRIPT roman_reactant end_POSTSUBSCRIPT and 𝐗productsubscript𝐗product\mathrm{\mathbf{X}}_{\mathrm{product}}bold_X start_POSTSUBSCRIPT roman_product end_POSTSUBSCRIPT are re-ordered such that the local representation vectors correspond to the same atom in reactants and products. Depending on the combine_mode hyperparameter, either a difference is taken between products’ and reactants’ atom representations, or they are summed, averaged, or passed through a multilayer perceptron (MLP). Thus, the local reaction representation 𝐗rxnsubscript𝐗rxn\mathrm{\mathbf{X}}_{\mathrm{rxn}}bold_X start_POSTSUBSCRIPT roman_rxn end_POSTSUBSCRIPT consists of vectors reflecting how the environment changes in the reaction for each atom. We will refer to this variant of the model, which uses atom-mapping information, as 3DReactM. While the current model is unable to treat unbalanced reactions (where there are additional atoms on the left- or right-hand side of the reaction equation), its modification in the spirit of ChemProp71, 103 is straightforward.

With the reaction representation at hand, predictions are made in the so-called vector or energy modes. In vector mode, the atomic vectors constituting the reaction representation 𝐗rxnsubscript𝐗rxn\mathrm{\mathbf{X}}_{\mathrm{rxn}}bold_X start_POSTSUBSCRIPT roman_rxn end_POSTSUBSCRIPT are initially passed through an MLP to introduce nonlinearity and then summed up to form a global reaction representation vector 𝐗¯rxnsubscript¯𝐗rxn\mathrm{\mathbf{\bar{X}}}_{\mathrm{rxn}}over¯ start_ARG bold_X end_ARG start_POSTSUBSCRIPT roman_rxn end_POSTSUBSCRIPT. The target is then learned using an MLP. This model pipeline is illustrated in Figure 2a. In energy mode, on the other hand, the local reaction representations are used to learn atomic contributions to the target (Figure 2b). While performing worse in general, in some cases this mode yields the best predictions (see Section 3.2.1).

Atom-mapping provides static information, analogous to a reaction mechanism, to link atoms in reactants to atoms in products. While highly informative, and thought to be critical to the performance of 2D-graph-based models, 71, 103, 72, 70, 81 accurate atom-maps are not available for all reaction datasets.38, 104, 105 To circumvent the need for atom-mapping, but mimic its role in exchanging information between reactants and products, other approaches dynamically (i.e., in a learnable fashion) exchange information between molecular representations. For example, RXNMapper107 is a neural network that learns atom-mappings within the larger self-supervised task of predicting the randomly masked parts in a reaction sequence, using one head of a multi-head transformer architecture. EquiBind,121 a neural network that predicts the rotation and translation of a ligand to a protein, contains a cross-attention module between ligand and receptor. The latter inspires our surrogate for atom-mapping: 3DReactX also uses cross-attention between reactants and products to link their atom indices (Section LABEL:S-sec:cross). The re-ordered representations of reactants and products are combined as for the case of atom-mapped reactions (Figures 2a and 2b). We note that other algorithms could also have been used to exchange information between reactants and products, for example in the form of message passing or equivariant attention.57, 122

3DReact also has a simple “no mapping” variant, called 3DReactS, which does not rely on atom-mapping, nor a surrogate cross-attention module. In vector mode (Figure 2c), the atomic components of molecular representations 𝐗reactantsubscript𝐗reactant\mathrm{\mathbf{X}}_{\mathrm{reactant}}bold_X start_POSTSUBSCRIPT roman_reactant end_POSTSUBSCRIPT and 𝐗productsubscript𝐗product\mathrm{\mathbf{X}}_{\mathrm{product}}bold_X start_POSTSUBSCRIPT roman_product end_POSTSUBSCRIPT are summed up to obtain global vectors 𝐗¯reactantsubscript¯𝐗reactant\mathrm{\mathbf{\bar{X}}}_{\mathrm{reactant}}over¯ start_ARG bold_X end_ARG start_POSTSUBSCRIPT roman_reactant end_POSTSUBSCRIPT and 𝐗¯productsubscript¯𝐗product\mathrm{\mathbf{\bar{X}}}_{\mathrm{product}}over¯ start_ARG bold_X end_ARG start_POSTSUBSCRIPT roman_product end_POSTSUBSCRIPT, respectively. Then they are combined, according to the combine_mode parameter, to form a reaction vector 𝐗¯rxnsubscript¯𝐗rxn\mathrm{\mathbf{\bar{X}}}_{\mathrm{rxn}}over¯ start_ARG bold_X end_ARG start_POSTSUBSCRIPT roman_rxn end_POSTSUBSCRIPT which is used to learn the target with an MLP. In energy mode (Figure 2d) individual atomic representations are used to learn their contributions to the quasi-molecular energies of reactants and products, which are later combined (according to the combine_mode parameter) to predict the target. In most cases, this simpler model out-performs 3DReactX (vide infra).

3 Results and Discussion

The performance of 3DReact is reported for three diverse datasets (the GDB7-22-TS,116 Cyclo-23-TS 117 and Proparg-21-TS 108, 69) using both random and extrapolative splits. For details on the datasets, refer to Section 5.1. For details on the extrapolation splits, see Section 5.2.

Models are run in three atom-mapping regimes: (i) with high-quality maps (“True”) derived from the TS structures or heuristic rules;117, 123, 116, 71, 106 (ii) with atom-maps obtained using the open-source RXNMapper107 (“RXNMapper”); and (iii) without any atom-mapping information at all (“None”). As discussed in recent work,124, 38 previously developed graph-based models for reaction property prediction71, 72, 70, 96, 97 including ChemProp71, 103 reported prediction errors only in the “True” atom-mapping regime. The “RXNMapper” regime is important for cases where the reaction mechanism is not known and atom-mapping using heuristic rules is impossible. The “None” regime is critical for all chemistry that falls outside the realm of organic chemistry captured in the patents125 that RXNMapper107 is trained on.

The atom-mapping-based model 3DReactM is used in the “True” and “RXNMapper” regimes. In the “None” regime, 3DReactX and 3DReactS were tested. 3DReactS consistently outperformed 3DReactX, so we include only 3DReactS and refer the reader to Section LABEL:S-sec:cross for their comparison.

3.1 Equivariance vs. invariance

Table 1 compares the relative performance of the invariant (InReact) and the equivariant (EquiReact) implementations of 3DReact with the learning curves of the two models presented in Figure 3. Previous studies43, 56 demonstrated that the equivariant models showed superior extrapolation capabilities on predictions of energies and forces, as well as steeper and shifted learning curves in force prediction tasks. Instead, we find that InReact and EquiReact are practically indistinguishable for the present chemical reaction tasks.

Dataset (property, units) Atom-mapping regime InReact EquiReact
Random splits
GDB7-22-TS (ΔEΔsuperscript𝐸\Delta E^{\ddagger}roman_Δ italic_E start_POSTSUPERSCRIPT ‡ end_POSTSUPERSCRIPT, kcal/mol) True 4.93±0.18plus-or-minus4.930.184.93\pm 0.184.93 ± 0.18 4.93±0.15plus-or-minus4.930.154.93\pm 0.154.93 ± 0.15
RXNMapper 6.03±0.26plus-or-minus6.030.266.03\pm 0.266.03 ± 0.26 6.05±0.25plus-or-minus6.050.256.05\pm 0.256.05 ± 0.25
None 6.56±0.26plus-or-minus6.560.266.56\pm 0.266.56 ± 0.26 6.53±0.28plus-or-minus6.530.286.53\pm 0.286.53 ± 0.28
Cyclo-23-TS (ΔGΔsuperscript𝐺\Delta G^{\ddagger}roman_Δ italic_G start_POSTSUPERSCRIPT ‡ end_POSTSUPERSCRIPT, kcal/mol) True 2.39±0.08plus-or-minus2.390.082.39\pm 0.082.39 ± 0.08 2.30±0.09plus-or-minus2.300.092.30\pm 0.092.30 ± 0.09
RXNMapper 2.37±0.07plus-or-minus2.370.072.37\pm 0.072.37 ± 0.07 2.35±0.12plus-or-minus2.350.122.35\pm 0.122.35 ± 0.12
None 2.39±0.05plus-or-minus2.390.052.39\pm 0.052.39 ± 0.05 2.31±0.09plus-or-minus2.310.092.31\pm 0.092.31 ± 0.09
Proparg-21-TS (ΔEΔsuperscript𝐸\Delta E^{\ddagger}roman_Δ italic_E start_POSTSUPERSCRIPT ‡ end_POSTSUPERSCRIPT, kcal/mol) True 0.33±0.07plus-or-minus0.330.070.33\pm 0.070.33 ± 0.07 0.31±0.05plus-or-minus0.310.050.31\pm 0.050.31 ± 0.05
None 0.34±0.06plus-or-minus0.340.060.34\pm 0.060.34 ± 0.06 0.31±0.06plus-or-minus0.310.060.31\pm 0.060.31 ± 0.06
Scaffold splits
GDB7-22-TS (ΔEΔsuperscript𝐸\Delta E^{\ddagger}roman_Δ italic_E start_POSTSUPERSCRIPT ‡ end_POSTSUPERSCRIPT, kcal/mol) True 7.8±0.7plus-or-minus7.80.77.8\pm 0.77.8 ± 0.7 7.8±0.8plus-or-minus7.80.87.8\pm 0.87.8 ± 0.8
RXNMapper 9.2±0.8plus-or-minus9.20.89.2\pm 0.89.2 ± 0.8 9.1±0.8plus-or-minus9.10.89.1\pm 0.89.1 ± 0.8
None 10.1±0.9plus-or-minus10.10.910.1\pm 0.910.1 ± 0.9 10.0±0.9plus-or-minus10.00.910.0\pm 0.910.0 ± 0.9
Cyclo-23-TS (ΔGΔsuperscript𝐺\Delta G^{\ddagger}roman_Δ italic_G start_POSTSUPERSCRIPT ‡ end_POSTSUPERSCRIPT, kcal/mol) True 2.79±0.18plus-or-minus2.790.182.79\pm 0.182.79 ± 0.18 2.72±0.18plus-or-minus2.720.182.72\pm 0.182.72 ± 0.18
RXNMapper 2.77±0.22plus-or-minus2.770.222.77\pm 0.222.77 ± 0.22 2.71±0.23plus-or-minus2.710.232.71\pm 0.232.71 ± 0.23
None 2.76±0.22plus-or-minus2.760.222.76\pm 0.222.76 ± 0.22 2.72±0.19plus-or-minus2.720.192.72\pm 0.192.72 ± 0.19
Proparg-21-TS (ΔEΔsuperscript𝐸\Delta E^{\ddagger}roman_Δ italic_E start_POSTSUPERSCRIPT ‡ end_POSTSUPERSCRIPT, kcal/mol) True 0.44±0.11plus-or-minus0.440.110.44\pm 0.110.44 ± 0.11 0.40±0.08plus-or-minus0.400.080.40\pm 0.080.40 ± 0.08
None 0.45±0.10plus-or-minus0.450.100.45\pm 0.100.45 ± 0.10 0.41±0.09plus-or-minus0.410.090.41\pm 0.090.41 ± 0.09
Table 1: Performance as measured in mean absolute errors (MAEs) of predictions of 3DReact (InReact vs. EquiReact). 3DReactM is used for the “True” and “RXNMapper” regimes, and 3DReactS is used for the “None” regime. MAEs are averaged over 10 folds of 80/10/10 splits (training/validation/test) and reported together with standard deviations across folds.
Refer to caption
Figure 3: Learning curves for InReact and EquiReact in the “True” atom-mapping regime. Each point shows mean absolute error (MAE), averaged over 10 folds of 80/10/10 splits (for training set fraction <0.8absent0.8<0.8< 0.8, the corresponding subset of the “full” training set is used), and error bars indicate standard deviations across folds.
Refer to caption
Figure 4: Top: Reactant and product of a toy reaction: two homometric structures (a) and (b) with atom labels, atom coordinates (Å), and interatomic distances (Å). “Bonds” of the same length are of the same color. Bottom: output after 5 epochs of the invariant (c,d,e) / equivariant (f,g) molecular channels for each atom with different radial cutoffs rmaxsubscript𝑟r_{\max}italic_r start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT and number of convolutional layers nconvsubscript𝑛convn_{\mathrm{conv}}italic_n start_POSTSUBSCRIPT roman_conv end_POSTSUBSCRIPT. Within each subfigure, atomic representations indistinguishable up to shown digits are marked by the same color.

We find that the datasets studied herein do not benefit from the inclusion of equivariant features for molecules. Yet, Figure 4 illustrates that a hypothetical reaction involving conversion between homometric structures of \ceHe4,126 which is mostly characterized by angle changes, clearly benefits from equivariant molecular features. In the reactant (Figure 4a), all atoms are identical and lead to the same learned representation. In the product (Figure 4b), only atoms B2 and B3 have identical environments, different from A1–4. Atoms B2–3 have the same distances r𝑟ritalic_r to the three neighbors, as in A1–4. Thus, InReact, which uses only interatomic distances, yields very close representations for these atoms (Figure 4c). Still, in each convolutional layer, atoms B2 and B3 receive information from B1 and B4, and with increase of nconvsubscript𝑛convn_{\mathrm{conv}}italic_n start_POSTSUBSCRIPT roman_conv end_POSTSUBSCRIPT the difference in the representations of B2–3 and A1–A4 becomes more apparent (Figure 4d). However, with smaller radial cutoff rmax=1.9subscript𝑟1.9r_{\max}=1.9italic_r start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT = 1.9, atoms B1–B3 and A1–4 become indistinguishable for any number of layers (Figure 4e). On the other hand, EquiReact, which uses explicit angular information from the spherical harmonics filters, clearly distinguishes all non-equivalent atoms in both cases already for nconv=2subscript𝑛conv2n_{\mathrm{conv}}=2italic_n start_POSTSUBSCRIPT roman_conv end_POSTSUBSCRIPT = 2 (Figure 4f,g).

While this is a toy example, it illustrates that transformations consisting of changes in angles rather than in bond lengths are better described using EquiReact. In general, the currently available reaction datasets do not pose sufficient challenge to allow distinguishing InReact and EquiReact. For the datasets studied in this work, InReact is sufficient and is the model variation used throughout as 3DReact.

3.2 Benchmark studies

3DReact is compared to previously best baseline models:38 ChemProp,71, 103 a graph neural network that uses atom-mapped SMILES to construct a CGR, and the 3D-structure-based SLATM8 fingerprint adapted to reactions by taking the difference between product and reactant fingerprints (SLATMd),36 combined with KRR models (SLATMd+KRR).

Note that both 3DReact and ChemProp are run without explicit H atoms, for two reasons. First, hydrogen atoms are not always mapped in the “True” and “RXNMapper” regimes, since they are usually implicit in SMILES strings. Second, there is no consistent improvement in including H atoms in the models (Table LABEL:S-tab:models-with-withoutH). SLATMd, built directly from molecular coordinates without using SMILES strings, does however incorporate H atoms by default. For further discussion refer to Section LABEL:S-sec:hydrogens.

3.2.1 Random splits

Performance as measured in mean absolute errors (MAEs) is illustrated in Table 2 for random splits of each dataset, demonstrating the models’ interpolative capabilities. For the equivalent results with root mean squared errors (RMSEs), consult Section LABEL:S-sec:rmse.

Dataset (property, units) Atom-mapping regime ChemProp SLATMd+KRR 3DReact
GDB7-22-TS (ΔEΔsuperscript𝐸\Delta E^{\ddagger}roman_Δ italic_E start_POSTSUPERSCRIPT ‡ end_POSTSUPERSCRIPT, kcal/mol) True 4.35±0.15plus-or-minus4.350.15\bf 4.35\pm 0.15bold_4.35 ± bold_0.15 4.93±0.18plus-or-minus4.930.184.93\pm 0.184.93 ± 0.18
RXNMapper 5.69±0.17plus-or-minus5.690.17\bf 5.69\pm 0.17bold_5.69 ± bold_0.17 6.03±0.26plus-or-minus6.030.26\bf 6.03\pm 0.26bold_6.03 ± bold_0.26
None 9.04±0.21plus-or-minus9.040.219.04\pm 0.219.04 ± 0.21 6.89±0.20plus-or-minus6.890.20\bf 6.89\pm 0.20bold_6.89 ± bold_0.20 6.56±0.26plus-or-minus6.560.26\bf 6.56\pm 0.26bold_6.56 ± bold_0.26
Cyclo-23-TS (ΔGΔsuperscript𝐺\Delta G^{\ddagger}roman_Δ italic_G start_POSTSUPERSCRIPT ‡ end_POSTSUPERSCRIPT, kcal/mol) True 2.69±0.10plus-or-minus2.690.102.69\pm 0.102.69 ± 0.10 2.39±0.08plus-or-minus2.390.08\bf 2.39\pm 0.08bold_2.39 ± bold_0.08
RXNMapper 2.71±0.07plus-or-minus2.710.072.71\pm 0.072.71 ± 0.07 2.37±0.07plus-or-minus2.370.07\bf 2.37\pm 0.07bold_2.37 ± bold_0.07
None 2.71±0.12plus-or-minus2.710.122.71\pm 0.122.71 ± 0.12 2.65±0.08plus-or-minus2.650.082.65\pm 0.082.65 ± 0.08 2.39±0.05plus-or-minus2.390.05\bf 2.39\pm 0.05bold_2.39 ± bold_0.05
Proparg-21-TS (ΔEΔsuperscript𝐸\Delta E^{\ddagger}roman_Δ italic_E start_POSTSUPERSCRIPT ‡ end_POSTSUPERSCRIPT, kcal/mol) True 1.53±0.14plus-or-minus1.530.141.53\pm 0.141.53 ± 0.14 0.33±0.07plus-or-minus0.330.07\bf 0.33\pm 0.07bold_0.33 ± bold_0.07
None 1.56±0.16plus-or-minus1.560.161.56\pm 0.161.56 ± 0.16 0.33±0.04plus-or-minus0.330.04\bf 0.33\pm 0.04bold_0.33 ± bold_0.04 0.34±0.06plus-or-minus0.340.06\bf 0.34\pm 0.06bold_0.34 ± bold_0.06
Table 2: Performance as measured in mean absolute errors (MAEs) of predictions of 3DReact vs. state-of-the-art baselines ChemProp and SLATMd+KRR. All datasets are compared in three atom-mapping regimes: “True”, “RXNMapper” and “None”, except for the Proparg-21-TS set, where RXNMapper cannot map the reaction SMILES. MAEs are averaged over 10 folds of random 80/10/10 splits (training/validation/test) and reported together with standard deviations across folds. The lowest errors for each regime and dataset are highlighted in bold, if statistically relevant.

The GDB7-22-TS dataset is distinct from the other two in that it includes variations in the reaction class (and mechanism), thereby showing a greater dependence on the existence and quality of atom-mapping information in the models. It has already been observed38 for ChemProp that there is stark hierarchy in the predictions from the “True” to “RXNMapper” to “None” regimes.

In the “True” regime, 3DReact does not improve predictive capabilities over the ChemProp model for the GDB7-22-TS set. This points to the importance of the chemical diversity in this dataset, where knowledge of the reaction mechanism (in the form of atom-maps) is sufficient information to predict the reaction barriers without information about the geometries of reactants and products. However, as previously discussed,38 “True” maps are an unrealistic scenario for most datasets. Moving to the “RXNMapper” regime, 3DReact and ChemProp already agree within standard deviations. This highlights that for practical-quality maps, 3DReact is amongst the best models for this dataset. In the “None” regime, 3DReact outperforms ChemProp by more than 2 kcal/moltimes2kcalmol2\text{\,}\mathrm{k}\mathrm{c}\mathrm{a}\mathrm{l}\mathrm{/}\mathrm{m}\mathrm{% o}\mathrm{l}start_ARG 2 end_ARG start_ARG times end_ARG start_ARG roman_kcal / roman_mol end_ARG.

SLATMd+KRR results in similar performance to 3DReact for the GDB7-22-TS set. The SLATMd representation also constructs features from 3D coordinates of the reactants and products using invariant functions, and is therefore more fundamentally similar to 3DReact than ChemProp. Nevertheless, since 3DReact allows for the inclusion of atom-mapping information, predictions are improved in the mapped regimes compared to SLATMd+KRR, which operates in the “None” regime only.

In summary, for the chemically diverse GDB7-22-TS set, while SLATMd allows for good performance in the “None” regime, and ChemProp in the “True” and “RXNMapper” regimes, since 3DReact can incorporate both atom-mapping information and 3D structure information, the model achieves robust performance in all three regimes, with the predicted MAEs ranging from 4.936.56 kcal/moltimesrange4.936.56kcalmol4.936.56\text{\,}\mathrm{k}\mathrm{c}\mathrm{a}\mathrm{l}\mathrm{/}\mathrm{m}% \mathrm{o}\mathrm{l}start_ARG start_ARG 4.93 end_ARG – start_ARG 6.56 end_ARG end_ARG start_ARG times end_ARG start_ARG roman_kcal / roman_mol end_ARG.

The Cyclo-23-TS 117 dataset contains a single reaction class and has been previously illustrated38 to show less dependence on the quality of atom-mapping than the GDB7-22-TS. For this set, 3DReact outperforms or matches the other models in all three regimes. This illustrates that a model based purely on geometry information of reactants and products, without any chemical information in the form of atom-mapping or surrogates thereof, can allow for accurate reaction property prediction. It is worth noting that atom-mapping does not improve predictions at all, i.e. there is no improvement from “None” to “RXNMapper” to “True”, even for the ChemProp model. This points to the different nature of this dataset compared to the GDB7-22-TS.

The best model is obtained with 3DReactS in the energy mode (Figure 2d). As outlined in Section 2.2, in energy mode an energy contribution is learned for reactants’ and products’ atoms separately. In the original publication,117 Stuyver et al. illustrate that the activation barriers (ΔGΔsuperscript𝐺\Delta G^{\ddagger}roman_Δ italic_G start_POSTSUPERSCRIPT ‡ end_POSTSUPERSCRIPT) correlate linearly with the reaction energy (ΔGΔ𝐺\Delta Groman_Δ italic_G). Since ΔGΔ𝐺\Delta Groman_Δ italic_G is the difference between products’ and reactants’ energies, the energy mode is the best choice for a model learning the reaction energy, and in the case of this dataset, for ΔGΔsuperscript𝐺\Delta G^{\ddagger}roman_Δ italic_G start_POSTSUPERSCRIPT ‡ end_POSTSUPERSCRIPT too, due to its linear correlation with ΔGΔ𝐺\Delta Groman_Δ italic_G.

Compared to SLATMd+KRR, 3DReact in the “None” regime results in lower prediction errors for this set, illustrating that despite both models using similar information, an end-to-end model can allow for improved predictions.

The Proparg-21-TS 108, 69 is a small dataset for neural network standards (753 points) and therefore constitutes a challenge for the data efficiency of our model. Like the Cyclo-23-TS set, it consists of a single reaction class, i.e. enantioselective propargylation of benzaldehyde. Since the enantioselectivity is related to the barrier through an exponential relationship, it is critical to predict the barrier accurately (\leq 1 kcal/moltimes1kcalmol1\text{\,}\mathrm{k}\mathrm{c}\mathrm{a}\mathrm{l}\mathrm{/}\mathrm{m}\mathrm{% o}\mathrm{l}start_ARG 1 end_ARG start_ARG times end_ARG start_ARG roman_kcal / roman_mol end_ARG).69 The “RXNMapper” regime is not available since RXNMapper cannot atom-map the reaction SMILES of this set.

In the other regimes, 3D-structure-based models lead to the best results, outperforming ChemProp by a large margin. Proparg-21-TS is particularly hard for 2D-based models38 since it contains molecules of different stereochemistry but the same SMILES strings. Again trained on a single-reaction class dataset, models do not benefit from being provided the “obvious” chemical information: including true atom-maps does not decrease the error. Competing only in the “None” regime, 3DReact does not allow for a performance improvement compared to SLATMd+KRR. Given the small size of the dataset, it is already a demonstration of data efficiency that the deep-learning model matches the prediction errors of the kernel model. Unlike for Nequip43 however, the data efficiency here is not due to the equivariant molecular components (Section 3.1).


The three datasets illustrate the benefits of the flexibility of 3DReact: depending on the datasets’ particular challenges, the model exploits the available information to yield the best-performing model in almost all cases. Since the model settings (such as vector or energy mode choice) are specified as hyperparameters, the optimized version of 3DReact can emerge with minimal user intervention.

3.2.2 Extrapolative splits

Figure 5 illustrates model performance for extrapolative splits (based on scaffolds, molecular size of reactants/products, and barrier magnitude, detailed in Section 5.2). These different types of extrapolative splits are necessarily more difficult than random splits, as demonstrated by higher MAEs in Figure 5. The relative performance of the models is largely maintained in the three different extrapolation regimes compared to the interpolation regime presented in Table 2.

Refer to caption
Figure 5: Mean absolute errors (MAEs) of predictions using three different extrapolation splits: scaffold, size-, and property-based. All datasets are compared in three atom-mapping regimes: “True”, “RXNMapper” (RXNM), and “None”, except for the Proparg-21-TS set, where RXNMapper cannot map the reaction SMILES. MAEs are averaged over 10 folds of 80/10/10 splits (training/validation/test), and error bars indicate standard deviations across folds, where applicable.

Bemis–Murcko scaffold127 splitting clusters molecules (reactants for GDB7-22-TS and Proparg-21-TS, products for Cyclo-23-TS) based on ring systems. Test molecules may therefore appear “novel” from the point of view of the reaction graph, but will still feature distances and angles close to what the model has seen during training. Similarly for size-based splits, since there is no correlation between reactant/product size and reaction barriers, using distance information allows for stable predictions on extrapolation. Property-based splits are more challenging than the other two. For the Cyclo-23-TS and Proparg-21-TS sets, 3DReact still offers respectable errors, lower than those of the other models. For the GDB7-22-TS set however, all models result in unreasonable MAEs over 20 kcal/moltimes20kcalmol20\text{\,}\mathrm{k}\mathrm{c}\mathrm{a}\mathrm{l}\mathrm{/}\mathrm{m}\mathrm% {o}\mathrm{l}start_ARG 20 end_ARG start_ARG times end_ARG start_ARG roman_kcal / roman_mol end_ARG. This points to the particular challenges of the GDB7-22-TS set and suggests an avenue for further developments of ML models for extrapolative tasks.81

Again in contrast to previous works that suggested equivariant models might be better at extrapolation tasks,43, 56 here we find that 3DReact offers stable extrapolation performance (particularly for size- and scaffold-based splits), but not necessarily improved extrapolation behavior compared to 2D-graph based models. This points to the different challenges in reaction property prediction. Nevertheless, Figure 5 illustrates that 3DReact is a consistently robust model for the three datasets when moving from interpolation to extrapolation regimes.

3.3 Model behavior

Since the GDB7-22-TS set has the largest chemical diversity amongst the datasets explored, studying 3DReact and baseline models SLATMd and ChemProp on this dataset best captures the different chemical interpretation provided by these models.

Refer to caption
Figure 6: t-SNE maps (perplexity =64absent64=64= 64) of the latent representations of 3DReact and ChemProp models in the “True” regime and the SLATMd representation of the GDB7-22-TS dataset, colored by the target ΔEΔsuperscript𝐸\Delta E^{\ddagger}roman_Δ italic_E start_POSTSUPERSCRIPT ‡ end_POSTSUPERSCRIPT (upper panels) and reaction types (lower panels).

Figure 6 compares the (latent) representations of 3DReact “True”, ChemProp “True” and SLATMd using t-SNE128 maps. In the upper panel, we find that the quality of the correlation between the representations and the target property are aligned with the relative performance of the models (Table 2). ChemProp and 3DReact show a smooth transition of the target property, whereas the map of SLATMd does not have a clear structure. The lower panel shows the correlation of the representations with the five most common reaction types defined by bond breaking and formation (see Section 5.4). ChemProp, as a chemically-inspired model, illustrates clear clusters in the reaction type. While SLATMd is a geometry-based model, the binning structure used to create the representation8, 36 results in a clear correlation with the reaction types, since e.g. the pairwise bins naturally cluster features such as \ceC–\ceH bond formation or breaking. 3DReact shows the least distinct “chemical” clustering, due to the interplay of geometry and mapping information exploited in the representation.

Refer to caption
Figure 7: Box plots illustrating how 3DReact “True” performs for the most common reaction types in the GDB7-22-TS set. 3DReact is constructed without explicit \ceH nodes in the graphs. The boxes range from the first to the third quartile of the datapoints. The whiskers limit 90% of the datapoints and the individual points illustrate outliers. The points correspond to the test set of the first random split. The errors are given in the target standard deviation (stdev) units (21.8 kcal/moltimes21.8kcalmol21.8\text{\,}\mathrm{k}\mathrm{c}\mathrm{a}\mathrm{l}\mathrm{/}\mathrm{m}% \mathrm{o}\mathrm{l}start_ARG 21.8 end_ARG start_ARG times end_ARG start_ARG roman_kcal / roman_mol end_ARG).

Figure 7 shows the error distribution of predictions belonging to the same reaction classes for 3DReact “True”. 3DReact performs universally well across the different reaction types, with consistently low errors and relatively small error spread. The reactions for which the model has higher mean errors and spread (+++H–H,--C–H,--C–H (green)) correspond to those involving C–H and H–H features. Since the model is trained without explicit H nodes in the graph, features associated with X–H bonds are included implicitly in the model. Capturing H–H bond changes will be the most challenging as these will be the least explicitly described, occurring only as initial features for neighboring nodes. Since C is the most frequently occurring element in various different configurations, capturing all the C–H features is more challenging than the O–H features for example, which will be more similar to one another. The equivalent plot for the model trained with explicit H nodes is shown in Figure LABEL:S-fig:box_plot_with_H, illustrating that the error spread reduces for the reaction types involving C–H and H–H features. Note that 3DReact without explicit Hs still leads to performance comparable to the variant with explicit Hs (Section LABEL:S-sec:hydrogens).

3.4 Geometry quality

In order to illustrate that 3DReact does not require high-quality molecular structures to be used in an out-of-sample scenario, we train and test a model using lower-quality GFN2-xTB129 (xTB) geometries to predict higher-level barriers (CCSD(T)-F12a/cc-pVDZ-F12//ω𝜔\omegaitalic_ωB97X-D3/def2-TZVP for GDB7-22-TS, B3LYP-D3(BJ)/def2-TZVP//B3LYP-D3(BJ)/def2-SVP for Cyclo-23-TS and B97D/TZV(2p,2d) for Proparg-21-TS). The results are illustrated in Figure 8 for the three datasets with DFT and xTB geometries, and compared to the SLATMd+KRR model in the same settings. 3DReact benefits from a lower sensitivity to the geometry quality compared to the pre-designed representation SLATMd combined with KRR, across the three datasets.

Refer to caption
Figure 8: Mean absolute errors (MAEs) for predictions using either the provided geometries (ω𝜔\omegaitalic_ωB97X-D3/def2-TZVP for GDB7-22-TS, B3LYP-D3(BJ)/def2-SVP for Cyclo-23-TS, B97D/TZV(2p,2d) for Proparg-21-TS) (DFT) or lower-quality GFN2-xTB (xTB) geometries. MAEs are averaged over 10 folds of random 80/10/10 splits (training/validation/test), error bars showing standard deviations across folds. Note that for GDB7-22-TS and Cyclo-23-TS datasets the DFT results are different from those presented in Section 3.2.1 because here they are obtained on the same subset as the xTB results (see Section 5.4).

For the GDB7-22-TS set, there is a negligible difference in model performance moving from DFT to xTB geometries. The xTB geometries are a good proxy for the DFT ones here, since this set consists of small, charge-neutral organic molecules, which are largely well-described by semi-empirical methods. For the Cyclo-23-TS set, while the molecules are still organic, they are larger than those in the GDB7-22-TS set, and there is a greater divergence between the GFN2-xTB and DFT geometries, resulting in a larger deterioration with these structures. Figure LABEL:S-fig:cyclo_rmsd demonstrates that when using the model trained on xTB geometries, barrier predictions for molecules with poorer geometries (i.e., higher RMSD of xTB vs. DFT geometries) are not necessarily worse than those on molecules with better geometries. Instead, there is a consistent decline in model performance when training with xTB geometries and predicting DFT barriers.

The Proparg-21-TS set is the most complex of the three for GFN2-xTB, since these systems with charged organosilicon compounds differ considerably from those used to parameterize semi-empirical methods or force fields. As described in Section 5.4, unlike for the other datasets where we generate an initial structure from SMILES using force fields, for this set it is impossible and we instead generate xTB geometries from the DFT ones. While this is not a feasible geometry generation pipeline for out-of-sample predictions, it still demonstrates how different methods perform with high and low-quality geometries. Here, we see that 3DReact is less sensitive than SLATMd+KRR and the variation trained with lower quality geometries still offers competitive errors (0.48±0.05plus-or-minus0.480.050.48\pm 0.050.48 ± 0.05 kcal/mol for the “None” model).

4 Conclusions

The accurate and reliable prediction of reaction barriers across diverse sets of chemical reactions remains an open challenge in computational chemistry. We contribute to this domain by introducing 3DReact, a geometric deep learning model constructed from the 3D coordinates of reactants and products. We show that the invariant model (vs. the equivariant version) is already sufficient for currently available reaction datasets. Existing models ChemProp and SLATMd+KRR exhibit impressive performance for atom-mapped, chemically diverse datasets and stereochemistry-sensitive datasets, respectively. 3DReact offers a hybrid model that can optionally incorporate mapping information alongside geometries, enabling robust performance across different dataset types and atom-mapping regimes. 3DReact also allows for a reduced sensitivity to the training geometry quality (i.e., xTB vs. DFT level) compared to SLATMd+KRR. Predictions are stable both when moving to molecular size- or scaffold-based splits. Altogether, 3DReact presents a flexible framework for accurate prediction of activation barriers across chemical reaction datasets. Despite the proposed developments, challenges remain for ML predictions of energy barriers, particularly in integrating them within experimental settings. This work is a step toward their reliable application.

5 Methods

5.1 Datasets

We test 3DReact on three datasets of reaction barriers previously used to benchmark reaction representations.38 The term “reaction barrier”, used interchangeably with “activation energy” and “activation barrier” is the energy difference between the energy of the optimized TS and the optimized reactants. Note that depending on the dataset, some provide purely electronic energies (labelled ΔEΔsuperscript𝐸\Delta E^{\ddagger}roman_Δ italic_E start_POSTSUPERSCRIPT ‡ end_POSTSUPERSCRIPT) and others — Gibbs free energies (labelled ΔGΔsuperscript𝐺\Delta G^{\ddagger}roman_Δ italic_G start_POSTSUPERSCRIPT ‡ end_POSTSUPERSCRIPT). In all datasets, optimized three-dimensional structures of reactants and products are provided, which are used to train models and make predictions. The activation barrier is not a direct function of these structures, but using the TS structure to make predictions removes the utility of the ML models vs. direct computation of the TS. Thus we use an implicit interpolation of reactants’ and products’ structures as a proxy for the TS as in previous works.36, 69, 38

The GDB7-22-TS 116 dataset consists of close to 12 0001200012\,00012 000 diverse organic reactions automatically constructed from the GDB7 dataset130, 131, 132 using the growing string method133 along with corresponding energy barriers (ΔEΔsuperscript𝐸\Delta E^{\ddagger}roman_Δ italic_E start_POSTSUPERSCRIPT ‡ end_POSTSUPERSCRIPT) computed at the CCSD(T)-F12a/cc-pVDZ-F12//ω𝜔\omegaitalic_ωB97X-D3/def2-TZVP level. The dataset provides atom-mapped SMILES, with “True” maps derived from the transition state. For 43434343 reactions out of 11 9261192611\,92611 926, one of the products’ SMILES represents a molecule different from the xyz structure. These reactions were therefore excluded from the dataset, leading to a modified GDB7-22-TS set used here.

While there are no pre-defined classes for all the reactions in the GDB7-20-TS123 or GDB7-22-TS 116 sets, Grambow et al. 70 split the dataset into reactions undergoing certain bond changes: for example, the most common type was breaking of a C–H bond (--C–H) and a C–C bond (--C–C) in the reactants and formation of a C–H bond (+++C–H) in the products, giving the reaction type signature +++C–H,--C–C,--C–H. Here, we extract similar reaction types by comparing the connectivity matrices from atom-mapped reaction SMILES of reactants and products (ignoring bond orders). The most abundant reaction types in the dataset are +++C–H,--C–C,--C–H (1667 reactions), +++H–N,--C–H (633), +++C–H,--C–H (619), +++H–O,--C–H,--C–O (599) and +++H–H,--C–H,--C–H (517).

The original Cyclo-23-TS 117 dataset encompasses 5 26952695\,2695 269 profiles for [3+2]delimited-[]32[3+2][ 3 + 2 ] cycloaddition reactions with activation free energies (ΔGΔsuperscript𝐺\Delta G^{\ddagger}roman_Δ italic_G start_POSTSUPERSCRIPT ‡ end_POSTSUPERSCRIPT) computed at the B3LYP-D3(BJ)/def2-TZVP//B3LYP-D3(BJ)/def2-SVP level in water using the SMD continuum solvation model. The dataset provides atom-mapped SMILES with “True” maps for heavy atoms derived from either the transition state structure or heuristic rules. For the regime with explicit hydrogen atoms, we atom-mapped the xyz files by matching the reactants, given in two separate files, to the provided transition state structure, which closely resembles the two reactants and has the same atom order as in the product. This was done with a labelled graph matching algorithm as implemented in NetworkX.134, 135 The algorithm is unaware of chirality, double-bond stereochemistry or conformations, and thus may lead to not exactly correct atom-mappings. We also found that in four reactions, the product SMILES and xyz files depict different species, thus the set was reduced to 5 26552655\,2655 265 reactions.

The Proparg-21-TS dataset108, 69 contains 753 structures of intermediates before and after the enantioselective transition state of benzaldehyde propargylation, with activation energies (ΔEΔsuperscript𝐸\Delta E^{\ddagger}roman_Δ italic_E start_POSTSUPERSCRIPT ‡ end_POSTSUPERSCRIPT) computed at the B97D/TZV(2p,2d) level. SMILES strings (“fragment-based” SMILES) and “True” atom-maps are not provided with the original dataset, these are taken from Ref. 38.

RXNMapper107-mapped versions of GDB7-22-TS and Cyclo-23-TS were obtained with the python package rxnmapper (version 0.3.0), using the default settings. The Proparg-21-TS set cannot be mapped, because the underlying libraries cannot process its SMILES string.38 Since RXNMapper sorts molecules in case of multiple reactants and/or products, which would complicate SMILES–xyz matching (see Section 5.3 below), we used a locally modified version that does not change the molecule order (the patch file is provided in the project repository at https://github.com/lcmd-epfl/EquiReact/tree/9d78892fe/data-curation/rxnmapper).

5.2 Data splits

For each dataset and splitting type, identical data splits were used for all the models compared. In each case, ten different splits are constructed with different random seeds.

Three different types of extrapolation split were used: scaffold-, molecular size- and property-based. Scaffold splitting136, 137 clusters molecules based on their 2D backbones (such as Bemis–Murcko scaffolds127) and ensures that the clusters (scaffolds) belonging to the training, validation, and test sets do not overlap. Size-based splitting organizes the splits such that the reactions of the smallest molecules are in the training set and the reactions of the largest molecules are in validation and test. With property-based splits, one trains on reactions with higher barriers and predicts on reactions with lower barriers. This choice of splits reflects the relevant out-of-sample cases: larger molecules are more expensive to compute, and reactions with smaller barriers are desirable. Size- and property-based splits can also be organized in reverse order, where larger molecules are in the train set and smaller in test, or reactions with lower barrier in train and higher barrier in test.

For molecular size- and scaffold-based splits, the initial data shuffling affects the composition of the datasets. The non-zero standard deviations for property-based splits with neural networks arise from different organization of the datapoints into batches.

5.3 Matching SMILES strings to xyz geometries

3DReact makes use of both the graph structure of a molecule (as provided in the SMILES string) and the three-dimensional structure (in the xyz). The atoms in the graph are associated with the atomic coordinates provided in the xyz file. Thanks to the way the GDB7-22-TS dataset116 was generated, the atomic coordinates can be easily matched to SMILES which in turn allows to atom-map reactants to products. However, we also tested RXNMapper-mapped SMILES which do not respect the same constraints. Therefore, for consistency, we use a SMILES–xyz matching procedure detailed below.

We construct molecular graphs from xyz using covalent radii and matched them to RDKit120 molecular graphs obtained from SMILES with a labelled graph matching algorithm as implemented in NetworkX.134, 135 This procedure is however unaware of chirality and double-bond stereochemistry, thus some of the matches might be incorrect. Still, it provides a flexible method that can be applied to any dataset consisting of SMILES strings and xyz files.

The same procedure was applied to the Cyclo-23-TS dataset in the few cases when the canonical SMILES have a different atom ordering than xyz.

5.4 xTB geometry generation

For the GDB7-22-TS and Cyclo-23-TS datasets, the starting structures were generated from SMILES using the distance-geometry embedding implemented in RDKit120 with the srETKDGv3 settings.138 Ten conformations were produced per molecule, which were then energy-ranked with the MMFF94 implementation139 in RDKit, defaulting to UFF in case of missing parameters. The lowest energy conformer was retained. For the Proparg-21-TS set, the original B97D/TZV(2p,2d) geometries were used as a starting point, because the stereochemical and conformational diversity of this set cannot be completely encoded with SMILES. Therefore MMFF94 will fail to generate an initial geometry from SMILES.

For all the sets, the starting structures were optimized at the GFN2-xTB semi-empirical level of theory129 at the “loose” convergence level for a maximum of 1000 iterations using xTB140 version 6.2 RC2. For 969969969969 reactions of the GDB7-22-TS set and 491491491491 reactions of the Cyclo-23-TS set, at least one of the participating molecules either could not converge to any reasonable configuration or converged to a structure not matching the SMILES. These reactions were excluded from the geometry quality tests (Section 3.4).

5.5 Model training

3DReact was trained using the Adam optimizer 141 with initial learning rate and weight decay parameters as hyperparameters. The learning rate was reduced by 40% after 60606060 epochs of no improvement in the validation MAE, as in Ref. 121. Models were trained for max. 512512512512 epochs, using early stopping after 150150150150 epochs of no improvement. The model with the best validation score was then used to make predictions on the test set.

The optimal model hyperparameters were searched within the following values: learning rate [5105,104,5104,103]absent5E-5E-45E-4E-3\in[$5\text{$\cdot$}{10}^{-5}$,${10}^{-4}$,$5\text{$\cdot$}{10}^{-4}$,${10}^{-% 3}$]∈ [ start_ARG 5 end_ARG start_ARG ⋅ end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 5 end_ARG end_ARG , start_ARG end_ARG start_ARG ⁢ end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 4 end_ARG end_ARG , start_ARG 5 end_ARG start_ARG ⋅ end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 4 end_ARG end_ARG , start_ARG end_ARG start_ARG ⁢ end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 3 end_ARG end_ARG ]; weight decay parameter [105,104,103,0]absentE-5E-4E-30\in[${10}^{-5}$,${10}^{-4}$,${10}^{-3}$,0]∈ [ start_ARG end_ARG start_ARG ⁢ end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 5 end_ARG end_ARG , start_ARG end_ARG start_ARG ⁢ end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 4 end_ARG end_ARG , start_ARG end_ARG start_ARG ⁢ end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 3 end_ARG end_ARG , 0 ]; node and edge features embedding size ns[16,32,48,64]subscript𝑛𝑠16324864n_{s}\in[16,32,48,64]italic_n start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ∈ [ 16 , 32 , 48 , 64 ]; =11\ell{=}1roman_ℓ = 1 hidden space size nv[16,32,48,64]subscript𝑛𝑣16324864n_{v}\in[16,32,48,64]italic_n start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ∈ [ 16 , 32 , 48 , 64 ]; number of edge features ng[16,32,48,64]subscript𝑛𝑔16324864n_{g}\in[16,32,48,64]italic_n start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ∈ [ 16 , 32 , 48 , 64 ]; number of convolutional layers nconv[2,3]subscript𝑛conv23n_{\mathrm{conv}}\in[2,3]italic_n start_POSTSUBSCRIPT roman_conv end_POSTSUBSCRIPT ∈ [ 2 , 3 ]; radial cutoff rmax[2.5,5.0,10.0]subscript𝑟2.55.010.0r_{\max}\in[2.5,5.0,10.0]italic_r start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ∈ [ 2.5 , 5.0 , 10.0 ]; maximum number of atom neighbors nneigh[10,25,50]subscript𝑛neigh102550n_{\mathrm{neigh}}\in[10,25,50]italic_n start_POSTSUBSCRIPT roman_neigh end_POSTSUBSCRIPT ∈ [ 10 , 25 , 50 ]; dropout probability pd[0.0,0.05,0.1]subscript𝑝𝑑0.00.050.1p_{d}\in[0.0,0.05,0.1]italic_p start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ∈ [ 0.0 , 0.05 , 0.1 ]; sum_mode \in [node, both]; combine_mode \in [mlp, diff, mean, sum]; graph_mode \in [energy, vector].

The hyperparameter search was done for the equivariant model EquiReactS (without attention or mapping) using Bayesian search as implemented in Weights & Biases.142 Hydrogen atoms were excluded from the graphs. Sweeps were run for 128 epochs for the GDB7-22-TS and Proparg-21-TS sets, and for 256 epochs for the Cyclo-23-TS set on the first random split. The parameters resulting in the best validation error, summarized in Table LABEL:S-tab:model-params, were used for all the other model settings.

5.6 Baseline models

The ChemProp model103 is based on a CGR built from atom-mapped SMILES strings of reactants and products, which is then passed through the directed message-passing neural network chemprop137, 71, 103 (version 1.5.0). The hyperparameters are taken from Ref. 38.

Molecular SLATM vectors were generated using the qml python package143 before being combined to form the reaction version SLATMd. SLATMd is used with kernel ridge regression (KRR) models. The kernel functions and widths, and regularization parameters, were optimized on the first of the ten random splits, in line with how the hyperparameters were optimized for 3DReact. Unlike 3DReact, the hyperparameters for DFT and xTB geometries were optimized separately.

Data and Software Availability statement

The code is available as a GitHub repository at https://github.com/lcmd-epfl/EquiReact. The versions of the datasets used, as well as any processing applied to them, can be found in the same repository. The unprocessed results are available in the same same repository as well as at https://wandb.ai/equireact.

{suppinfo}

Supplementary Information is provided in the freely available file equireact_si.pdf, detailing the architecture of the molecular channels (Section LABEL:S-sec:molecular_channels), the 3DReact hyperparameters (Section LABEL:S-sec:model-params), the RMSE analogue of Table 2 (Section LABEL:S-sec:rmse), the discussion of the model with a cross-attention surrogate for atom-mapping (Section LABEL:S-sec:cross), extrapolation studies (Section LABEL:S-sec:extrapolation), some illustrative correlation plots for the GDB7-22-TS set (Section LABEL:S-sec:gdb_outliers_corr), the model performance with and without explicit hydrogen atoms (Section LABEL:S-sec:hydrogens), and the geometry sensitivity analysis for the Cyclo-23-TS set (Section LABEL:S-sec:geom_cyclo).

Author Information

Author contributions

P.v.G., K.R.B., and C.B. conceptualized the project. 3DReact and support codes were written and run by K.R.B. and P.v.G., with design suggestions from C.B., V.R.S., and R.L. Results were analyzed by P.v.G., K.R.B., V.R.S., R.L., and C.B. xTB computations were run by R.L. The original draft was written by P.v.G. and K.R.B. with reviews and edits from all authors. C.C. and A.K. provided supervision and acquired funding.

Conflict of interest

The authors have no conflicts to disclose.

{acknowledgement}

The authors thank Liam Marsh and Yannick Calvino Alonso for helpful discussion and comments on the text. P.v.G., C.B., V.R.S., R.L., A.K., and C.C. acknowledge the National Centre of Competence in Research (NCCR) “Sustainable chemical process through catalysis (Catalysis)”, grant number 180544, of the Swiss National Science Foundation (SNSF) for financial support. K.R.B. and C.C. were supported by the European Research Council (grant number 817977) and by the National Centre of Competence in Research (NCCR) “Materials’ Revolution: Computational Design and Discovery of Novel Materials (MARVEL)”, grant number 205602, of the Swiss National Science Foundation.

References

  • Behler and Parrinello 2007 Behler, J.; Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 2007, 98, 146401.
  • Rupp et al. 2012 Rupp, M.; Tkatchenko, A.; Müller, K.-R.; von Lilienfeld, O. A. Fast and accurate modeling of molecular atomization energies with machine learning. Phys. Rev. Lett. 2012, 108, 058301.
  • Bartók et al. 2013 Bartók, A. P.; Kondor, R.; Csányi, G. On representing chemical environments. Phys. Rev. B 2013, 87, 184115.
  • Hansen et al. 2015 Hansen, K.; Biegler, F.; Ramakrishnan, R.; Pronobis, W.; von Lilienfeld, O. A.; Müller, K.-R.; Tkatchenko, A. Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space. J. Phys. Chem. Lett. 2015, 6, 2326–2331.
  • Huo and Rupp 2017 Huo, H.; Rupp, M. Unified representation for machine learning of molecules and crystals. arXiv preprint 2017, arXiv:1704.06439.
  • Faber et al. 2018 Faber, F. A.; Christensen, A. S.; Huang, B.; von Lilienfeld, O. A. Alchemical and structural distribution based representation for universal quantum machine learning. J. Chem. Phys. 2018, 148, 241717.
  • Christensen et al. 2020 Christensen, A. S.; Bratholm, L. A.; Faber, F. A.; von Lilienfeld, O. A. FCHL revisited: Faster and more accurate quantum machine learning. J. Chem. Phys. 2020, 152, 044107.
  • Huang and von Lilienfeld 2020 Huang, B.; von Lilienfeld, O. A. Quantum machine learning using atom-in-molecule-based fragments selected on the fly. Nat. Chem. 2020, 12, 945–951.
  • Drautz 2019 Drautz, R. Atomic cluster expansion for accurate and transferable interatomic potentials. Phys. Rev. B 2019, 99, 014104.
  • Dusson et al. 2022 Dusson, G.; Bachmayr, M.; Csányi, G.; Drautz, R.; Etter, S.; van der Oord, C.; Ortner, C. Atomic cluster expansion: Completeness, efficiency and stability. J. Comput. Phys. 2022, 454, 110946.
  • Grisafi and Ceriotti 2019 Grisafi, A.; Ceriotti, M. Incorporating long-range physics in atomic-scale machine learning. J. Chem. Phys. 2019, 151, 204105.
  • Grisafi et al. 2021 Grisafi, A.; Nigam, J.; Ceriotti, M. Multi-scale approach for the prediction of atomic scale properties. Chem. Sci. 2021, 12, 2078–2090.
  • Nigam et al. 2020 Nigam, J.; Pozdnyakov, S.; Ceriotti, M. Recursive evaluation and iterative contraction of N𝑁Nitalic_N-body equivariant features. J. Chem. Phys. 2020, 153, 121101.
  • Fabrizio et al. 2022 Fabrizio, A.; Briling, K. R.; Corminboeuf, C. SPAHM: the Spectrum of Approximated Hamiltonian Matrices representations. Digital Discovery 2022, 1, 286–294.
  • Briling et al. 2024 Briling, K. R.; Calvino Alonso, Y.; Fabrizio, A.; Corminboeuf, C. SPAHM(a,b): Encoding the density information from guess Hamiltonian in quantum machine learning representations. J. Chem. Theory Comput. 2024, 20, 1108–1117.
  • Karandashev and von Lilienfeld 2022 Karandashev, K.; von Lilienfeld, O. A. An orbital-based representation for accurate quantum machine learning. J. Chem. Phys. 2022, 156, 114101.
  • Llenga and Gryn’ova 2023 Llenga, S.; Gryn’ova, G. Matrix of orthogonalized atomic orbital coefficients representation for radicals and ions. J. Chem. Phys. 2023, 158, 214116.
  • Li et al. 2015 Li, Z.; Kermode, J. R.; De Vita, A. Molecular dynamics with on-the-fly machine learning of quantum-mechanical forces. Phys. Rev. Lett. 2015, 114, 096405.
  • Chmiela et al. 2017 Chmiela, S.; Tkatchenko, A.; Sauceda, H. E.; Poltavsky, I.; Schütt, K. T.; Müller, K.-R. Machine learning of accurate energy-conserving molecular force fields. Sci. Adv. 2017, 3, e1603015.
  • Chmiela et al. 2018 Chmiela, S.; Sauceda, H. E.; Müller, K.-R.; Tkatchenko, A. Towards exact molecular dynamics simulations with machine-learned force fields. Nat. Commun. 2018, 9, 3887.
  • Behler 2017 Behler, J. First principles neural network potentials for reactive simulations of large molecular and condensed systems. Angew. Chem. Int. Ed. 2017, 56, 12828–12840.
  • Smith et al. 2018 Smith, J. S.; Nebgen, B.; Lubbers, N.; Isayev, O.; Roitberg, A. E. Less is more: Sampling chemical space with active learning. J. Chem. Phys. 2018, 148, 241733.
  • Bereau et al. 2015 Bereau, T.; Andrienko, D.; von Lilienfeld, O. A. Transferable atomic multipole machine learning models for small organic molecules. J. Chem. Theory Comput. 2015, 11, 3225–3233.
  • Grisafi et al. 2018 Grisafi, A.; Wilkins, D. M.; Csányi, G.; Ceriotti, M. Symmetry-adapted machine learning for tensorial properties of atomistic systems. Phys. Rev. Lett. 2018, 120, 036002.
  • Wilkins et al. 2019 Wilkins, D. M.; Grisafi, A.; Yang, Y.; Lao, K. U.; DiStasio Jr, R. A.; Ceriotti, M. Accurate molecular polarizabilities with coupled cluster theory and machine learning. Proc. Natl. Acad. Sci. U.S.A. 2019, 116, 3401–3406.
  • Montavon et al. 2013 Montavon, G.; Rupp, M.; Gobre, V.; Vazquez-Mayagoitia, A.; Hansen, K.; Tkatchenko, A.; Müller, K.-R.; von Lilienfeld, O. A. Machine learning of molecular electronic properties in chemical compound space. New J. Phys. 2013, 15, 095003.
  • Mazouin et al. 2022 Mazouin, B.; Schöpfer, A. A.; von Lilienfeld, O. A. Selected machine learning of HOMO–LUMO gaps with improved data-efficiency. Mater. Adv. 2022, 3, 8306–8316.
  • Brockherde et al. 2017 Brockherde, F.; Vogt, L.; Li, L.; Tuckerman, M. E.; Burke, K.; Müller, K.-R. Bypassing the Kohn-Sham equations with machine learning. Nat. Commun. 2017, 8, 872.
  • Grisafi et al. 2019 Grisafi, A.; Fabrizio, A.; Meyer, B.; Wilkins, D. M.; Corminboeuf, C.; Ceriotti, M. Transferable machine-learning model of the electron density. ACS Cent. Sci. 2019, 5, 57–64.
  • Fabrizio et al. 2019 Fabrizio, A.; Grisafi, A.; Meyer, B.; Ceriotti, M.; Corminboeuf, C. Electron density learning of non-covalent systems. Chem. Sci. 2019, 10, 9424–9432.
  • Musil et al. 2021 Musil, F.; Grisafi, A.; Bartók, A. P.; Ortner, C.; Csányi, G.; Ceriotti, M. Physics-inspired structural representations for molecules and materials. Chem. Rev. 2021, 121, 9759–9815.
  • Langer et al. 2022 Langer, M. F.; Goessmann, A.; Rupp, M. Representations of molecules and materials for interpolation of quantum-mechanical simulations via machine learning. npj Comput. Mater. 2022, 8, 41.
  • Huang and von Lilienfeld 2021 Huang, B.; von Lilienfeld, O. A. Ab initio machine learning in chemical compound space. Chem. Rev. 2021, 121, 10001–10036.
  • Kulik et al. 2022 Kulik, H. J.; Hammerschmidt, T.; Schmidt, J.; Botti, S.; Marques, M. A. L.; Boley, M.; Scheffler, M.; Todorović, M.; Rinke, P.; Oses, C. et al. Roadmap on Machine learning in electronic structure. Electron. Struct. 2022, 4, 023004.
  • Glielmo et al. 2017 Glielmo, A.; Sollich, P.; De Vita, A. Accurate interatomic force fields via machine learning with covariant kernels. Phys. Rev. B 2017, 95, 214302.
  • van Gerwen et al. 2022 van Gerwen, P.; Fabrizio, A.; Wodrich, M. D.; Corminboeuf, C. Physics-based representations for machine learning properties of chemical reactions. Mach. Learn.: Sci. Technol. 2022, 3, 045005.
  • Faber et al. 2017 Faber, F. A.; Hutchison, L.; Huang, B.; Gilmer, J.; Schoenholz, S. S.; Dahl, G. E.; Vinyals, O.; Kearnes, S.; Riley, P. F.; von Lilienfeld, O. A. Prediction errors of molecular machine learning models lower than hybrid DFT error. J. Chem. Theory Comput. 2017, 13, 5255–5264.
  • van Gerwen et al. 2024 van Gerwen, P.; Briling, K. R.; Calvino Alonso, Y.; Franke, M.; Corminboeuf, C. Benchmarking machine-readable vectors of chemical reactions on computed activation barriers. Digital Discovery 2024, 3, 932–943.
  • Schütt et al. 2017 Schütt, K.; Kindermans, P.-J.; Sauceda Felix, H. E.; Chmiela, S.; Tkatchenko, A.; Müller, K.-R. SchNet: A continuous-filter convolutional neural network for modeling quantum interactions. Adv. Neural Inf. Process. Syst. 2017, 30, 991–1001.
  • Unke and Meuwly 2019 Unke, O. T.; Meuwly, M. PhysNet: A neural network for predicting energies, forces, dipole moments, and partial charges. J. Chem. Theory Comput. 2019, 15, 3678–3693.
  • Gasteiger et al. 2020 Gasteiger, J.; Groß, J.; Günnemann, S. Directional message passing for molecular graphs. arXiv preprint 2020, arXiv:2003.03123.
  • Gilmer et al. 2017 Gilmer, J.; Schoenholz, S. S.; Riley, P. F.; Vinyals, O.; Dahl, G. E. Neural message passing for quantum chemistry. Proceedings of the 34th International Conference on Machine Learning. 2017; pp 1263–1272.
  • Batzner et al. 2022 Batzner, S.; Musaelian, A.; Sun, L.; Geiger, M.; Mailoa, J. P.; Kornbluth, M.; Molinari, N.; Smidt, T. E.; Kozinsky, B. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat. Commun. 2022, 13, 2453.
  • Gasteiger et al. 2021 Gasteiger, J.; Becker, F.; Günnemann, S. GemNet: Universal directional graph neural networks for molecules. Adv. Neural Inf. Process. Syst. 2021, 34, 6790–6802.
  • Haghighatlari et al. 2022 Haghighatlari, M.; Li, J.; Guan, X.; Zhang, O.; Das, A.; Stein, C. J.; Heidar-Zadeh, F.; Liu, M.; Head-Gordon, M.; Bertels, L. et al. Newtonnet: A newtonian message passing network for deep learning of interatomic potentials and forces. Digital Discovery 2022, 1, 333–343.
  • Qiao et al. 2020 Qiao, Z.; Welborn, M.; Anandkumar, A.; Manby, F. R.; Miller, T. F. OrbNet: Deep learning for quantum chemistry using symmetry-adapted atomic-orbital features. J. Chem. Phys. 2020, 153, 124111.
  • Thomas et al. 2018 Thomas, N.; Smidt, T.; Kearnes, S.; Yang, L.; Li, L.; Kohlhoff, K.; Riley, P. Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds. arXiv preprint 2018, arXiv:1802.08219.
  • Townshend et al. 2020 Townshend, R. J.; Townshend, B.; Eismann, S.; Dror, R. O. Geometric prediction: Moving beyond scalars. arXiv preprint 2020, arXiv:2006.14163.
  • Anderson et al. 2019 Anderson, B.; Hy, T. S.; Kondor, R. Cormorant: Covariant molecular neural networks. Adv. Neural Inf. Process. Syst. 2019, 32, 14537–14546.
  • Satorras et al. 2021 Satorras, V. G.; Hoogeboom, E.; Welling, M. E(n) equivariant graph neural networks. Proceedings of the 38th International Conference on Machine Learning. 2021; pp 9323–9332.
  • Christensen et al. 2021 Christensen, A. S.; Sirumalla, S. K.; Qiao, Z.; O’Connor, M. B.; Smith, D. G.; Ding, F.; Bygrave, P. J.; Anandkumar, A.; Welborn, M.; Manby, F. R. et al. OrbNet Denali: A machine learning potential for biological and organic chemistry with semi-empirical cost and DFT accuracy. J. Chem. Phys. 2021, 155, 204103.
  • Schütt et al. 2021 Schütt, K.; Unke, O.; Gastegger, M. Equivariant message passing for the prediction of tensorial properties and molecular spectra. Proceedings of the 38th International Conference on Machine Learning. 2021; pp 9377–9388.
  • Unke et al. 2021 Unke, O. T.; Chmiela, S.; Gastegger, M.; Schütt, K. T.; Sauceda, H. E.; Müller, K.-R. SpookyNet: Learning force fields with electronic degrees of freedom and nonlocal effects. Nat. Commun. 2021, 12, 7273.
  • Zhang et al. 2020 Zhang, Y.; Ye, S.; Zhang, J.; Hu, C.; Jiang, J.; Jiang, B. Efficient and accurate simulations of vibrational and electronic spectra with symmetry-preserving neural network models for tensorial properties. J. Phys. Chem. B 2020, 124, 7284–7290.
  • Nguyen and Lunghi 2022 Nguyen, V. H. A.; Lunghi, A. Predicting tensorial molecular properties with equivariant machine learning models. Phys. Rev. B 2022, 105, 165131.
  • Batatia et al. 2022 Batatia, I.; Kovacs, D. P.; Simm, G.; Ortner, C.; Csanyi, G. MACE: Higher order equivariant message passing neural networks for fast and accurate force fields. Adv. Neural Inf. Process. Syst. 2022, 35, 11423–11436.
  • Liao and Smidt 2022 Liao, Y.-L.; Smidt, T. Equiformer: Equivariant graph attention transformer for 3D atomistic graphs. arXiv preprint 2022, arXiv:2206.11990.
  • Fuchs et al. 2020 Fuchs, F.; Worrall, D.; Fischer, V.; Welling, M. SE(3)-Transformers: 3D roto-translation equivariant attention networks. Adv. Neural Inf. Process. Syst. 2020, 33, 1970–1981.
  • Simeon and De Fabritiis 2023 Simeon, G.; De Fabritiis, G. TensorNet: Cartesian tensor representations for efficient learning of molecular potentials. Adv. Neural Inf. Process. Syst. 2023, 36, 37334–37353.
  • Corso et al. 2024 Corso, G.; Stark, H.; Jegelka, S.; Jaakkola, T.; Barzilay, R. Graph neural networks. Nat. Rev. Methods Primers 2024, 4, 17.
  • Duval et al. 2023 Duval, A.; Mathis, S. V.; Joshi, C. K.; Schmidt, V.; Miret, S.; Malliaros, F. D.; Cohen, T.; Liò, P.; Bengio, Y.; Bronstein, M. A hitchhiker’s guide to geometric GNNs for 3D atomic systems. arXiv preprint 2023, arXiv:2312.07511.
  • Musaelian et al. 2023 Musaelian, A.; Batzner, S.; Johansson, A.; Sun, L.; Owen, C. J.; Kornbluth, M.; Kozinsky, B. Learning local equivariant representations for large-scale atomistic dynamics. Nat. Commun. 2023, 14, 579.
  • Wen et al. 2024 Wen, M.; Horton, M. K.; Munro, J. M.; Huck, P.; Persson, K. A. An equivariant graph neural network for the elasticity tensors of all seven crystal systems. Digital Discovery 2024, 3, 869–882.
  • Batatia et al. 2022 Batatia, I.; Batzner, S.; Kovács, D. P.; Musaelian, A.; Simm, G. N. C.; Drautz, R.; Ortner, C.; Kozinsky, B.; Csányi, G. The design space of E(3)-equivariant atom-centered interatomic potentials. arXiv preprint 2022, arXiv:2205.06643.
  • Liu et al. 2022 Liu, Y.; Wang, L.; Liu, M.; Zhang, X.; Oztekin, B.; Ji, S. Spherical message passing for 3D graph networks. arXiv preprint 2022, arXiv:2102.05013.
  • Kondor 2018 Kondor, R. N𝑁Nitalic_N-body networks: A covariant hierarchical neural network architecture for learning atomic potentials. arXiv preprint 2018, arXiv:1803.01588.
  • Bochkarev et al. 2022 Bochkarev, A.; Lysogorskiy, Y.; Ortner, C.; Csányi, G.; Drautz, R. Multilayer atomic cluster expansion for semilocal interactions. Phys. Rev. Res. 2022, 4, L042019.
  • Lewis-Atwell et al. 2022 Lewis-Atwell, T.; Townsend, P. A.; Grayson, M. N. Machine learning activation energies of chemical reactions. WIREs Comput. Mol. Sci. 2022, 12, e1593.
  • Gallarati et al. 2021 Gallarati, S.; Fabregat, R.; Laplaza, R.; Bhattacharjee, S.; Wodrich, M. D.; Corminboeuf, C. Reaction-based machine learning representations for predicting the enantioselectivity of organocatalysts. Chem. Sci. 2021, 12, 6879–6889.
  • Grambow et al. 2020 Grambow, C. A.; Pattanaik, L.; Green, W. H. Deep learning of activation energies. J. Phys. Chem. Lett. 2020, 11, 2992–2997.
  • Heid and Green 2022 Heid, E.; Green, W. H. Machine learning of reaction properties via learned representations of the condensed graph of reaction. J. Chem. Inf. Model. 2022, 62, 2101–2110.
  • Spiekermann et al. 2022 Spiekermann, K. A.; Pattanaik, L.; Green, W. H. Fast predictions of reaction barrier heights: Toward coupled-cluster accuracy. J. Phys. Chem. A 2022, 126, 3976–3986.
  • Zhao et al. 2023 Zhao, Q.; Anstine, D. M.; Isayev, O.; Savoie, B. M. Δ2superscriptΔ2\Delta^{2}roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT machine learning for reaction property prediction. Chem. Sci. 2023, 14, 13392–13401.
  • Heinen et al. 2021 Heinen, S.; von Rudorff, G. F.; von Lilienfeld, O. A. Toward the design of chemical reactions: Machine learning barriers of competing mechanisms in reactant space. J. Chem. Phys. 2021, 155, 064105.
  • Singh et al. 2019 Singh, A. R.; Rohr, B. A.; Gauthier, J. A.; Nørskov, J. K. Predicting chemical reaction barriers with a machine learning model. Catal. Lett. 2019, 149, 2347–2354.
  • Choi et al. 2018 Choi, S.; Kim, Y.; Kim, J. W.; Kim, Z.; Kim, W. Y. Feasibility of activation energy prediction of gas-phase reactions by machine learning. Chem. Eur. J. 2018, 24, 12354–12358.
  • Farrar and Grayson 2022 Farrar, E. H. E.; Grayson, M. N. Machine learning and semi-empirical calculations: A synergistic approach to rapid, accurate, and mechanism-based reaction barrier prediction. Chem. Sci. 2022, 13, 7594–7603.
  • Friederich et al. 2020 Friederich, P.; dos Passos Gomes, G.; Bin, R. D.; Aspuru-Guzik, A.; Balcells, D. Machine learning dihydrogen activation in the chemical space surrounding Vaska’s complex. Chem. Sci. 2020, 11, 4584–4601.
  • Migliaro and Cundari 2020 Migliaro, I.; Cundari, T. R. Density functional study of methane activation by frustrated lewis pairs with group 13 trihalides and group 15 pentahalides and a machine learning analysis of their barrier heights. J. Chem. Inf. Model. 2020, 60, 4958–4966.
  • Lewis-Atwell et al. 2023 Lewis-Atwell, T.; Beechey, D.; Şimşek, Ö.; Grayson, M. N. Reformulating reactivity design for data-efficient machine learning. ACS Catal. 2023, 13, 13506–13515.
  • Vadaddi et al. 2024 Vadaddi, S. M.; Zhao, Q.; Savoie, B. M. Graph to activation energy models easily reach irreducible errors but show limited transferability. J. Phys. Chem. A 2024, 128, 2543–2555.
  • Ramos et al. 2024 Ramos, J. E. A.; Neeser, R. M. M.; Stuyver, T. Repurposing quantum chemical descriptor datasets for on-the-fly generation of informative reaction representations: Application to hydrogen atom transfer reactions. Digital Discovery 2024, 3, 919–931.
  • Schwaller et al. 2022 Schwaller, P.; Vaucher, A. C.; Laplaza, R.; Bunne, C.; Krause, A.; Corminboeuf, C.; Laino, T. Machine intelligence for chemical reaction space. WIREs Comput. Mol. Sci. 2022, 12, e1604.
  • Rogers and Hahn 2010 Rogers, D.; Hahn, M. Extended-connectivity fingerprints. J. Chem. Inf. Model. 2010, 50, 742–754.
  • Probst et al. 2022 Probst, D.; Schwaller, P.; Reymond, J.-L. Reaction classification and yield prediction using the differential reaction fingerprint DRFP. Digital Discovery 2022, 1, 91–97.
  • Ahneman et al. 2018 Ahneman, D. T.; Estrada, J. G.; Lin, S.; Dreher, S. D.; Doyle, A. G. Predicting reaction performance in C–N cross-coupling using machine learning. Science 2018, 360, 186–190.
  • Żurański et al. 2021 Żurański, A. M.; Martinez Alvarado, J. I.; Shields, B. J.; Doyle, A. G. Predicting reaction yields via supervised learning. Acc. Chem. Res. 2021, 54, 1856–1865.
  • Zahrt et al. 2019 Zahrt, A. F.; Henle, J. J.; Rose, B. T.; Wang, Y.; Darrow, W. T.; Denmark, S. E. Prediction of higher-selectivity catalysts by computer-driven workflow and machine learning. Science 2019, 363, eaau5631.
  • Jorner et al. 2021 Jorner, K.; Brinck, T.; Norrby, P.-O.; Buttar, D. Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies. Chem. Sci. 2021, 12, 1163–1175.
  • Reid and Sigman 2019 Reid, J. P.; Sigman, M. S. Holistic prediction of enantioselectivity in asymmetric catalysis. Nature 2019, 571, 343–348.
  • Gensch et al. 2022 Gensch, T.; dos Passos Gomes, G.; Friederich, P.; Peters, E.; Gaudin, T.; Pollice, R.; Jorner, K.; Nigam, A.; Lindner-D’Addario, M.; Sigman, M. S. et al. A comprehensive discovery platform for organophosphorus ligands for catalysis. J. Am. Chem. Soc. 2022, 144, 1205–1217.
  • Santiago et al. 2018 Santiago, C. B.; Guo, J.-Y.; Sigman, M. S. Predictive and mechanistic multivariate linear regression models for reaction development. Chem. Sci. 2018, 9, 2398–2412.
  • Jorner 2023 Jorner, K. Putting chemical knowledge to work in machine learning for reactivity. Chimia 2023, 77, 22.
  • Gallegos et al. 2021 Gallegos, L. C.; Luchini, G.; St. John, P. C.; Kim, S.; Paton, R. S. Importance of engineered and learned molecular representations in predicting organic reactivity, selectivity, and chemical properties. Acc. Chem. Res. 2021, 54, 827–836.
  • Williams et al. 2021 Williams, W. L.; Zeng, L.; Gensch, T.; Sigman, M. S.; Doyle, A. G.; Anslyn, E. V. The evolution of data-driven modeling in organic chemistry. ACS Cent. Sci. 2021, 7, 1622–1637.
  • Stuyver and Coley 2023 Stuyver, T.; Coley, C. W. Machine learning-guided computational screening of new candidate reactions with high bioorthogonal click potential. Chem. Eur. J. 2023, 29, e202300387.
  • Stuyver and Coley 2022 Stuyver, T.; Coley, C. W. Quantum chemistry-augmented neural networks for reactivity prediction: Performance, generalizability, and explainability. J. Chem. Phys. 2022, 156, 084104.
  • Vargas et al. 2024 Vargas, S.; Gee, W.; Alexandrova, A. High-throughput quantum theory of atoms in molecules (QTAIM) for geometric deep learning of molecular and reaction properties. Digital Discovery 2024, 3, 987–998.
  • Vijay et al. 2024 Vijay, S.; Venetos, M. C.; Spotte-Smith, E. W. C.; Kaplan, A. D.; Wen, M.; Persson, K. A. CoeffNet: Predicting activation barriers through a chemically-interpretable, equivariant and physically constrained graph neural network. Chem. Sci. 2024, 15, 2923–2936.
  • Devlin et al. 2018 Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint 2018, arXiv:1810.04805.
  • Schwaller et al. 2019 Schwaller, P.; Laino, T.; Gaudin, T.; Bolgar, P.; Hunter, C. A.; Bekas, C.; Lee, A. A. Molecular transformer: A model for uncertainty-calibrated chemical reaction prediction. ACS Cent. Sci. 2019, 5, 1572–1583.
  • Schwaller et al. 2021 Schwaller, P.; Vaucher, A. C.; Laino, T.; Reymond, J.-L. Prediction of chemical reaction yields using deep learning. Mach. Learn.: Sci. Technol. 2021, 2, 015016.
  • Heid et al. 2024 Heid, E.; Greenman, K. P.; Chung, Y.; Li, S.-C.; Graff, D. E.; Vermeire, F. H.; Wu, H.; Green, W. H.; McGill, C. J. Chemprop: A machine learning package for chemical property prediction. J. Chem. Inf. Model. 2024, 64, 9–17.
  • Chen et al. 2013 Chen, W. L.; Chen, D. Z.; Taylor, K. T. Automatic reaction mapping and reaction center detection. WIREs Comput. Mol. Sci. 2013, 3, 560–593.
  • Preciat Gonzalez et al. 2017 Preciat Gonzalez, G. A.; El Assal, L. R.; Noronha, A.; Thiele, I.; Haraldsdóttir, H. S.; Fleming, R. M. Comparative evaluation of atom mapping algorithms for balanced metabolic reactions: Application to Recon 3D. J. Cheminform. 2017, 9, 39.
  • Jaworski et al. 2019 Jaworski, W.; Szymkuć, S.; Mikulak-Klucznik, B.; Piecuch, K.; Klucznik, T.; Kaźmierowski, M.; Rydzewski, J.; Gambin, A.; Grzybowski, B. A. Automatic mapping of atoms across both simple and complex chemical reactions. Nat. Commun. 2019, 10, 1434.
  • Schwaller et al. 2021 Schwaller, P.; Hoover, B.; Reymond, J.-L.; Strobelt, H.; Laino, T. Extraction of organic chemistry grammar from unsupervised learning of chemical reactions. Sci. Adv. 2021, 7, eabe4166.
  • Doney et al. 2016 Doney, A. C.; Rooks, B. J.; Lu, T.; Wheeler, S. E. Design of organocatalysts for asymmetric propargylations through computational screening. ACS Catal. 2016, 6, 7948–7955.
  • Nehil-Puleo et al. 2024 Nehil-Puleo, K.; Quach, C. D.; Craven, N. C.; McCabe, C.; Cummings, P. T. E(n)E𝑛\mathrm{E}(n)roman_E ( italic_n ) equivariant graph neural network for learning interactional properties of molecules. J. Phys. Chem. B 2024, 128, 1108–1117.
  • Duan et al. 2023 Duan, C.; Du, Y.; Jia, H.; Kulik, H. J. Accurate transition state generation with an object-aware equivariant elementary reaction diffusion model. arXiv preprint 2023, arXiv:2304.06174.
  • Zhang et al. 2021 Zhang, J.; Lei, Y.-K.; Zhang, Z.; Han, X.; Li, M.; Yang, L.; Yang, Y. I.; Gao, Y. Q. Deep reinforcement learning of transition states. Phys. Chem. Chem. Phys. 2021, 23, 6888–6895.
  • Pattanaik et al. 2020 Pattanaik, L.; Ingraham, J. B.; Grambow, C. A.; Green, W. H. Generating transition states of isomerization reactions with deep learning. Phys. Chem. Chem. Phys. 2020, 22, 23618–23626.
  • Makoś et al. 2021 Makoś, M. Z.; Verma, N.; Larson, E. C.; Freindorf, M.; Kraka, E. Generative adversarial networks for transition state geometry prediction. J. Chem. Phys. 2021, 155, 024116.
  • Kim et al. 2024 Kim, S.; Woo, J.; Kim, W. Y. Diffusion-based generative AI for exploring transition states from 2D molecular graphs. Nat. Commun. 2024, 15, 341.
  • Choi 2023 Choi, S. Prediction of transition state structures of gas-phase chemical reactions via machine learning. Nat. Commun. 2023, 14, 1168.
  • Spiekermann et al. 2022 Spiekermann, K.; Pattanaik, L.; Green, W. H. High accuracy barrier heights, enthalpies, and rate coefficients for chemical reactions. Sci. Data 2022, 9, 417.
  • Stuyver et al. 2023 Stuyver, T.; Jorner, K.; Coley, C. W. Reaction profiles for quantum chemistry-computed [3+2]delimited-[]32[3+2][ 3 + 2 ] cycloaddition reactions. Sci. Data 2023, 10, 66.
  • Geiger et al. 2022 Geiger, M.; Smidt, T.; M., A.; Miller, B. K.; Boomsma, W.; Dice, B.; Lapchevskyi, K.; Weiler, M.; Tyszkiewicz, M.; Uhrin, M. et al. e3nn/e3nn: 2022-12-12. 2022; https://zenodo.org/records/7430260.
  • Corso et al. 2023 Corso, G.; Stärk, H.; Jing, B.; Barzilay, R.; Jaakkola, T. DiffDock: Diffusion steps, twists, and turns for molecular docking. arXiv preprint 2023, arXiv:2210.01776.
  • Landrum et al. 2023 Landrum, G.; Tosco, P.; Kelley, B.; Ric,; Sriniker,; Cosgrove, D.; Gedeck,; Vianello, R.; NadineSchneider,; Kawashima, E. et al. rdkit/rdkit: 2023_03_1 (Q1 2023) release. 2023; https://zenodo.org/record/7880616.
  • Stärk et al. 2022 Stärk, H.; Ganea, O.; Pattanaik, L.; Barzilay, R.; Jaakkola, T. EquiBind: Geometric deep learning for drug binding structure prediction. Proceedings of the 39th International Conference on Machine Learning. 2022; pp 20503–20521.
  • Ganea et al. 2022 Ganea, O.-E.; Huang, X.; Bunne, C.; Bian, Y.; Barzilay, R.; Jaakkola, T.; Krause, A. Independent SE(3)-equivariant models for end-to-end rigid protein docking. arXiv preprint 2022, arXiv:2111.07786.
  • Grambow et al. 2020 Grambow, C.; Pattanaik, L.; Green, W. Reactants, products, and transition states of elementary chemical reactions based on quantum chemistry. Sci. Data 2020, 7, 137.
  • van Gerwen et al. 2023 van Gerwen, P.; Wodrich, M. D.; Laplaza, R.; Corminboeuf, C. Reply to Comment on ‘Physics-based representations for machine learning properties of chemical reactions’. Mach. Learn.: Sci. Technol. 2023, 4, 048002.
  • Lowe 2012 Lowe, D. M. Extraction of chemical structures and reactions from the literature. Ph.D. thesis, University of Cambridge, 2012.
  • von Lilienfeld 2013 von Lilienfeld, O. A. First principles view on chemical compound space: Gaining rigorous atomistic control of molecular properties. Int. J. Quantum Chem. 2013, 113, 1676–1689.
  • Bemis and Murcko 1996 Bemis, G. W.; Murcko, M. A. The properties of known drugs. 1. Molecular frameworks. J. Med. Chem. 1996, 39, 2887–2893.
  • van der Maaten and Hinton 2008 van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605.
  • Bannwarth et al. 2019 Bannwarth, C.; Ehlert, S.; Grimme, S. GFN2-xTB–An accurate and broadly parametrized self-consistent tight-binding quantum chemical method with multipole electrostatics and density-dependent dispersion contributions. J. Chem. Theory Comput. 2019, 15, 1652–1671.
  • Blum and Reymond 2009 Blum, L. C.; Reymond, J.-L. 970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. J. Am. Chem. Soc. 2009, 131, 8732–8733.
  • Reymond 2015 Reymond, J.-L. The chemical space project. Acc. Chem. Res. 2015, 48, 722–730.
  • Ramakrishnan et al. 2014 Ramakrishnan, R.; Dral, P. O.; Rupp, M.; von Lilienfeld, O. A. Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 2014, 1, 140022.
  • Zimmerman 2015 Zimmerman, P. M. Single-ended transition state finding with the growing string method. J. Comput. Chem. 2015, 36, 601–611.
  • Cordella et al. 2001 Cordella, L. P.; Foggia, P.; Sansone, C.; Vento, M. An improved algorithm for matching large graphs. 3rd IAPR-TC15 workshop on graph-based representations in pattern recognition. 2001; pp 149–159.
  • Hagberg et al. 2008 Hagberg, A. A.; Schult, D. A.; Swart, P. J. Exploring network structure, dynamics, and function using NetworkX. Proceedings of the 7th Python in Science Conference. 2008; pp 11–15.
  • Wu et al. 2018 Wu, Z.; Ramsundar, B.; Feinberg, E. N.; Gomes, J.; Geniesse, C.; Pappu, A. S.; Leswing, K.; Pande, V. MoleculeNet: A benchmark for molecular machine learning. Chem. Sci. 2018, 9, 513–530.
  • Yang et al. 2019 Yang, K.; Swanson, K.; Jin, W.; Coley, C.; Eiden, P.; Gao, H.; Guzman-Perez, A.; Hopper, T.; Kelley, B.; Mathea, M. et al. Analyzing learned molecular representations for property prediction. J. Chem. Inf. Model. 2019, 59, 3370–3388.
  • Riniker and Landrum 2015 Riniker, S.; Landrum, G. A. Better informed distance geometry: Using what we know to improve conformation generation. J. Chem. Inf. Model. 2015, 55, 2562–2574.
  • Tosco et al. 2014 Tosco, P.; Stiefl, N.; Landrum, G. Bringing the MMFF force field to the RDKit: implementation and validation. J. Cheminform. 2014, 6, 37.
  • Atkinson et al. 2019 Atkinson, P.; Bannwarth, C.; Bohle, F.; Brandenburg, G.; Caldeweyher, E.; Checinski, M.; Dohm, S.; Ehlert, S.; Ehrlich, S.; Gerasimov, I. et al. Semiempirical Extended Tight-Binding Program Package. https://github.com/grimme-lab/xtb, 2019.
  • Kingma and Ba 2014 Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. arXiv preprint 2014, arXiv:1412.6980.
  • Biewald 2020 Biewald, L. Experiment Tracking with Weights and Biases. 2020; https://www.wandb.com/, Software available from wandb.com.
  • Christensen et al. 2017 Christensen, A. S.; Faber, F.; Huang, B.; Bratholm, L.; Tkatchenko, A.; Müller, K.-R.; von Lilienfeld, O. A. QML: A Python toolkit for quantum machine learning. https://github.com/qmlcode/qml, 2017.
{tocentry}[Uncaptioned image]