Search | arXiv e-print repository

arXiv:2406.19591 [pdf, other]

Mathematical modelling and uncertainty quantification for analysis of biphasic coral reef recovery patterns

Authors: David J. Warne, Kerryn Crossman, Grace E. M. Heron, Jesse A. Sharp, Wang Jin, Paul Pao-Yen Wu, Matthew J. Simpson, Kerrie Mengersen, Juan-Carlos Ortiz

Abstract: Coral reefs are increasingly subjected to major disturbances threatening the health of marine ecosystems. Substantial research underway to develop intervention strategies that assist reefs in recovery from, and resistance to, inevitable future climate and weather extremes. To assess potential benefits of interventions, mechanistic understanding of coral reef recovery and resistance patterns is ess… ▽ More Coral reefs are increasingly subjected to major disturbances threatening the health of marine ecosystems. Substantial research underway to develop intervention strategies that assist reefs in recovery from, and resistance to, inevitable future climate and weather extremes. To assess potential benefits of interventions, mechanistic understanding of coral reef recovery and resistance patterns is essential. Recent evidence suggests that more than half of the reefs surveyed across the Great Barrier Reef (GBR) exhibit deviations from standard recovery modelling assumptions when the initial coral cover is low ($\leq 10$\%). New modelling is necessary to account for these observed patterns to better inform management strategies. We consider a new model for reef recovery at the coral cover scale that accounts for biphasic recovery patterns. The model is based on a multispecies Richards' growth model that includes a change point in the recovery patterns. Bayesian inference is applied for uncertainty quantification of key parameters for assessing reef health and recovery patterns. This analysis is applied to benthic survey data from the Australian Institute of Marine Sciences (AIMS). We demonstrate agreement between model predictions and data across every recorded recovery trajectory with at least two years of observations following disturbance events occurring between 1992--2020. This new approach will enable new insights into the biological, ecological and environmental factors that contribute to the duration and severity of biphasic coral recovery patterns across the GBR. These new insights will help to inform managements and monitoring practice to mitigate the impacts of climate change on coral reefs. △ Less

Submitted 27 June, 2024; originally announced June 2024.

MSC Class: 62P12 (Primary)

arXiv:2210.03561 [pdf, other]

Empowering Graph Representation Learning with Test-Time Graph Transformation

Authors: Wei Jin, Tong Zhao, Jiayuan Ding, Yozen Liu, Jiliang Tang, Neil Shah

Abstract: As powerful tools for representation learning on graphs, graph neural networks (GNNs) have facilitated various applications from drug discovery to recommender systems. Nevertheless, the effectiveness of GNNs is immensely challenged by issues related to data quality, such as distribution shift, abnormal features and adversarial attacks. Recent efforts have been made on tackling these issues from a… ▽ More As powerful tools for representation learning on graphs, graph neural networks (GNNs) have facilitated various applications from drug discovery to recommender systems. Nevertheless, the effectiveness of GNNs is immensely challenged by issues related to data quality, such as distribution shift, abnormal features and adversarial attacks. Recent efforts have been made on tackling these issues from a modeling perspective which requires additional cost of changing model architectures or re-training model parameters. In this work, we provide a data-centric view to tackle these issues and propose a graph transformation framework named GTrans which adapts and refines graph data at test time to achieve better performance. We provide theoretical analysis on the design of the framework and discuss why adapting graph data works better than adapting the model. Extensive experiments have demonstrated the effectiveness of GTrans on three distinct scenarios for eight benchmark datasets where suboptimal data is presented. Remarkably, GTrans performs the best in most cases with improvements up to 2.8%, 8.2% and 3.8% over the best baselines on three experimental settings. Code is released at https://github.com/ChandlerBang/GTrans. △ Less

Submitted 26 February, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

Comments: ICLR 2023

arXiv:2110.03753 [pdf, other]

From Stars to Subgraphs: Uplifting Any GNN with Local Structure Awareness

Authors: Lingxiao Zhao, Wei Jin, Leman Akoglu, Neil Shah

Abstract: Message Passing Neural Networks (MPNNs) are a common type of Graph Neural Network (GNN), in which each node's representation is computed recursively by aggregating representations (messages) from its immediate neighbors akin to a star-shaped pattern. MPNNs are appealing for being efficient and scalable, how-ever their expressiveness is upper-bounded by the 1st-order Weisfeiler-Lehman isomorphism t… ▽ More Message Passing Neural Networks (MPNNs) are a common type of Graph Neural Network (GNN), in which each node's representation is computed recursively by aggregating representations (messages) from its immediate neighbors akin to a star-shaped pattern. MPNNs are appealing for being efficient and scalable, how-ever their expressiveness is upper-bounded by the 1st-order Weisfeiler-Lehman isomorphism test (1-WL). In response, prior works propose highly expressive models at the cost of scalability and sometimes generalization performance. Our work stands between these two regimes: we introduce a general framework to uplift any MPNN to be more expressive, with limited scalability overhead and greatly improved practical performance. We achieve this by extending local aggregation in MPNNs from star patterns to general subgraph patterns (e.g.,k-egonets):in our framework, each node representation is computed as the encoding of a surrounding induced subgraph rather than encoding of immediate neighbors only (i.e. a star). We choose the subgraph encoder to be a GNN (mainly MPNNs, considering scalability) to design a general framework that serves as a wrapper to up-lift any GNN. We call our proposed method GNN-AK(GNN As Kernel), as the framework resembles a convolutional neural network by replacing the kernel with GNNs. Theoretically, we show that our framework is strictly more powerful than 1&2-WL, and is not less powerful than 3-WL. We also design subgraph sampling strategies which greatly reduce memory footprint and improve speed while maintaining performance. Our method sets new state-of-the-art performance by large margins for several well-known graph ML tasks; specifically, 0.08 MAE on ZINC,74.79% and 86.887% accuracy on CIFAR10 and PATTERN respectively. △ Less

Submitted 20 April, 2022; v1 submitted 7 October, 2021; originally announced October 2021.

Comments: Expressive GNN framework

arXiv:2006.10141 [pdf, other]

Self-supervised Learning on Graphs: Deep Insights and New Direction

Authors: Wei Jin, Tyler Derr, Haochen Liu, Yiqi Wang, Suhang Wang, Zitao Liu, Jiliang Tang

Abstract: The success of deep learning notoriously requires larger amounts of costly annotated data. This has led to the development of self-supervised learning (SSL) that aims to alleviate this limitation by creating domain specific pretext tasks on unlabeled data. Simultaneously, there are increasing interests in generalizing deep learning to the graph domain in the form of graph neural networks (GNNs). G… ▽ More The success of deep learning notoriously requires larger amounts of costly annotated data. This has led to the development of self-supervised learning (SSL) that aims to alleviate this limitation by creating domain specific pretext tasks on unlabeled data. Simultaneously, there are increasing interests in generalizing deep learning to the graph domain in the form of graph neural networks (GNNs). GNNs can naturally utilize unlabeled nodes through the simple neighborhood aggregation that is unable to thoroughly make use of unlabeled nodes. Thus, we seek to harness SSL for GNNs to fully exploit the unlabeled data. Different from data instances in the image and text domains, nodes in graphs present unique structure information and they are inherently linked indicating not independent and identically distributed (or i.i.d.). Such complexity is a double-edged sword for SSL on graphs. On the one hand, it determines that it is challenging to adopt solutions from the image and text domains to graphs and dedicated efforts are desired. On the other hand, it provides rich information that enables us to build SSL from a variety of perspectives. Thus, in this paper, we first deepen our understandings on when, why, and which strategies of SSL work with GNNs by empirically studying numerous basic SSL pretext tasks on graphs. Inspired by deep insights from the empirical studies, we propose a new direction SelfTask to build advanced pretext tasks that are able to achieve state-of-the-art performance on various real-world datasets. The specific experimental settings to reproduce our results can be found in \url{https://github.com/ChandlerBang/SelfTask-GNN}. △ Less

Submitted 17 June, 2020; originally announced June 2020.

arXiv:2006.03908 [pdf, other]

Enforcing Predictive Invariance across Structured Biomedical Domains

Authors: Wengong Jin, Regina Barzilay, Tommi Jaakkola

Abstract: Many biochemical applications such as molecular property prediction require models to generalize beyond their training domains (environments). Moreover, natural environments in these tasks are structured, defined by complex descriptors such as molecular scaffolds or protein families. Therefore, most environments are either never seen during training, or contain only a single training example. To a… ▽ More Many biochemical applications such as molecular property prediction require models to generalize beyond their training domains (environments). Moreover, natural environments in these tasks are structured, defined by complex descriptors such as molecular scaffolds or protein families. Therefore, most environments are either never seen during training, or contain only a single training example. To address these challenges, we propose a new regret minimization (RGM) algorithm and its extension for structured environments. RGM builds from invariant risk minimization (IRM) by recasting simultaneous optimality condition in terms of predictive regret, finding a representation that enables the predictor to compete against an oracle with hindsight access to held-out environments. The structured extension adaptively highlights variation due to complex environments via specialized domain perturbations. We evaluate our method on multiple applications: molecular property prediction, protein homology and stability prediction and show that RGM significantly outperforms previous state-of-the-art baselines. △ Less

Submitted 7 October, 2020; v1 submitted 6 June, 2020; originally announced June 2020.

arXiv:2005.12386 [pdf, other]

Customized Graph Neural Networks

Authors: Yiqi Wang, Yao Ma, Wei Jin, Chaozhuo Li, Charu Aggarwal, Jiliang Tang

Abstract: Recently, Graph Neural Networks (GNNs) have greatly advanced the task of graph classification. Typically, we first build a unified GNN model with graphs in a given training set and then use this unified model to predict labels of all the unseen graphs in the test set. However, graphs in the same dataset often have dramatically distinct structures, which indicates that a unified model may be sub-op… ▽ More Recently, Graph Neural Networks (GNNs) have greatly advanced the task of graph classification. Typically, we first build a unified GNN model with graphs in a given training set and then use this unified model to predict labels of all the unseen graphs in the test set. However, graphs in the same dataset often have dramatically distinct structures, which indicates that a unified model may be sub-optimal given an individual graph. Therefore, in this paper, we aim to develop customized graph neural networks for graph classification. Specifically, we propose a novel customized graph neural network framework, i.e., Customized-GNN. Given a graph sample, Customized-GNN can generate a sample-specific model for this graph based on its structure. Meanwhile, the proposed framework is very general that can be applied to numerous existing graph neural network models. Comprehensive experiments on various graph classification benchmarks demonstrate the effectiveness of the proposed framework. △ Less

Submitted 14 December, 2021; v1 submitted 22 May, 2020; originally announced May 2020.

arXiv:2005.10203 [pdf, other]

Graph Structure Learning for Robust Graph Neural Networks

Authors: Wei Jin, Yao Ma, Xiaorui Liu, Xianfeng Tang, Suhang Wang, Jiliang Tang

Abstract: Graph Neural Networks (GNNs) are powerful tools in representation learning for graphs. However, recent studies show that GNNs are vulnerable to carefully-crafted perturbations, called adversarial attacks. Adversarial attacks can easily fool GNNs in making predictions for downstream tasks. The vulnerability to adversarial attacks has raised increasing concerns for applying GNNs in safety-critical a… ▽ More Graph Neural Networks (GNNs) are powerful tools in representation learning for graphs. However, recent studies show that GNNs are vulnerable to carefully-crafted perturbations, called adversarial attacks. Adversarial attacks can easily fool GNNs in making predictions for downstream tasks. The vulnerability to adversarial attacks has raised increasing concerns for applying GNNs in safety-critical applications. Therefore, developing robust algorithms to defend adversarial attacks is of great significance. A natural idea to defend adversarial attacks is to clean the perturbed graph. It is evident that real-world graphs share some intrinsic properties. For example, many real-world graphs are low-rank and sparse, and the features of two adjacent nodes tend to be similar. In fact, we find that adversarial attacks are likely to violate these graph properties. Therefore, in this paper, we explore these properties to defend adversarial attacks on graphs. In particular, we propose a general framework Pro-GNN, which can jointly learn a structural graph and a robust graph neural network model from the perturbed graph guided by these properties. Extensive experiments on real-world graphs demonstrate that the proposed framework achieves significantly better performance compared with the state-of-the-art defense methods, even when the graph is heavily perturbed. We release the implementation of Pro-GNN to our DeepRobust repository for adversarial attacks and defenses (footnote: https://github.com/DSE-MSU/DeepRobust). The specific experimental settings to reproduce our results can be found in https://github.com/ChandlerBang/Pro-GNN. △ Less

Submitted 27 June, 2020; v1 submitted 20 May, 2020; originally announced May 2020.

Comments: Accepted by KDD 2020

arXiv:2005.06149 [pdf, other]

DeepRobust: A PyTorch Library for Adversarial Attacks and Defenses

Authors: Yaxin Li, Wei Jin, Han Xu, Jiliang Tang

Abstract: DeepRobust is a PyTorch adversarial learning library which aims to build a comprehensive and easy-to-use platform to foster this research field. It currently contains more than 10 attack algorithms and 8 defense algorithms in image domain and 9 attack algorithms and 4 defense algorithms in graph domain, under a variety of deep learning architectures. In this manual, we introduce the main contents… ▽ More DeepRobust is a PyTorch adversarial learning library which aims to build a comprehensive and easy-to-use platform to foster this research field. It currently contains more than 10 attack algorithms and 8 defense algorithms in image domain and 9 attack algorithms and 4 defense algorithms in graph domain, under a variety of deep learning architectures. In this manual, we introduce the main contents of DeepRobust with detailed instructions. The library is kept updated and can be found at https://github.com/DSE-MSU/DeepRobust. △ Less

Submitted 13 May, 2020; originally announced May 2020.

Comments: Adversarial attacks and defenses, Pytorch library

arXiv:2005.03004 [pdf, other]

Adaptive Invariance for Molecule Property Prediction

Authors: Wengong Jin, Regina Barzilay, Tommi Jaakkola

Abstract: Effective property prediction methods can help accelerate the search for COVID-19 antivirals either through accurate in-silico screens or by effectively guiding on-going at-scale experimental efforts. However, existing prediction tools have limited ability to accommodate scarce or fragmented training data currently available. In this paper, we introduce a novel approach to learn predictors that ca… ▽ More Effective property prediction methods can help accelerate the search for COVID-19 antivirals either through accurate in-silico screens or by effectively guiding on-going at-scale experimental efforts. However, existing prediction tools have limited ability to accommodate scarce or fragmented training data currently available. In this paper, we introduce a novel approach to learn predictors that can generalize or extrapolate beyond the heterogeneous data. Our method builds on and extends recently proposed invariant risk minimization, adaptively forcing the predictor to avoid nuisance variation. We achieve this by continually exercising and manipulating latent representations of molecules to highlight undesirable variation to the predictor. To test the method we use a combination of three data sources: SARS-CoV-2 antiviral screening data, molecular fragments that bind to SARS-CoV-2 main protease and large screening data for SARS-CoV-1. Our predictor outperforms state-of-the-art transfer learning methods by significant margin. We also report the top 20 predictions of our model on Broad drug repurposing hub. △ Less

Submitted 5 May, 2020; originally announced May 2020.

arXiv:2005.00792 [pdf, other]

ForecastQA: A Question Answering Challenge for Event Forecasting with Temporal Text Data

Authors: Woojeong Jin, Rahul Khanna, Suji Kim, Dong-Ho Lee, Fred Morstatter, Aram Galstyan, Xiang Ren

Abstract: Event forecasting is a challenging, yet important task, as humans seek to constantly plan for the future. Existing automated forecasting studies rely mostly on structured data, such as time-series or event-based knowledge graphs, to help predict future events. In this work, we aim to formulate a task, construct a dataset, and provide benchmarks for developing methods for event forecasting with lar… ▽ More Event forecasting is a challenging, yet important task, as humans seek to constantly plan for the future. Existing automated forecasting studies rely mostly on structured data, such as time-series or event-based knowledge graphs, to help predict future events. In this work, we aim to formulate a task, construct a dataset, and provide benchmarks for developing methods for event forecasting with large volumes of unstructured text data. To simulate the forecasting scenario on temporal news documents, we formulate the problem as a restricted-domain, multiple-choice, question-answering (QA) task. Unlike existing QA tasks, our task limits accessible information, and thus a model has to make a forecasting judgement. To showcase the usefulness of this task formulation, we introduce ForecastQA, a question-answering dataset consisting of 10,392 event forecasting questions, which have been collected and verified via crowdsourcing efforts. We present our experiments on ForecastQA using BERT-based models and find that our best model achieves 60.1% accuracy on the dataset, which still lags behind human performance by about 19%. We hope ForecastQA will support future research efforts in bridging this gap. △ Less

Submitted 7 June, 2021; v1 submitted 2 May, 2020; originally announced May 2020.

Comments: Accepted to ACL 2021. Project page: https://inklab.usc.edu/ForecastQA/

arXiv:2004.13181 [pdf, ps, other]

EM-GAN: Fast Stress Analysis for Multi-Segment Interconnect Using Generative Adversarial Networks

Authors: Wentian Jin, Sheriff Sadiqbatcha, Jinwei Zhang, Sheldon X. -D. Tan

Abstract: In this paper, we propose a fast transient hydrostatic stress analysis for electromigration (EM) failure assessment for multi-segment interconnects using generative adversarial networks (GANs). Our work leverages the image synthesis feature of GAN-based generative deep neural networks. The stress evaluation of multi-segment interconnects, modeled by partial differential equations, can be viewed as… ▽ More In this paper, we propose a fast transient hydrostatic stress analysis for electromigration (EM) failure assessment for multi-segment interconnects using generative adversarial networks (GANs). Our work leverages the image synthesis feature of GAN-based generative deep neural networks. The stress evaluation of multi-segment interconnects, modeled by partial differential equations, can be viewed as time-varying 2D-images-to-image problem where the input is the multi-segment interconnects topology with current densities and the output is the EM stress distribution in those wire segments at the given aging time. Based on this observation, we train conditional GAN model using the images of many self-generated multi-segment wires and wire current densities and aging time (as conditions) against the COMSOL simulation results. Different hyperparameters of GAN were studied and compared. The proposed algorithm, called {\it EM-GAN}, can quickly give accurate stress distribution of a general multi-segment wire tree for a given aging time, which is important for full-chip fast EM failure assessment. Our experimental results show that the EM-GAN shows 6.6\% averaged error compared to COMSOL simulation results with orders of magnitude speedup. It also delivers 8.3X speedup over state-of-the-art analytic based EM analysis solver. △ Less

Submitted 27 April, 2020; originally announced April 2020.

arXiv:2004.05487 [pdf, other]

A Bayesian Nonparametric Approach for Inferring Drug Combination Effects on Mental Health in People with HIV

Authors: Wei Jin, Yang Ni, Leah H. Rubin, Amanda B. Spence, Yanxun Xu

Abstract: Although combination antiretroviral therapy (ART) is highly effective in suppressing viral load for people with HIV (PWH), many ART agents may exacerbate central nervous system (CNS)-related adverse effects including depression. Therefore, understanding the effects of ART drugs on the CNS function, especially mental health, can help clinicians personalize medicine with less adverse effects for PWH… ▽ More Although combination antiretroviral therapy (ART) is highly effective in suppressing viral load for people with HIV (PWH), many ART agents may exacerbate central nervous system (CNS)-related adverse effects including depression. Therefore, understanding the effects of ART drugs on the CNS function, especially mental health, can help clinicians personalize medicine with less adverse effects for PWH and prevent them from discontinuing their ART to avoid undesirable health outcomes and increased likelihood of HIV transmission. The emergence of electronic health records offers researchers unprecedented access to HIV data including individuals' mental health records, drug prescriptions, and clinical information over time. However, modeling such data is very challenging due to high-dimensionality of the drug combination space, the individual heterogeneity, and sparseness of the observed drug combinations. We develop a Bayesian nonparametric approach to learn drug combination effect on mental health in PWH adjusting for socio-demographic, behavioral, and clinical factors. The proposed method is built upon the subset-tree kernel method that represents drug combinations in a way that synthesizes known regimen structure into a single mathematical representation. It also utilizes a distance-dependent Chinese restaurant process to cluster heterogeneous population while taking into account individuals' treatment histories. We evaluate the proposed approach through simulation studies, and apply the method to a dataset from the Women's Interagency HIV Study, yielding interpretable and promising results. Our method has clinical utility in guiding clinicians to prescribe more informed and effective personalized treatment based on individuals' treatment histories and clinical characteristics. △ Less

Submitted 11 April, 2020; originally announced April 2020.

arXiv:2003.03919 [pdf, other]

doi 10.24963/ijcai.2020/386

Temporal Attribute Prediction via Joint Modeling of Multi-Relational Structure Evolution

Authors: Sankalp Garg, Navodita Sharma, Woojeong Jin, Xiang Ren

Abstract: Time series prediction is an important problem in machine learning. Previous methods for time series prediction did not involve additional information. With a lot of dynamic knowledge graphs available, we can use this additional information to predict the time series better. Recently, there has been a focus on the application of deep representation learning on dynamic graphs. These methods predict… ▽ More Time series prediction is an important problem in machine learning. Previous methods for time series prediction did not involve additional information. With a lot of dynamic knowledge graphs available, we can use this additional information to predict the time series better. Recently, there has been a focus on the application of deep representation learning on dynamic graphs. These methods predict the structure of the graph by reasoning over the interactions in the graph at previous time steps. In this paper, we propose a new framework to incorporate the information from dynamic knowledge graphs for time series prediction. We show that if the information contained in the graph and the time series data are closely related, then this inter-dependence can be used to predict the time series with improved accuracy. Our framework, DArtNet, learns a static embedding for every node in the graph as well as a dynamic embedding which is dependent on the dynamic attribute value (time-series). Then it captures the information from the neighborhood by taking a relation specific mean and encodes the history information using RNN. We jointly train the model link prediction and attribute prediction. We evaluate our method on five specially curated datasets for this problem and show a consistent improvement in time series prediction results. We release the data and code of model DArtNet for future research at https://github.com/INK-USC/DArtNet . △ Less

Submitted 13 July, 2020; v1 submitted 9 March, 2020; originally announced March 2020.

Comments: In Proceedings of IJCAI 2020. Code can be found at https://github.com/INK-USC/DArtNet . The sole copyright holder is IJCAI (International Joint Conferences on Artificial Intelligence), all rights reserved. Original Publication available at https://www.ijcai.org/Proceedings/2020/386

arXiv:2003.00653 [pdf, other]

Adversarial Attacks and Defenses on Graphs: A Review, A Tool and Empirical Studies

Authors: Wei Jin, Yaxin Li, Han Xu, Yiqi Wang, Shuiwang Ji, Charu Aggarwal, Jiliang Tang

Abstract: Deep neural networks (DNNs) have achieved significant performance in various tasks. However, recent studies have shown that DNNs can be easily fooled by small perturbation on the input, called adversarial attacks. As the extensions of DNNs to graphs, Graph Neural Networks (GNNs) have been demonstrated to inherit this vulnerability. Adversary can mislead GNNs to give wrong predictions by modifying… ▽ More Deep neural networks (DNNs) have achieved significant performance in various tasks. However, recent studies have shown that DNNs can be easily fooled by small perturbation on the input, called adversarial attacks. As the extensions of DNNs to graphs, Graph Neural Networks (GNNs) have been demonstrated to inherit this vulnerability. Adversary can mislead GNNs to give wrong predictions by modifying the graph structure such as manipulating a few edges. This vulnerability has arisen tremendous concerns for adapting GNNs in safety-critical applications and has attracted increasing research attention in recent years. Thus, it is necessary and timely to provide a comprehensive overview of existing graph adversarial attacks and the countermeasures. In this survey, we categorize existing attacks and defenses, and review the corresponding state-of-the-art methods. Furthermore, we have developed a repository with representative algorithms (https://github.com/DSE-MSU/DeepRobust/tree/master/deeprobust/graph). The repository enables us to conduct empirical studies to deepen our understandings on attacks and defenses on graphs. △ Less

Submitted 12 December, 2020; v1 submitted 1 March, 2020; originally announced March 2020.

Comments: Accepted by SIGKDD Explorations

arXiv:2002.04720 [pdf, other]

Improving Molecular Design by Stochastic Iterative Target Augmentation

Authors: Kevin Yang, Wengong Jin, Kyle Swanson, Regina Barzilay, Tommi Jaakkola

Abstract: Generative models in molecular design tend to be richly parameterized, data-hungry neural models, as they must create complex structured objects as outputs. Estimating such models from data may be challenging due to the lack of sufficient training data. In this paper, we propose a surprisingly effective self-training approach for iteratively creating additional molecular targets. We first pre-trai… ▽ More Generative models in molecular design tend to be richly parameterized, data-hungry neural models, as they must create complex structured objects as outputs. Estimating such models from data may be challenging due to the lack of sufficient training data. In this paper, we propose a surprisingly effective self-training approach for iteratively creating additional molecular targets. We first pre-train the generative model together with a simple property predictor. The property predictor is then used as a likelihood model for filtering candidate structures from the generative model. Additional targets are iteratively produced and used in the course of stochastic EM iterations to maximize the log-likelihood that the candidate structures are accepted. A simple rejection (re-weighting) sampler suffices to draw posterior samples since the generative model is already reasonable after pre-training. We demonstrate significant gains over strong baselines for both unconditional and conditional molecular design. In particular, our approach outperforms the previous state-of-the-art in conditional molecular design by over 10% in absolute gain. Finally, we show that our approach is useful in other domains as well, such as program synthesis. △ Less

Submitted 15 August, 2021; v1 submitted 11 February, 2020; originally announced February 2020.

Comments: ICML 2020

Journal ref: PMLR 119:10716-10726, 2020

arXiv:2002.03244 [pdf, other]

Multi-Objective Molecule Generation using Interpretable Substructures

Authors: Wengong Jin, Regina Barzilay, Tommi Jaakkola

Abstract: Drug discovery aims to find novel compounds with specified chemical property profiles. In terms of generative modeling, the goal is to learn to sample molecules in the intersection of multiple property constraints. This task becomes increasingly challenging when there are many property constraints. We propose to offset this complexity by composing molecules from a vocabulary of substructures that… ▽ More Drug discovery aims to find novel compounds with specified chemical property profiles. In terms of generative modeling, the goal is to learn to sample molecules in the intersection of multiple property constraints. This task becomes increasingly challenging when there are many property constraints. We propose to offset this complexity by composing molecules from a vocabulary of substructures that we call molecular rationales. These rationales are identified from molecules as substructures that are likely responsible for each property of interest. We then learn to expand rationales into a full molecule using graph generative models. Our final generative model composes molecules as mixtures of multiple rationale completions, and this mixture is fine-tuned to preserve the properties of interest. We evaluate our model on various drug design tasks and demonstrate significant improvements over state-of-the-art baselines in terms of accuracy, diversity, and novelty of generated compounds. △ Less

Submitted 2 July, 2020; v1 submitted 8 February, 2020; originally announced February 2020.

arXiv:2002.03230 [pdf, other]

Hierarchical Generation of Molecular Graphs using Structural Motifs

Authors: Wengong Jin, Regina Barzilay, Tommi Jaakkola

Abstract: Graph generation techniques are increasingly being adopted for drug discovery. Previous graph generation approaches have utilized relatively small molecular building blocks such as atoms or simple cycles, limiting their effectiveness to smaller molecules. Indeed, as we demonstrate, their performance degrades significantly for larger molecules. In this paper, we propose a new hierarchical graph enc… ▽ More Graph generation techniques are increasingly being adopted for drug discovery. Previous graph generation approaches have utilized relatively small molecular building blocks such as atoms or simple cycles, limiting their effectiveness to smaller molecules. Indeed, as we demonstrate, their performance degrades significantly for larger molecules. In this paper, we propose a new hierarchical graph encoder-decoder that employs significantly larger and more flexible graph motifs as basic building blocks. Our encoder produces a multi-resolution representation for each molecule in a fine-to-coarse fashion, from atoms to connected motifs. Each level integrates the encoding of constituents below with the graph at that level. Our autoregressive coarse-to-fine decoder adds one motif at a time, interleaving the decision of selecting a new motif with the process of resolving its attachments to the emerging molecule. We evaluate our model on multiple molecule generation tasks, including polymers, and show that our model significantly outperforms previous state-of-the-art baselines. △ Less

Submitted 18 April, 2020; v1 submitted 8 February, 2020; originally announced February 2020.

arXiv:1907.11223 [pdf, other]

Hierarchical Graph-to-Graph Translation for Molecules

Authors: Wengong Jin, Regina Barzilay, Tommi Jaakkola

Abstract: The problem of accelerating drug discovery relies heavily on automatic tools to optimize precursor molecules to afford them with better biochemical properties. Our work in this paper substantially extends prior state-of-the-art on graph-to-graph translation methods for molecular optimization. In particular, we realize coherent multi-resolution representations by interweaving the encoding of substr… ▽ More The problem of accelerating drug discovery relies heavily on automatic tools to optimize precursor molecules to afford them with better biochemical properties. Our work in this paper substantially extends prior state-of-the-art on graph-to-graph translation methods for molecular optimization. In particular, we realize coherent multi-resolution representations by interweaving the encoding of substructure components with the atom-level encoding of the original molecular graph. Moreover, our graph decoder is fully autoregressive, and interleaves each step of adding a new substructure with the process of resolving its attachment to the emerging molecule. We evaluate our model on multiple molecular optimization tasks and show that our model significantly outperforms previous state-of-the-art baselines. △ Less

Submitted 18 October, 2019; v1 submitted 11 June, 2019; originally announced July 2019.

arXiv:1904.05530 [pdf, other]

Recurrent Event Network: Autoregressive Structure Inference over Temporal Knowledge Graphs

Authors: Woojeong Jin, Meng Qu, Xisen Jin, Xiang Ren

Abstract: Knowledge graph reasoning is a critical task in natural language processing. The task becomes more challenging on temporal knowledge graphs, where each fact is associated with a timestamp. Most existing methods focus on reasoning at past timestamps and they are not able to predict facts happening in the future. This paper proposes Recurrent Event Network (RE-NET), a novel autoregressive architectu… ▽ More Knowledge graph reasoning is a critical task in natural language processing. The task becomes more challenging on temporal knowledge graphs, where each fact is associated with a timestamp. Most existing methods focus on reasoning at past timestamps and they are not able to predict facts happening in the future. This paper proposes Recurrent Event Network (RE-NET), a novel autoregressive architecture for predicting future interactions. The occurrence of a fact (event) is modeled as a probability distribution conditioned on temporal sequences of past knowledge graphs. Specifically, our RE-NET employs a recurrent event encoder to encode past facts and uses a neighborhood aggregator to model the connection of facts at the same timestamp. Future facts can then be inferred in a sequential manner based on the two modules. We evaluate our proposed method via link prediction at future times on five public datasets. Through extensive experiments, we demonstrate the strength of RENET, especially on multi-step inference over future timestamps, and achieve state-of-the-art performance on all five datasets. Code and data can be found at https://github.com/INK-USC/RE-Net. △ Less

Submitted 6 October, 2020; v1 submitted 11 April, 2019; originally announced April 2019.

Comments: 15 pages, 8 figures, accepted at as full paper in EMNLP 2020

arXiv:1904.01561 [pdf, other]

Analyzing Learned Molecular Representations for Property Prediction

Authors: Kevin Yang, Kyle Swanson, Wengong Jin, Connor Coley, Philipp Eiden, Hua Gao, Angel Guzman-Perez, Timothy Hopper, Brian Kelley, Miriam Mathea, Andrew Palmer, Volker Settels, Tommi Jaakkola, Klavs Jensen, Regina Barzilay

Abstract: Advancements in neural machinery have led to a wide range of algorithmic solutions for molecular property prediction. Two classes of models in particular have yielded promising results: neural networks applied to computed molecular fingerprints or expert-crafted descriptors, and graph convolutional neural networks that construct a learned molecular representation by operating on the graph structur… ▽ More Advancements in neural machinery have led to a wide range of algorithmic solutions for molecular property prediction. Two classes of models in particular have yielded promising results: neural networks applied to computed molecular fingerprints or expert-crafted descriptors, and graph convolutional neural networks that construct a learned molecular representation by operating on the graph structure of the molecule. However, recent literature has yet to clearly determine which of these two methods is superior when generalizing to new chemical space. Furthermore, prior research has rarely examined these new models in industry research settings in comparison to existing employed models. In this paper, we benchmark models extensively on 19 public and 16 proprietary industrial datasets spanning a wide variety of chemical endpoints. In addition, we introduce a graph convolutional model that consistently matches or outperforms models using fixed molecular descriptors as well as previous graph neural architectures on both public and proprietary datasets. Our empirical findings indicate that while approaches based on these representations have yet to reach the level of experimental reproducibility, our proposed model nevertheless offers significant improvements over models currently used in industrial workflows. △ Less

Submitted 20 November, 2019; v1 submitted 2 April, 2019; originally announced April 2019.

Journal ref: Journal of chemical information and modeling 59.8 (2019): 3370-3388

arXiv:1902.09737 [pdf, other]

Functional Transparency for Structured Data: a Game-Theoretic Approach

Authors: Guang-He Lee, Wengong Jin, David Alvarez-Melis, Tommi S. Jaakkola

Abstract: We provide a new approach to training neural models to exhibit transparency in a well-defined, functional manner. Our approach naturally operates over structured data and tailors the predictor, functionally, towards a chosen family of (local) witnesses. The estimation problem is setup as a co-operative game between an unrestricted predictor such as a neural network, and a set of witnesses chosen f… ▽ More We provide a new approach to training neural models to exhibit transparency in a well-defined, functional manner. Our approach naturally operates over structured data and tailors the predictor, functionally, towards a chosen family of (local) witnesses. The estimation problem is setup as a co-operative game between an unrestricted predictor such as a neural network, and a set of witnesses chosen from the desired transparent family. The goal of the witnesses is to highlight, locally, how well the predictor conforms to the chosen family of functions, while the predictor is trained to minimize the highlighted discrepancy. We emphasize that the predictor remains globally powerful as it is only encouraged to agree locally with locally adapted witnesses. We analyze the effect of the proposed approach, provide example formulations in the context of deep graph and sequence models, and empirically illustrate the idea in chemical property prediction, temporal modeling, and molecule representation learning. △ Less

Submitted 26 February, 2019; originally announced February 2019.

arXiv:1812.01070 [pdf, other]

Learning Multimodal Graph-to-Graph Translation for Molecular Optimization

Authors: Wengong Jin, Kevin Yang, Regina Barzilay, Tommi Jaakkola

Abstract: We view molecular optimization as a graph-to-graph translation problem. The goal is to learn to map from one molecular graph to another with better properties based on an available corpus of paired molecules. Since molecules can be optimized in different ways, there are multiple viable translations for each input graph. A key challenge is therefore to model diverse translation outputs. Our primary… ▽ More We view molecular optimization as a graph-to-graph translation problem. The goal is to learn to map from one molecular graph to another with better properties based on an available corpus of paired molecules. Since molecules can be optimized in different ways, there are multiple viable translations for each input graph. A key challenge is therefore to model diverse translation outputs. Our primary contributions include a junction tree encoder-decoder for learning diverse graph translations along with a novel adversarial training method for aligning distributions of molecules. Diverse output distributions in our model are explicitly realized by low-dimensional latent vectors that modulate the translation process. We evaluate our model on multiple molecular optimization tasks and show that our model outperforms previous state-of-the-art baselines. △ Less

Submitted 28 January, 2019; v1 submitted 3 December, 2018; originally announced December 2018.

arXiv:1802.04364 [pdf, other]

Junction Tree Variational Autoencoder for Molecular Graph Generation

Authors: Wengong Jin, Regina Barzilay, Tommi Jaakkola

Abstract: We seek to automate the design of molecules based on specific chemical properties. In computational terms, this task involves continuous embedding and generation of molecular graphs. Our primary contribution is the direct realization of molecular graphs, a task previously approached by generating linear SMILES strings instead of graphs. Our junction tree variational autoencoder generates molecular… ▽ More We seek to automate the design of molecules based on specific chemical properties. In computational terms, this task involves continuous embedding and generation of molecular graphs. Our primary contribution is the direct realization of molecular graphs, a task previously approached by generating linear SMILES strings instead of graphs. Our junction tree variational autoencoder generates molecular graphs in two phases, by first generating a tree-structured scaffold over chemical substructures, and then combining them into a molecule with a graph message passing network. This approach allows us to incrementally expand molecules while maintaining chemical validity at every step. We evaluate our model on multiple tasks ranging from molecular generation to optimization. Across these tasks, our model outperforms previous state-of-the-art baselines by a significant margin. △ Less

Submitted 29 March, 2019; v1 submitted 12 February, 2018; originally announced February 2018.

arXiv:1709.04555 [pdf, other]

Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network

Authors: Wengong Jin, Connor W. Coley, Regina Barzilay, Tommi Jaakkola

Abstract: The prediction of organic reaction outcomes is a fundamental problem in computational chemistry. Since a reaction may involve hundreds of atoms, fully exploring the space of possible transformations is intractable. The current solution utilizes reaction templates to limit the space, but it suffers from coverage and efficiency issues. In this paper, we propose a template-free approach to efficientl… ▽ More The prediction of organic reaction outcomes is a fundamental problem in computational chemistry. Since a reaction may involve hundreds of atoms, fully exploring the space of possible transformations is intractable. The current solution utilizes reaction templates to limit the space, but it suffers from coverage and efficiency issues. In this paper, we propose a template-free approach to efficiently explore the space of product molecules by first pinpointing the reaction center -- the set of nodes and edges where graph edits occur. Since only a small number of atoms contribute to reaction center, we can directly enumerate candidate products. The generated candidates are scored by a Weisfeiler-Lehman Difference Network that models high-order interactions between changes occurring at nodes across the molecule. Our framework outperforms the top-performing template-based approach with a 10\% margin, while running orders of magnitude faster. Finally, we demonstrate that the model accuracy rivals the performance of domain experts. △ Less

Submitted 29 December, 2017; v1 submitted 13 September, 2017; originally announced September 2017.

Comments: accepted by NIPS 2017

Showing 1–24 of 24 results for author: Jin, W