Search | arXiv e-print repository

α-HMM: A Graphical Model for RNA Folding

Authors: Sixiang Zhang, Aaron J. Yang, Liming Cai

Abstract: RNA secondary structure is modeled with the novel arbitrary-order hidden Markov model (α-HMM). The α-HMM extends over the traditional HMM with capability to model stochastic events that may be in influenced by historically distant ones, making it suitable to account for long-range canonical base pairings between nucleotides, which constitute the RNA secondary structure. Unlike previous heavy-weigh… ▽ More RNA secondary structure is modeled with the novel arbitrary-order hidden Markov model (α-HMM). The α-HMM extends over the traditional HMM with capability to model stochastic events that may be in influenced by historically distant ones, making it suitable to account for long-range canonical base pairings between nucleotides, which constitute the RNA secondary structure. Unlike previous heavy-weight extensions over HMM, the α-HMM has the flexibility to apply restrictions on how one event may influence another in stochastic processes, enabling efficient prediction of RNA secondary structure including pseudoknots. △ Less

Submitted 7 January, 2024; originally announced January 2024.

Comments: 14 pages, 5 figures, 1 table

arXiv:2112.04624 [pdf, other]

Deep Molecular Representation Learning via Fusing Physical and Chemical Information

Authors: Shuwen Yang, Ziyao Li, Guojie Song, Lingsheng Cai

Abstract: Molecular representation learning is the first yet vital step in combining deep learning and molecular science. To push the boundaries of molecular representation learning, we present PhysChem, a novel neural architecture that learns molecular representations via fusing physical and chemical information of molecules. PhysChem is composed of a physicist network (PhysNet) and a chemist network (Chem… ▽ More Molecular representation learning is the first yet vital step in combining deep learning and molecular science. To push the boundaries of molecular representation learning, we present PhysChem, a novel neural architecture that learns molecular representations via fusing physical and chemical information of molecules. PhysChem is composed of a physicist network (PhysNet) and a chemist network (ChemNet). PhysNet is a neural physical engine that learns molecular conformations through simulating molecular dynamics with parameterized forces; ChemNet implements geometry-aware deep message-passing to learn chemical / biomedical properties of molecules. Two networks specialize in their own tasks and cooperate by providing expertise to each other. By fusing physical and chemical information, PhysChem achieved state-of-the-art performances on MoleculeNet, a standard molecular machine learning benchmark. The effectiveness of PhysChem was further corroborated on cutting-edge datasets of SARS-CoV-2. △ Less

Submitted 28 November, 2021; originally announced December 2021.

Comments: In NeurIPS-2021, 18 pages, 5 figures, appendix included

arXiv:2104.05024 [pdf, other]

A Kernel-free Boundary Integral Method for the Bidomain Equations

Authors: Xindan Gao, Li Cai, Craig S. Henriquez, Wenjun Ying

Abstract: The bidomain equations have been widely used to mathematically model the electrical activity of the cardiac tissue. In this work, we present a potential theory-based Cartesian grid method which is referred as the kernel-free boundary integral (KFBI) method which works well on complex domains to efficiently simulate the linear diffusion part of the bidomain equation. After a proper temporal discret… ▽ More The bidomain equations have been widely used to mathematically model the electrical activity of the cardiac tissue. In this work, we present a potential theory-based Cartesian grid method which is referred as the kernel-free boundary integral (KFBI) method which works well on complex domains to efficiently simulate the linear diffusion part of the bidomain equation. After a proper temporal discretization, the KFBI method is applied to solve the resulting homogeneous Neumann boundary value problems with a second-order accuracy. According to the potential theory, the boundary integral equations reformulated from the boundary value problems can be solved iteratively with the simple Richardson iteration or the Krylov subspace iteration method. During the iteration, the boundary and volume integrals are evaluated by limiting the structured grid-based discrete solutions of the equivalent interface problems at quasi-uniform interface nodes without the need to know the analytical expression of Green's functions. In particular, the discrete linear system of the equivalent interface problem obtained from the standard finite difference schemes or the finite element schemes can be efficiently solved by fast elliptic solvers such as the fast Fourier transform based solvers or those based on geometric multigrid iterations after an appropriate modification at the irregular grid nodes. Numerical results for solving the FitzHugh-Nagumo bidomain equations in both two- and three-dimensional spaces are presented to demonstrate the numerical performance of the KFBI method such as the second-order accuracy and the propagation and scroll wave of the voltage simulated on the real human left ventricle model. △ Less

Submitted 11 April, 2021; originally announced April 2021.

arXiv:2012.03671 [pdf]

Joint analysis of structural connectivity and cortical surface features: correlates with mild traumatic brain injury

Authors: Cailey I. Kerley, Leon Y. Cai, Chang Yu, Logan M. Crawford, Jason M. Elenberger, Eden S. Singh, Kurt G. Schilling, Katherine S. Aboud, Bennett A. Landman, Tonia S. Rex

Abstract: Mild traumatic brain injury (mTBI) is a complex syndrome that affects up to 600 per 100,000 individuals, with a particular concentration among military personnel. About half of all mTBI patients experience a diverse array of chronic symptoms which persist long after the acute injury. Hence, there is an urgent need for better understanding of the white matter and gray matter pathologies associated… ▽ More Mild traumatic brain injury (mTBI) is a complex syndrome that affects up to 600 per 100,000 individuals, with a particular concentration among military personnel. About half of all mTBI patients experience a diverse array of chronic symptoms which persist long after the acute injury. Hence, there is an urgent need for better understanding of the white matter and gray matter pathologies associated with mTBI to map which specific brain systems are impacted and identify courses of intervention. Previous works have linked mTBI to disruptions in white matter pathways and cortical surface abnormalities. Herein, we examine these hypothesized links in an exploratory study of joint structural connectivity and cortical surface changes associated with mTBI and its chronic symptoms. Briefly, we consider a cohort of 12 mTBI and 26 control subjects. A set of 588 cortical surface metrics and 4,753 structural connectivity metrics were extracted from cortical surface regions and diffusion weighted magnetic resonance imaging in each subject. Principal component analysis (PCA) was used to reduce the dimensionality of each metric set. We then applied independent component analysis (ICA) both to each PCA space individually and together in a joint ICA approach. We identified a stable independent component across the connectivity-only and joint ICAs which presented significant group differences in subject loadings (p<0.05, corrected). Additionally, we found that two mTBI symptoms, slowed thinking and forgetfulness, were significantly correlated (p<0.05, corrected) with mTBI subject loadings in a surface-only ICA. These surface-only loadings captured an increase in bilateral cortical thickness. △ Less

Submitted 15 December, 2020; v1 submitted 18 November, 2020; originally announced December 2020.

Comments: To be published in Proc SPIE Int Soc Opt Eng. 2021 Feb

arXiv:2012.01981 [pdf, other]

Advanced Graph and Sequence Neural Networks for Molecular Property Prediction and Drug Discovery

Authors: Zhengyang Wang, Meng Liu, Youzhi Luo, Zhao Xu, Yaochen Xie, Limei Wang, Lei Cai, Qi Qi, Zhuoning Yuan, Tianbao Yang, Shuiwang Ji

Abstract: Properties of molecules are indicative of their functions and thus are useful in many applications. With the advances of deep learning methods, computational approaches for predicting molecular properties are gaining increasing momentum. However, there lacks customized and advanced methods and comprehensive tools for this task currently. Here we develop a suite of comprehensive machine learning me… ▽ More Properties of molecules are indicative of their functions and thus are useful in many applications. With the advances of deep learning methods, computational approaches for predicting molecular properties are gaining increasing momentum. However, there lacks customized and advanced methods and comprehensive tools for this task currently. Here we develop a suite of comprehensive machine learning methods and tools spanning different computational models, molecular representations, and loss functions for molecular property prediction and drug discovery. Specifically, we represent molecules as both graphs and sequences. Built on these representations, we develop novel deep models for learning from molecular graphs and sequences. In order to learn effectively from highly imbalanced datasets, we develop advanced loss functions that optimize areas under precision-recall curves. Altogether, our work not only serves as a comprehensive tool, but also contributes towards developing novel and advanced graph and sequence learning methodologies. Results on both online and offline antibiotics discovery and molecular property prediction tasks show that our methods achieve consistent improvements over prior methods. In particular, our methods achieve #1 ranking in terms of both ROC-AUC and PRC-AUC on the AI Cures Open Challenge for drug discovery related to COVID-19. Our software is released as part of the MoleculeX library under AdvProp. △ Less

Submitted 6 July, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

Comments: Supplementary Material: https://github.com/divelab/MoleculeX/blob/master/AdvProp/AdvProp_supp.pdf

arXiv:2007.09334 [pdf, other]

doi 10.1145/3394486.3403110

Deep Learning of High-Order Interactions for Protein Interface Prediction

Authors: Yi Liu, Hao Yuan, Lei Cai, Shuiwang Ji

Abstract: Protein interactions are important in a broad range of biological processes. Traditionally, computational methods have been developed to automatically predict protein interface from hand-crafted features. Recent approaches employ deep neural networks and predict the interaction of each amino acid pair independently. However, these methods do not incorporate the important sequential information fro… ▽ More Protein interactions are important in a broad range of biological processes. Traditionally, computational methods have been developed to automatically predict protein interface from hand-crafted features. Recent approaches employ deep neural networks and predict the interaction of each amino acid pair independently. However, these methods do not incorporate the important sequential information from amino acid chains and the high-order pairwise interactions. Intuitively, the prediction of an amino acid pair should depend on both their features and the information of other amino acid pairs. In this work, we propose to formulate the protein interface prediction as a 2D dense prediction problem. In addition, we propose a novel deep model to incorporate the sequential information and high-order pairwise interactions to perform interface predictions. We represent proteins as graphs and employ graph neural networks to learn node features. Then we propose the sequential modeling method to incorporate the sequential information and reorder the feature matrix. Next, we incorporate high-order pairwise interactions to generate a 3D tensor containing different pairwise interactions. Finally, we employ convolutional neural networks to perform 2D dense predictions. Experimental results on multiple benchmarks demonstrate that our proposed method can consistently improve the protein interface prediction performance. △ Less

Submitted 18 July, 2020; originally announced July 2020.

Comments: 10 pages, 3 figures, 4 tables. KDD2020

arXiv:1407.7080 [pdf, other]

Ab initio Prediction of RNA Nucleotide Interactions with Backbone k-Tree Model

Authors: Liang Ding, Xingran Xue, Sal LaMarca, Mohammad Mohebbi, Abdul Samad, Russell L. Malmberg, Liming Cai

Abstract: Given the importance of non-coding RNAs to cellular regulatory functions and rapid growth of RNA transcripts, computational prediction of RNA tertiary structure remains highly demanded yet significantly challenging. Even for a short RNA sequence, the space of tertiary conformations is immense; existing methods to identify native-like conformations mostly resort to random sampling of conformations… ▽ More Given the importance of non-coding RNAs to cellular regulatory functions and rapid growth of RNA transcripts, computational prediction of RNA tertiary structure remains highly demanded yet significantly challenging. Even for a short RNA sequence, the space of tertiary conformations is immense; existing methods to identify native-like conformations mostly resort to random sampling of conformations to gain computational feasibility. However native conformations may not be examined and prediction accuracy may be compromised due to sampling. In particular, the state-of-the-art methods have yet to deliver the desired prediction performance for RNAs of length beyond 50. This paper presents the work to tackle a key step in the RNA tertiary structure prediction problem, the prediction of the nucleotide interactions that constitute the desired tertiary structure. The research is established upon a novel graph model, called backbone k-tree, to markably constrain nucleotide interaction relationships in RNA tertiary structure. It is shown that the new model makes it possible to efficiently predict the optimal set of nucleotide interactions from the query sequence, including the interactions in all recently revealed families. Evident by the preliminary results, the new method can predict with a high accuracy the nucleotide interactions that constitute the tertiary structure of the query sequence, thus providing a viable solution towards ab initio prediction of RNA tertiary structure. △ Less

Submitted 25 July, 2014; originally announced July 2014.

Comments: Accepted by Computational Methods for Structural RNAs (CMSR'14)

Showing 1–7 of 7 results for author: Cai, L