Search | arXiv e-print repository

PhaGO: Protein function annotation for bacteriophages by integrating the genomic context

Authors: Jiaojiao Guan, Yongxin Ji, Cheng Peng, Wei Zou, Xubo Tang, Jiayu Shang, Yanni Sun

Abstract: Bacteriophages are viruses that target bacteria, playing a crucial role in microbial ecology. Phage proteins are important in understanding phage biology, such as virus infection, replication, and evolution. Although a large number of new phages have been identified via metagenomic sequencing, many of them have limited protein function annotation. Accurate function annotation of phage proteins pre… ▽ More Bacteriophages are viruses that target bacteria, playing a crucial role in microbial ecology. Phage proteins are important in understanding phage biology, such as virus infection, replication, and evolution. Although a large number of new phages have been identified via metagenomic sequencing, many of them have limited protein function annotation. Accurate function annotation of phage proteins presents several challenges, including their inherent diversity and the scarcity of annotated ones. Existing tools have yet to fully leverage the unique properties of phages in annotating protein functions. In this work, we propose a new protein function annotation tool for phages by leveraging the modular genomic structure of phage genomes. By employing embeddings from the latest protein foundation models and Transformer to capture contextual information between proteins in phage genomes, PhaGO surpasses state-of-the-art methods in annotating diverged proteins and proteins with uncommon functions by 6.78% and 13.05% improvement, respectively. PhaGO can annotate proteins lacking homology search results, which is critical for characterizing the rapidly accumulating phage genomes. We demonstrate the utility of PhaGO by identifying 688 potential holins in phages, which exhibit high structural conservation with known holins. The results show the potential of PhaGO to extend our understanding of newly discovered phages. △ Less

Submitted 17 August, 2024; v1 submitted 12 August, 2024; originally announced August 2024.

Comments: 17 pages,6 figures

arXiv:2406.02014 [pdf, other]

Understanding Auditory Evoked Brain Signal via Physics-informed Embedding Network with Multi-Task Transformer

Authors: Wanli Ma, Xuegang Tang, Jin Gu, Ying Wang, Yuling Xia

Abstract: In the fields of brain-computer interaction and cognitive neuroscience, effective decoding of auditory signals from task-based functional magnetic resonance imaging (fMRI) is key to understanding how the brain processes complex auditory information. Although existing methods have enhanced decoding capabilities, limitations remain in information utilization and model representation. To overcome the… ▽ More In the fields of brain-computer interaction and cognitive neuroscience, effective decoding of auditory signals from task-based functional magnetic resonance imaging (fMRI) is key to understanding how the brain processes complex auditory information. Although existing methods have enhanced decoding capabilities, limitations remain in information utilization and model representation. To overcome these challenges, we propose an innovative multi-task learning model, Physics-informed Embedding Network with Multi-Task Transformer (PEMT-Net), which enhances decoding performance through physics-informed embedding and deep learning techniques. PEMT-Net consists of two principal components: feature augmentation and classification. For feature augmentation, we propose a novel approach by creating neural embedding graphs via node embedding, utilizing random walks to simulate the physical diffusion of neural information. This method captures both local and non-local information overflow and proposes a position encoding based on relative physical coordinates. In the classification segment, we propose adaptive embedding fusion to maximally capture linear and non-linear characteristics. Furthermore, we propose an innovative parameter-sharing mechanism to optimize the retention and learning of extracted features. Experiments on a specific dataset demonstrate PEMT-Net's significant performance in multi-task auditory signal decoding, surpassing existing methods and offering new insights into the brain's mechanisms for processing complex auditory information. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2405.11735 [pdf, other]

Accurate and efficient protein embedding using multi-teacher distillation learning

Authors: Jiayu Shang, Cheng Peng, Yongxin Ji, Jiaojiao Guan, Dehan Cai, Xubo Tang, Yanni Sun

Abstract: Motivation: Protein embedding, which represents proteins as numerical vectors, is a crucial step in various learning-based protein annotation/classification problems, including gene ontology prediction, protein-protein interaction prediction, and protein structure prediction. However, existing protein embedding methods are often computationally expensive due to their large number of parameters, wh… ▽ More Motivation: Protein embedding, which represents proteins as numerical vectors, is a crucial step in various learning-based protein annotation/classification problems, including gene ontology prediction, protein-protein interaction prediction, and protein structure prediction. However, existing protein embedding methods are often computationally expensive due to their large number of parameters, which can reach millions or even billions. The growing availability of large-scale protein datasets and the need for efficient analysis tools have created a pressing demand for efficient protein embedding methods. Results: We propose a novel protein embedding approach based on multi-teacher distillation learning, which leverages the knowledge of multiple pre-trained protein embedding models to learn a compact and informative representation of proteins. Our method achieves comparable performance to state-of-the-art methods while significantly reducing computational costs and resource requirements. Specifically, our approach reduces computational time by ~70\% and maintains almost the same accuracy as the original large models. This makes our method well-suited for large-scale protein analysis and enables the bioinformatics community to perform protein embedding tasks more efficiently. △ Less

Submitted 19 May, 2024; originally announced May 2024.

Comments: 3 pages; 1 figure

arXiv:2402.08703 [pdf, other]

A Survey of Generative AI for de novo Drug Design: New Frontiers in Molecule and Protein Generation

Authors: Xiangru Tang, Howard Dai, Elizabeth Knight, Fang Wu, Yunyang Li, Tianxiao Li, Mark Gerstein

Abstract: Artificial intelligence (AI)-driven methods can vastly improve the historically costly drug design process, with various generative models already in widespread use. Generative models for de novo drug design, in particular, focus on the creation of novel biological compounds entirely from scratch, representing a promising future direction. Rapid development in the field, combined with the inherent… ▽ More Artificial intelligence (AI)-driven methods can vastly improve the historically costly drug design process, with various generative models already in widespread use. Generative models for de novo drug design, in particular, focus on the creation of novel biological compounds entirely from scratch, representing a promising future direction. Rapid development in the field, combined with the inherent complexity of the drug design process, creates a difficult landscape for new researchers to enter. In this survey, we organize de novo drug design into two overarching themes: small molecule and protein generation. Within each theme, we identify a variety of subtasks and applications, highlighting important datasets, benchmarks, and model architectures and comparing the performance of top models. We take a broad approach to AI-driven drug design, allowing for both micro-level comparisons of various methods within each subtask and macro-level observations across different fields. We discuss parallel challenges and approaches between the two applications and highlight future directions for AI-driven de novo drug design as a whole. An organized repository of all covered sources is available at https://github.com/gersteinlab/GenAI4Drug. △ Less

Submitted 26 June, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

arXiv:2306.07618 [pdf, other]

Hyperbolic Graph Diffusion Model

Authors: Lingfeng Wen, Xuan Tang, Mingjie Ouyang, Xiangxiang Shen, Jian Yang, Daxin Zhu, Mingsong Chen, Xian Wei

Abstract: Diffusion generative models (DMs) have achieved promising results in image and graph generation. However, real-world graphs, such as social networks, molecular graphs, and traffic graphs, generally share non-Euclidean topologies and hidden hierarchies. For example, the degree distributions of graphs are mostly power-law distributions. The current latent diffusion model embeds the hierarchical data… ▽ More Diffusion generative models (DMs) have achieved promising results in image and graph generation. However, real-world graphs, such as social networks, molecular graphs, and traffic graphs, generally share non-Euclidean topologies and hidden hierarchies. For example, the degree distributions of graphs are mostly power-law distributions. The current latent diffusion model embeds the hierarchical data in a Euclidean space, which leads to distortions and interferes with modeling the distribution. Instead, hyperbolic space has been found to be more suitable for capturing complex hierarchical structures due to its exponential growth property. In order to simultaneously utilize the data generation capabilities of diffusion models and the ability of hyperbolic embeddings to extract latent hierarchical distributions, we propose a novel graph generation method called, Hyperbolic Graph Diffusion Model (HGDM), which consists of an auto-encoder to encode nodes into successive hyperbolic embeddings, and a DM that operates in the hyperbolic latent space. HGDM captures the crucial graph structure distributions by constructing a hyperbolic potential node space that incorporates edge information. Extensive experiments show that HGDM achieves better performance in generic graph and molecule generation benchmarks, with a $48\%$ improvement in the quality of graph generation with highly hierarchical structures. △ Less

Submitted 3 January, 2024; v1 submitted 13 June, 2023; originally announced June 2023.

Comments: accepted by AAAI 2024

arXiv:2303.15707 [pdf, other]

PhaBOX: A web server for identifying and characterizing phage contigs in metagenomic data

Authors: Jiayu Shang, Cheng Peng, Herui Liao, Xubo Tang, Yanni Sun

Abstract: Motivation: There is accumulating evidence showing the important roles of bacteriophages (phages) in regulating the structure and functions of the microbiome. However, lacking an easy-to-use and integrated phage analysis software hampers microbiome-related research from incorporating phages in the analysis. Results: In this work, we developed a web server, PhaBOX, which can comprehensively identif… ▽ More Motivation: There is accumulating evidence showing the important roles of bacteriophages (phages) in regulating the structure and functions of the microbiome. However, lacking an easy-to-use and integrated phage analysis software hampers microbiome-related research from incorporating phages in the analysis. Results: In this work, we developed a web server, PhaBOX, which can comprehensively identify and analyze phage contigs in metagenomic data. It supports integrated phage analysis, including phage contig identification from the metagenomic assembly, lifestyle prediction, taxonomic classification, and host prediction. Instead of treating the algorithms as a black box, PhaBOX also supports visualization of the essential features for making predictions. The web server is designed with a user-friendly graphical interface that enables both informatics-trained and non-specialist users to analyze phages in microbiome data with ease. Availability: The web server of PhaBOX is available via: https://phage.ee.cityu.edu.hk. The source code of PhaBOX is available at: https://github.com/KennthShang/PhaBOX Contact: yannisun@cityu.edu.hk △ Less

Submitted 27 July, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

Comments: 7 pages, 3 figures, 1 table

Journal ref: published on Bioinformatics Advances 2023

arXiv:2301.12422 [pdf, other]

PhaVIP: Phage VIrion Protein classification based on chaos game representation and Vision Transformer

Authors: Jiayu Shang, Cheng Peng, Xubo Tang, Yanni Sun

Abstract: Motivation: As viruses that mainly infect bacteria, phages are key players across a wide range of ecosystems. Analyzing phage proteins is indispensable for understanding phages' functions and roles in microbiomes. High-throughput sequencing enables us to obtain phages in different microbiomes with low cost. However, compared to the fast accumulation of newly identified phages, phage protein classi… ▽ More Motivation: As viruses that mainly infect bacteria, phages are key players across a wide range of ecosystems. Analyzing phage proteins is indispensable for understanding phages' functions and roles in microbiomes. High-throughput sequencing enables us to obtain phages in different microbiomes with low cost. However, compared to the fast accumulation of newly identified phages, phage protein classification remains difficult. In particular, a fundamental need is to annotate virion proteins, the structural proteins such as major tail, baseplate etc. Although there are experimental methods for virion protein identification, they are too expensive or time-consuming, leaving a large number of proteins unclassified. Thus, there is a great demand to develop a computational method for fast and accurate phage virion protein classification. Results: In this work, we adapted the state-of-the-art image classification model, Vision Transformer, to conduct virion protein classification. By encoding protein sequences into unique images using chaos gaming representation, we can leverage Vision Transformer to learn both local and global features from sequence ``images''. Our method, PhaVIP, has two main functions: classifying PVP and non-PVP sequences and annotating the types of PVP, such as capsid and tail. We tested PhaVIP on several datasets with increasing difficulty and benchmarked it against alternative tools. The experimental results show that PhaVIP has superior performance. After validating the performance of PhaVIP, we investigated two applications that can use the output of PhaVIP: phage taxonomy classification and phage host prediction. The results show the benefit of using classified proteins rather than all proteins. △ Less

Submitted 30 January, 2023; v1 submitted 29 January, 2023; originally announced January 2023.

Comments: 15 pages, 13 figures

arXiv:2212.13026 [pdf, other]

Network analysis on cortical morphometry in first-episode schizophrenia

Authors: Mowen Yin, Weikai Huang, Zhichao Liang, Quanying Liu, Xiaoying Tang

Abstract: First-episode schizophrenia (FES) results in abnormality of brain connectivity at different levels. Despite some successful findings on functional and structural connectivity of FES, relatively few studies have been focused on morphological connectivity, which may provide a potential biomarker for FES. In this study, we aim to investigate cortical morphological connectivity in FES. T1-weighted mag… ▽ More First-episode schizophrenia (FES) results in abnormality of brain connectivity at different levels. Despite some successful findings on functional and structural connectivity of FES, relatively few studies have been focused on morphological connectivity, which may provide a potential biomarker for FES. In this study, we aim to investigate cortical morphological connectivity in FES. T1-weighted magnetic resonance image data from 92 FES patients and 106 healthy controls (HCs) are analyzed.We parcellate brain into 68 cortical regions, calculate the averaged thickness and surface area of each region, construct undirected networks by correlating cortical thickness or surface area measures across 68 regions for each group, and finally compute a variety of network-related topology characteristics. Our experimental results show that both the cortical thickness network and the surface area network in two groups are small-world networks; that is, those networks have high clustering coefficients and low characteristic path lengths. At certain network sparsity levels, both the cortical thickness network and the surface area network of FES have significantly lower clustering coefficients and local efficiencies than those of HC, indicating FES-related abnormalities in local connectivity and small-worldness. These abnormalities mainly involve the frontal, parietal, and temporal lobes. Further regional analyses confirm significant group differences in the node betweenness of the posterior cingulate gyrus for both the cortical thickness network and the surface area network. Our work supports that cortical morphological connectivity, which is constructed based on correlations across subjects' cortical thickness, may serve as a tool to study topological abnormalities in neurological disorders. △ Less

Submitted 26 December, 2022; originally announced December 2022.

arXiv:2208.04196 [pdf, ps, other]

Hopf Bifurcations of Reaction Networks with Zero-One Stoichiometric Coefficients

Authors: Xiaoxian Tang, Kaizhang Wang

Abstract: For the reaction networks with zero-one stoichiometric coefficients (or simply zero-one networks), we prove that if a network admits a Hopf bifurcation, then the rank of the stoichiometric matrix is at least four. As a corollary, we show that if a zero-one network admits a Hopf bifurcation, then it contains at least four species and five reactions. As applications, we show that there exist rank-fo… ▽ More For the reaction networks with zero-one stoichiometric coefficients (or simply zero-one networks), we prove that if a network admits a Hopf bifurcation, then the rank of the stoichiometric matrix is at least four. As a corollary, we show that if a zero-one network admits a Hopf bifurcation, then it contains at least four species and five reactions. As applications, we show that there exist rank-four subnetworks, which have the capacity for Hopf bifurcations/oscillations, in two biologically significant networks: the MAPK cascades and the ERK network. We provide a computational tool for computing all four-species, five-reaction, zero-one networks that have the capacity for Hopf bifurcations. △ Less

Submitted 8 August, 2022; originally announced August 2022.

Comments: 27 pages, 5 figures

arXiv:2206.09693 [pdf, other]

doi 10.1093/bib/bbac487

PhaTYP: Predicting the lifestyle for bacteriophages using BERT

Authors: Jiayu Shang, Xubo Tang, Yanni Sun

Abstract: Bacteriophages (or phages), which infect bacteria, have two distinct lifestyles: virulent and temperate. Predicting the lifestyle of phages helps decipher their interactions with their bacterial hosts, aiding phages' applications in fields such as phage therapy. Because experimental methods for annotating the lifestyle of phages cannot keep pace with the fast accumulation of sequenced phages, comp… ▽ More Bacteriophages (or phages), which infect bacteria, have two distinct lifestyles: virulent and temperate. Predicting the lifestyle of phages helps decipher their interactions with their bacterial hosts, aiding phages' applications in fields such as phage therapy. Because experimental methods for annotating the lifestyle of phages cannot keep pace with the fast accumulation of sequenced phages, computational method for predicting phages' lifestyles has become an attractive alternative. Despite some promising results, computational lifestyle prediction remains difficult because of the limited known annotations and the sheer amount of sequenced phage contigs assembled from metagenomic data. In particular, most of the existing tools cannot precisely predict phages' lifestyles for short contigs. In this work, we develop PhaTYP (Phage TYPe prediction tool) to improve the accuracy of lifestyle prediction on short contigs. We design two different training tasks, self-supervised and fine-tuning tasks, to overcome lifestyle prediction difficulties. We rigorously tested and compared PhaTYP with four state-of-the-art methods: DeePhage, PHACTS, PhagePred, and BACPHLIP. The experimental results show that PhaTYP outperforms all these methods and achieves more stable performance on short contigs. In addition, we demonstrated the utility of PhaTYP for analyzing the phage lifestyle on human neonates' gut data. This application shows that PhaTYP is a useful means for studying phages in metagenomic data and helps extend our understanding of microbial communities. △ Less

Submitted 20 June, 2022; originally announced June 2022.

Comments: 16 pages, 11 figures

Journal ref: Briefings in Bioinformatics, November 2022

arXiv:2201.04778 [pdf, other]

doi 10.1093/bib/bbac258

Accurate identification of bacteriophages from metagenomic data using Transformer

Authors: Jiayu Shang, Xubo Tang, Ruocheng Guo, Yanni Sun

Abstract: Motivation: Bacteriophages are viruses infecting bacteria. Being key players in microbial communities, they can regulate the composition/function of microbiome by infecting their bacterial hosts and mediating gene transfer. Recently, metagenomic sequencing, which can sequence all genetic materials from various microbiome, has become a popular means for new phage discovery. However, accurate and co… ▽ More Motivation: Bacteriophages are viruses infecting bacteria. Being key players in microbial communities, they can regulate the composition/function of microbiome by infecting their bacterial hosts and mediating gene transfer. Recently, metagenomic sequencing, which can sequence all genetic materials from various microbiome, has become a popular means for new phage discovery. However, accurate and comprehensive detection of phages from the metagenomic data remains difficult. High diversity/abundance, and limited reference genomes pose major challenges for recruiting phage fragments from metagenomic data. Existing alignment-based or learning-based models have either low recall or precision on metagenomic data. Results: In this work, we adopt the state-of-the-art language model, Transformer, to conduct contextual embedding for phage contigs. By constructing a protein-cluster vocabulary, we can feed both the protein composition and the proteins' positions from each contig into the Transformer. The Transformer can learn the protein organization and associations using the self-attention mechanism and predicts the label for test contigs. We rigorously tested our developed tool named PhaMer on multiple datasets with increasing difficulty, including quality RefSeq genomes, short contigs, simulated metagenomic data, mock metagenomic data, and the public IMG/VR dataset. All the experimental results show that PhaMer outperforms the state-of-the-art tools. In the real metagenomic data experiment, PhaMer improves the F1-score of phage detection by 27\%. △ Less

Submitted 11 August, 2022; v1 submitted 12 January, 2022; originally announced January 2022.

Comments: 15 phages, 11 figures

Journal ref: Briefings in Bioinformatics, Volume 23, Issue 4, July 2022, bbac258

arXiv:2201.04663 [pdf, other]

Novel Symmetry-preserving Neural Network Model for Phylogenetic Inference

Authors: Xudong Tang, Leonardo Zepeda-Nunez, Shengwen Yang, Zelin Zhao, Claudia Solis-Lemus

Abstract: Scientists world-wide are putting together massive efforts to understand how the biodiversity that we see on Earth evolved from single-cell organisms at the origin of life and this diversification process is represented through the Tree of Life. Low sampling rates and high heterogeneity in the rate of evolution across sites and lineages produce a phenomenon denoted "long branch attraction" (LBA) i… ▽ More Scientists world-wide are putting together massive efforts to understand how the biodiversity that we see on Earth evolved from single-cell organisms at the origin of life and this diversification process is represented through the Tree of Life. Low sampling rates and high heterogeneity in the rate of evolution across sites and lineages produce a phenomenon denoted "long branch attraction" (LBA) in which long non-sister lineages are estimated to be sisters regardless of their true evolutionary relationship. LBA has been a pervasive problem in phylogenetic inference affecting different types of methodologies from distance-based to likelihood-based. Here, we present a novel neural network model that outperforms standard phylogenetic methods and other neural network implementations under LBA settings. Furthermore, unlike existing neural network models, our model naturally accounts for the tree isomorphisms via permutation invariant functions which ultimately result in lower memory and allows the seamless extension to larger trees. △ Less

Submitted 20 December, 2023; v1 submitted 12 January, 2022; originally announced January 2022.

Comments: 15 pages, 6 figures

arXiv:2110.05231 [pdf, other]

Multi-modal Self-supervised Pre-training for Regulatory Genome Across Cell Types

Authors: Shentong Mo, Xi Fu, Chenyang Hong, Yizhen Chen, Yuxuan Zheng, Xiangru Tang, Zhiqiang Shen, Eric P Xing, Yanyan Lan

Abstract: In the genome biology research, regulatory genome modeling is an important topic for many regulatory downstream tasks, such as promoter classification, transaction factor binding sites prediction. The core problem is to model how regulatory elements interact with each other and its variability across different cell types. However, current deep learning methods often focus on modeling genome sequen… ▽ More In the genome biology research, regulatory genome modeling is an important topic for many regulatory downstream tasks, such as promoter classification, transaction factor binding sites prediction. The core problem is to model how regulatory elements interact with each other and its variability across different cell types. However, current deep learning methods often focus on modeling genome sequences of a fixed set of cell types and do not account for the interaction between multiple regulatory elements, making them only perform well on the cell types in the training set and lack the generalizability required in biological applications. In this work, we propose a simple yet effective approach for pre-training genome data in a multi-modal and self-supervised manner, which we call GeneBERT. Specifically, we simultaneously take the 1d sequence of genome data and a 2d matrix of (transcription factors x regions) as the input, where three pre-training tasks are proposed to improve the robustness and generalizability of our model. We pre-train our model on the ATAC-seq dataset with 17 million genome sequences. We evaluate our GeneBERT on regulatory downstream tasks across different cell types, including promoter classification, transaction factor binding sites prediction, disease risk estimation, and splicing sites prediction. Extensive experiments demonstrate the effectiveness of multi-modal and self-supervised pre-training for large-scale regulatory genomics data. △ Less

Submitted 3 November, 2021; v1 submitted 11 October, 2021; originally announced October 2021.

arXiv:2109.03038 [pdf]

Characterizing interdisciplinarity in drug research: a translational science perspective

Authors: Xin Li, Xuli Tang

Abstract: Despite the significant advances in life science, it still takes decades to translate a basic drug discovery into a cure for human disease. To accelerate the process from bench to bedside, interdisciplinary research (especially research involving both basic research and clinical research) has been strongly recommend by many previous studies. However, the patterns and the roles of the interdiscipli… ▽ More Despite the significant advances in life science, it still takes decades to translate a basic drug discovery into a cure for human disease. To accelerate the process from bench to bedside, interdisciplinary research (especially research involving both basic research and clinical research) has been strongly recommend by many previous studies. However, the patterns and the roles of the interdisciplinary characteristics in drug research have not been deeply examined in extant studies. The purpose of this study was to characterize interdisciplinary characteristics in drug research from the perspective of translational science, and to examine the role of different kinds of interdisciplinary characteristics in translational research for drugs. △ Less

Submitted 4 September, 2021; originally announced September 2021.

Journal ref: Journal of Informetrics 2021

arXiv:2108.09695 [pdf, ps, other]

Multistationarity of Reaction Networks with One-Dimensional Stoichiometric Subspaces

Authors: Kexin Lin, Xiaoxian Tang, Zhishuo Zhang

Abstract: We study the multistationarity for the reaction networks with one-dimensional stoichiometric subspaces, and we focus on the networks admitting finitely many positive steady states. We prove that if a network admits multistationarity, then network has an embedded one-species network with arrow diagram (->,<-) and another with arrow diagram (<-,->). The inverse is also true if there exist two reacti… ▽ More We study the multistationarity for the reaction networks with one-dimensional stoichiometric subspaces, and we focus on the networks admitting finitely many positive steady states. We prove that if a network admits multistationarity, then network has an embedded one-species network with arrow diagram (->,<-) and another with arrow diagram (<-,->). The inverse is also true if there exist two reactions in the network such that the subnetwork consisting of the two reactions admits at least one and finitely many positive steady states. We also prove that if a network admits at least three positive steady states, then it contains at least three bi-arrow diagrams. More than that, we completely characterize the bi-reaction networks that admit at least three positive steady states. △ Less

Submitted 22 August, 2021; originally announced August 2021.

Comments: 31 pages

MSC Class: 37N25

arXiv:2012.15437 [pdf, ps, other]

Multistability of Reaction Networks with One-Dimensional Stoichiometric Subspaces

Authors: Xiaoxian Tang, Zhishuo Zhang

Abstract: For the reaction networks with one-dimensional stoichiometric subspaces, we show the following results. (1) If the maximum number of positive steady states is an even number N, then the maximum number of stable positive steady states is N/2. (2) If the maximum number of positive steady states is an odd number N, then we provide a condition on the network such that the maximum number of stable posi… ▽ More For the reaction networks with one-dimensional stoichiometric subspaces, we show the following results. (1) If the maximum number of positive steady states is an even number N, then the maximum number of stable positive steady states is N/2. (2) If the maximum number of positive steady states is an odd number N, then we provide a condition on the network such that the maximum number of stable positive steady states is (N-1)/2 if this condition is satisfied, and this maximum number is (N+1)/2 otherwise. △ Less

Submitted 19 February, 2021; v1 submitted 30 December, 2020; originally announced December 2020.

Comments: 26 pages, 1 figure

arXiv:2008.03846 [pdf, ps, other]

Multistability of Small Reaction Networks

Authors: Xiaoxian Tang, Hao Xu

Abstract: For three typical sets of small reaction networks (networks with two reactions, one irreversible and one reversible reaction, or two reversible-reaction pairs), we completely answer the challenging question: what is the smallest subset of all multistable networks such that any multistable network outside of the subset contains either more species or more reactants than any network in this subset? For three typical sets of small reaction networks (networks with two reactions, one irreversible and one reversible reaction, or two reversible-reaction pairs), we completely answer the challenging question: what is the smallest subset of all multistable networks such that any multistable network outside of the subset contains either more species or more reactants than any network in this subset? △ Less

Submitted 28 January, 2021; v1 submitted 9 August, 2020; originally announced August 2020.

Comments: 23 pages, 5 tables

arXiv:1910.14452 [pdf, other]

Dynamics of ERK regulation in the processive limit

Authors: Carsten Conradi, Nida Obatake, Anne Shiu, Xiaoxian Tang

Abstract: We consider a model of extracellular signal-regulated kinase (ERK) regulation by dual-site phosphorylation and dephosphorylation, which exhibits bistability and oscillations, but loses these properties in the limit in which the mechanisms underlying phosphorylation and dephosphorylation become processive. Our results suggest that anywhere along the way to becoming processive, the model remains bis… ▽ More We consider a model of extracellular signal-regulated kinase (ERK) regulation by dual-site phosphorylation and dephosphorylation, which exhibits bistability and oscillations, but loses these properties in the limit in which the mechanisms underlying phosphorylation and dephosphorylation become processive. Our results suggest that anywhere along the way to becoming processive, the model remains bistable and oscillatory. More precisely, in simplified versions of the model, precursors to bistability and oscillations (specifically, multistationarity and Hopf bifurcations, respectively) exist at all "processivity levels". Finally, we investigate whether bistability and oscillations can exist together. △ Less

Submitted 3 September, 2020; v1 submitted 31 October, 2019; originally announced October 2019.

Comments: 22 pages, 2 figures, 3 tables, builds on arXiv:1903.02617

arXiv:1910.13632 [pdf]

doi 10.1214/19-AOAS1249

RCRnorm: An integrated system of random-coefficient hierarchical regression models for normalizing NanoString nCounter data

Authors: Gaoxiang Jia, Xinlei Wang, Qiwei Li, Wei Lu, Ximing Tang, Ignacio Wistuba, Yang Xie

Abstract: Formalin-fixed paraffin-embedded (FFPE) samples have great potential for biomarker discovery, retrospective studies and diagnosis or prognosis of diseases. Their application, however, is hindered by the unsatisfactory performance of traditional gene expression profiling techniques on damaged RNAs. NanoString nCounter platform is well suited for profiling of FFPE samples and measures gene expressio… ▽ More Formalin-fixed paraffin-embedded (FFPE) samples have great potential for biomarker discovery, retrospective studies and diagnosis or prognosis of diseases. Their application, however, is hindered by the unsatisfactory performance of traditional gene expression profiling techniques on damaged RNAs. NanoString nCounter platform is well suited for profiling of FFPE samples and measures gene expression with high sensitivity which may greatly facilitate realization of scientific and clinical values of FFPE samples. However, methodological development for normalization, a critical step when analyzing this type of data, is far behind. Existing methods designed for the platform use information from different types of internal controls separately and rely on an overly-simplified assumption that expression of housekeeping genes is constant across samples for global scaling. Thus, these methods are not optimized for the nCounter system, not mentioning that they were not developed for FFPE samples. We construct an integrated system of random-coefficient hierarchical regression models to capture main patterns and characteristics observed from NanoString data of FFPE samples and develop a Bayesian approach to estimate parameters and normalize gene expression across samples. Our method, labeled RCRnorm, incorporates information from all aspects of the experimental design and simultaneously removes biases from various sources. It eliminates the unrealistic assumption on housekeeping genes and offers great interpretability. Furthermore, it is applicable to freshly frozen or like samples that can be generally viewed as a reduced case of FFPE samples. Simulation and applications showed the superior performance of RCRnorm. △ Less

Submitted 28 October, 2019; originally announced October 2019.

MSC Class: 97K80

Journal ref: Ann. Appl. Stat. 13 (2019), no. 3, 1617--1647. https://projecteuclid.org/euclid.aoas/1571277766

arXiv:1903.02617 [pdf, other]

Oscillations and bistability in a model of ERK regulation

Authors: Nida Obatake, Anne Shiu, Xiaoxian Tang, Angelica Torres

Abstract: This work concerns the question of how two important dynamical properties, oscillations and bistability, emerge in an important biological signaling network. Specifically, we consider a model for dual-site phosphorylation and dephosphorylation of extracellular signal-regulated kinase (ERK). We prove that oscillations persist even as the model is greatly simplified (reactions are made irreversible… ▽ More This work concerns the question of how two important dynamical properties, oscillations and bistability, emerge in an important biological signaling network. Specifically, we consider a model for dual-site phosphorylation and dephosphorylation of extracellular signal-regulated kinase (ERK). We prove that oscillations persist even as the model is greatly simplified (reactions are made irreversible and intermediates are removed). Bistability, however, is much less robust -- this property is lost when intermediates are removed or even when all reactions are made irreversible. Moreover, bistability is characterized by the presence of two reversible, catalytic reactions: as other reactions are made irreversible, bistability persists as long as one or both of the specified reactions is preserved. Finally, we investigate the maximum number of steady states, aided by a network's "mixed volume" (a concept from convex geometry). Taken together, our results shed light on the question of how oscillations and bistability emerge from a limiting network of the ERK network -- namely, the fully processive dual-site network -- which is known to be globally stable and therefore lack both oscillations and bistability. Our proofs are enabled by a Hopf bifurcation criterion due to Yang, analyses of Newton polytopes arising from Hurwitz determinants, and recent characterizations of multistationarity for networks having a steady-state parametrization. △ Less

Submitted 6 March, 2019; originally announced March 2019.

Comments: 33 pages, 4 figures, 4 tables, 3 appendices

arXiv:1810.05574 [pdf, other]

Multistationarity in Structured Reaction Networks

Authors: Alicia Dickenstein, Mercedes Perez Millan, Anne Shiu, Xiaoxian Tang

Abstract: Many dynamical systems arising in biology and other areas exhibit multistationarity (two or more positive steady states with the same conserved quantities). Although deciding multistationarity for a polynomial dynamical system is an effective question in real algebraic geometry, it is in general difficult to determine whether a given network can give rise to a multistationary system, and if so, to… ▽ More Many dynamical systems arising in biology and other areas exhibit multistationarity (two or more positive steady states with the same conserved quantities). Although deciding multistationarity for a polynomial dynamical system is an effective question in real algebraic geometry, it is in general difficult to determine whether a given network can give rise to a multistationary system, and if so, to identify witnesses to multistationarity, that is, specific parameter values for which the system exhibits multiple steady states. Here we investigate both problems. First, we build on work of Conradi, Feliu, Mincheva, and Wiuf, who showed that for certain reaction networks whose steady states admit a positive parametrization, multistationarity is characterized by whether a certain "critical function" changes sign. Here, we allow for more general parametrizations, which make it much easier to determine the existence of a sign change. This is particularly simple when the steady-state equations are linearly equivalent to binomials; we give necessary conditions for this to happen, which hold for many networks studied in the literature. We also give a sufficient condition for multistationarity of networks whose steady-state equations can be replaced by equivalent triangular-form equations. Finally, we present methods for finding witnesses to multistationarity, which we show work well for certain structured reaction networks, including those common to biological signaling pathways. Our work relies on results from degree theory, on the existence of explicit rational parametrizations of the steady states, and on the specialization of Groebner bases. △ Less

Submitted 7 February, 2019; v1 submitted 12 October, 2018; originally announced October 2018.

Comments: 44 pages, 2 figures, 1 table

arXiv:1703.04238 [pdf, other]

Increasing Trends of Guillain-Barré Syndrome (GBS) and Dengue in Hong Kong

Authors: Xiujuan Tang, Shi Zhao, Alice P. Y. Chiu, Xin Wang, Lin Yang, Daihai He

Abstract: Background: Guillain-Barré Syndrome (GBS) is a common type of severe acute paralytic neuropathy and associated with other virus infections such as dengue fever and Zika. This study investigate the relationship between GBS, dengue, local meteorological factors in Hong Kong and global climatic factors from January 2000 to June 2016. Methods: The correlations between GBS, dengue, Multivariate El Ni… ▽ More Background: Guillain-Barré Syndrome (GBS) is a common type of severe acute paralytic neuropathy and associated with other virus infections such as dengue fever and Zika. This study investigate the relationship between GBS, dengue, local meteorological factors in Hong Kong and global climatic factors from January 2000 to June 2016. Methods: The correlations between GBS, dengue, Multivariate El Nino Southern Oscillation Index (MEI) and local meteorological data were explored by the Spearman Rank correlations and cross-correlations between these time series. Poisson regression models were fitted to identify nonlinear associations between MEI and dengue. Cross wavelet analysis was applied to infer potential non-stationary oscillating associations among MEI, dengue and GBS. Findings : An increasing trend was found for both GBS cases and imported dengue cases in Hong Kong. We found a weak but statistically significant negative correlation between GBS and local meteorological factors. MEI explained over 12\% of dengue's variations from Poisson regression models. Wavelet analyses showed that there is possible non-stationary oscillating association between dengue and GBS from 2005 to 2015 in Hong Kong. Our study has led to an improved understanding of the timing and relationship between GBS, dengue and MEI. △ Less

Submitted 13 March, 2017; originally announced March 2017.

Comments: 11 pages, 6 figures

arXiv:1510.08797 [pdf, ps, other]

doi 10.1137/16M1079841

Convexity in Tree Spaces

Authors: Bo Lin, Bernd Sturmfels, Xiaoxian Tang, Ruriko Yoshida

Abstract: We study the geometry of metrics and convexity structures on the space of phylogenetic trees, which is here realized as the tropical linear space of all \ ultrametrics. The ${\rm CAT}(0)$-metric of Billera-Holmes-Vogtman arises from the theory of orthant spaces. While its geodesics can be computed by the Owen-Provan algorithm, geodesic triangles are complicated. We show that the dimension of such… ▽ More We study the geometry of metrics and convexity structures on the space of phylogenetic trees, which is here realized as the tropical linear space of all \ ultrametrics. The ${\rm CAT}(0)$-metric of Billera-Holmes-Vogtman arises from the theory of orthant spaces. While its geodesics can be computed by the Owen-Provan algorithm, geodesic triangles are complicated. We show that the dimension of such a triangle can be arbitrarily high. Tropical convexity and the tropical metric behave better. They exhibit properties desirable for geometric statistics, such as geodesics of small depth. △ Less

Submitted 14 June, 2016; v1 submitted 29 October, 2015; originally announced October 2015.

Comments: 21 pages, 5 figures; Theorem 13 is now proved in all dimensions

Journal ref: SIAM Journal on Discrete Mathematics 31 (2017) 2015-2038

arXiv:1407.3990 [pdf, other]

doi 10.1007/s00332-015-9252-y

Stability of twisted states in the Kuramoto model on Cayley and random graphs

Authors: Georgi S. Medvedev, Xuezhi Tang

Abstract: The Kuramoto model (KM) of coupled phase oscillators on complete, Paley, and Erdos-Renyi (ER) graphs is analyzed in this work. As quasirandom graphs, the complete, Paley, and ER graphs share many structural properties. For instance, they exhibit the same asymptotics of the edge distributions, homomorphism densities, graph spectra, and have constant graph limits. Nonetheless, we show that the asymp… ▽ More The Kuramoto model (KM) of coupled phase oscillators on complete, Paley, and Erdos-Renyi (ER) graphs is analyzed in this work. As quasirandom graphs, the complete, Paley, and ER graphs share many structural properties. For instance, they exhibit the same asymptotics of the edge distributions, homomorphism densities, graph spectra, and have constant graph limits. Nonetheless, we show that the asymptotic behavior of solutions in the KM on these graphs can be qualitatively different. Specifically, we identify twisted states, steady state solutions of the KM on complete and Paley graphs, which are stable for one family of graphs but not for the other. On the other hand, we show that the solutions of the IVPs for the KM on complete and random graphs remain close on finite time intervals, provided they start from close initial conditions and the graphs are sufficiently large. Therefore, the results of this paper elucidate the relation between the network structure and dynamics in coupled nonlinear dynamical systems. Furthermore, we present new results on synchronization and stability of twisted states for the KM on Cayley and random graphs. △ Less

Submitted 10 May, 2015; v1 submitted 15 July, 2014; originally announced July 2014.

Comments: Journal of Nonlinear Science, 2015

MSC Class: 34C15; 45J05; 45L05; 05C90

arXiv:q-bio/0506012 [pdf, ps, other]

doi 10.1103/PhysRevE.72.041912

The prion-like folding behavior in aggregated proteins

Authors: Yong-Yun Ji, You-Quan Li, Jun-Wen Mao, Xiao-Wei Tang

Abstract: We investigate the folding behavior of protein sequences by numerically studying all sequences with maximally compact lattice model through exhaustive enumeration. We get the prion-like behavior of protein folding. Individual proteins remaining stable in the isolated native state may change their conformations when they aggregate. We observe the folding properties as the interfacial interaction… ▽ More We investigate the folding behavior of protein sequences by numerically studying all sequences with maximally compact lattice model through exhaustive enumeration. We get the prion-like behavior of protein folding. Individual proteins remaining stable in the isolated native state may change their conformations when they aggregate. We observe the folding properties as the interfacial interaction strength changes, and find that the strength must be strong enough before the propagation of the most stable structures happens. △ Less

Submitted 9 June, 2005; originally announced June 2005.

Comments: 7 pages, 6 figures

Journal ref: Physical Review E 72, 041912 (2005), Virtual Journal of Biological Physics Research(October 15, 2005)

arXiv:q-bio/0408024 [pdf, ps, other]

doi 10.1103/PhysRevE.72.021904

Medium effects on the selection of sequences folding into stable proteins in a simple model

Authors: You-Quan Li, Yong-Yun Ji, Jun-Wen Mao, Xiao-Wei Tang

Abstract: We study the medium effects on the selection of sequences in protein folding by taking account of the surface potential in HP-model. Our analysis on the proportion of H and P monomers in the sequences gives a direct interpretation that the lowly designable structures possess small average gap. The numerical calculation by means of our model exhibits that the surface potential enhances the averag… ▽ More We study the medium effects on the selection of sequences in protein folding by taking account of the surface potential in HP-model. Our analysis on the proportion of H and P monomers in the sequences gives a direct interpretation that the lowly designable structures possess small average gap. The numerical calculation by means of our model exhibits that the surface potential enhances the average gap of highly designable structures. It also shows that a most stable structure may be no longer the most stable one if the medium parameters changed. △ Less

Submitted 27 August, 2004; originally announced August 2004.

Comments: 4 pages, 4 figures

Journal ref: Phys. Rev. E 72, 021904 (2005)

arXiv:cond-mat/0309674 [pdf, ps, other]

doi 10.1103/PhysRevE.69.051907

Polymer Induced Bundling of F-actin and the Depletion Force

Authors: M. Hosek, J. X. Tang

Abstract: The inert polymer polyethylene glycol (PEG) induces a "bundling" phenomenon in F-actin solutions when its concentration exceeds a critical onset value C_o. Over a limited range of PEG molecular weight and ionic strength, C_o can be expressed as a function of these two variables. The process is reversible, but hysteresis is also observed in the dissolution of the bundles, with ionic strength havi… ▽ More The inert polymer polyethylene glycol (PEG) induces a "bundling" phenomenon in F-actin solutions when its concentration exceeds a critical onset value C_o. Over a limited range of PEG molecular weight and ionic strength, C_o can be expressed as a function of these two variables. The process is reversible, but hysteresis is also observed in the dissolution of the bundles, with ionic strength having a large influence. Additional actin filaments are able to join previously formed bundles. Little, if any, polymer is associated with the bundle structure. Continuum estimates of the Asakura-Oosawa depletion force, Coulomb repulsion, and van der Waals potential are combined for a partial explanation of the bundling effect and hysteresis. Conjectures are presented concerning the apparent limit in bundle size. △ Less

Submitted 14 January, 2004; v1 submitted 29 September, 2003; originally announced September 2003.

Report number: iucm03-006

Showing 1–27 of 27 results for author: Tang, X