Search | arXiv e-print repository

Revisiting Adaptive Cellular Recognition Under Domain Shifts: A Contextual Correspondence View

Authors: Jianan Fan, Dongnan Liu, Canran Li, Hang Chang, Heng Huang, Filip Braet, Mei Chen, Weidong Cai

Abstract: Cellular nuclei recognition serves as a fundamental and essential step in the workflow of digital pathology. However, with disparate source organs and staining procedures among histology image clusters, the scanned tiles inherently conform to a non-uniform data distribution, which induces deteriorated promises for general cross-cohort usages. Despite the latest efforts leveraging domain adaptation… ▽ More Cellular nuclei recognition serves as a fundamental and essential step in the workflow of digital pathology. However, with disparate source organs and staining procedures among histology image clusters, the scanned tiles inherently conform to a non-uniform data distribution, which induces deteriorated promises for general cross-cohort usages. Despite the latest efforts leveraging domain adaptation to mitigate distributional discrepancy, those methods are subjected to modeling the morphological characteristics of each cell individually, disregarding the hierarchical latent structure and intrinsic contextual correspondences across the tumor micro-environment. In this work, we identify the importance of implicit correspondences across biological contexts for exploiting domain-invariant pathological composition and thereby propose to exploit the dependence over various biological structures for domain adaptive cellular recognition. We discover those high-level correspondences via unsupervised contextual modeling and use them as bridges to facilitate adaptation over diverse organs and stains. In addition, to further exploit the rich spatial contexts embedded amongst nuclear communities, we propose self-adaptive dynamic distillation to secure instance-aware trade-offs across different model constituents. The proposed method is extensively evaluated on a broad spectrum of cross-domain settings under miscellaneous data distribution shifts and outperforms the state-of-the-art methods by a substantial margin. Code is available at https://github.com/camwew/CellularRecognition_DA_CC. △ Less

Submitted 19 July, 2024; v1 submitted 14 July, 2024; originally announced July 2024.

Comments: ECCV 2024 main conference

arXiv:2406.01636 [pdf]

COVID-19: post infection implications in different age groups, mechanism, diagnosis, effective prevention, treatment, and recommendations

Authors: Muhammad Akmal Raheem, Muhammad Ajwad Rahim, Ijaz Gul, Md. Reyad-ul-Ferdous, Liyan Le, Junguo Hui, Shuiwei Xia, Minjiang Chen, Dongmei Yu, Vijay Pandey, Peiwu Qin, Jiansong Ji

Abstract: SARS-CoV-2, the highly contagious pathogen responsible for the COVID-19 pandemic, has persistent effects that begin four weeks after initial infection and last for an undetermined duration. These chronic effects are more harmful than acute ones. This review explores the long-term impact of the virus on various human organs, including the pulmonary, cardiovascular, neurological, reproductive, gastr… ▽ More SARS-CoV-2, the highly contagious pathogen responsible for the COVID-19 pandemic, has persistent effects that begin four weeks after initial infection and last for an undetermined duration. These chronic effects are more harmful than acute ones. This review explores the long-term impact of the virus on various human organs, including the pulmonary, cardiovascular, neurological, reproductive, gastrointestinal, musculoskeletal, endocrine, and lymphoid systems, particularly in older adults. Regarding diagnosis, RT-PCR is the gold standard for detecting COVID-19, though it requires specialized equipment, skilled personnel, and considerable time to produce results. To address these limitations, artificial intelligence in imaging and microfluidics technologies offers promising alternatives for diagnosing COVID-19 efficiently. Pharmacological and non-pharmacological strategies are effective in mitigating the persistent impacts of COVID-19. These strategies enhance immunity in post-COVID-19 patients by reducing cytokine release syndrome, improving T cell response, and increasing the circulation of activated natural killer and CD8 T cells in blood and tissues. This, in turn, alleviates symptoms such as fever, nausea, fatigue, muscle weakness, and pain. Vaccines, including inactivated viral, live attenuated viral, protein subunit, viral vectored, mRNA, DNA, and nanoparticle vaccines, significantly reduce the adverse long-term effects of the virus. However, no vaccine has been reported to provide lifetime protection against COVID-19. Consequently, protective measures such as physical distancing, mask usage, and hand hygiene remain essential strategies. This review offers a comprehensive understanding of the persistent effects of COVID-19 on individuals of varying ages, along with insights into diagnosis, treatment, vaccination, and future preventative measures against the spread of SARS-CoV-2. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2310.01768 [pdf, other]

Backdiff: a diffusion model for generalized transferable protein backmapping

Authors: Yikai Liu, Ming Chen, Guang Lin

Abstract: Coarse-grained (CG) models play a crucial role in the study of protein structures, protein thermodynamic properties, and protein conformation dynamics. Due to the information loss in the coarse-graining process, backmapping from CG to all-atom configurations is essential in many protein design and drug discovery applications when detailed atomic representations are needed for in-depth studies. Des… ▽ More Coarse-grained (CG) models play a crucial role in the study of protein structures, protein thermodynamic properties, and protein conformation dynamics. Due to the information loss in the coarse-graining process, backmapping from CG to all-atom configurations is essential in many protein design and drug discovery applications when detailed atomic representations are needed for in-depth studies. Despite recent progress in data-driven backmapping approaches, devising a backmapping method that can be universally applied across various CG models and proteins remains unresolved. In this work, we propose BackDiff, a new generative model designed to achieve generalization and reliability in the protein backmapping problem. BackDiff leverages the conditional score-based diffusion model with geometric representations. Since different CG models can contain different coarse-grained sites which include selected atoms (CG atoms) and simple CG auxiliary functions of atomistic coordinates (CG auxiliary variables), we design a self-supervised training framework to adapt to different CG atoms, and constrain the diffusion sampling paths with arbitrary CG auxiliary variables as conditions. Our method facilitates end-to-end training and allows efficient sampling across different proteins and diverse CG models without the need for retraining. Comprehensive experiments over multiple popular CG models demonstrate BackDiff's superior performance to existing state-of-the-art approaches, and generalization and flexibility that these approaches cannot achieve. A pretrained BackDiff model can offer a convenient yet reliable plug-and-play solution for protein researchers, enabling them to investigate further from their own CG models. △ Less

Submitted 28 November, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

Comments: 22 pages, 5 figures

arXiv:2306.07618 [pdf, other]

Hyperbolic Graph Diffusion Model

Authors: Lingfeng Wen, Xuan Tang, Mingjie Ouyang, Xiangxiang Shen, Jian Yang, Daxin Zhu, Mingsong Chen, Xian Wei

Abstract: Diffusion generative models (DMs) have achieved promising results in image and graph generation. However, real-world graphs, such as social networks, molecular graphs, and traffic graphs, generally share non-Euclidean topologies and hidden hierarchies. For example, the degree distributions of graphs are mostly power-law distributions. The current latent diffusion model embeds the hierarchical data… ▽ More Diffusion generative models (DMs) have achieved promising results in image and graph generation. However, real-world graphs, such as social networks, molecular graphs, and traffic graphs, generally share non-Euclidean topologies and hidden hierarchies. For example, the degree distributions of graphs are mostly power-law distributions. The current latent diffusion model embeds the hierarchical data in a Euclidean space, which leads to distortions and interferes with modeling the distribution. Instead, hyperbolic space has been found to be more suitable for capturing complex hierarchical structures due to its exponential growth property. In order to simultaneously utilize the data generation capabilities of diffusion models and the ability of hyperbolic embeddings to extract latent hierarchical distributions, we propose a novel graph generation method called, Hyperbolic Graph Diffusion Model (HGDM), which consists of an auto-encoder to encode nodes into successive hyperbolic embeddings, and a DM that operates in the hyperbolic latent space. HGDM captures the crucial graph structure distributions by constructing a hyperbolic potential node space that incorporates edge information. Extensive experiments show that HGDM achieves better performance in generic graph and molecule generation benchmarks, with a $48\%$ improvement in the quality of graph generation with highly hierarchical structures. △ Less

Submitted 3 January, 2024; v1 submitted 13 June, 2023; originally announced June 2023.

Comments: accepted by AAAI 2024

arXiv:2305.15590 [pdf]

Deep Representation Learning of Tissue Metabolome and Computed Tomography Images Annotates Non-invasive Classification and Prognosis Prediction of NSCLC

Authors: Marc Boubnovski Martell, Kristofer Linton-Reid, Sumeet Hindocha, Mitchell Chen, OCTAPUS-AI, Paula Moreno, Marina Álvarez-Benito, Ángel Salvatierra, Richard Lee, Joram M. Posma, Marco A Calzado, Eric O Aboagye

Abstract: The rich chemical information from tissue metabolomics provides a powerful means to elaborate tissue physiology or tumor characteristics at cellular and tumor microenvironment levels. However, the process of obtaining such information requires invasive biopsies, is costly, and can delay clinical patient management. Conversely, computed tomography (CT) is a clinical standard of care but does not in… ▽ More The rich chemical information from tissue metabolomics provides a powerful means to elaborate tissue physiology or tumor characteristics at cellular and tumor microenvironment levels. However, the process of obtaining such information requires invasive biopsies, is costly, and can delay clinical patient management. Conversely, computed tomography (CT) is a clinical standard of care but does not intuitively harbor histological or prognostic information. Furthermore, the ability to embed metabolome information into CT to subsequently use the learned representation for classification or prognosis has yet to be described. This study develops a deep learning-based framework -- tissue-metabolomic-radiomic-CT (TMR-CT) by combining 48 paired CT images and tumor/normal tissue metabolite intensities to generate ten image embeddings to infer metabolite-derived representation from CT alone. In clinical NSCLC settings, we ascertain whether TMR-CT achieves state-of-the-art results in solving histology classification/prognosis tasks in an unseen international CT dataset of 742 patients. TMR-CT non-invasively determines histological classes - adenocarcinoma/ squamous cell carcinoma with an F1-score=0.78 and further asserts patients' prognosis with a c-index=0.72, surpassing the performance of radiomics models and clinical features. Additionally, our work shows the potential to generate informative biology-inspired CT-led features to explore connections between hard-to-obtain tissue metabolic profiles and routine lesion-derived image data. △ Less

Submitted 26 May, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

arXiv:2305.10373 [pdf, other]

Functional Connectivity: Continuous-Time Latent Factor Models for Neural Spike Trains

Authors: Meixi Chen, Martin Lysy, David Moorman, Reza Ramezan

Abstract: Modelling the dynamics of interactions in a neuronal ensemble is an important problem in functional connectivity research. One popular framework is latent factor models (LFMs), which have achieved notable success in decoding neuronal population dynamics. However, most LFMs are specified in discrete time, where the choice of bin size significantly impacts inference results. In this work, we present… ▽ More Modelling the dynamics of interactions in a neuronal ensemble is an important problem in functional connectivity research. One popular framework is latent factor models (LFMs), which have achieved notable success in decoding neuronal population dynamics. However, most LFMs are specified in discrete time, where the choice of bin size significantly impacts inference results. In this work, we present what is, to the best of our knowledge, the first continuous-time multivariate spike train LFM for studying neuronal interactions and functional connectivity. We present an efficient parameter inference algorithm for our biologically justifiable model which (1) scales linearly in the number of simultaneously recorded neurons and (2) bypasses time binning and related issues. Simulation studies show that parameter estimation using the proposed model is highly accurate. Applying our LFM to experimental data from a classical conditioning study on the prefrontal cortex in rats, we found that coordinated neuronal activities are affected by (1) the onset of the cue for reward delivery, and (2) the sub-region within the frontal cortex (OFC/mPFC). These findings shed new light on our understanding of cue and outcome value encoding. △ Less

Submitted 17 May, 2023; originally announced May 2023.

arXiv:2303.18241 [pdf]

Improving Resolution and Resolvability of Single Particle CryoEM using Gaussian Mixture Models

Authors: Muyuan Chen, Michael F. Schmid, Wah Chiu

Abstract: Cryogenic electron microscopy is widely used in structural biology, but its resolution is often limited by the dynamics of the macromolecule. Here, we developed a refinement protocol based on Gaussian mixture models that integrates particle orientation and conformation estimation, and improves the alignment for flexible domains of protein structures. We demonstrated this protocol on multiple datas… ▽ More Cryogenic electron microscopy is widely used in structural biology, but its resolution is often limited by the dynamics of the macromolecule. Here, we developed a refinement protocol based on Gaussian mixture models that integrates particle orientation and conformation estimation, and improves the alignment for flexible domains of protein structures. We demonstrated this protocol on multiple datasets, resulting in improved resolution and resolvability, locally and globally, by visual and quantitative measures. △ Less

Submitted 29 August, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

Comments: 36 pages, 11 figures

arXiv:2302.10907 [pdf]

Diffusion Models in Bioinformatics: A New Wave of Deep Learning Revolution in Action

Authors: Zhiye Guo, Jian Liu, Yanli Wang, Mengrui Chen, Duolin Wang, Dong Xu, Jianlin Cheng

Abstract: Denoising diffusion models have emerged as one of the most powerful generative models in recent years. They have achieved remarkable success in many fields, such as computer vision, natural language processing (NLP), and bioinformatics. Although there are a few excellent reviews on diffusion models and their applications in computer vision and NLP, there is a lack of an overview of their applicati… ▽ More Denoising diffusion models have emerged as one of the most powerful generative models in recent years. They have achieved remarkable success in many fields, such as computer vision, natural language processing (NLP), and bioinformatics. Although there are a few excellent reviews on diffusion models and their applications in computer vision and NLP, there is a lack of an overview of their applications in bioinformatics. This review aims to provide a rather thorough overview of the applications of diffusion models in bioinformatics to aid their further development in bioinformatics and computational biology. We start with an introduction of the key concepts and theoretical foundations of three cornerstone diffusion modeling frameworks (denoising diffusion probabilistic models, noise-conditioned scoring networks, and stochastic differential equations), followed by a comprehensive description of diffusion models employed in the different domains of bioinformatics, including cryo-EM data enhancement, single-cell data analysis, protein design and generation, drug and small molecule design, and protein-ligand interaction. The review is concluded with a summary of the potential new development and applications of diffusion models in bioinformatics. △ Less

Submitted 13 February, 2023; originally announced February 2023.

Comments: 16 pages, 2 figures, 2 tables, first version of the manuscript

ACM Class: I.2.1; J.3

arXiv:2302.10599 [pdf]

The construction of ceRNAs network reveals the prognostic characteristics of prostate cancer

Authors: Mengjie Chen

Abstract: The dysregulation of transcripts is characterized as one of the main mechanisms in tumor pathogenesis. The recent discovery developed a new hypothesis, competitive endogenous RNAs (ceRNAs), which could regulate other RNA transcripts via competing for their shared miRNAs. The interaction of elements in ceRNAs network was involved in a large range of biological reactions and facilitate to cancer pro… ▽ More The dysregulation of transcripts is characterized as one of the main mechanisms in tumor pathogenesis. The recent discovery developed a new hypothesis, competitive endogenous RNAs (ceRNAs), which could regulate other RNA transcripts via competing for their shared miRNAs. The interaction of elements in ceRNAs network was involved in a large range of biological reactions and facilitate to cancer progression. In this study, we performed a comprehensive investigation on the regulatory mechanisms and functional roles of ceRNAs in prostate cancer (PCa) and constructed a ceRNAs network which could possess potential value in patient prognosis and be evaluated as therapeutic targets for PCa. △ Less

Submitted 21 February, 2023; originally announced February 2023.

Comments: in Chinese language

arXiv:2212.10784 [pdf, other]

Can NLI Provide Proper Indirect Supervision for Low-resource Biomedical Relation Extraction?

Authors: Jiashu Xu, Mingyu Derek Ma, Muhao Chen

Abstract: Two key obstacles in biomedical relation extraction (RE) are the scarcity of annotations and the prevalence of instances without explicitly pre-defined labels due to low annotation coverage. Existing approaches, which treat biomedical RE as a multi-class classification task, often result in poor generalization in low-resource settings and do not have the ability to make selective prediction on unk… ▽ More Two key obstacles in biomedical relation extraction (RE) are the scarcity of annotations and the prevalence of instances without explicitly pre-defined labels due to low annotation coverage. Existing approaches, which treat biomedical RE as a multi-class classification task, often result in poor generalization in low-resource settings and do not have the ability to make selective prediction on unknown cases but give a guess from seen relations, hindering the applicability of those approaches. We present NBR, which converts biomedical RE as natural language inference formulation through indirect supervision. By converting relations to natural language hypotheses, NBR is capable of exploiting semantic cues to alleviate annotation scarcity. By incorporating a ranking-based loss that implicitly calibrates abstinent instances, NBR learns a clearer decision boundary and is instructed to abstain on uncertain instances. Extensive experiments on three widely-used biomedical RE benchmarks, namely ChemProt, DDI and GAD, verify the effectiveness of NBR in both full-set and low-resource regimes. Our analysis demonstrates that indirect supervision benefits biomedical RE even when a domain gap exists, and combining NLI knowledge with biomedical knowledge leads to the best performance gains. △ Less

Submitted 19 October, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

Comments: 16 pages; ACL 2023; code in https://github.com/luka-group/NLI_as_Indirect_Supervision

arXiv:2211.10926 [pdf, other]

Unraveling implicit human behavioral effects on dynamic characteristics of Covid-19 daily infection rates in Taiwan

Authors: Ting-Li Chen, Elizabeth P. Chou, Min-Yi Chen, Hsieh Fushing

Abstract: We study Covid-19 spreading dynamics underlying 84 curves of daily Covid-19 infection rates pertaining to 84 districts belonging to the largest seven cities in Taiwan during her pristine surge period. Our computational developments begin with selecting and extracting 18 features from each smoothed district-specific curve. This step of computing effort allows unstructured data to be converted into… ▽ More We study Covid-19 spreading dynamics underlying 84 curves of daily Covid-19 infection rates pertaining to 84 districts belonging to the largest seven cities in Taiwan during her pristine surge period. Our computational developments begin with selecting and extracting 18 features from each smoothed district-specific curve. This step of computing effort allows unstructured data to be converted into structured data, with which we then demonstrate asymmetric growth and decline dynamics among all involved curves. Specifically, based on Theoretical Information measurements of conditional entropy and mutual information, we compute major factors of order-1 and order-2 that reveal significant effects on affecting the curves' peak value and curvature at peak, which are two essential features characterizing all the curves. Further, we investigate and demonstrate major factors determining the geographic and social-economic induced behavioral effects by encoding each of these 84 districts with two binary characteristics: North-vs-South and Unban-vs-suburban. Furthermore, based on this data-driven knowledge on the district scale, we go on to study fine-scale behavioral effects on infectious disease spreading through similarity among 96 age-group-specific curves of daily infection rate within 12 urban districts of Taipei and 12 suburban districts of New Taipei City, which counts for almost one-quarter of the island nation's total population. We conclude that human living, traveling, and working behaviors do implicitly affect the spreading dynamics of Covid-19 across Taiwan profoundly. △ Less

Submitted 20 November, 2022; originally announced November 2022.

arXiv:2211.10518 [pdf]

Integrating molecular models into CryoEM heterogeneity analysis using scalable high-resolution deep Gaussian mixture models

Authors: Muyuan Chen, Bogdan Toader, Roy Lederman

Abstract: Resolving the structural variability of proteins is often key to understanding the structure-function relationship of those macromolecular machines. Single particle analysis using Cryogenic electron microscopy (CryoEM), combined with machine learning algorithms, provides a way to reveal the dynamics within the protein system from noisy micrographs. Here, we introduce an improved computational meth… ▽ More Resolving the structural variability of proteins is often key to understanding the structure-function relationship of those macromolecular machines. Single particle analysis using Cryogenic electron microscopy (CryoEM), combined with machine learning algorithms, provides a way to reveal the dynamics within the protein system from noisy micrographs. Here, we introduce an improved computational method that uses Gaussian mixture models for protein structure representation and deep neural networks for conformation space embedding. By integrating information from molecular models into the heterogeneity analysis, we can resolve complex protein conformational changes at near atomic resolution and present the results in a more interpretable form. △ Less

Submitted 18 November, 2022; originally announced November 2022.

arXiv:2210.05880 [pdf]

Pathology Steered Stratification Network for Subtype Identification in Alzheimer's Disease

Authors: Enze Xu, Jingwen Zhang, Jiadi Li, Qianqian Song, Defu Yang, Guorong Wu, Minghan Chen

Abstract: Alzheimer's disease (AD) is a heterogeneous, multifactorial neurodegenerative disorder characterized by beta-amyloid, pathologic tau, and neurodegeneration. There are no effective treatments for Alzheimer's disease at a late stage, urging for early intervention. However, existing statistical inference approaches of AD subtype identification ignore the pathological domain knowledge, which could lea… ▽ More Alzheimer's disease (AD) is a heterogeneous, multifactorial neurodegenerative disorder characterized by beta-amyloid, pathologic tau, and neurodegeneration. There are no effective treatments for Alzheimer's disease at a late stage, urging for early intervention. However, existing statistical inference approaches of AD subtype identification ignore the pathological domain knowledge, which could lead to ill-posed results that are sometimes inconsistent with the essential neurological principles. Integrating systems biology modeling with machine learning, we propose a novel pathology steered stratification network (PSSN) that incorporates established domain knowledge in AD pathology through a reaction-diffusion model, where we consider non-linear interactions between major biomarkers and diffusion along brain structural network. Trained on longitudinal multimodal neuroimaging data, the biological model predicts long-term trajectories that capture individual progression pattern, filling in the gaps between sparse imaging data available. A deep predictive neural network is then built to exploit spatiotemporal dynamics, link neurological examinations with clinical profiles, and generate subtype assignment probability on an individual basis. We further identify an evolutionary disease graph to quantify subtype transition probabilities through extensive simulations. Our stratification achieves superior performance in both inter-cluster heterogeneity and intra-cluster homogeneity of various clinical scores. Applying our approach to enriched samples of aging populations, we identify six subtypes spanning AD spectrum, where each subtype exhibits a distinctive biomarker pattern that is consistent with its clinical outcome. PSSN provides insights into pre-symptomatic diagnosis and practical guidance on clinical treatments, which may be further generalized to other neurodegenerative diseases. △ Less

Submitted 25 August, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

arXiv:2206.12240 [pdf, other]

PSP: Million-level Protein Sequence Dataset for Protein Structure Prediction

Authors: Sirui Liu, Jun Zhang, Haotian Chu, Min Wang, Boxin Xue, Ningxi Ni, Jialiang Yu, Yuhao Xie, Zhenyu Chen, Mengyun Chen, Yuan Liu, Piya Patra, Fan Xu, Jie Chen, Zidong Wang, Lijiang Yang, Fan Yu, Lei Chen, Yi Qin Gao

Abstract: Proteins are essential component of human life and their structures are important for function and mechanism analysis. Recent work has shown the potential of AI-driven methods for protein structure prediction. However, the development of new models is restricted by the lack of dataset and benchmark training procedure. To the best of our knowledge, the existing open source datasets are far less to… ▽ More Proteins are essential component of human life and their structures are important for function and mechanism analysis. Recent work has shown the potential of AI-driven methods for protein structure prediction. However, the development of new models is restricted by the lack of dataset and benchmark training procedure. To the best of our knowledge, the existing open source datasets are far less to satisfy the needs of modern protein sequence-structure related research. To solve this problem, we present the first million-level protein structure prediction dataset with high coverage and diversity, named as PSP. This dataset consists of 570k true structure sequences (10TB) and 745k complementary distillation sequences (15TB). We provide in addition the benchmark training procedure for SOTA protein structure prediction model on this dataset. We validate the utility of this dataset for training by participating CAMEO contest in which our model won the first place. We hope our PSP dataset together with the training benchmark can enable a broader community of AI/biology researchers for AI-driven protein related research. △ Less

Submitted 24 June, 2022; originally announced June 2022.

arXiv:2206.09903 [pdf, other]

A Multivariate Point Process Model for Simultaneously Recorded Neural Spike Trains

Authors: Reza Ramezan, Meixi Chen, Martin Lysy, Paul Marriott

Abstract: The current state-of-the-art in neurophysiological data collection allows for simultaneous recording of tens to hundreds of neurons, for which point processes are an appropriate statistical modelling framework. However, existing point process models lack multivariate generalizations which are both flexible and computationally tractable. This paper introduces a multivariate generalization of the Sk… ▽ More The current state-of-the-art in neurophysiological data collection allows for simultaneous recording of tens to hundreds of neurons, for which point processes are an appropriate statistical modelling framework. However, existing point process models lack multivariate generalizations which are both flexible and computationally tractable. This paper introduces a multivariate generalization of the Skellam process with resetting (SPR), a point process tailored to model individual neural spike trains. The multivariate SPR (MSPR) is biologically justified as it mimics the process of neural integration. Its flexible dependence structure and a fast parameter estimation method make it well-suited for the analysis of simultaneously recorded spike trains from multiple neurons. The strengths and weaknesses of the MSPR are demonstrated through simulation and analysis of experimental data. △ Less

Submitted 20 June, 2022; originally announced June 2022.

Comments: 6 pages, 1 figure

arXiv:2204.09893 [pdf, other]

doi 10.3389/fnins.2022.945037

MAP-SNN: Mapping Spike Activities with Multiplicity, Adaptability, and Plasticity into Bio-Plausible Spiking Neural Networks

Authors: Chengting Yu, Yangkai Du, Mufeng Chen, Aili Wang, Gaoang Wang, Erping Li

Abstract: Spiking Neural Network (SNN) is considered more biologically realistic and power-efficient as it imitates the fundamental mechanism of the human brain. Recently, backpropagation (BP) based SNN learning algorithms that utilize deep learning frameworks have achieved good performance. However, bio-interpretability is partially neglected in those BP-based algorithms. Toward bio-plausible BP-based SNNs… ▽ More Spiking Neural Network (SNN) is considered more biologically realistic and power-efficient as it imitates the fundamental mechanism of the human brain. Recently, backpropagation (BP) based SNN learning algorithms that utilize deep learning frameworks have achieved good performance. However, bio-interpretability is partially neglected in those BP-based algorithms. Toward bio-plausible BP-based SNNs, we consider three properties in modeling spike activities: Multiplicity, Adaptability, and Plasticity (MAP). In terms of multiplicity, we propose a Multiple-Spike Pattern (MSP) with multiple spike transmission to strengthen model robustness in discrete time-iteration. To realize adaptability, we adopt Spike Frequency Adaption (SFA) under MSP to decrease spike activities for improved efficiency. For plasticity, we propose a trainable convolutional synapse that models spike response current to enhance the diversity of spiking neurons for temporal feature extraction. The proposed SNN model achieves competitive performances on neuromorphic datasets: N-MNIST and SHD. Furthermore, experimental results demonstrate that the proposed three aspects are significant to iterative robustness, spike efficiency, and temporal feature extraction capability of spike activities. In summary, this work proposes a feasible scheme for bio-inspired spike activities with MAP, offering a new neuromorphic perspective to embed biological characteristics into spiking neural networks. △ Less

Submitted 21 April, 2022; originally announced April 2022.

Journal ref: Frontiers in neuroscience, 2022, 09

arXiv:2203.00743 [pdf]

Uncovering the dynamic effects of DEX treatment on lung cancer by integrating bioinformatic inference and multiscale modeling of scRNA-seq and proteomics data

Authors: Minghan Chen, Chunrui Xu, Ziang Xu, Wei He, Haorui Zhang, Jing Su, Qianqian Song

Abstract: Motivation: Lung cancer is one of the leading causes for cancer-related death, with a five-year survival rate of 18%. It is a priority for us to understand the underlying mechanisms that affect the implementation and effectiveness of lung cancer therapeutics. In this study, we combine the power of Bioinformatics and Systems Biology to comprehensively uncover functional and signaling pathways of dr… ▽ More Motivation: Lung cancer is one of the leading causes for cancer-related death, with a five-year survival rate of 18%. It is a priority for us to understand the underlying mechanisms that affect the implementation and effectiveness of lung cancer therapeutics. In this study, we combine the power of Bioinformatics and Systems Biology to comprehensively uncover functional and signaling pathways of drug treatment using bioinformatics inference and multiscale modeling of both scRNA-seq data and proteomics data. The innovative and cross-disciplinary approach can be further applied to other computational studies in tumorigenesis and oncotherapy. Results: A time series of lung adenocarcinoma-derived A549 cells after DEX treatment were analysed. (1) We first discovered the differentially expressed genes in those lung cancer cells. Then through the interrogation of their regulatory network, we identified key hub genes including TGF-\b{eta}, MYC, and SMAD3 varied underlie DEX treatment. Further enrichment analysis revealed the TGF-\b{eta} signaling pathway as the top enriched term. Those genes involved in the TGF-\b{eta} pathway and their crosstalk with the ERBB pathway presented a strong survival prognosis in clinical lung cancer samples. (2) Based on biological validation and further curation, a multiscale model of tumor regulation centered on both TGF-\b{eta}-induced and ERBB-amplified signaling pathways was developed to characterize the dynamics effects of DEX therapy on lung cancer cells. Our simulation results were well matched to available data of SMAD2, FOXO3, TGF\b{eta}1, and TGF\b{eta}R1 over the time course. Moreover, we provided predictions of different doses to illustrate the trend and therapeutic potential of DEX treatment. △ Less

Submitted 1 March, 2022; originally announced March 2022.

arXiv:2201.08941 [pdf]

Uncovering the System Vulnerability and Criticality of Human Brain under Dynamical Neuropathological Events in Alzheimer's Disease

Authors: Jingwen Zhang, Qing Liu, Haorui Zhang, Michelle Dai, Qianqian Song, Defu Yang, Guorong Wu, Minghan Chen

Abstract: Background: Despite the striking efforts in investigating neurobiological factors behind the acquisition of amyloid-\b{eta} (A), protein tau (T), and neurodegeneration ([N]) biomarkers, the mechanistic pathways of how AT[N] biomarkers spreading throughout the brain remain elusive. Objectives: To disentangle the massive heterogeneities in AD progressions and identify vulnerable/critical brain regio… ▽ More Background: Despite the striking efforts in investigating neurobiological factors behind the acquisition of amyloid-\b{eta} (A), protein tau (T), and neurodegeneration ([N]) biomarkers, the mechanistic pathways of how AT[N] biomarkers spreading throughout the brain remain elusive. Objectives: To disentangle the massive heterogeneities in AD progressions and identify vulnerable/critical brain regions to AD pathology. Methods: In this work, we characterized the interaction of AT[N] biomarkers and their propagation across brain networks using a novel bistable reaction-diffusion model, which allows us to establish a new systems biology underpinning of Alzheimer's disease (AD) progression. We applied our model to large-scale longitudinal neuroimages from the ADNI database and studied the systematic vulnerability and criticality of brains. Results: Our model yields long term prediction that is statistically significant linear correlated with temporal imaging data, produces clinically consistent risk prediction, and captures the Braak-like spreading pattern of AT[N] biomarkers in AD development. Conclusion: Our major findings include (i) tau is a stronger indicator of regional risk compared to amyloid, (ii) temporal lobe exhibits higher vulnerability to AD-related pathologies, (iii) proposed critical brain regions outperform hub nodes in transmitting disease factors across the brain, and (iv) comparing the spread of neuropathological burdens caused by amyloid-\b{eta} and tau diffusions, disruption of metabolic balance is the most determinant factor contributing to the initiation and progression of Alzheimer's disease. △ Less

Submitted 21 August, 2023; v1 submitted 21 January, 2022; originally announced January 2022.

arXiv:2111.14283 [pdf, other]

Exploration of Dark Chemical Genomics Space via Portal Learning: Applied to Targeting the Undruggable Genome and COVID-19 Anti-Infective Polypharmacology

Authors: Tian Cai, Li Xie, Muge Chen, Yang Liu, Di He, Shuo Zhang, Cameron Mura, Philip E. Bourne, Lei Xie

Abstract: Advances in biomedicine are largely fueled by exploring uncharted territories of human biology. Machine learning can both enable and accelerate discovery, but faces a fundamental hurdle when applied to unseen data with distributions that differ from previously observed ones -- a common dilemma in scientific inquiry. We have developed a new deep learning framework, called {\textit{Portal Learning}}… ▽ More Advances in biomedicine are largely fueled by exploring uncharted territories of human biology. Machine learning can both enable and accelerate discovery, but faces a fundamental hurdle when applied to unseen data with distributions that differ from previously observed ones -- a common dilemma in scientific inquiry. We have developed a new deep learning framework, called {\textit{Portal Learning}}, to explore dark chemical and biological space. Three key, novel components of our approach include: (i) end-to-end, step-wise transfer learning, in recognition of biology's sequence-structure-function paradigm, (ii) out-of-cluster meta-learning, and (iii) stress model selection. Portal Learning provides a practical solution to the out-of-distribution (OOD) problem in statistical machine learning. Here, we have implemented Portal Learning to predict chemical-protein interactions on a genome-wide scale. Systematic studies demonstrate that Portal Learning can effectively assign ligands to unexplored gene families (unknown functions), versus existing state-of-the-art methods, thereby allowing us to target previously "undruggable" proteins and design novel polypharmacological agents for disrupting interactions between SARS-CoV-2 and human proteins. Portal Learning is general-purpose and can be further applied to other areas of scientific inquiry. △ Less

Submitted 23 November, 2021; originally announced November 2021.

Comments: 18 pages, 6 figures

MSC Class: 68T07

arXiv:2105.04042 [pdf]

A modified two-leaf light use efficiency model for improving the simulation of GPP using a radiation scalar

Authors: Xiaobin Guan, Jing M. Chen, Huanfeng Shen, Xinyao Xie

Abstract: A TL-LUE model modified with a radiation scalar (RTL-LUE) is developed in this paper. The same maximum LUE is used for both sunlit and shaded leaves, and the difference in LUE between sunlit and shaded leaf groups is determined by the same radiation scalar. The RTL-LUE model was calibrated and validated at global 169 FLUXNET eddy covariance (EC) sites. Results indicate that although GPP simulation… ▽ More A TL-LUE model modified with a radiation scalar (RTL-LUE) is developed in this paper. The same maximum LUE is used for both sunlit and shaded leaves, and the difference in LUE between sunlit and shaded leaf groups is determined by the same radiation scalar. The RTL-LUE model was calibrated and validated at global 169 FLUXNET eddy covariance (EC) sites. Results indicate that although GPP simulations from the TL-LUE model match well with the EC GPP, the RTL-LUE model can further improve the simulation, for half-hour, 8-day, and yearly time scales. The TL-LUE model tends to overestimate GPP under conditions of high incoming photosynthetically active radiation (PAR), because the radiation-independent LUE values for both sunlit and shaded leaves are only suitable for low-medium (e.g. average) incoming PAR conditions. The errors in the RTL-LUE model show lower sensitivity to PAR, and its GPP simulations can better track the diurnal and seasonal variations of EC GPP by alleviating the overestimation at noon and growing seasons associated with the TL-LUE model. This study demonstrates the necessity of considering a radiation scalar in GPP simulation in LUE models even if the first-order effect of radiation is already considered through differentiating sunlit and shaded leaves. The simple RTL-LUE developed in this study would be a useful alternative to complex process-based models for global carbon cycle research. △ Less

Submitted 9 May, 2021; originally announced May 2021.

Comments: 40 pages, 9 figures

arXiv:2103.04376 [pdf, other]

Analyzing the Spatiotemporal Interaction and Propagation of ATN Biomarkers in Alzheimer's Disease using Longitudinal Neuroimaging Data

Authors: Qing Liu, Defu Yang, Jingwen Zhang, Ziming Wei, Guorong Wu, Minghan Chen

Abstract: Three major biomarkers: beta-amyloid (A), pathologic tau (T), and neurodegeneration (N), are recognized as valid proxies for neuropathologic changes of Alzheimer's disease. While there are extensive studies on cerebrospinal fluids biomarkers (amyloid, tau), the spatial propagation pattern across brain is missing and their interactive mechanisms with neurodegeneration are still unclear. To this end… ▽ More Three major biomarkers: beta-amyloid (A), pathologic tau (T), and neurodegeneration (N), are recognized as valid proxies for neuropathologic changes of Alzheimer's disease. While there are extensive studies on cerebrospinal fluids biomarkers (amyloid, tau), the spatial propagation pattern across brain is missing and their interactive mechanisms with neurodegeneration are still unclear. To this end, we aim to analyze the spatiotemporal associations between ATN biomarkers using large-scale neuroimaging data. We first investigate the temporal appearances of amyloid plaques, tau tangles, and neuronal loss by modeling the longitudinal transition trajectories. Second, we propose linear mixed-effects models to quantify the pathological interactions and propagation of ATN biomarkers at each brain region. Our analysis of the current data shows that there exists a temporal latency in the build-up of amyloid to the onset of tau pathology and neurodegeneration. The propagation pattern of amyloid can be characterized by its diffusion along the topological brain network. Our models provide sufficient evidence that the progression of pathological tau and neurodegeneration share a strong regional association, which is different from amyloid. △ Less

Submitted 7 March, 2021; originally announced March 2021.

Comments: 4 pages, 2 figures, to be published in IEEE ISBI 2021

arXiv:2103.04283 [pdf, ps, other]

doi 10.1145/3388440.3412477

Bio-JOIE: Joint Representation Learning of Biological Knowledge Bases

Authors: Junheng Hao, Chelsea Ju, Muhao Chen, Yizhou Sun, Carlo Zaniolo, Wei Wang

Abstract: The widespread of Coronavirus has led to a worldwide pandemic with a high mortality rate. Currently, the knowledge accumulated from different studies about this virus is very limited. Leveraging a wide-range of biological knowledge, such as gene ontology and protein-protein interaction (PPI) networks from other closely related species presents a vital approach to infer the molecular impact of a ne… ▽ More The widespread of Coronavirus has led to a worldwide pandemic with a high mortality rate. Currently, the knowledge accumulated from different studies about this virus is very limited. Leveraging a wide-range of biological knowledge, such as gene ontology and protein-protein interaction (PPI) networks from other closely related species presents a vital approach to infer the molecular impact of a new species. In this paper, we propose the transferred multi-relational embedding model Bio-JOIE to capture the knowledge of gene ontology and PPI networks, which demonstrates superb capability in modeling the SARS-CoV-2-human protein interactions. Bio-JOIE jointly trains two model components. The knowledge model encodes the relational facts from the protein and GO domains into separated embedding spaces, using a hierarchy-aware encoding technique employed for the GO terms. On top of that, the transfer model learns a non-linear transformation to transfer the knowledge of PPIs and gene ontology annotations across their embedding spaces. By leveraging only structured knowledge, Bio-JOIE significantly outperforms existing state-of-the-art methods in PPI type prediction on multiple species. Furthermore, we also demonstrate the potential of leveraging the learned representations on clustering proteins with enzymatic function into enzyme commission families. Finally, we show that Bio-JOIE can accurately identify PPIs between the SARS-CoV-2 proteins and human proteins, providing valuable insights for advancing research on this new disease. △ Less

Submitted 7 March, 2021; originally announced March 2021.

Comments: ACM BCB 2020, Best Student Paper

Journal ref: In Procs of the 11th ACM BCB, pp. 1-10. 2020

arXiv:2101.10356 [pdf]

doi 10.1038/s41592-021-01220-5

Deep learning based mixed-dimensional GMM for characterizing variability in CryoEM

Authors: Muyuan Chen, Steven Ludtke

Abstract: Structural flexibility and/or dynamic interactions with other molecules is a critical aspect of protein function. CryoEM provides direct visualization of individual macromolecules sampling different conformational and compositional states. While numerous methods are available for computational classification of discrete states, characterization of continuous conformational changes or large numbers… ▽ More Structural flexibility and/or dynamic interactions with other molecules is a critical aspect of protein function. CryoEM provides direct visualization of individual macromolecules sampling different conformational and compositional states. While numerous methods are available for computational classification of discrete states, characterization of continuous conformational changes or large numbers of discrete state without human supervision remains challenging. Here we present e2gmm, a machine learning algorithm to determine a conformational landscape for proteins or complexes using a 3-D Gaussian mixture model mapped onto 2-D particle images in known orientations. Using a deep neural network architecture, e2gmm can automatically resolve the structural heterogeneity within the protein complex and map particles onto a small latent space describing conformational and compositional changes. This system presents a more intuitive and flexible representation than other manifold methods currently in use. We demonstrate this method on both simulated data as well as three biological systems, to explore compositional and conformational changes at a range of scales. The software is distributed as part of EMAN2. △ Less

Submitted 23 May, 2021; v1 submitted 25 January, 2021; originally announced January 2021.

Comments: 31 pages, 5 main figures and 8 supplementary figures

Journal ref: Nature Methods 18, 930-936 (2021)

arXiv:2101.00179 [pdf, other]

A Universal Framework for Reconstructing Complex Networks and Node Dynamics from Discrete or Continuous Dynamics Data

Authors: Yan Zhang, Yu Guo, Zhang Zhang, Mengyuan Chen, Shuo Wang, Jiang Zhang

Abstract: Many dynamical processes of complex systems can be understood as the dynamics of a group of nodes interacting on a given network structure. However, finding such interaction structure and node dynamics from time series of node behaviours is tough. Conventional methods focus on either network structure inference task or dynamics reconstruction problem, very few of them can work well on both. This p… ▽ More Many dynamical processes of complex systems can be understood as the dynamics of a group of nodes interacting on a given network structure. However, finding such interaction structure and node dynamics from time series of node behaviours is tough. Conventional methods focus on either network structure inference task or dynamics reconstruction problem, very few of them can work well on both. This paper proposes a universal framework for reconstructing network structure and node dynamics at the same time from observed time-series data of nodes. We use a differentiable Bernoulli sampling process to generate a candidate network structure, and use neural networks to simulate the node dynamics based on the candidate network. We then adjust all the parameters with a stochastic gradient descent algorithm to maximize the likelihood function defined on the data. The experiments show that our model can recover various network structures and node dynamics at the same time with high accuracy. It can also work well on binary, discrete and continuous time-series data, and the reconstruction results are robust against noise and missing information. △ Less

Submitted 25 June, 2022; v1 submitted 1 January, 2021; originally announced January 2021.

arXiv:2011.05595 [pdf]

Desires and Motivation: The Computational Rule, the Underlying Neural Circuitry, and the Relevant Clinical Disorders

Authors: Yu Liu, Yinghong Zhao, Mo Chen

Abstract: As organism is a dissipative system. The process from multi desires to exclusive motivation is of great importance among all sensory-action loops. In this paper we argued that a proper Desire-Motivation model should be a continuous dynamic mapping from the dynamic desire vector to the sparse motivation vector. Meanwhile, it should at least have specific stability and adjustability of motivation in… ▽ More As organism is a dissipative system. The process from multi desires to exclusive motivation is of great importance among all sensory-action loops. In this paper we argued that a proper Desire-Motivation model should be a continuous dynamic mapping from the dynamic desire vector to the sparse motivation vector. Meanwhile, it should at least have specific stability and adjustability of motivation intensity. Besides, the neuroscience evidences suggest that the Desire-Motivation model should have dynamic information acquisition and should be a recurrent neural network. A five-equation model is built based on the above arguments, namely the Recurrent Gating Desire-Motivation (RGDM) model. Additionally, a heuristic speculation based on the RGDM model about corresponding brain regions is carried out. It believes that the tonic and phasic firing of ventral tegmental area dopamine neurons should execute the respective and collective feedback functions of recurrent processing. The analysis about the RGMD model shows the expectations about individual personality from three dimensions, namely stability, intensity, and motivation decision speed. These three dimensions can be combined and create eight different personalities, which is correspondent to Jung's personality structure theorem. Furthermore, the RGDM model can be used to predict three different brand-new types of depressive disorder with different phenotypes. Moreover, it can also explain several other psychiatry disorders from new perspectives. △ Less

Submitted 11 November, 2020; originally announced November 2020.

arXiv:2009.06313 [pdf, other]

doi 10.1063/1.5143004

Finding Acceptable Parameter Regions of Stochastic Hill functions for Multisite Phosphorylation Mechanism

Authors: Minghan Chen, Mansooreh Ahmadian, Layne Watson, Yang Cao

Abstract: Multisite phosphorylation plays an important role in regulating switchlike protein activity and has been used widely in mathematical models. With the development of new experimental techniques and more molecular data, molecular phosphorylation processes emerge in many systems with increasing complexity and sizes. These developments call for simple yet valid stochastic models to describe various mu… ▽ More Multisite phosphorylation plays an important role in regulating switchlike protein activity and has been used widely in mathematical models. With the development of new experimental techniques and more molecular data, molecular phosphorylation processes emerge in many systems with increasing complexity and sizes. These developments call for simple yet valid stochastic models to describe various multisite phosphorylation processes, especially in large and complex biochemical networks. To reduce model complexity, this work aims to simplify the multisite phosphorylation mechanism by a stochastic Hill function model. Further, this work optimizes regions of parameter space to match simulation results from the stochastic Hill function with the distributive multisite phosphorylation process. While traditional parameter optimization methods have been focusing on finding the best parameter vector, in most circumstances modelers would like to find a set of parameter vectors that generate similar system dynamics and results. This paper proposes a general $α$-$β$-$γ$ rule to return an acceptable parameter region of the stochastic Hill function based on a quasi-Newton stochastic optimization (QNSTOP) algorithm. Different objective functions are investigated characterizing different features of the simulation-based empirical data, among which the approximate maximum log-likelihood method is recommended for general applications. Numerical results demonstrate that with an appropriate parameter vector value, the stochastic Hill function model depicts the multisite phosphorylation process well except the initial (transient) period. △ Less

Submitted 14 September, 2020; originally announced September 2020.

Journal ref: The Journal of Chemical Physics 152.12 (2020): 124108

arXiv:2009.04854 [pdf, other]

A Network-Guided Reaction-Diffusion Model of AT[N] Biomarkers in Alzheimer's Disease

Authors: Jingwen Zhang, Defu Yang, Wei He, Guorong Wu, Minghan Chen

Abstract: Currently, many studies of Alzheimer's disease (AD) are investigating the neurobiological factors behind the acquisition of beta-amyloid (A), pathologic tau (T), and neurodegeneration ([N]) biomarkers from neuroimages. However, a system-level mechanism of how these neuropathological burdens promote neurodegeneration and why AD exhibits characteristic progression is largely elusive. In this study,… ▽ More Currently, many studies of Alzheimer's disease (AD) are investigating the neurobiological factors behind the acquisition of beta-amyloid (A), pathologic tau (T), and neurodegeneration ([N]) biomarkers from neuroimages. However, a system-level mechanism of how these neuropathological burdens promote neurodegeneration and why AD exhibits characteristic progression is largely elusive. In this study, we combined the power of systems biology and network neuroscience to understand the dynamic interaction and diffusion process of AT[N] biomarkers from an unprecedented amount of longitudinal Amyloid PET scan, MRI imaging, and DTI data. Specifically, we developed a network-guided biochemical model to jointly (1) model the interaction of AT[N] biomarkers at each brain region and (2) characterize their propagation pattern across the fiber pathways in the structural brain network, where the brain resilience is also considered as a moderator of cognitive decline. Our biochemical model offers a greater mathematical insight to understand the physiopathological mechanism of AD progression by studying the system dynamics and stability. Thus, an in-depth system-level analysis allows us to gain a new understanding of how AT[N] biomarkers spread throughout the brain, capture the early sign of cognitive decline, and predict the AD progression from the preclinical stage. △ Less

Submitted 2 October, 2020; v1 submitted 10 September, 2020; originally announced September 2020.

Comments: 11 pages, 8 figures

arXiv:2006.00523 [pdf, other]

The Effects of Stringent Interventions for Coronavirus Pandemic

Authors: Ting Tian, Wenxiang Luo, Yukang Jiang, Minqiong Chen, Canhong Wen, Wenliang Pan, Xueqin Wang

Abstract: The pandemic of COVID-19 has caused severe public health consequences around the world. Many interventions of COVID-19 have been implemented. It is of great public health and societal importance to evaluate the effects of interventions in the pandemic of COVID-19. In this paper, with help of synthetic control method, regression discontinuity and a Susceptible-Infected and infectious without isolat… ▽ More The pandemic of COVID-19 has caused severe public health consequences around the world. Many interventions of COVID-19 have been implemented. It is of great public health and societal importance to evaluate the effects of interventions in the pandemic of COVID-19. In this paper, with help of synthetic control method, regression discontinuity and a Susceptible-Infected and infectious without isolation-Hospitalized in isolation-Removed (SIHR) model, we evaluate the horizontal and longitudinal effects of stringent interventions implemented in Wenzhou, a representative urban city of China, where stringent interventions were enforced to curb its own epidemic situation with rapidly increasing newly confirmed cases. We found that there were statistically significant treatment effects of those stringent interventions which reduced the cumulative confirmed cases of COVID-19. Those reduction effects would increase over time. Also, if the stringent interventions were delayed by 2 days or mild interventions were implemented instead, the expected number of cumulative confirmed cases would have been nearly 2 times or 5 times of the actual number. The effects of stringent interventions are significant in mitigating the epidemic situation of COVID-19. The slower the interventions were implemented, the more severe the epidemic would have been, and the stronger the interventions would have been required. △ Less

Submitted 31 May, 2020; originally announced June 2020.

Comments: 29 pages, 6 figures, Ting Tian, Wenxiang Luo and Yukang Jiang contributed equally to this article

arXiv:2005.05469 [pdf]

Causal Estimation of Stay-at-Home Orders on SARS-CoV-2 Transmission

Authors: M. Keith Chen, Yilin Zhuo, Malena de la Fuente, Ryne Rohla, Elisa F. Long

Abstract: Accurately estimating the effectiveness of stay-at-home orders (SHOs) on reducing social contact and disease spread is crucial for mitigating pandemics. Leveraging individual-level location data for 10 million smartphones, we observe that by April 30th---when nine in ten Americans were under a SHO---daily movement had fallen 70% from pre-COVID levels. One-quarter of this decline is causally attrib… ▽ More Accurately estimating the effectiveness of stay-at-home orders (SHOs) on reducing social contact and disease spread is crucial for mitigating pandemics. Leveraging individual-level location data for 10 million smartphones, we observe that by April 30th---when nine in ten Americans were under a SHO---daily movement had fallen 70% from pre-COVID levels. One-quarter of this decline is causally attributable to SHOs, with wide demographic differences in compliance, most notably by political affiliation. Likely Trump voters reduce movement by 9% following a local SHO, compared to a 21% reduction among their Clinton-voting neighbors, who face similar exposure risks and identical government orders. Linking social distancing behavior with an epidemic model, we estimate that reductions in movement have causally reduced SARS-CoV-2 transmission rates by 49%. △ Less

Submitted 11 May, 2020; originally announced May 2020.

arXiv:1909.00023 [pdf]

High-resolution 3D refractive index microscopy of multiple-scattering samples from intensity images

Authors: Shwetadwip Chowdhury, Michael Chen, Regina Eckert, David Ren, Fan Wu, Nicole Repina, Laura Waller

Abstract: Optical diffraction tomography (ODT) reconstructs a samples volumetric refractive index (RI) to create high-contrast, quantitative 3D visualizations of biological samples. However, standard implementations of ODT use interferometric systems, and so are sensitive to phase instabilities, complex mechanical design, and coherent noise. Furthermore, their reconstruction framework is typically limited t… ▽ More Optical diffraction tomography (ODT) reconstructs a samples volumetric refractive index (RI) to create high-contrast, quantitative 3D visualizations of biological samples. However, standard implementations of ODT use interferometric systems, and so are sensitive to phase instabilities, complex mechanical design, and coherent noise. Furthermore, their reconstruction framework is typically limited to weakly-scattering samples, and thus excludes a whole class of multiple-scattering samples. Here, we implement a new 3D RI microscopy technique that utilizes a computational multi-slice beam propagation method to invert the optical scattering process and reconstruct high-resolution (NA>1.0) 3D RI distributions of multiple-scattering samples. The method acquires intensity-only measurements from different illumination angles, and then solves a non-linear optimization problem to recover the sample 3D RI distribution. We experimentally demonstrate reconstruction of samples with varying amounts of multiple scattering: a 3T3 fibroblast cell, a cluster of C. elegans embryos, and a whole C. elegans worm, with lateral and axial resolutions of 250 nm and 900 nm, respectively. △ Less

Submitted 9 September, 2019; v1 submitted 30 August, 2019; originally announced September 2019.

Comments: 24 pages, 8 figures

arXiv:1902.03978 [pdf]

doi 10.1038/s41592-019-0591-8

A complete data processing workflow for CryoET and subtomogram averaging

Authors: Muyuan Chen, James M. Bell, Xiaodong Shi, Stella Y. Sun, Zhao Wang, Steven J. Ludtke

Abstract: Electron cryotomography (CryoET) is currently the only method capable of visualizing cells in 3D at nanometer resolutions. While modern instruments produce massive amounts of tomography data containing extremely rich structural information, the data processing is very labor intensive and results are often limited by the skills of the personnel rather than the data. We present an integrated workflo… ▽ More Electron cryotomography (CryoET) is currently the only method capable of visualizing cells in 3D at nanometer resolutions. While modern instruments produce massive amounts of tomography data containing extremely rich structural information, the data processing is very labor intensive and results are often limited by the skills of the personnel rather than the data. We present an integrated workflow that covers the entire tomography data processing pipeline, from automated tilt series alignment to subnanometer resolution subtomogram averaging. This workflow greatly reduces human effort and increases throughput, and is capable of determining protein structures at state-of-the-art resolutions for both purified macromolecules and cells. △ Less

Submitted 11 February, 2019; originally announced February 2019.

Comments: 21 pages, 4+2 figures

Journal ref: Nature Methods 16 (2019) 1161-1168

arXiv:1811.09326 [pdf, other]

doi 10.1371/journal.pcbi.1006395

Balance of Mechanical Forces Drives Endothelial Gap Formation and May Facilitate Cancer and Immune-Cell Extravasation

Authors: Jorge Escribano, Michelle B. Chen, Emad Moeendarbary, Xuan Cao, Vivek Shenoy, Jose Manuel Garcia-Aznar, Roger D. Kamm, Fabian Spill

Abstract: The formation of gaps in the endothelium is a crucial process underlying both cancer and immune cell extravasation, contributing to the functioning of the immune system during infection, the unfavorable development of chronic inflammation and tumor metastasis. Here, we present a stochastic-mechanical multiscale model of an endothelial cell monolayer and show that the dynamic nature of the endothel… ▽ More The formation of gaps in the endothelium is a crucial process underlying both cancer and immune cell extravasation, contributing to the functioning of the immune system during infection, the unfavorable development of chronic inflammation and tumor metastasis. Here, we present a stochastic-mechanical multiscale model of an endothelial cell monolayer and show that the dynamic nature of the endothelium leads to spontaneous gap formation, even without intervention from the transmigrating cells. These gaps preferentially appear at the vertices between three endothelial cells, as opposed to the border between two cells. We quantify the frequency and lifetime of these gaps, and validate our predictions experimentally. Interestingly, we find experimentally that cancer cells also preferentially extravasate at vertices, even when they first arrest on borders. This suggests that extravasating cells, rather than initially signaling to the endothelium, might exploit the autonomously forming gaps in the endothelium to initiate transmigration. △ Less

Submitted 22 November, 2018; originally announced November 2018.

Comments: 25 pages, 28 supplementary pages, 5 figures, 15 supplementary figures

arXiv:1803.01123 [pdf, other]

doi 10.1063/1.5009749

Relaxation rates of gene expression kinetics reveal the feedback signs of autoregulatory gene networks

Authors: Chen Jia, Hong Qian, Min Chen, Michael Q. Zhang

Abstract: The transient response to a stimulus and subsequent recovery to a steady state are the fundamental characteristics of a living organism. Here we study the relaxation kinetics of autoregulatory gene networks based on the chemical master equation model of single-cell stochastic gene expression with nonlinear feedback regulation. We report a novel relation between the rate of relaxation, characterize… ▽ More The transient response to a stimulus and subsequent recovery to a steady state are the fundamental characteristics of a living organism. Here we study the relaxation kinetics of autoregulatory gene networks based on the chemical master equation model of single-cell stochastic gene expression with nonlinear feedback regulation. We report a novel relation between the rate of relaxation, characterized by the spectral gap of the Markov model, and the feedback sign of the underlying gene circuit. When a network has no feedback, the relaxation rate is exactly the decaying rate of the protein. We further show that positive feedback always slows down the relaxation kinetics while negative feedback always speeds it up. Numerical simulations demonstrate that this relation provides a possible method to infer the feedback topology of autoregulatory gene networks by using time-series data of gene expression. △ Less

Submitted 3 March, 2018; originally announced March 2018.

Comments: 17 pages

arXiv:1703.06532 [pdf, other]

Stochastic fluctuations can reveal the feedback signs of gene regulatory networks at the single-molecule level

Authors: Chen Jia, Peng Xie, Min Chen, Michael Q. Zhang

Abstract: Understanding the relationship between spontaneous stochastic fluctuations and the topology of the underlying gene regulatory network is of fundamental importance for the study of single-cell stochastic gene expression. Here by solving the analytical steady-state distribution of the protein copy number in a general kinetic model of stochastic gene expression with nonlinear feedback regulation, we… ▽ More Understanding the relationship between spontaneous stochastic fluctuations and the topology of the underlying gene regulatory network is of fundamental importance for the study of single-cell stochastic gene expression. Here by solving the analytical steady-state distribution of the protein copy number in a general kinetic model of stochastic gene expression with nonlinear feedback regulation, we reveal the relationship between stochastic fluctuations and feedback topology at the single-molecule level, which provides novel insights into how and to what extent a feedback loop can enhance or suppress molecular fluctuations. Based on such relationship, we also develop an effective method to extract the topological information of a gene regulatory network from single-cell gene expression data. The theory is demonstrated by numerical simulations and, more importantly, validated quantitatively by single-cell data analysis of a synthetic gene circuit integrated in human kidney cells. △ Less

Submitted 24 October, 2017; v1 submitted 19 March, 2017; originally announced March 2017.

Comments: 13 pages, 5 figures

arXiv:1701.05567 [pdf]

doi 10.1038/nmeth.4405

Convolutional Neural Networks for Automated Annotation of Cellular Cryo-Electron Tomograms

Authors: Muyuan Chen, Wei Dai, Ying Sun, Darius Jonasch, Cynthia Y He, Michael F. Schmid, Wah Chiu, Steven J Ludtke

Abstract: Cellular Electron Cryotomography (CryoET) offers the ability to look inside cells and observe macromolecules frozen in action. A primary challenge for this technique is identifying and extracting the molecular components within the crowded cellular environment. We introduce a method using neural networks to dramatically reduce the time and human effort required for subcellular annotation and featu… ▽ More Cellular Electron Cryotomography (CryoET) offers the ability to look inside cells and observe macromolecules frozen in action. A primary challenge for this technique is identifying and extracting the molecular components within the crowded cellular environment. We introduce a method using neural networks to dramatically reduce the time and human effort required for subcellular annotation and feature extraction. Subsequent subtomogram classification and averaging yields in-situ structures of molecular components of interest. △ Less

Submitted 11 June, 2017; v1 submitted 19 January, 2017; originally announced January 2017.

Comments: 21 pages, 8 figures

Journal ref: Nature Methods volume 14, 983-985 (2017)

arXiv:1701.05503 [pdf]

The competitive nature of STAT complex formation drives phenotype switching of T cells

Authors: Ildar I Sadreev, Michael Z Q Chen, Yoshinori Umezawa, Vadim N Biktashev, Claudia Kemper, Diana V Salakhieva, Gavin I Welsh, Nikolay V Kotov

Abstract: Signal transducers and activators of transcription (STATs) are key molecular determinants of T cell fate and effector function. A number of inflammatory diseases are characterized by an altered balance of T cell phenotypes and cytokine secretion. STATs, therefore, represent viable therapeutic targets in numerous pathologies. However, the underlying mechanisms of how the same STAT proteins regulate… ▽ More Signal transducers and activators of transcription (STATs) are key molecular determinants of T cell fate and effector function. A number of inflammatory diseases are characterized by an altered balance of T cell phenotypes and cytokine secretion. STATs, therefore, represent viable therapeutic targets in numerous pathologies. However, the underlying mechanisms of how the same STAT proteins regulate both the development of different T cell phenotypes and their plasticity during changes in extracellular conditions remain unclear. In this study, we investigated the STAT mediated regulation of T cell phenotype formation and plasticity using mathematical modeling and experimental data for intracellular STAT signaling proteins. The close fit of our model predictions to the experimental data for IFN-γ to IL-10 switching allows us to propose a potential mechanism for T cell switching that regulates human Th1/Tr1 responses. According to this mechanism, T cell phenotype switching is due to the relative redistribution of STAT dimer complexes caused by the extracellular cytokine-dependent STAT competition effects. The proposed model is applicable to a number of STAT signaling circuits. △ Less

Submitted 19 January, 2017; originally announced January 2017.

arXiv:1611.05443 [pdf, ps, other]

Bridging the Gap between Individuality and Joint Improvisation in the Mirror Game

Authors: Chao Zhai, Michael Z. Q. Chen, Francesco Alderisio, Alexei Yu. Uteshev, Mario di Bernardo

Abstract: Extensive experiments in Human Movement Science suggest that solo motions are characterized by unique features that define the individuality or motor signature of people. While interacting with others, humans tend to spontaneously coordinate their movement and unconsciously give rise to joint improvisation. However, it has yet to be shed light on the relationship between individuality and joint im… ▽ More Extensive experiments in Human Movement Science suggest that solo motions are characterized by unique features that define the individuality or motor signature of people. While interacting with others, humans tend to spontaneously coordinate their movement and unconsciously give rise to joint improvisation. However, it has yet to be shed light on the relationship between individuality and joint improvisation. By means of an ad-hoc virtual agent, in this work we uncover the internal mechanisms of the transition from solo to joint improvised motion in the mirror game, a simple yet effective paradigm for studying interpersonal human coordination. According to the analysis of experimental data, normalized segments of velocity in solo motion are regarded as individual motor signature, and the existence of velocity segments possessing a prescribed signature is theoretically guaranteed. In this work, we first develop a systematic approach based on velocity segments to generate \emph{in-silico} trajectories of a given human participant playing solo. Then we present an online algorithm for the virtual player to produce joint improvised motion with another agent while exhibiting some desired kinematic characteristics, and to account for movement coordination and mutual adaptation during joint action tasks. Finally, we demonstrate that the proposed approach succeeds in revealing the kinematic features transition from solo to joint improvised motions, thus revealing the existence of a tight relationship between individuality and joint improvisation. △ Less

Submitted 16 November, 2016; originally announced November 2016.

arXiv:1606.01684 [pdf, ps, other]

doi 10.1038/srep26096

Regulation of Irregular Neuronal Firing by Autaptic Transmission

Authors: Daqing Guo, Shengdun Wu, Mingming Chen, Matjaz Perc, Yangsong Zhang, Jingling Ma, Yan Cui, Peng Xu, Yang Xia, Dezhong Yao

Abstract: The importance of self-feedback autaptic transmission in modulating spike-time irregularity is still poorly understood. By using a biophysical model that incorporates autaptic coupling, we here show that self-innervation of neurons participates in the modulation of irregular neuronal firing, primarily by regulating the occurrence frequency of burst firing. In particular, we find that both excitato… ▽ More The importance of self-feedback autaptic transmission in modulating spike-time irregularity is still poorly understood. By using a biophysical model that incorporates autaptic coupling, we here show that self-innervation of neurons participates in the modulation of irregular neuronal firing, primarily by regulating the occurrence frequency of burst firing. In particular, we find that both excitatory and electrical autapses increase the occurrence of burst firing, thus reducing neuronal firing regularity. In contrast, inhibitory autapses suppress burst firing and therefore tend to improve the regularity of neuronal firing. Importantly, we show that these findings are independent of the firing properties of individual neurons, and as such can be observed for neurons operating in different modes. Our results provide an insightful mechanistic understanding of how different types of autapses shape irregular firing at the single-neuron level, and they highlight the functional importance of autaptic self-innervation in taming and modulating neurodynamics. △ Less

Submitted 6 June, 2016; originally announced June 2016.

Comments: 27 pages, 8 figures

Journal ref: Sci. Rep. 6 (2016) 26096

arXiv:1606.01358 [pdf, ps, other]

doi 10.1209/0295-5075/114/30001

Firing regulation of fast-spiking interneurons by autaptic inhibition

Authors: Daqing Guo, Mingming Chen, Matjaz Perc, Shengdun Wu, Chuan Xia, Yangsong Zhang, Peng Xu, Yang Xia, Dezhong Yao

Abstract: Fast-spiking (FS) interneurons in the brain are self-innervated by powerful inhibitory GABAergic autaptic connections. By computational modelling, we investigate how autaptic inhibition regulates the firing response of such interneurons. Our results indicate that autaptic inhibition both boosts the current threshold for action potential generation as well as modulates the input-output gain of FS i… ▽ More Fast-spiking (FS) interneurons in the brain are self-innervated by powerful inhibitory GABAergic autaptic connections. By computational modelling, we investigate how autaptic inhibition regulates the firing response of such interneurons. Our results indicate that autaptic inhibition both boosts the current threshold for action potential generation as well as modulates the input-output gain of FS interneurons. The autaptic transmission delay is identified as a key parameter that controls the firing patterns and determines multistability regions of FS interneurons. Furthermore, we observe that neuronal noise influences the firing regulation of FS interneurons by autaptic inhibition and extends their dynamic range for encoding inputs. Importantly, autaptic inhibition modulates noise-induced irregular firing of FS interneurons, such that coherent firing appears at an optimal autaptic inhibition level. Our result reveal the functional roles of autaptic inhibition in taming the firing dynamics of FS interneurons. △ Less

Submitted 4 June, 2016; originally announced June 2016.

Comments: 6 pages, 5 figures

Journal ref: EPL 114 (2016) 30001

arXiv:1604.03187 [pdf, ps, other]

Evoking complex neuronal networks by stimulating a single neuron

Authors: Mengjiao Chen, Weijie Lin, Hengtong Wang, Wei Ren, Xingang Wang

Abstract: The dynamical responses of complex neuronal networks to external stimulus injected on a \emph{single} neuron are investigated. Stimulating the largest-degree neuron in the network, it is found that as the intensity of the stimulus increases, the network will be transiting from the resting to firing states and then restoring to the resting state, showing a bounded firing region in the parameter spa… ▽ More The dynamical responses of complex neuronal networks to external stimulus injected on a \emph{single} neuron are investigated. Stimulating the largest-degree neuron in the network, it is found that as the intensity of the stimulus increases, the network will be transiting from the resting to firing states and then restoring to the resting state, showing a bounded firing region in the parameter space. Furthermore, it is found that as the coupling strength decreases, the firing region is gradually expanded and, at the weak couplings, separated into disconnected subregions. By a simplified network model, we conduct a detail analysis on the bifurcation diagram of the network dynamics in the two-dimensional parameter space spanned by stimulating intensity and coupling strength, and, by introducing a new coefficient named effective stimulus, explore the mechanisms of the modified firing region. It is revealed that the coupling strength and stimulating intensity are equally important in evoking the network, but with different mechanisms. Specifically, the effective stimuli are \emph{shifted up} globally with the increase of the stimulating intensity, while are \emph{drawn closer} with the increase of the coupling strength. The dynamical responses of small-world and random complex networks to external stimulus injected on the largest-degree neuron are also investigated, which confirm the generality of the observed phenomena. △ Less

Submitted 11 April, 2016; originally announced April 2016.

Comments: 5 figures, 8 pages

arXiv:1501.07074 [pdf, ps, other]

doi 10.1088/1478-3975/12/1/016009

Development of modularity in the neural activity of children's brains

Authors: Man Chen, Michael W. Deem

Abstract: We study how modularity of the human brain changes as children develop into adults. Theory suggests that modularity can enhance the response function of a networked system subject to changing external stimuli. Thus, greater cognitive performance might be achieved for more modular neural activity, and modularity might likely increase as children develop. The value of modularity calculated from fMRI… ▽ More We study how modularity of the human brain changes as children develop into adults. Theory suggests that modularity can enhance the response function of a networked system subject to changing external stimuli. Thus, greater cognitive performance might be achieved for more modular neural activity, and modularity might likely increase as children develop. The value of modularity calculated from fMRI data is observed to increase during childhood development and peak in young adulthood. Head motion is deconvolved from the fMRI data, and it is shown that the dependence of modularity on age is independent of the magnitude of head motion. A model is presented to illustrate how modularity can provide greater cognitive performance at short times, i.e.\ task switching. A fitness function is extracted from the model. Quasispecies theory is used to predict how the average modularity evolves with age, illustrating the increase of modularity during development from children to adults that arises from selection for rapid cognitive function in young adults. Experiments exploring the effect of modularity on cognitive performance are suggested. Modularity may be a potential biomarker for injury, rehabilitation, or disease. △ Less

Submitted 28 January, 2015; originally announced January 2015.

Comments: 29 pages, 11 figures

Journal ref: Phys. Biol. 12 (2015) 016009

arXiv:1501.04709 [pdf]

doi 10.1038/srep16361

Identifying robust communities and multi-community nodes by combining top-down and bottom-up approaches to clustering

Authors: Chris Gaiteri, Mingming Chen, Boleslaw Szymanski, Konstantin Kuzmin, Jierui Xie, Changkyu Lee, Timothy Blanche, Elias Chaibub Neto, Su-Chun Huang, Thomas Grabowski, Tara Madhyastha, Vitalina Komashko

Abstract: Biological functions are carried out by groups of interacting molecules, cells or tissues, known as communities. Membership in these communities may overlap when biological components are involved in multiple functions. However, traditional clustering methods detect non-overlapping communities. These detected communities may also be unstable and difficult to replicate, because traditional methods… ▽ More Biological functions are carried out by groups of interacting molecules, cells or tissues, known as communities. Membership in these communities may overlap when biological components are involved in multiple functions. However, traditional clustering methods detect non-overlapping communities. These detected communities may also be unstable and difficult to replicate, because traditional methods are sensitive to noise and parameter settings. These aspects of traditional clustering methods limit our ability to detect biological communities, and therefore our ability to understand biological functions. To address these limitations and detect robust overlapping biological communities, we propose an unorthodox clustering method called SpeakEasy which identifies communities using top-down and bottom-up approaches simultaneously. Specifically, nodes join communities based on their local connections, as well as global information about the network structure. This method can quantify the stability of each community, automatically identify the number of communities, and quickly cluster networks with hundreds of thousands of nodes. SpeakEasy shows top performance on synthetic clustering benchmarks and accurately identifies meaningful biological communities in a range of datasets, including: gene microarrays, protein interactions, sorted cell populations, electrophysiology and fMRI brain imaging. △ Less

Submitted 25 February, 2015; v1 submitted 19 January, 2015; originally announced January 2015.

Journal ref: Scientific Reports 5, Article number: 16361 (2015)

arXiv:1501.04708 [pdf, ps, other]

doi 10.1088/1478-3975/12/2/025001

Modularity Enhances the Rate of Evolution in a Rugged Fitness Landscape

Authors: Jeong-Man Park, Man Chen, Dong Wang, Michael W. Deem

Abstract: Biological systems are modular, and this modularity affects the evolution of biological systems over time and in different environments. We here develop a theory for the dynamics of evolution in a rugged, modular fitness landscape. We show analytically how horizontal gene transfer couples to the modularity in the system and leads to more rapid rates of evolution at short times. The model, in gener… ▽ More Biological systems are modular, and this modularity affects the evolution of biological systems over time and in different environments. We here develop a theory for the dynamics of evolution in a rugged, modular fitness landscape. We show analytically how horizontal gene transfer couples to the modularity in the system and leads to more rapid rates of evolution at short times. The model, in general, analytically demonstrates a selective pressure for the prevalence of modularity in biology. We use this model to show how the evolution of the influenza virus is affected by the modularity of the proteins that are recognized by the human immune system. Approximately 25\% of the observed rate of fitness increase of the virus could be ascribed to a modular viral landscape. △ Less

Submitted 19 January, 2015; originally announced January 2015.

Comments: 45 pages; 7 figures

arXiv:1412.1025 [pdf]

Taxonomic Provenance: Two Influential Primate Classifications Logically Aligned

Authors: Nico M. Franz, Naomi M. Pier, DeAnn M. Reeder, Mingmin Chen, Shizhuo Yu, Parisa Kianmajd, Shawn Bowers, Bertram Ludaescher

Abstract: Classification standards such as the Mammal Species of the World (MSW) aim to unify name usages at the global scale, but may nevertheless experience significant levels of taxonomic change from one edition to the next. This circumstance challenges the biodiversity and phylogenetic data communities to develop more granular identifiers to track taxonomic congruence and incongruence in ways that both… ▽ More Classification standards such as the Mammal Species of the World (MSW) aim to unify name usages at the global scale, but may nevertheless experience significant levels of taxonomic change from one edition to the next. This circumstance challenges the biodiversity and phylogenetic data communities to develop more granular identifiers to track taxonomic congruence and incongruence in ways that both humans and machines can process, i.e., to logically represent taxonomic provenance across multiple classification hierarchies. Here we show that reasoning over taxonomic provenance is feasible for two classifications of primates corresponding to the second and third MSW editions. Our approach entails three main components: (1) individuation of name usages as taxonomic concepts, (2) articulation of concepts via human-asserted Region Connection Calculus (RCC-5) relationships, and (3) the use of an Answer Set Programming toolkit to infer and visualize logically consistent alignments of these taxonomic input constraints. Our use case entails the Primates sec. Groves (1993; MSW2 - 317 taxonomic concepts; 233 at the species level) and Primates sec. Groves (2005; MSW3 - 483 taxonomic concepts; 376 at the species level). Using 402 concept-to-concept input articulations, the reasoning process yields a single, consistent alignment, and infers 153,111 Maximally Informative Relations that constitute a comprehensive provenance resolution map for every concept pair in the Primates sec. MSW2/MSW3. The entire alignment and various partitions facilitate quantitative analyses of name/meaning dissociation, revealing that approximately one in three paired name usages across treatments is not reliable - in the sense of the same name identifying congruent taxonomic meanings. We conclude with an optimistic outlook for logic-based provenance tools in next-generation biodiversity and phylogeny data platforms. △ Less

Submitted 9 December, 2014; v1 submitted 2 December, 2014; originally announced December 2014.

Comments: Partial manuscript; figures down-sampled

arXiv:1411.0940 [pdf]

doi 10.1371/journal.pone.0118247

Reasoning over Taxonomic Change: Exploring Alignments for the Perelleschus Use Case

Authors: Nico M. Franz, Mingmin Chen, Shizhuo Yu, Parisa Kianmajd, Shawn Bowers, Bertram Ludaescher

Abstract: Classifications and phylogenetic inferences of organismal groups change in light of new insights. Over time these changes can result in an imperfect tracking of taxonomic perspectives through the re-/use of Code-compliant or informal names. To mitigate these limitations, we introduce a novel approach for aligning taxonomies through the interaction of human experts and logic reasoners. We explore t… ▽ More Classifications and phylogenetic inferences of organismal groups change in light of new insights. Over time these changes can result in an imperfect tracking of taxonomic perspectives through the re-/use of Code-compliant or informal names. To mitigate these limitations, we introduce a novel approach for aligning taxonomies through the interaction of human experts and logic reasoners. We explore the performance of this approach with the Perelleschus use case of Franz & Cardona-Duque (2013). The use case includes six taxonomies published from 1936 to 2013, 54 taxonomic concepts (i.e., circumscriptions of names individuated according to their respective source publications), and 75 expert-asserted Region Connection Calculus articulations (e.g., congruence, proper inclusion, overlap, or exclusion). An Open Source reasoning toolkit is used to analyze 13 paired Perelleschus taxonomy alignments under heterogeneous constraints and interpretations. The reasoning workflow optimizes the logical consistency and expressiveness of the input and infers the set of maximally informative relations among the entailed taxonomic concepts. The latter are then used to produce merge visualizations that represent all congruent and non-congruent taxonomic elements among the aligned input trees. In this small use case with 6-53 input concepts per alignment, the information gained through the reasoning process is on average one order of magnitude greater than in the input. The approach offers scalable solutions for tracking provenance among succeeding taxonomic perspectives that may have differential biases in naming conventions, phylogenetic resolution, ingroup and outgroup sampling, or ostensive (member-referencing) versus intensional (property-referencing) concepts and articulations. △ Less

Submitted 3 November, 2014; originally announced November 2014.

Comments: 30 pages, 16 figures

arXiv:1401.7724 [pdf]

doi 10.1088/1478-3975/10/1/016006

Prediction of heart rate response to conclusion of spontaneous breathing trial by fluctuation dissipation theory

Authors: Man Chen, Liang Ren Niestemski, Robert Prevost, Michael McRae, Sharath Cholleti, Gabriel Najarro, Timothy G. Buchman, Michael W. Deem

Abstract: The non-equilibrium fluctuation dissipation theorem is applied to predict how critically ill patients respond to treatment, based upon data currently collected by standard hospital monitoring devices. This framework is demonstrated on a common procedure in critical care: the spontaneous breathing trial. It is shown that the responses of groups of similar patients to the spontaneous breathing trial… ▽ More The non-equilibrium fluctuation dissipation theorem is applied to predict how critically ill patients respond to treatment, based upon data currently collected by standard hospital monitoring devices. This framework is demonstrated on a common procedure in critical care: the spontaneous breathing trial. It is shown that the responses of groups of similar patients to the spontaneous breathing trial can be predicted by the non-equilibrium fluctuation dissipation approach. This mathematical framework, when fully formed and applied to other clinical interventions, may serve as part of the basis for personalized critical care. △ Less

Submitted 29 January, 2014; originally announced January 2014.

Comments: 12 pages, 2 figures

Journal ref: Phys. Biol. 10 (2013) 016006

arXiv:1401.5048 [pdf, other]

doi 10.1088/1478-3975/10/5/056006

Hierarchy of Gene Expression Data is Predictive of Future Breast Cancer Outcome

Authors: Man Chen, Michael W. Deem

Abstract: We calculate measures of hierarchy in gene and tissue networks of breast cancer patients. We find that the likelihood of metastasis in the future is correlated with increased values of network hierarchy for expression networks of cancer-associated genes, due to correlated expression of cancer-specific pathways. Conversely, future metastasis and quick relapse times are negatively correlated with va… ▽ More We calculate measures of hierarchy in gene and tissue networks of breast cancer patients. We find that the likelihood of metastasis in the future is correlated with increased values of network hierarchy for expression networks of cancer-associated genes, due to correlated expression of cancer-specific pathways. Conversely, future metastasis and quick relapse times are negatively correlated with values of network hierarchy in the expression network of all genes, due to dedifferentiation of gene pathways and circuits. These results suggest that hierarchy of gene expression may be useful as an additional biomarker for breast cancer prognosis. △ Less

Submitted 20 January, 2014; originally announced January 2014.

Comments: 14 pages, 5 figures

Journal ref: Phys. Biol. 10 (2013) 056006

arXiv:1401.2231 [pdf, ps, other]

doi 10.1371/journal.pcbi.1003495

Bidirectional Control of Absence Seizures by the Basal Ganglia: A Computational Evidence

Authors: Mingming Chen, Daqing Guo, Tiebin Wang, Wei Jing, Yang Xia, Peng Xu, Cheng Luo, Pedro A. Valdes-Sosa, Dezhong Yao

Abstract: Absence epilepsy is believed to be associated with the abnormal interactions between the cerebral cortex and thalamus. Besides the direct coupling, anatomical evidence indicates that the cerebral cortex and thalamus also communicate indirectly through an important intermediate bridge--basal ganglia. It has been thus postulated that the basal ganglia might play key roles in the modulation of absenc… ▽ More Absence epilepsy is believed to be associated with the abnormal interactions between the cerebral cortex and thalamus. Besides the direct coupling, anatomical evidence indicates that the cerebral cortex and thalamus also communicate indirectly through an important intermediate bridge--basal ganglia. It has been thus postulated that the basal ganglia might play key roles in the modulation of absence seizures, but the relevant biophysical mechanisms are still not completely established. Using a biophysically based model, we demonstrate here that the typical absence seizure activities can be controlled and modulated by the direct GABAergic projections from the substantia nigra pars reticulata (SNr) to either the thalamic reticular nucleus (TRN) or the specific relay nuclei (SRN) of thalamus, through different biophysical mechanisms. Under certain conditions, these two types of seizure control are observed to coexist in the same network. More importantly, due to the competition between the inhibitory SNr-TRN and SNr-SRN pathways, we find that both decreasing and increasing the activation of SNr neurons from the normal level may considerably suppress the generation of SWDs in the coexistence region. Overall, these results highlight the bidirectional functional roles of basal ganglia in controlling and modulating absence seizures, and might provide novel insights into the therapeutic treatments of this brain disorder. △ Less

Submitted 10 January, 2014; originally announced January 2014.

Comments: 10 figures and 1 table. This paper has been accepted by PLoS Computational Biology

arXiv:1309.5337 [pdf, other]

Change Point Analysis of Histone Modifications Reveals Epigenetic Blocks Linking to Physical Domains

Authors: Mengjie Chen, Haifan Lin, Hongyu Zhao

Abstract: Histone modification is a vital epigenetic mechanism for transcriptional control in eukaryotes. High-throughput techniques have enabled whole-genome analysis of histone modifications in recent years. However, most studies assume one combination of histone modification invariantly translates to one transcriptional output regardless of local chromatin environment. In this study we hypothesize that,… ▽ More Histone modification is a vital epigenetic mechanism for transcriptional control in eukaryotes. High-throughput techniques have enabled whole-genome analysis of histone modifications in recent years. However, most studies assume one combination of histone modification invariantly translates to one transcriptional output regardless of local chromatin environment. In this study we hypothesize that, the genome is organized into local domains that manifest similar enrichment pattern of histone modification, which leads to orchestrated regulation of expression of genes with relevant bio- logical functions. We propose a multivariate Bayesian Change Point (BCP) model to segment the Drosophila melanogaster genome into consecutive blocks on the basis of combinatorial patterns of histone marks. By modeling the sparse distribution of histone marks across the chromosome with a zero-inflated Gaussian mixture, our partitions capture local BLOCKs that manifest relatively homogeneous enrichment pattern of histone modifications. We further characterized BLOCKs by their transcription levels, distribution of genes, degree of co-regulation and GO enrichment. Our results demonstrate that these BLOCKs, although inferred merely from histone modifications, reveal strong relevance with physical domains, which suggest their important roles in chromatin organization and coordinated gene regulation. △ Less

Submitted 9 May, 2014; v1 submitted 20 September, 2013; originally announced September 2013.

Comments: 23 pages, 6 figures

arXiv:1307.8229 [pdf, other]

Posterior Contraction Rates of the Phylogenetic Indian Buffet Processes

Authors: Mengjie Chen, Chao Gao, Hongyu Zhao

Abstract: By expressing prior distributions as general stochastic processes, nonparametric Bayesian methods provide a flexible way to incorporate prior knowledge and constrain the latent structure in statistical inference. The Indian buffet process (IBP) is such an example that can be used to define a prior distribution on infinite binary features, where the exchangeability among subjects is assumed. The ph… ▽ More By expressing prior distributions as general stochastic processes, nonparametric Bayesian methods provide a flexible way to incorporate prior knowledge and constrain the latent structure in statistical inference. The Indian buffet process (IBP) is such an example that can be used to define a prior distribution on infinite binary features, where the exchangeability among subjects is assumed. The phylogenetic Indian buffet process (pIBP), a derivative of IBP, enables the modeling of non-exchangeability among subjects through a stochastic process on a rooted tree, which is similar to that used in phylogenetics, to describe relationships among the subjects. In this paper, we study the theoretical properties of IBP and pIBP under a binary factor model. We establish the posterior contraction rates for both IBP and pIBP and substantiate the theoretical results through simulation studies. This is the first work addressing the frequentist property of the posterior behaviors of IBP and pIBP. We also demonstrated its practical usefulness by applying pIBP prior to a real data example arising in the field of cancer genomics where the exchangeability among subjects is violated. △ Less

Submitted 19 May, 2015; v1 submitted 31 July, 2013; originally announced July 2013.

Showing 1–50 of 50 results for author: Chen, M