-
Feasibility of Identifying Factors Related to Alzheimer's Disease and Related Dementia in Real-World Data
Authors:
Aokun Chen,
Qian Li,
Yu Huang,
Yongqiu Li,
Yu-neng Chuang,
Xia Hu,
Serena Guo,
Yonghui Wu,
Yi Guo,
Jiang Bian
Abstract:
A comprehensive view of factors associated with AD/ADRD will significantly aid in studies to develop new treatments for AD/ADRD and identify high-risk populations and patients for prevention efforts. In our study, we summarized the risk factors for AD/ADRD by reviewing existing meta-analyses and review articles on risk and preventive factors for AD/ADRD. In total, we extracted 477 risk factors in…
▽ More
A comprehensive view of factors associated with AD/ADRD will significantly aid in studies to develop new treatments for AD/ADRD and identify high-risk populations and patients for prevention efforts. In our study, we summarized the risk factors for AD/ADRD by reviewing existing meta-analyses and review articles on risk and preventive factors for AD/ADRD. In total, we extracted 477 risk factors in 10 categories from 537 studies. We constructed an interactive knowledge map to disseminate our study results. Most of the risk factors are accessible from structured Electronic Health Records (EHRs), and clinical narratives show promise as information sources. However, evaluating genomic risk factors using RWD remains a challenge, as genetic testing for AD/ADRD is still not a common practice and is poorly documented in both structured and unstructured EHRs. Considering the constantly evolving research on AD/ADRD risk factors, literature mining via NLP methods offers a solution to automatically update our knowledge map.
△ Less
Submitted 3 February, 2024;
originally announced February 2024.
-
Toward a Team of AI-made Scientists for Scientific Discovery from Gene Expression Data
Authors:
Haoyang Liu,
Yijiang Li,
Jinglin Jian,
Yuxuan Cheng,
Jianrong Lu,
Shuyi Guo,
Jinglei Zhu,
Mianchen Zhang,
Miantong Zhang,
Haohan Wang
Abstract:
Machine learning has emerged as a powerful tool for scientific discovery, enabling researchers to extract meaningful insights from complex datasets. For instance, it has facilitated the identification of disease-predictive genes from gene expression data, significantly advancing healthcare. However, the traditional process for analyzing such datasets demands substantial human effort and expertise…
▽ More
Machine learning has emerged as a powerful tool for scientific discovery, enabling researchers to extract meaningful insights from complex datasets. For instance, it has facilitated the identification of disease-predictive genes from gene expression data, significantly advancing healthcare. However, the traditional process for analyzing such datasets demands substantial human effort and expertise for the data selection, processing, and analysis. To address this challenge, we introduce a novel framework, a Team of AI-made Scientists (TAIS), designed to streamline the scientific discovery pipeline. TAIS comprises simulated roles, including a project manager, data engineer, and domain expert, each represented by a Large Language Model (LLM). These roles collaborate to replicate the tasks typically performed by data scientists, with a specific focus on identifying disease-predictive genes. Furthermore, we have curated a benchmark dataset to assess TAIS's effectiveness in gene identification, demonstrating our system's potential to significantly enhance the efficiency and scope of scientific exploration. Our findings represent a solid step towards automating scientific discovery through large language models.
△ Less
Submitted 20 February, 2024; v1 submitted 15 February, 2024;
originally announced February 2024.
-
A dynamic model to study the potential TB infections and assessment of control strategies in China
Authors:
Chuanqing Xu,
Kedeng Cheng,
Songbai Guo,
Dehui Yuan,
Xiaoyu Zhao
Abstract:
China is one of the countries with a high burden of tuberculosis, and although the number of new cases of tuberculosis has been decreasing year by year, the number of new infections per year has remained high and the diagnosis rate of tuberculosis-infected patients has remained low. Based on the analysis of TB infection data, we develop a model of TB transmission dynamics that include potentially…
▽ More
China is one of the countries with a high burden of tuberculosis, and although the number of new cases of tuberculosis has been decreasing year by year, the number of new infections per year has remained high and the diagnosis rate of tuberculosis-infected patients has remained low. Based on the analysis of TB infection data, we develop a model of TB transmission dynamics that include potentially infected individuals and BCG vaccination, fit the model parameters to the data on new TB cases, calculate the basic reproduction number \mathcal{R}_v= 0.4442. A parametric sensitivity analysis of \mathcal{R}_v is performed, and we obtained the correlation coefficients of BCG vaccination rate and effectiveness rate with \mathcal{R}_v as -0.810, -0.825. According to the model, we estimate that there are 614,186 (95% CI [562,631,665,741]) potentially infected TB cases in China, accounting for about 39.5% of the total number of TB cases. We assess the feasibility of achieving the goals of the WHO strategy to end tuberculosis in China and find that reducing the number of new cases by 90 per cent by 2035 is very difficult with the current tuberculosis control measures. However, with an effective combination of control measures such as increased detection of potentially infected persons, improved drug treatment, and reduction of overall exposure to tuberculosis patients, it is feasible to reach the WHO strategic goal of ending tuberculosis by 2035.
△ Less
Submitted 25 January, 2024; v1 submitted 22 January, 2024;
originally announced January 2024.
-
STW-MD: A Novel Spatio-Temporal Weighting and Multi-Step Decision Tree Method for Considering Spatial Heterogeneity in Brain Gene Expression Data
Authors:
Shanjun Mao,
Xiao Huang,
Runjiu Chen,
Chenyang Zhang,
Yizhu Diao,
Zongjin Li,
Qingzhe Wang,
Shan Tang,
Shuixia Guo
Abstract:
Motivation: Gene expression during brain development or abnormal development is a biological process that is highly dynamic in spatio and temporal. Due to the lack of comprehensive integration of spatial and temporal dimensions of brain gene expression data, previous studies have mainly focused on individual brain regions or a certain developmental stage. Our motivation is to address this gap by i…
▽ More
Motivation: Gene expression during brain development or abnormal development is a biological process that is highly dynamic in spatio and temporal. Due to the lack of comprehensive integration of spatial and temporal dimensions of brain gene expression data, previous studies have mainly focused on individual brain regions or a certain developmental stage. Our motivation is to address this gap by incorporating spatio-temporal information to gain a more complete understanding of the mechanisms underlying brain development or disorders associated with abnormal brain development, such as Alzheimer's disease (AD), and to identify potential determinants of response.
Results: In this study, we propose a novel two-step framework based on spatial-temporal information weighting and multi-step decision trees. This framework can effectively exploit the spatial similarity and temporal dependence between different stages and different brain regions, and facilitate differential gene analysis in brain regions with high heterogeneity. We focus on two datasets: the AD dataset, which includes gene expression data from early, middle, and late stages, and the brain development dataset, spanning fetal development to adulthood. Our findings highlight the advantages of the proposed framework in discovering gene classes and elucidating their impact on brain development and AD progression across diverse brain regions and stages. These findings align with existing studies and provide insights into the processes of normal and abnormal brain development.
Availability: The code of STW-MD is available at https://github.com/tsnm1/STW-MD.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
Diffusing on Two Levels and Optimizing for Multiple Properties: A Novel Approach to Generating Molecules with Desirable Properties
Authors:
Siyuan Guo,
Jihong Guan,
Shuigeng Zhou
Abstract:
In the past decade, Artificial Intelligence driven drug design and discovery has been a hot research topic, where an important branch is molecule generation by generative models, from GAN-based models and VAE-based models to the latest diffusion-based models. However, most existing models pursue only the basic properties like validity and uniqueness of the generated molecules, a few go further to…
▽ More
In the past decade, Artificial Intelligence driven drug design and discovery has been a hot research topic, where an important branch is molecule generation by generative models, from GAN-based models and VAE-based models to the latest diffusion-based models. However, most existing models pursue only the basic properties like validity and uniqueness of the generated molecules, a few go further to explicitly optimize one single important molecular property (e.g. QED or PlogP), which makes most generated molecules little usefulness in practice. In this paper, we present a novel approach to generating molecules with desirable properties, which expands the diffusion model framework with multiple innovative designs. The novelty is two-fold. On the one hand, considering that the structures of molecules are complex and diverse, and molecular properties are usually determined by some substructures (e.g. pharmacophores), we propose to perform diffusion on two structural levels: molecules and molecular fragments respectively, with which a mixed Gaussian distribution is obtained for the reverse diffusion process. To get desirable molecular fragments, we develop a novel electronic effect based fragmentation method. On the other hand, we introduce two ways to explicitly optimize multiple molecular properties under the diffusion model framework. First, as potential drug molecules must be chemically valid, we optimize molecular validity by an energy-guidance function. Second, since potential drug molecules should be desirable in various properties, we employ a multi-objective mechanism to optimize multiple molecular properties simultaneously. Extensive experiments with two benchmark datasets QM9 and ZINC250k show that the molecules generated by our proposed method have better validity, uniqueness, novelty, Fréchet ChemNet Distance (FCD), QED, and PlogP than those generated by current SOTA models.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
Towards Trustworthy Artificial Intelligence for Equitable Global Health
Authors:
Hong Qin,
Jude Kong,
Wandi Ding,
Ramneek Ahluwalia,
Christo El Morr,
Zeynep Engin,
Jake Okechukwu Effoduh,
Rebecca Hwa,
Serena Jingchuan Guo,
Laleh Seyyed-Kalantari,
Sylvia Kiwuwa Muyingo,
Candace Makeda Moore,
Ravi Parikh,
Reva Schwartz,
Dongxiao Zhu,
Xiaoqian Wang,
Yiye Zhang
Abstract:
Artificial intelligence (AI) can potentially transform global health, but algorithmic bias can exacerbate social inequities and disparity. Trustworthy AI entails the intentional design to ensure equity and mitigate potential biases. To advance trustworthy AI in global health, we convened a workshop on Fairness in Machine Intelligence for Global Health (FairMI4GH). The event brought together a glob…
▽ More
Artificial intelligence (AI) can potentially transform global health, but algorithmic bias can exacerbate social inequities and disparity. Trustworthy AI entails the intentional design to ensure equity and mitigate potential biases. To advance trustworthy AI in global health, we convened a workshop on Fairness in Machine Intelligence for Global Health (FairMI4GH). The event brought together a global mix of experts from various disciplines, community health practitioners, policymakers, and more. Topics covered included managing AI bias in socio-technical systems, AI's potential impacts on global health, and balancing data privacy with transparency. Panel discussions examined the cultural, political, and ethical dimensions of AI in global health. FairMI4GH aimed to stimulate dialogue, facilitate knowledge transfer, and spark innovative solutions. Drawing from NIST's AI Risk Management Framework, it provided suggestions for handling AI risks and biases. The need to mitigate data biases from the research design stage, adopt a human-centered approach, and advocate for AI transparency was recognized. Challenges such as updating legal frameworks, managing cross-border data sharing, and motivating developers to reduce bias were acknowledged. The event emphasized the necessity of diverse viewpoints and multi-dimensional dialogue for creating a fair and ethical AI framework for equitable global health.
△ Less
Submitted 10 September, 2023;
originally announced September 2023.
-
Biologically Plausible Variational Policy Gradient with Spiking Recurrent Winner-Take-All Networks
Authors:
Zhile Yang,
Shangqi Guo,
Ying Fang,
Jian K. Liu
Abstract:
One stream of reinforcement learning research is exploring biologically plausible models and algorithms to simulate biological intelligence and fit neuromorphic hardware. Among them, reward-modulated spike-timing-dependent plasticity (R-STDP) is a recent branch with good potential in energy efficiency. However, current R-STDP methods rely on heuristic designs of local learning rules, thus requirin…
▽ More
One stream of reinforcement learning research is exploring biologically plausible models and algorithms to simulate biological intelligence and fit neuromorphic hardware. Among them, reward-modulated spike-timing-dependent plasticity (R-STDP) is a recent branch with good potential in energy efficiency. However, current R-STDP methods rely on heuristic designs of local learning rules, thus requiring task-specific expert knowledge. In this paper, we consider a spiking recurrent winner-take-all network, and propose a new R-STDP method, spiking variational policy gradient (SVPG), whose local learning rules are derived from the global policy gradient and thus eliminate the need for heuristic designs. In experiments of MNIST classification and Gym InvertedPendulum, our SVPG achieves good training performance, and also presents better robustness to various kinds of noises than conventional methods.
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
A study on the transmission dynamics of COVID-19 considering the impact of asymptomatic infection
Authors:
ZH. Zhang,
XT. Huang,
KD. Cheng,
CQ. Xu,
SB. Guo,
XJ. Wang
Abstract:
The COVID-19 epidemic has been spreading around the world for nearly three years, and asymptomatic infections have exacerbated the spread of the epidemic. To evaluate the role of asymptomatic infections in the spread of the epidemic, we develop mathematical models to assess the proportion of asymptomatic infections caused by different strains of the main covid-19 variants. The analysis shows that…
▽ More
The COVID-19 epidemic has been spreading around the world for nearly three years, and asymptomatic infections have exacerbated the spread of the epidemic. To evaluate the role of asymptomatic infections in the spread of the epidemic, we develop mathematical models to assess the proportion of asymptomatic infections caused by different strains of the main covid-19 variants. The analysis shows that when the control reproduction number is less than 1, the disease-free equilibrium point of the model is globally asymptotically stable; and when the control reproduction number is greater than 1, the endemic equilibrium point exists and is unique, and is locally asymptotically stable. We fit the epidemic data in the four time periods corresponding to the selected 614G, Alpha, Delta and Omicron variants. The fitting results show that, from the comparison of the four time periods, the proportion of asymptomatic persons among the infected persons gradually increased. We also predict the peak time and peak value for the four time periods, and the results indicate that the transmission speed and transmission intensity of the variant strains increased to some extent. Finally, we discuss the impact of the detection ratio of symptomatic infections on the spread of the epidemic. The results show that with the increase of the detection ratio, the cumulative number of cases has dropped significantly, but the decline in the proportion of asymptomatic infections is not obvious. Therefore, in view of the hidden transmission of asymptomatic infections, the cooperation between various epidemic prevention and control policies is required to effectively curb the spread of the epidemic.
△ Less
Submitted 24 October, 2022;
originally announced October 2022.
-
Dynamics of COVID-19 models with asymptomatic infections and quarantine measures
Authors:
Songbai Guo,
Yuling Xue,
Xiliang Li,
Zuohuan Zheng
Abstract:
Considering the propagation characteristics of COVID-19 in different regions, the dynamics analysis and numerical demonstration of long-term and short-term models of COVID-19 are carried out, respectively. The long-term model is devoted to investigate the global stability of COVID-19 model with asymptomatic infections and quarantine measures. By using the limit system of the model and Lyapunov fun…
▽ More
Considering the propagation characteristics of COVID-19 in different regions, the dynamics analysis and numerical demonstration of long-term and short-term models of COVID-19 are carried out, respectively. The long-term model is devoted to investigate the global stability of COVID-19 model with asymptomatic infections and quarantine measures. By using the limit system of the model and Lyapunov function method, it is shown that the COVID-19-free equilibrium $V^0$ is globally asymptotically stable if the control reproduction number $\mathcal{R}_{c}<1$ and globally attractive if $\mathcal{R}_{c}=1$, which means that COVID-19 will die out; the COVID-19 equilibrium $V^{\ast}$ is globally asymptotically stable if $\mathcal{R}_{c}>1$, which means that COVID-19 will be persistent. In particular, to obtain the local stability of $V^{\ast}$, we use proof by contradiction and the properties of complex modulus with some novel details, and we prove the weak persistence of the system to obtain the global attractivity of $V^{\ast}$. Moreover, the final size of the corresponding short-term model is calculated and the stability of its multiple equilibria is analyzed. Numerical simulations of COVID-19 cases show that quarantine measures and asymptomatic infections have a non-negligible impact on the transmission of COVID-19.
△ Less
Submitted 6 November, 2022; v1 submitted 12 September, 2022;
originally announced September 2022.
-
A novel analysis approach of uniform persistence for a COVID-19 model with quarantine and standard incidence rate
Authors:
Songbai Guo,
Yuling Xue,
Xiliang Li,
Zuohuan Zheng
Abstract:
A coronavirus disease 2019 (COVID-19) model with quarantine and standard incidence rate is first developed, then a novel analysis approach for finding the ultimate lower bound of COVID-19 infectious individuals is proposed, which means that the COVID-19 pandemic is uniformly persistent if the control reproduction number $\mathcal{R}_{c}>1$. This approach can be applied to other related biomathemat…
▽ More
A coronavirus disease 2019 (COVID-19) model with quarantine and standard incidence rate is first developed, then a novel analysis approach for finding the ultimate lower bound of COVID-19 infectious individuals is proposed, which means that the COVID-19 pandemic is uniformly persistent if the control reproduction number $\mathcal{R}_{c}>1$. This approach can be applied to other related biomathematical models, and some existing works can be improved by using it. In addition, the COVID-19-free equilibrium $V^0$ is locally asymptotically stable (LAS) if $\mathcal{R}_{c}<1$ and linearly stable if $\mathcal{R}_{c}=1$, respectively; while $V^0$ is unstable if $\mathcal{R}_{c}>1$.
△ Less
Submitted 31 October, 2022; v1 submitted 31 May, 2022;
originally announced May 2022.
-
The low-entropy hydration shell at the binding site of spike RBD determines the contagiousness of SARS-CoV-2 variants
Authors:
Lin Yang,
Shuai Guo,
Chengyu Houc,
Jiacheng Lia,
Liping Shi,
Chenchen Liao,
Rongchun Shi,
Xiaoliang Ma,
Bing Zheng,
Yi Fang,
Lin Ye,
Xiaodong He
Abstract:
The infectivity of SARS-CoV-2 depends on the binding affinity of the receptor-binding domain (RBD) of the spike protein with the angiotensin converting enzyme 2 (ACE2) receptor. The calculated RBD-ACE2 binding energies indicate that the difference in transmission efficiency of SARS-CoV-2 variants cannot be fully explained by electrostatic interactions, hydrogen-bond interactions, van der Waals int…
▽ More
The infectivity of SARS-CoV-2 depends on the binding affinity of the receptor-binding domain (RBD) of the spike protein with the angiotensin converting enzyme 2 (ACE2) receptor. The calculated RBD-ACE2 binding energies indicate that the difference in transmission efficiency of SARS-CoV-2 variants cannot be fully explained by electrostatic interactions, hydrogen-bond interactions, van der Waals interactions, internal energy, and nonpolar solvation energies. Here, we demonstrate that low-entropy regions of hydration shells around proteins drive hydrophobic attraction between shape-matched low-entropy regions of the hydration shells, which essentially coordinates protein-protein binding in rotational-configurational space of mutual orientations and determines the binding affinity. An innovative method was used to identify the low-entropy regions of the hydration shells of the RBDs of multiple SARS-CoV-2 variants and the ACE2. We observed integral low-entropy regions of hydration shells covering the binding sites of the RBDs and matching in shape to the low-entropy region of hydration shell at the binding site of the ACE2. The RBD-ACE2 binding is thus found to be guided by hydrophobic collapse between the shape-matched low-entropy regions of the hydration shells. A measure of the low-entropy of the hydration shells can be obtained by counting the number of hydrophilic groups expressing hydrophilicity within the binding sites. The low-entropy level of hydration shells at the binding site of a spike protein is found to be an important indicator of the contagiousness of the coronavirus.
△ Less
Submitted 27 April, 2022;
originally announced April 2022.
-
Activate index: an integrated index to reveal disrupted brain network organizations of major depressive disorder patients
Authors:
Yu Fu,
Yanyan Huang,
Meng Niu,
Le Xue,
Shunjie Dong,
Shunlin Guo,
Junqiang Lei,
Cheng Zhuo
Abstract:
Altered functional brain networks have been a typical manifestation that distinguishes major depressive disorder (MDD) patients from healthy control (HC) subjects in functional magnetic resonance imaging (fMRI) studies. Recently, rich club and diverse club metrics have been proposed for network or network neuroscience analyses. The rich club defines a set of nodes that tend to be the hubs of speci…
▽ More
Altered functional brain networks have been a typical manifestation that distinguishes major depressive disorder (MDD) patients from healthy control (HC) subjects in functional magnetic resonance imaging (fMRI) studies. Recently, rich club and diverse club metrics have been proposed for network or network neuroscience analyses. The rich club defines a set of nodes that tend to be the hubs of specific communities, and the diverse club defines the nodes that span more communities and have edges diversely distributed across different communities. Considering the heterogeneity of rich clubs and diverse clubs, combining them and on the basis to derive a novel indicator may reveal new evidence of brain functional integration and separation, which might provide new insights into MDD. This study for the first time discussed the differences between MDD and HC using both rich club and diverse club metrics and found the complementarity of them in analyzing brain networks. Besides, a novel index, termed "active index", has been proposed in this study. The active index defines a group of nodes that tend to be diversely distributed across communities while avoiding being a hub of a community. Experimental results demonstrate the superiority of active index in analyzing MDD brain mechanisms.
△ Less
Submitted 14 February, 2022;
originally announced February 2022.
-
Space Layout of Low-entropy Hydration Shells Guides Protein Binding
Authors:
Lin Yang,
Shuai Guo,
Chengyu Hou,
Chencheng Liao,
Jiacheng Li,
Liping Shi,
Xiaoliang Ma,
Shenda Jiang,
Bing Zheng,
Yi Fang,
Lin Ye,
Xiaodong He
Abstract:
Protein-protein binding enables orderly and lawful biological self-organization, and is therefore considered a miracle of nature. Protein-protein binding is steered by electrostatic forces, hydrogen bonding, van der Waals force, and hydrophobic interactions. Among these physical forces, only the hydrophobic interactions can be considered as long-range intermolecular attractions between proteins in…
▽ More
Protein-protein binding enables orderly and lawful biological self-organization, and is therefore considered a miracle of nature. Protein-protein binding is steered by electrostatic forces, hydrogen bonding, van der Waals force, and hydrophobic interactions. Among these physical forces, only the hydrophobic interactions can be considered as long-range intermolecular attractions between proteins in intracellular and extracellular fluid. Low-entropy regions of hydration shells around proteins drive hydrophobic attraction among them that essentially coordinate protein-protein docking in rotational-conformational space of mutual orientations at the guidance stage of the binding. Here, an innovative method was developed for identifying the low-entropy regions of hydration shells of given proteins, and we discovered that the largest low-entropy regions of hydration shells on proteins typically cover the binding sites. According to an analysis of determined protein complex structures, shape matching between the largest low-entropy hydration shell region of a protein and that of its partner at the binding sites is revealed as a regular pattern. Protein-protein binding is thus found to be mainly guided by hydrophobic collapse between the shape-matched low-entropy hydration shells that is verified by bioinformatics analyses of hundreds of structures of protein complexes. A simple algorithm is developed to precisely predict protein binding sites.
△ Less
Submitted 21 February, 2022;
originally announced February 2022.
-
Network resilience in the aging brain
Authors:
Tao Liu,
Shu Guo,
Hao Liu,
Rui Kang,
Mingyang Bai,
Jiyang Jiang,
Wei Wen,
Xing Pan,
Jun Tai,
Jianxin Li,
Jian Cheng,
Jing Jing,
Zhenzhou Wu,
Haijun Niu,
Haogang Zhu,
Zixiao Li,
Yongjun Wang,
Henry Brodaty,
Perminder Sachdev,
Daqing Li
Abstract:
Degeneration and adaptation are two competing sides of the same coin called resilience in the progressive processes of brain aging or diseases. Degeneration accumulates during brain aging and other cerebral activities, causing structural atrophy and dysfunction. At the same time, adaptation allows brain network reorganize to compensate for structural loss to maintain cognition function. Although h…
▽ More
Degeneration and adaptation are two competing sides of the same coin called resilience in the progressive processes of brain aging or diseases. Degeneration accumulates during brain aging and other cerebral activities, causing structural atrophy and dysfunction. At the same time, adaptation allows brain network reorganize to compensate for structural loss to maintain cognition function. Although hidden resilience mechanism is critical and fundamental to uncover the brain aging law, due to the lack of datasets and appropriate methodology, it remains essentially unknown how these two processes interact dynamically across brain networks. To quantitatively investigate this complex process, we analyze aging brains based on 6-year follow-up multimodal neuroimaging database from 63 persons. We reveal the critical mechanism of network resilience that various perturbation may cause fast brain structural atrophy, and then brain can reorganize its functional layout to lower its operational efficiency, which helps to slow down the structural atrophy and finally recover its functional efficiency equilibrium. This empirical finding could be explained by our theoretical model, suggesting one universal resilience dynamical function. This resilience is achieved in the brain functional network with evolving percolation and rich-club features. Our findings can help to understand the brain aging process and design possible mitigation methods to adjust interaction between degeneration and adaptation from resilience viewpoint.
△ Less
Submitted 3 February, 2022;
originally announced February 2022.
-
Spatiotemporal impacts of human activities and socio-demographics during the COVID-19 outbreak in the U.S
Authors:
Lu Ling,
Xinwu Qian,
Satish V. Ukkusuri,
Shuocheng Guo
Abstract:
Understanding influencing factors is essential for the surveillance and prevention of infectious diseases, and the factors are likely to vary spatially and temporally as the disease progresses. Taking daily cases and deaths data during the coronavirus disease 2019 (COVID-19) outbreak in the U.S. as a case study, we develop a mobility-augmented geographically and temporally weighted regression (M-G…
▽ More
Understanding influencing factors is essential for the surveillance and prevention of infectious diseases, and the factors are likely to vary spatially and temporally as the disease progresses. Taking daily cases and deaths data during the coronavirus disease 2019 (COVID-19) outbreak in the U.S. as a case study, we develop a mobility-augmented geographically and temporally weighted regression (M-GTWR) model to quantify the spatiotemporal impacts of social-demographic factors and human activities on the COVID-19 dynamics. Different from the base GTWR model, we incorporate a mobility-adjusted distance weight matrix where travel mobility is used in addition to the spatial adjacency to capture the correlations among local observations. The model residuals suggest that the proposed model achieves a substantial improvement over other benchmark methods in addressing the spatiotemporal nonstationarity. Our results reveal that the impacts of social-demographic and human activity variables present significant spatiotemporal heterogeneity. In particular, a 1% increase in population density may lead to 0.63% and 0.71% more daily cases and deaths, and a 1% increase in the mean commuting time may result in 0.22% and 0.95% increases in daily cases and deaths. Although increased human activities will, in general, intensify the disease outbreak, we report that the effects of grocery and pharmacy-related activities are insignificant in areas with high population density. And activities at the workplace and public transit are found to either increase or decrease the number of cases and deaths, depending on particular locations. The results of our study establish a quantitative framework for identifying influencing factors during a disease outbreak, and the obtained insights may have significant implications in guiding the policy-making against infectious diseases.
△ Less
Submitted 26 April, 2021;
originally announced April 2021.
-
Hydrophobic interaction determines docking affinity of SARS CoV 2 variants with antibodies
Authors:
Jiacheng Li,
Chengyu Hou,
Menghao Wang,
Chencheng Liao,
Shuai Guo,
Liping Shi,
Xiaoliang Ma,
Hongchi Zhang,
Shenda Jiang,
Bing Zheng,
Lin Ye,
Lin Yang,
Xiaodong He
Abstract:
Preliminary epidemiologic, phylogenetic and clinical findings suggest that several novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants have increased transmissibility and decreased efficacy of several existing vaccines. Four mutations in the receptor-binding domain (RBD) of the spike protein that are reported to contribute to increased transmission. Understanding physical m…
▽ More
Preliminary epidemiologic, phylogenetic and clinical findings suggest that several novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants have increased transmissibility and decreased efficacy of several existing vaccines. Four mutations in the receptor-binding domain (RBD) of the spike protein that are reported to contribute to increased transmission. Understanding physical mechanism responsible for the affinity enhancement between the SARS-CoV-2 variants and ACE2 is the "urgent challenge" for developing blockers, vaccines and therapeutic antibodies against the coronavirus disease 2019 (COVID-19) pandemic. Based on a hydrophobic-interaction-based protein docking mechanism, this study reveals that the mutation N501Y obviously increased the hydrophobic attraction and decrease hydrophilic repulsion between the RBD and ACE2 that most likely caused the transmissibility increment of the variants. By analyzing the mutation-induced hydrophobic surface changes in the attraction and repulsion at the binding site of the complexes of the SARS-CoV-2 variants and antibodies, we found out that all the mutations of N501Y, E484K, K417N and L452R can selectively decrease or increase their binding affinity with some antibodies.
△ Less
Submitted 28 February, 2021;
originally announced March 2021.
-
The role of hydrophobic interactions in folding of $β$-sheets
Authors:
Jiacheng Li,
Xiaoliang Ma,
Hongchi Zhang,
Chengyu Hou,
Liping Shi,
Shuai Guo,
Chenchen Liao,
Bing Zheng,
Lin Ye,
Lin Yang,
Xiaodong He
Abstract:
Exploring the protein-folding problem has been a long-standing challenge in molecular biology. Protein folding is highly dependent on folding of secondary structures as the way to pave a native folding pathway. Here, we demonstrate that a feature of a large hydrophobic surface area covering most side-chains on one side or the other side of adjacent $β$-strands of a $β$-sheet is prevail in almost a…
▽ More
Exploring the protein-folding problem has been a long-standing challenge in molecular biology. Protein folding is highly dependent on folding of secondary structures as the way to pave a native folding pathway. Here, we demonstrate that a feature of a large hydrophobic surface area covering most side-chains on one side or the other side of adjacent $β$-strands of a $β$-sheet is prevail in almost all experimentally determined $β$-sheets, indicating that folding of $β$-sheets is most likely triggered by multistage hydrophobic interactions among neighbored side-chains of unfolded polypeptides, enable $β$-sheets fold reproducibly following explicit physical folding codes in aqueous environments. $β$-turns often contain five types of residues characterized with relatively small exposed hydrophobic proportions of their side-chains, that is explained as these residues can block hydrophobic effect among neighbored side-chains in sequence. Temperature dependence of the folding of $β$-sheet is thus attributed to temperature dependence of the strength of the hydrophobicity. The hydrophobic-effect-based mechanism responsible for $β$-sheets folding is verified by bioinformatics analyses of thousands of results available from experiments. The folding codes in amino acid sequence that dictate formation of a $β$-hairpin can be deciphered through evaluating hydrophobic interaction among side-chains of an unfolded polypeptide from a $β$-strand-like thermodynamic metastable state.
△ Less
Submitted 16 September, 2020;
originally announced September 2020.
-
A hydrophobic-interaction-based mechanism trigger docking between the SARS CoV 2 spike and angiotensin-converting enzyme 2
Authors:
Jiacheng Li,
Xiaoliang Ma,
Shuai Guo,
Chengyu Hou,
Liping Shi,
Hongchi Zhang,
Bing Zheng,
Chencheng Liao,
Lin Yang,
Lin Ye,
Xiaodong He
Abstract:
A recent experimental study found that the binding affinity between the cellular receptor human angiotensin converting enzyme 2 (ACE2) and receptor-binding domain (RBD) in spike (S) protein of novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is more than 10-fold higher than that of the original severe acute respiratory syndrome coronavirus (SARS-CoV). However, main-chain structur…
▽ More
A recent experimental study found that the binding affinity between the cellular receptor human angiotensin converting enzyme 2 (ACE2) and receptor-binding domain (RBD) in spike (S) protein of novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is more than 10-fold higher than that of the original severe acute respiratory syndrome coronavirus (SARS-CoV). However, main-chain structures of the SARS-CoV-2 RBD are almost the same with that of the SARS-CoV RBD. Understanding physical mechanism responsible for the outstanding affinity between the SARS-CoV-2 S and ACE2 is the "urgent challenge" for developing blockers, vaccines and therapeutic antibodies against the coronavirus disease 2019 (COVID-19) pandemic. Considering the mechanisms of hydrophobic interaction, hydration shell, surface tension, and the shielding effect of water molecules, this study reveals a hydrophobic-interaction-based mechanism by means of which SARS-CoV-2 S and ACE2 bind together in an aqueous environment. The hydrophobic interaction between the SARS-CoV-2 S and ACE2 protein is found to be significantly greater than that between SARS-CoV S and ACE2. At the docking site, the hydrophobic portions of the hydrophilic side chains of SARS-CoV-2 S are found to be involved in the hydrophobic interaction between SARS-CoV-2 S and ACE2. We propose a method to design live attenuated viruses by mutating several key amino acid residues of the spike protein to decrease the hydrophobic surface areas at the docking site. Mutation of a small amount of residues can greatly reduce the hydrophobic binding of the coronavirus to the receptor, which may be significant reduce infectivity and transmissibility of the virus.
△ Less
Submitted 26 August, 2020;
originally announced August 2020.
-
Methods for Joint Imaging and RNA-seq Data Analysis
Authors:
Junhai Jiang,
Nan Lin,
Shicheng Guo,
Jinyun Chen,
Momiao Xiong
Abstract:
Emerging integrative analysis of genomic and anatomical imaging data which has not been well developed, provides invaluable information for the holistic discovery of the genomic structure of disease and has the potential to open a new avenue for discovering novel disease susceptibility genes which cannot be identified if they are analyzed separately. A key issue to the success of imaging and genom…
▽ More
Emerging integrative analysis of genomic and anatomical imaging data which has not been well developed, provides invaluable information for the holistic discovery of the genomic structure of disease and has the potential to open a new avenue for discovering novel disease susceptibility genes which cannot be identified if they are analyzed separately. A key issue to the success of imaging and genomic data analysis is how to reduce their dimensions. Most previous methods for imaging information extraction and RNA-seq data reduction do not explore imaging spatial information and often ignore gene expression variation at genomic positional level. To overcome these limitations, we extend functional principle component analysis from one dimension to two dimension (2DFPCA) for representing imaging data and develop a multiple functional linear model (MFLM) in which functional principal scores of images are taken as multiple quantitative traits and RNA-seq profile across a gene is taken as a function predictor for assessing the association of gene expression with images. The developed method has been applied to image and RNA-seq data of ovarian cancer and KIRC studies. We identified 24 and 84 genes whose expressions were associated with imaging variations in ovarian cancer and KIRC studies, respectively. Our results showed that many significantly associated genes with images were not differentially expressed, but revealed their morphological and metabolic functions. The results also demonstrated that the peaks of the estimated regression coefficient function in the MFLM often allowed the discovery of splicing sites and multiple isoform of gene expressions.
△ Less
Submitted 12 September, 2014;
originally announced September 2014.
-
Large-scale simulation of RNA macroevolution by an energy-dependent fitness model
Authors:
Sheng Guo,
Li-San Wang,
Junhyong Kim
Abstract:
Simulated nucleotide sequences are widely used in theoretical and empirical molecular evolution studies. Conventional simulators generally use fixed parameter time-homogeneous Markov model for sequence evolution. In this work, we use the folding free energy of the secondary structure of an RNA as a proxy for its phenotypic fitness, and simulate RNA macroevolution by a mutation-selection populati…
▽ More
Simulated nucleotide sequences are widely used in theoretical and empirical molecular evolution studies. Conventional simulators generally use fixed parameter time-homogeneous Markov model for sequence evolution. In this work, we use the folding free energy of the secondary structure of an RNA as a proxy for its phenotypic fitness, and simulate RNA macroevolution by a mutation-selection population genetics model. Because the two-step process is conditioned on an RNA and its mutant ensemble, we no longer have a global substitution matrix, nor do we explicitly assume any for this inhomogeneous stochastic process. After introducing the base model of RNA evolution, we outline the heuristic implementation algorithm and several model improvements. We then discuss the calibration of the model parameters and demonstrate that in phylogeny reconstruction with both the parsimony method and the likelihood method, the sequences generated by our simulator, rnasim, have greater statistical complexity than those by two standard simulators, ROSE and Seq-Gen, and are close to empirical sequences.
△ Less
Submitted 11 December, 2009;
originally announced December 2009.