-
Integrating Deep Learning and Synthetic Biology: A Co-Design Approach for Enhancing Gene Expression via N-terminal Coding Sequences
Authors:
Zhanglu Yan,
Weiran Chu,
Yuhua Sheng,
Kaiwen Tang,
Shida Wang,
Yanfeng Liu,
Weng-Fai Wong
Abstract:
N-terminal coding sequence (NCS) influences gene expression by impacting the translation initiation rate. The NCS optimization problem is to find an NCS that maximizes gene expression. The problem is important in genetic engineering. However, current methods for NCS optimization such as rational design and statistics-guided approaches are labor-intensive yield only relatively small improvements. T…
▽ More
N-terminal coding sequence (NCS) influences gene expression by impacting the translation initiation rate. The NCS optimization problem is to find an NCS that maximizes gene expression. The problem is important in genetic engineering. However, current methods for NCS optimization such as rational design and statistics-guided approaches are labor-intensive yield only relatively small improvements. This paper introduces a deep learning/synthetic biology co-designed few-shot training workflow for NCS optimization. Our method utilizes k-nearest encoding followed by word2vec to encode the NCS, then performs feature extraction using attention mechanisms, before constructing a time-series network for predicting gene expression intensity, and finally a direct search algorithm identifies the optimal NCS with limited training data. We took green fluorescent protein (GFP) expressed by Bacillus subtilis as a reporting protein of NCSs, and employed the fluorescence enhancement factor as the metric of NCS optimization. Within just six iterative experiments, our model generated an NCS (MLD62) that increased average GFP expression by 5.41-fold, outperforming the state-of-the-art NCS designs. Extending our findings beyond GFP, we showed that our engineered NCS (MLD62) can effectively boost the production of N-acetylneuraminic acid by enhancing the expression of the crucial rate-limiting GNA1 gene, demonstrating its practical utility. We have open-sourced our NCS expression database and experimental procedures for public use.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
How enlightened self-interest guided global vaccine sharing benefits all: a modelling study
Authors:
Zhenyu Han,
Qianyue Hao,
Qiwei He,
Katherine Budeski,
Depeng Jin,
Fengli Xu,
Kun Tang
Abstract:
Background: Despite the consensus that vaccines play an important role in combating the global spread of infectious diseases, vaccine inequity is still rampant with deep-seated mentality of self-priority. This study aims to evaluate the existence and possible outcomes of a more equitable global vaccine distribution and explore a concrete incentive mechanism that promotes vaccine equity. Methods: W…
▽ More
Background: Despite the consensus that vaccines play an important role in combating the global spread of infectious diseases, vaccine inequity is still rampant with deep-seated mentality of self-priority. This study aims to evaluate the existence and possible outcomes of a more equitable global vaccine distribution and explore a concrete incentive mechanism that promotes vaccine equity. Methods: We design a metapopulation epidemiological model that simultaneously considers global vaccine distribution and human mobility, which is then calibrated by the number of infections and real-world vaccination records during COVID-19 pandemic from March 2020 to July 2021. We explore the possibility of the enlightened self-interest incentive mechanism, i.e., improving one's own epidemic outcomes by sharing vaccines with other countries, by evaluating the number of infections and deaths under various vaccine sharing strategies using the proposed model. To understand how these strategies affect the national interests, we distinguish the imported and local cases for further cost-benefit analyses that rationalize the enlightened self-interest incentive mechanism behind vaccine sharing. ...
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
Deep learning radiomics for assessment of gastroesophageal varices in people with compensated advanced chronic liver disease
Authors:
Lan Wang,
Ruiling He,
Lili Zhao,
Jia Wang,
Zhengzi Geng,
Tao Ren,
Guo Zhang,
Peng Zhang,
Kaiqiang Tang,
Chaofei Gao,
Fei Chen,
Liting Zhang,
Yonghe Zhou,
Xin Li,
Fanbin He,
Hui Huan,
Wenjuan Wang,
Yunxiao Liang,
Juan Tang,
Fang Ai,
Tingyu Wang,
Liyun Zheng,
Zhongwei Zhao,
Jiansong Ji,
Wei Liu
, et al. (22 additional authors not shown)
Abstract:
Objective: Bleeding from gastroesophageal varices (GEV) is a medical emergency associated with high mortality. We aim to construct an artificial intelligence-based model of two-dimensional shear wave elastography (2D-SWE) of the liver and spleen to precisely assess the risk of GEV and high-risk gastroesophageal varices (HRV).
Design: A prospective multicenter study was conducted in patients with…
▽ More
Objective: Bleeding from gastroesophageal varices (GEV) is a medical emergency associated with high mortality. We aim to construct an artificial intelligence-based model of two-dimensional shear wave elastography (2D-SWE) of the liver and spleen to precisely assess the risk of GEV and high-risk gastroesophageal varices (HRV).
Design: A prospective multicenter study was conducted in patients with compensated advanced chronic liver disease. 305 patients were enrolled from 12 hospitals, and finally 265 patients were included, with 1136 liver stiffness measurement (LSM) images and 1042 spleen stiffness measurement (SSM) images generated by 2D-SWE. We leveraged deep learning methods to uncover associations between image features and patient risk, and thus conducted models to predict GEV and HRV.
Results: A multi-modality Deep Learning Risk Prediction model (DLRP) was constructed to assess GEV and HRV, based on LSM and SSM images, and clinical information. Validation analysis revealed that the AUCs of DLRP were 0.91 for GEV (95% CI 0.90 to 0.93, p < 0.05) and 0.88 for HRV (95% CI 0.86 to 0.89, p < 0.01), which were significantly and robustly better than canonical risk indicators, including the value of LSM and SSM. Moreover, DLPR was better than the model using individual parameters, including LSM and SSM images. In HRV prediction, the 2D-SWE images of SSM outperform LSM (p < 0.01).
Conclusion: DLRP shows excellent performance in predicting GEV and HRV over canonical risk indicators LSM and SSM. Additionally, the 2D-SWE images of SSM provided more information for better accuracy in predicting HRV than the LSM.
△ Less
Submitted 12 June, 2023;
originally announced June 2023.
-
Fomite transmission and disinfection strategies for SARS-CoV-2 and related viruses
Authors:
Nicolas CastaƱo,
Seth Cordts,
Myra Kurosu Jalil,
Kevin Zhang,
Saisneha Koppaka,
Alison Bick,
Rajorshi Paul,
Sindy KY Tang
Abstract:
Contaminated objects or surfaces, referred to as fomites, play a critical role in the spread of viruses, including SARS-CoV-2, the virus responsible for the COVID-19 pandemic. The long persistence of viruses (hours to days) on surfaces calls for an urgent need for surface disinfection strategies to intercept virus transmission and the spread of the disease. Elucidating the physicochemical processe…
▽ More
Contaminated objects or surfaces, referred to as fomites, play a critical role in the spread of viruses, including SARS-CoV-2, the virus responsible for the COVID-19 pandemic. The long persistence of viruses (hours to days) on surfaces calls for an urgent need for surface disinfection strategies to intercept virus transmission and the spread of the disease. Elucidating the physicochemical processes and surface science underlying the adsorption and transfer of virus between surfaces, as well as their inactivation, are important in understanding how the disease is transmitted, and in developing effective interception strategies. This review aims to summarize the current knowledge and underlying physicochemical processes of virus transmission, in particular via fomites, and common disinfection approaches. Gaps in knowledge and needs for further research are also identified. The review focuses on SARS-CoV-2, but will supplement the discussions with related viruses.
△ Less
Submitted 22 May, 2020;
originally announced May 2020.
-
Impact of Temperature and Relative Humidity on the Transmission of COVID-19: A Modeling Study in China and the United States
Authors:
Jingyuan Wang,
Ke Tang,
Kai Feng,
Xin Li,
Weifeng Lv,
Kun Chen,
Fei Wang
Abstract:
Objectives: We aim to assess the impact of temperature and relative humidity on the transmission of COVID-19 across communities after accounting for community-level factors such as demographics, socioeconomic status, and human mobility status. Design: A retrospective cross-sectional regression analysis via the Fama-MacBeth procedure is adopted. Setting: We use the data for COVID-19 daily symptom-o…
▽ More
Objectives: We aim to assess the impact of temperature and relative humidity on the transmission of COVID-19 across communities after accounting for community-level factors such as demographics, socioeconomic status, and human mobility status. Design: A retrospective cross-sectional regression analysis via the Fama-MacBeth procedure is adopted. Setting: We use the data for COVID-19 daily symptom-onset cases for 100 Chinese cities and COVID-19 daily confirmed cases for 1,005 U.S. counties. Participants: A total of 69,498 cases in China and 740,843 cases in the U.S. are used for calculating the effective reproductive numbers. Primary outcome measures: Regression analysis of the impact of temperature and relative humidity on the effective reproductive number (R value). Results: Statistically significant negative correlations are found between temperature/relative humidity and the effective reproductive number (R value) in both China and the U.S. Conclusions: Higher temperature and higher relative humidity potentially suppress the transmission of COVID-19. Specifically, an increase in temperature by 1 degree Celsius is associated with a reduction in the R value of COVID-19 by 0.026 (95% CI [-0.0395,-0.0125]) in China and by 0.020 (95% CI [-0.0311, -0.0096]) in the U.S.; an increase in relative humidity by 1% is associated with a reduction in the R value by 0.0076 (95% CI [-0.0108,-0.0045]) in China and by 0.0080 (95% CI [-0.0150,-0.0010]) in the U.S. Therefore, the potential impact of temperature/relative humidity on the effective reproductive number alone is not strong enough to stop the pandemic.
△ Less
Submitted 30 May, 2021; v1 submitted 9 March, 2020;
originally announced March 2020.
-
Alignment-Free Sequence Analysis and Applications
Authors:
Jie Ren,
Xin Bai,
Yang Young Lu,
Kujin Tang,
Ying Wang,
Gesine Reinert,
Fengzhu Sun
Abstract:
Genome and metagenome comparisons based on large amounts of next-generation sequencing (NGS) data pose significant challenges for alignment-based approaches due to the huge data size and the relatively short length of the reads. Alignment-free approaches based on the counts of word patterns in NGS data do not depend on the complete genome and are generally computationally efficient. Thus, they con…
▽ More
Genome and metagenome comparisons based on large amounts of next-generation sequencing (NGS) data pose significant challenges for alignment-based approaches due to the huge data size and the relatively short length of the reads. Alignment-free approaches based on the counts of word patterns in NGS data do not depend on the complete genome and are generally computationally efficient. Thus, they contribute significantly to genome and metagenome comparison. Recently, novel statistical approaches have been developed for the comparison of both long and shotgun sequences. These approaches have been applied to many problems including the comparison of gene regulatory regions, genome sequences, metagenomes, binning contigs in metagenomic data, identification of virus-host interactions, and detection of horizontal gene transfers. We provide an updated review of these applications and other related developments of word-count based approaches for alignment-free sequence analysis.
△ Less
Submitted 26 March, 2018;
originally announced March 2018.
-
Transcriptional Similarity in Couples Reveals the Impact of Shared Environment and Lifestyle on Gene Regulation through Modified Cytosines
Authors:
Ke Tang,
Wei Zhang
Abstract:
Gene expression is a complex and quantitative trait that is influenced by both genetic and non-genetic regulators including environmental factors. Evaluating the contribution of environment to gene expression regulation and identifying which genes are more likely to be influenced by environmental factors are important for understanding human complex traits. We hypothesize that by living together a…
▽ More
Gene expression is a complex and quantitative trait that is influenced by both genetic and non-genetic regulators including environmental factors. Evaluating the contribution of environment to gene expression regulation and identifying which genes are more likely to be influenced by environmental factors are important for understanding human complex traits. We hypothesize that by living together as couples, there can be commonly co-regulated genes that may reflect the shared living environment (e.g., diet, indoor air pollutants, behavioral lifestyle). The lymphoblastoid cell lines (LCLs) derived from unrelated couples of African ancestry (YRI, Yoruba people from Ibadan, Nigeria) from the International HapMap Project provided a unique model for us to characterize gene expression pattern in couples by comparing gene expression levels between husbands and wives. Strikingly, 778 genes were found to show much smaller variances in couples than random pairs of individuals at a false discovery rate (FDR) of 5%. Since genetic variation between unrelated family members in a general population is expected to be the same assuming a random-mating society, non-genetic factors (e.g., epigenetic systems) are more likely to be the mediators for the observed transcriptional similarity in couples. We thus evaluated the contribution of modified cytosines to those genes showing transcriptional similarity in couples as well as the relationships these CpG sites with other gene regulatory elements, such as transcription factor binding sites (TFBS). Our findings suggested that transcriptional similarity in couples likely reflected shared common environment partially mediated through cytosine modifications.
△ Less
Submitted 24 May, 2016;
originally announced May 2016.
-
Automatic landmark annotation and dense correspondence registration for 3D human facial images
Authors:
Jianya Guo,
Xi Mei,
Kun Tang
Abstract:
Dense surface registration of three-dimensional (3D) human facial images holds great potential for studies of human trait diversity, disease genetics, and forensics. Non-rigid registration is particularly useful for establishing dense anatomical correspondences between faces. Here we describe a novel non-rigid registration method for fully automatic 3D facial image mapping. This method comprises t…
▽ More
Dense surface registration of three-dimensional (3D) human facial images holds great potential for studies of human trait diversity, disease genetics, and forensics. Non-rigid registration is particularly useful for establishing dense anatomical correspondences between faces. Here we describe a novel non-rigid registration method for fully automatic 3D facial image mapping. This method comprises two steps: first, seventeen facial landmarks are automatically annotated, mainly via PCA-based feature recognition following 3D-to-2D data transformation. Second, an efficient thin-plate spline (TPS) protocol is used to establish the dense anatomical correspondence between facial images, under the guidance of the predefined landmarks. We demonstrate that this method is robust and highly accurate, even for different ethnicities. The average face is calculated for individuals of Han Chinese and Uyghur origins. While fully automatic and computationally efficient, this method enables high-throughput analysis of human facial feature variation.
△ Less
Submitted 19 December, 2012;
originally announced December 2012.