-
Module control of network analysis in psychopathology
Authors:
Chunyu Pan,
Quan Zhang,
Yue Zhu,
Shengzhou Kong,
Juan Liu,
Changsheng Zhang,
Fei Wang,
Xizhe Zhang
Abstract:
The network approach to characterizing psychopathology departs from traditional latent categorical and dimensional approaches. Causal interplay among symptoms contributed to dynamic psychopathology system. Therefore, analyzing the symptom clusters is critical for understanding mental disorders. Furthermore, despite extensive research studying the topological features of symptom networks, the contr…
▽ More
The network approach to characterizing psychopathology departs from traditional latent categorical and dimensional approaches. Causal interplay among symptoms contributed to dynamic psychopathology system. Therefore, analyzing the symptom clusters is critical for understanding mental disorders. Furthermore, despite extensive research studying the topological features of symptom networks, the control relationships between symptoms remain largely unclear. Here, we present a novel systematizing concept, module control, to analyze the control principle of the symptom network at a module level. We introduce Module Control Network (MCN) to identify key modules that regulate the network's behavior. By applying our approach to a multivariate psychological dataset, we discover that non-emotional modules, such as sleep-related and stress-related modules, are the primary controlling modules in the symptom network. Our findings indicate that module control can expose central symptom cluster governing psychopathology network, offering novel insights into the underlying mechanisms of mental disorders and individualized approach to psychological interventions.
△ Less
Submitted 30 May, 2024;
originally announced July 2024.
-
Interpretable Online Network Dictionary Learning for Inferring Long-Range Chromatin Interactions
Authors:
Vishal Rana,
Jianhao Peng,
Chao Pan,
Hanbaek Lyu,
Albert Cheng,
Minji Kim,
Olgica Milenkovic
Abstract:
Dictionary learning (DL) is commonly used in computational biology to tackle ubiquitous clustering problems due to its conceptual simplicity and relatively low computational complexity. However, DL algorithms produce results that lack interpretability and are not optimized for large-scale graph-structured data. We propose a novel DL algorithm called online convex network dictionary learning (onlin…
▽ More
Dictionary learning (DL) is commonly used in computational biology to tackle ubiquitous clustering problems due to its conceptual simplicity and relatively low computational complexity. However, DL algorithms produce results that lack interpretability and are not optimized for large-scale graph-structured data. We propose a novel DL algorithm called online convex network dictionary learning (online cvxNDL) that can handle extremely large datasets and enables the interpretation of dictionary elements, which serve as cluster representatives, through convex combinations of real measurements. Moreover, the algorithm can be applied to network-structured data via specialized subnetwork sampling techniques.
To demonstrate the utility of our approach, we apply cvxNDL on 3D-genome RNAPII ChIA-Drop data to identify important long-range interaction patterns. ChIA-Drop probes higher-order interactions, and produces hypergraphs whose nodes represent genomic fragments. The hyperedges represent observed physical contacts. Our hypergraph model analysis creates an interpretable dictionary of long-range interaction patterns that accurately represent global chromatin physical contact maps. Using dictionary information, one can also associate the contact maps with RNA transcripts and infer cellular functions.
Our results offer two key insights. First, we demonstrate that online cvxNDL retains the accuracy of classical DL methods while simultaneously ensuring unique interpretability and scalability. Second, we identify distinct collections of proximal and distal interaction patterns involving chromatin elements shared by related processes across different chromosomes, as well as patterns unique to specific chromosomes. To associate the dictionary elements with biological properties of the corresponding chromatin regions, we employ Gene Ontology enrichment analysis and perform RNA coexpression studies.
△ Less
Submitted 16 December, 2023;
originally announced December 2023.
-
Semi-Quantitative Group Testing for Efficient and Accurate qPCR Screening of Pathogens with a Wide Range of Loads
Authors:
Ananthan Nambiar,
Chao Pan,
Vishal Rana,
Mahdi Cheraghchi,
João Ribeiro,
Sergei Maslov,
Olgica Milenkovic
Abstract:
Pathogenic infections pose a significant threat to global health, affecting millions of people every year and presenting substantial challenges to healthcare systems worldwide. Efficient and timely testing plays a critical role in disease control and transmission prevention. Group testing is a well-established method for reducing the number of tests needed to screen large populations when the dise…
▽ More
Pathogenic infections pose a significant threat to global health, affecting millions of people every year and presenting substantial challenges to healthcare systems worldwide. Efficient and timely testing plays a critical role in disease control and transmission prevention. Group testing is a well-established method for reducing the number of tests needed to screen large populations when the disease prevalence is low. However, it does not fully utilize the quantitative information provided by qPCR methods, nor is it able to accommodate a wide range of pathogen loads. To address these issues, we introduce a novel adaptive semi-quantitative group testing (SQGT) scheme to efficiently screen populations via two-stage qPCR testing. The SQGT method quantizes cycle threshold ($Ct$) values into multiple bins, leveraging the information from the first stage of screening to improve the detection sensitivity. Dynamic $Ct$ threshold adjustments mitigate dilution effects and enhance test accuracy. Comparisons with traditional binary outcome GT methods show that SQGT reduces the number of tests by $24$% while maintaining a negligible false negative rate.
△ Less
Submitted 2 August, 2023; v1 submitted 30 July, 2023;
originally announced July 2023.
-
Deep neural network improves the estimation of polygenic risk scores for breast cancer
Authors:
Adrien Badré,
Li Zhang,
Wellington Muchero,
Justin C. Reynolds,
Chongle Pan
Abstract:
Polygenic risk scores (PRS) estimate the genetic risk of an individual for a complex disease based on many genetic variants across the whole genome. In this study, we compared a series of computational models for estimation of breast cancer PRS. A deep neural network (DNN) was found to outperform alternative machine learning techniques and established statistical algorithms, including BLUP, BayesA…
▽ More
Polygenic risk scores (PRS) estimate the genetic risk of an individual for a complex disease based on many genetic variants across the whole genome. In this study, we compared a series of computational models for estimation of breast cancer PRS. A deep neural network (DNN) was found to outperform alternative machine learning techniques and established statistical algorithms, including BLUP, BayesA and LDpred. In the test cohort with 50% prevalence, the Area Under the receiver operating characteristic Curve (AUC) were 67.4% for DNN, 64.2% for BLUP, 64.5% for BayesA, and 62.4% for LDpred. BLUP, BayesA, and LPpred all generated PRS that followed a normal distribution in the case population. However, the PRS generated by DNN in the case population followed a bi-modal distribution composed of two normal distributions with distinctly different means. This suggests that DNN was able to separate the case population into a high-genetic-risk case sub-population with an average PRS significantly higher than the control population and a normal-genetic-risk case sub-population with an average PRS similar to the control population. This allowed DNN to achieve 18.8% recall at 90% precision in the test cohort with 50% prevalence, which can be extrapolated to 65.4% recall at 20% precision in a general population with 12% prevalence. Interpretation of the DNN model identified salient variants that were assigned insignificant p-values by association studies, but were important for DNN prediction. These variants may be associated with the phenotype through non-linear relationships.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
NetMoST: A network-based machine learning approach for subtyping schizophrenia using polygenic SNP allele biomarkers
Authors:
Xinru Wei,
Shuai Dong,
Zhao Su,
Lili Tang,
Pengfei Zhao,
Chunyu Pan,
Fei Wang,
Yanqing Tang,
Weixiong Zhang,
Xizhe Zhang
Abstract:
Subtyping neuropsychiatric disorders like schizophrenia is essential for improving the diagnosis and treatment of complex diseases. Subtyping schizophrenia is challenging because it is polygenic and genetically heterogeneous, rendering the standard symptom-based diagnosis often unreliable and unrepeatable. We developed a novel network-based machine-learning approach, netMoST, to subtyping psychiat…
▽ More
Subtyping neuropsychiatric disorders like schizophrenia is essential for improving the diagnosis and treatment of complex diseases. Subtyping schizophrenia is challenging because it is polygenic and genetically heterogeneous, rendering the standard symptom-based diagnosis often unreliable and unrepeatable. We developed a novel network-based machine-learning approach, netMoST, to subtyping psychiatric disorders. NetMoST identifies polygenic risk SNP-allele modules from genome-wide genotyping data as polygenic haplotype biomarkers (PHBs) for disease subtyping. We applied netMoST to subtype a cohort of schizophrenia subjects into three distinct biotypes with differentiable genetic, neuroimaging and functional characteristics. The PHBs of the first biotype (36.9% of all patients) were related to neurodevelopment and cognition, the PHBs of the second biotype (28.4%) were enriched for neuroimmune functions, and the PHBs of the third biotype (34.7%) were associated with the transport of calcium ions and neurotransmitters. Neuroimaging patterns provided additional support to the new biotypes, with unique regional homogeneity (ReHo) patterns observed in the brains of each biotype compared with healthy controls. Our findings demonstrated netMoST's capability for uncovering novel biotypes of complex diseases such as schizophrenia. The results also showed the power of exploring polygenic allelic patterns that transcend the conventional GWAS approaches.
△ Less
Submitted 10 March, 2023; v1 submitted 31 January, 2023;
originally announced February 2023.
-
Identification of cancer-keeping genes as therapeutic targets by finding network control hubs
Authors:
Xizhe Zhang,
Chunyu Pan,
Xinru Wei,
Meng Yu,
Shuangjie Liu,
Jun An,
Jieping Yang,
Baojun Wei,
Wenjun Hao,
Yang Yao,
Yuyan Zhu,
Weixiong Zhang
Abstract:
Finding cancer driver genes has been a focal theme of cancer research and clinical studies. One of the recent approaches is based on network structural controllability that focuses on finding a control scheme and driver genes that can steer the cell from an arbitrary state to a designated state. While theoretically sound, this approach is impractical for many reasons, e.g., the control scheme is o…
▽ More
Finding cancer driver genes has been a focal theme of cancer research and clinical studies. One of the recent approaches is based on network structural controllability that focuses on finding a control scheme and driver genes that can steer the cell from an arbitrary state to a designated state. While theoretically sound, this approach is impractical for many reasons, e.g., the control scheme is often not unique and half of the nodes may be driver genes for the cell. We developed a novel approach that transcends structural controllability. Instead of considering driver genes for one control scheme, we considered control hub genes that reside in the middle of a control path of every control scheme. Control hubs are the most vulnerable spots for controlling the cell and exogenous stimuli on them may render the cell uncontrollable. We adopted control hubs as cancer-keep genes (CKGs) and applied them to a gene regulatory network of bladder cancer (BLCA). All the genes on the cell cycle and p53 singling pathways in BLCA are CKGs, confirming the importance of these genes and the two pathways in cancer. A smaller set of 35 sensitive CKGs (sCKGs) for BLCA was identified by removing network links. Six sCKGs (RPS6KA3, FGFR3, N-cadherin (CDH2), EP300, caspase-1, and FN1) were subjected to small-interferencing-RNA knockdown in four cell lines to validate their effects on the proliferation or migration of cancer cells. Knocking down RPS6KA3 in a mouse model of BLCA significantly inhibited the growth of tumor xenografts in the mouse model. Combined, our results demonstrated the value of CKGs as therapeutic targets for cancer therapy and the potential of CKGs as an effective means for studying and characterizing cancer etiology.
△ Less
Submitted 13 June, 2022;
originally announced June 2022.
-
Total controllability analysis discovers explainable drugs for Covid-19 treatment
Authors:
Xinru Wei,
Chunyu Pan,
Xizhe Zhang,
Weixiong Zhang
Abstract:
Network medicine has been pursued for Covid-19 drug repurposing. One such approach adopts structural controllability, a theory for controlling a network (the cell). Motivated to protect the cell from viral infections, we extended this theory to total controllability and introduced a new concept of control hubs. Perturbation to any control hub renders the cell uncontrollable by exogenous stimuli, e…
▽ More
Network medicine has been pursued for Covid-19 drug repurposing. One such approach adopts structural controllability, a theory for controlling a network (the cell). Motivated to protect the cell from viral infections, we extended this theory to total controllability and introduced a new concept of control hubs. Perturbation to any control hub renders the cell uncontrollable by exogenous stimuli, e.g., viral infections, so control hubs are ideal drug targets. We developed an efficient algorithm for finding all control hubs and applied it to the largest homogenous human protein-protein interaction network. Our new method outperforms several popular gene-selection methods, including that based on structural controllability. The final 65 druggable control hubs are enriched with functions of cell proliferation, regulation of apoptosis, and responses to cellular stress and nutrient levels, revealing critical pathways induced by SARS-CoV-2. These druggable control hubs led to drugs in 4 major categories: antiviral and anti-inflammatory agents, drugs on central nerve systems, and dietary supplements and hormones that boost immunity. Their functions also provided deep insights into the therapeutic mechanisms of the drugs for Covid-19 therapy, making the new approach an explainable drug repurposing method. A remarkable example is Fostamatinib that has been shown to lower mortality, shorten the length of ICU stay, and reduce disease severity of hospitalized Covid-19 patients. The drug targets 10 control hubs, 9 of which are kinases that play key roles in cell differentiation and programmed death. One such kinase is RIPK1 that directly interacts with viral protein nsp12, the RdRp of the virus. The study produced many control hubs that were not targets of existing drugs but were enriched with proteins on membranes and the NF-$κ$B pathway, so are excellent candidate targets for new drugs.
△ Less
Submitted 1 June, 2023; v1 submitted 6 June, 2022;
originally announced June 2022.
-
Improvement of Resting-state EEG Analysis Process with Spectrum Weight-Voting based on LES
Authors:
Yumeng Ye,
Haichun Liu,
TianHong Zhang,
Changchun Pan,
Genke Yang,
JiJun Wang,
Robert C. Qiu
Abstract:
EEG is a non-invasive technique for recording brain bioelectric activity, which has potential applications in various fields such as human-computer interaction and neuroscience. However, there are many difficulties in analyzing EEG data, including its complex composition, low amplitude as well as low signal-to-noise ratio. Some of the existing methods of analysis are based on feature extraction an…
▽ More
EEG is a non-invasive technique for recording brain bioelectric activity, which has potential applications in various fields such as human-computer interaction and neuroscience. However, there are many difficulties in analyzing EEG data, including its complex composition, low amplitude as well as low signal-to-noise ratio. Some of the existing methods of analysis are based on feature extraction and machine learning to differentiate the phase of schizophrenia that samples belong to. However, medical research requires the use of machine learning not only to give more accurate classification results, but also to give the results that can be applied to pathological studies. The main purpose of this study is to obtain the weight values as the representation of influence of each frequency band on the classification of schizophrenia phases on the basis of a more effective classification method using the LES feature extraction, and then the weight values are processed and applied to improve the accuracy of machine learning classification. We propose a method called weight-voting to obtain the weights of sub-bands features by using results of classification for voting to fit the actual categories of EEG data, and using weights for reclassification. Through this method, we can first obtain the influence of each band in distinguishing three schizophrenia phases, and analyze the effect of band features on the risk of schizophrenia contributing to the study of psychopathology. Our results show that there is a high correlation between the change of weight of low gamma band and the difference between HC, CHR and FES. If the features revised according to weights are used for reclassification, the accuracy of result will be improved compared with the original classifier, which confirms the role of the band weight distribution.
△ Less
Submitted 17 January, 2018; v1 submitted 20 December, 2017;
originally announced December 2017.
-
A Data Driven Approach for Resting-state EEG signal Classification of Schizophrenia with Control Participants using Random Matrix Theory
Authors:
Haichun Liu,
TianHong Zhang,
Yumeng Ye,
Changchun Pan,
Genke Yang,
JiJun Wang,
Robert C. Qiu
Abstract:
Resting state electroencephalogram (EEG) abnormalities in clinically high-risk individuals (CHR), clinically stable first-episode patients with schizophrenia (FES), healthy controls (HC) suggest alterations in neural oscillatory activity. However, few studies directly compare these anomalies among each types. Therefore, this study investigated whether these electrophysiological characteristics dif…
▽ More
Resting state electroencephalogram (EEG) abnormalities in clinically high-risk individuals (CHR), clinically stable first-episode patients with schizophrenia (FES), healthy controls (HC) suggest alterations in neural oscillatory activity. However, few studies directly compare these anomalies among each types. Therefore, this study investigated whether these electrophysiological characteristics differentiate clinical populations from one another, and from non-psychiatric controls. To address this question, resting EEG power and coherence were assessed in 40 clinically high-risk individuals (CHR), 40 first-episode patients with schizophrenia (FES), and 40 healthy controls (HC). These findings suggest that resting EEG can be a sensitive measure for differentiating between clinical disorders.This paper proposes a novel data-driven supervised learning method to obtain identification of the patients mental status in schizophrenia research. According to Marchenko-Pastur Law, the distribution of the eigenvalues of EEG data is divided into signal subspace and noise subspace. A test statistic named LES that embodies the characteristics of all eigenvalues is adopted. different classifier and different feature(LES test function) are selected for experiments, we have shown that using von Neumann Entropy as LES test function combine with SVM classifier could obtain the best average classification accuracy during three classification among HC, FES and CHR of Schizophrenia group with EEG signal. It is worth noting that the result of LES feature extraction with the highest classification accuracy is around 90% in two classification(HC compare with FES) and around 70% in three classification. Where the classification accuracy higher than 70% could be used to assist clinical diagnosis.
△ Less
Submitted 17 January, 2018; v1 submitted 13 December, 2017;
originally announced December 2017.
-
Intuitive representation of surface properties of biomolecules using BioBlender
Authors:
Raluca Mihaela Andrei,
Marco Callieri,
Maria Francesca Zini,
Tiziana Loni,
Giuseppe Maraziti,
Mike Chen Pan,
Monica Zoppè
Abstract:
In this and the associated article 'BioBlender: Fast and Efficient All Atom Morphing of Proteins Using Blender Game Engine', by Zini et al., we present BioBlender, a complete instrument for the elaboration of motion (Zini et al.) and the visualization (here) of proteins and other macromolecules, using instruments of computer graphics. The availability of protein structures enables the study of the…
▽ More
In this and the associated article 'BioBlender: Fast and Efficient All Atom Morphing of Proteins Using Blender Game Engine', by Zini et al., we present BioBlender, a complete instrument for the elaboration of motion (Zini et al.) and the visualization (here) of proteins and other macromolecules, using instruments of computer graphics. The availability of protein structures enables the study of their surfaces and surface properties such as electrostatic potential (EP) and hydropathy (MLP), based on atomic contribution. Recent advances in 3D animation and rendering software have not yet been exploited for the representation of proteins and other biological molecules in an intuitive, animated form. Taking advantage of an open-source, 3D animation and rendering software, Blender, we developed BioBlender, a package dedicated to biological work: elaboration of proteins' motions with the simultaneous visualization of chemical and physical features. EP and MLP are calculated using physico-chemical programs and custom programs and scripts, organized and accessed within BioBlender interface. A new visual code is introduced for MLP visualization: a range of optical features that permits a photorealistic rendering of its spatial distribution on the surface of the protein. EP is represented as animated line particles that flow along field lines proportional to the total charge of the protein. Our system permits EP and MLP visualization of molecules and, in the case of moving proteins, the continuous perception of these features, calculated for each intermediate conformation. Using real world tactile/sight feelings, the nanoscale world of proteins becomes more understandable, familiar to our everyday life, making it easier to introduce "un-seen" phenomena (concepts) such as hydropathy or charges.
△ Less
Submitted 27 June, 2012; v1 submitted 23 September, 2010;
originally announced September 2010.