Search | arXiv e-print repository

Heterogeneous graph attention network improves cancer multiomics integration

Authors: Sina Tabakhi, Charlotte Vandermeulen, Ian Sudbery, Haiping Lu

Abstract: The increase in high-dimensional multiomics data demands advanced integration models to capture the complexity of human diseases. Graph-based deep learning integration models, despite their promise, struggle with small patient cohorts and high-dimensional features, often applying independent feature selection without modeling relationships among omics. Furthermore, conventional graph-based omics m… ▽ More The increase in high-dimensional multiomics data demands advanced integration models to capture the complexity of human diseases. Graph-based deep learning integration models, despite their promise, struggle with small patient cohorts and high-dimensional features, often applying independent feature selection without modeling relationships among omics. Furthermore, conventional graph-based omics models focus on homogeneous graphs, lacking multiple types of nodes and edges to capture diverse structures. We introduce a Heterogeneous Graph ATtention network for omics integration (HeteroGATomics) to improve cancer diagnosis. HeteroGATomics performs joint feature selection through a multi-agent system, creating dedicated networks of feature and patient similarity for each omic modality. These networks are then combined into one heterogeneous graph for learning holistic omic-specific representations and integrating predictions across modalities. Experiments on three cancer multiomics datasets demonstrate HeteroGATomics' superior performance in cancer diagnosis. Moreover, HeteroGATomics enhances interpretability by identifying important biomarkers contributing to the diagnosis outcomes. △ Less

Submitted 5 August, 2024; originally announced August 2024.

Comments: 29 pages, 13 figures

arXiv:2407.13637 [pdf]

DREAM: a biomedical data-driven self-evolving autonomous research system

Authors: Luojia Deng, Yijie Wu, Yongyong Ren, Hui Lu

Abstract: In contemporary biomedical research, the efficiency of data-driven approaches is hindered by large data volumes, tool selection complexity, and human resource limitations, necessitating the development of fully autonomous research systems to meet complex analytical needs. Such a system should include the ability to autonomously generate research questions, write analytical code, configure the comp… ▽ More In contemporary biomedical research, the efficiency of data-driven approaches is hindered by large data volumes, tool selection complexity, and human resource limitations, necessitating the development of fully autonomous research systems to meet complex analytical needs. Such a system should include the ability to autonomously generate research questions, write analytical code, configure the computational environment, judge and interpret the results, and iteratively generate in-depth questions or solutions, all without human intervention. Here we developed DREAM, the first biomedical Data-dRiven self-Evolving Autonomous systeM, which can independently conduct scientific research without human involvement. Utilizing a clinical dataset and two omics datasets, DREAM demonstrated its ability to raise and deepen scientific questions, with difficulty scores for clinical data questions surpassing top published articles by 5.7% and outperforming GPT-4 and bioinformatics graduate students by 58.6% and 56.0%, respectively. Overall, DREAM has a success rate of 80% in autonomous clinical data mining. Certainly, human can participate in different steps of DREAM to achieve more personalized goals. After evolution, 10% of the questions exceeded the average scores of top published article questions on originality and complexity. In the autonomous environment configuration of the eight bioinformatics workflows, DREAM exhibited an 88% success rate, whereas GPT-4 failed to configure any workflows. In clinical dataset, DREAM was over 10,000 times more efficient than the average scientist with a single computer core, and capable of revealing new discoveries. As a self-evolving autonomous research system, DREAM provides an efficient and reliable solution for future biomedical research. This paradigm may also have a revolutionary impact on other data-driven scientific research fields. △ Less

Submitted 18 July, 2024; originally announced July 2024.

Comments: 11 pages, 4 figures

arXiv:2407.13037 [pdf, other]

Dispersion Relations for Active Undulators in Overdamped Environments

Authors: Christopher J. Pierce, Daniel Irvine, Lucinda Peng, Xuefei Lu, Hang Lu, Daniel I. Goldman

Abstract: Organisms that locomote by propagating waves of body bending can maintain performance across heterogeneous environments by modifying their gait frequency $ω$ or wavenumber $k$. We identify a unifying relationship between these parameters for overdamped undulatory swimmers (including nematodes, spermatozoa, and mm-scale fish) moving in diverse environmental rheologies, in the form of an active `dis… ▽ More Organisms that locomote by propagating waves of body bending can maintain performance across heterogeneous environments by modifying their gait frequency $ω$ or wavenumber $k$. We identify a unifying relationship between these parameters for overdamped undulatory swimmers (including nematodes, spermatozoa, and mm-scale fish) moving in diverse environmental rheologies, in the form of an active `dispersion relation' $ω\propto k^{\pm2}$. A model treating the organisms as actively driven viscoelastic beams reproduces the experimentally observed scaling. The relative strength of rate-dependent dissipation in the body and the environment determines whether $k^2$ or $k^{-2}$ scaling is observed. The existence of these scaling regimes reflects the $k$ and $ω$ dependence of the various underlying force terms and how their relative importance changes with the external environment and the neuronally commanded gait. △ Less

Submitted 17 July, 2024; originally announced July 2024.

arXiv:2404.05781 [pdf, other]

Group-specific discriminant analysis reveals statistically validated sex differences in lateralization of brain functional network

Authors: Shuo Zhou, Junhao Luo, Yaya Jiang, Haolin Wang, Haiping Lu, Gaolang Gong

Abstract: Lateralization is a fundamental feature of the human brain, where sex differences have been observed. Conventional studies in neuroscience on sex-specific lateralization are typically conducted on univariate statistical comparisons between male and female groups. However, these analyses often lack effective validation of group specificity. Here, we formulate modeling sex differences in lateralizat… ▽ More Lateralization is a fundamental feature of the human brain, where sex differences have been observed. Conventional studies in neuroscience on sex-specific lateralization are typically conducted on univariate statistical comparisons between male and female groups. However, these analyses often lack effective validation of group specificity. Here, we formulate modeling sex differences in lateralization of functional networks as a dual-classification problem, consisting of first-order classification for left vs. right functional networks and second-order classification for male vs. female models. To capture sex-specific patterns, we develop the Group-Specific Discriminant Analysis (GSDA) for first-order classification. The evaluation on two public neuroimaging datasets demonstrates the efficacy of GSDA in learning sex-specific models from functional networks, achieving a significant improvement in group specificity over baseline methods. The major sex differences are in the strength of lateralization and the interactions within and between lobes. The GSDA-based method is generic in nature and can be adapted to other group-specific analyses such as handedness-specific or disease-specific analyses. △ Less

Submitted 8 April, 2024; originally announced April 2024.

arXiv:2401.09500 [pdf, other]

MorphGrower: A Synchronized Layer-by-layer Growing Approach for Plausible Neuronal Morphology Generation

Authors: Nianzu Yang, Kaipeng Zeng, Haotian Lu, Yexin Wu, Zexin Yuan, Danni Chen, Shengdian Jiang, Jiaxiang Wu, Yimin Wang, Junchi Yan

Abstract: Neuronal morphology is essential for studying brain functioning and understanding neurodegenerative disorders. As acquiring real-world morphology data is expensive, computational approaches for morphology generation have been studied. Traditional methods heavily rely on expert-set rules and parameter tuning, making it difficult to generalize across different types of morphologies. Recently, MorphV… ▽ More Neuronal morphology is essential for studying brain functioning and understanding neurodegenerative disorders. As acquiring real-world morphology data is expensive, computational approaches for morphology generation have been studied. Traditional methods heavily rely on expert-set rules and parameter tuning, making it difficult to generalize across different types of morphologies. Recently, MorphVAE was introduced as the sole learning-based method, but its generated morphologies lack plausibility, i.e., they do not appear realistic enough and most of the generated samples are topologically invalid. To fill this gap, this paper proposes MorphGrower, which mimicks the neuron natural growth mechanism for generation. Specifically, MorphGrower generates morphologies layer by layer, with each subsequent layer conditioned on the previously generated structure. During each layer generation, MorphGrower utilizes a pair of sibling branches as the basic generation block and generates branch pairs synchronously. This approach ensures topological validity and allows for fine-grained generation, thereby enhancing the realism of the final generated morphologies. Results on four real-world datasets demonstrate that MorphGrower outperforms MorphVAE by a notable margin. Importantly, the electrophysiological response simulation demonstrates the plausibility of our generated samples from a neuroscience perspective. Our code is available at https://github.com/Thinklab-SJTU/MorphGrower. △ Less

Submitted 27 May, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

arXiv:2401.08738 [pdf]

Machine Learning-Based Analysis of Ebola Virus' Impact on Gene Expression in Nonhuman Primates

Authors: Mostafa Rezapour, Muhammad Khalid Khan Niazi, Hao Lu, Aarthi Narayanan, Metin Nafi Gurcan

Abstract: This study introduces the Supervised Magnitude-Altitude Scoring (SMAS) methodology, a machine learning-based approach, for analyzing gene expression data obtained from nonhuman primates (NHPs) infected with Ebola virus (EBOV). We utilize a comprehensive dataset of NanoString gene expression profiles from Ebola-infected NHPs, deploying the SMAS system for nuanced host-pathogen interaction analysis.… ▽ More This study introduces the Supervised Magnitude-Altitude Scoring (SMAS) methodology, a machine learning-based approach, for analyzing gene expression data obtained from nonhuman primates (NHPs) infected with Ebola virus (EBOV). We utilize a comprehensive dataset of NanoString gene expression profiles from Ebola-infected NHPs, deploying the SMAS system for nuanced host-pathogen interaction analysis. SMAS effectively combines gene selection based on statistical significance and expression changes, employing linear classifiers such as logistic regression to accurately differentiate between RT-qPCR positive and negative NHP samples. A key finding of our research is the identification of IFI6 and IFI27 as critical biomarkers, demonstrating exceptional predictive performance with 100% accuracy and Area Under the Curve (AUC) metrics in classifying various stages of Ebola infection. Alongside IFI6 and IFI27, genes, including MX1, OAS1, and ISG15, were significantly upregulated, highlighting their essential roles in the immune response to EBOV. Our results underscore the efficacy of the SMAS method in revealing complex genetic interactions and response mechanisms during EBOV infection. This research provides valuable insights into EBOV pathogenesis and aids in developing more precise diagnostic tools and therapeutic strategies to address EBOV infection in particular and viral infection in general. △ Less

Submitted 22 January, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

Comments: 28 pages, 8 figures, 2 tables

arXiv:2309.05768 [pdf]

The Past, Present, and Future of the Brain Imaging Data Structure (BIDS)

Authors: Russell A. Poldrack, Christopher J. Markiewicz, Stefan Appelhoff, Yoni K. Ashar, Tibor Auer, Sylvain Baillet, Shashank Bansal, Leandro Beltrachini, Christian G. Benar, Giacomo Bertazzoli, Suyash Bhogawar, Ross W. Blair, Marta Bortoletto, Mathieu Boudreau, Teon L. Brooks, Vince D. Calhoun, Filippo Maria Castelli, Patricia Clement, Alexander L Cohen, Julien Cohen-Adad, Sasha D'Ambrosio, Gilles de Hollander, María de la iglesia-Vayá, Alejandro de la Vega, Arnaud Delorme , et al. (89 additional authors not shown)

Abstract: The Brain Imaging Data Structure (BIDS) is a community-driven standard for the organization of data and metadata from a growing range of neuroscience modalities. This paper is meant as a history of how the standard has developed and grown over time. We outline the principles behind the project, the mechanisms by which it has been extended, and some of the challenges being addressed as it evolves.… ▽ More The Brain Imaging Data Structure (BIDS) is a community-driven standard for the organization of data and metadata from a growing range of neuroscience modalities. This paper is meant as a history of how the standard has developed and grown over time. We outline the principles behind the project, the mechanisms by which it has been extended, and some of the challenges being addressed as it evolves. We also discuss the lessons learned through the project, with the aim of enabling researchers in other domains to learn from the success of BIDS. △ Less

Submitted 8 January, 2024; v1 submitted 11 September, 2023; originally announced September 2023.

arXiv:2309.00483 [pdf, other]

Geometry-aware Line Graph Transformer Pre-training for Molecular Property Prediction

Authors: Peizhen Bai, Xianyuan Liu, Haiping Lu

Abstract: Molecular property prediction with deep learning has gained much attention over the past years. Owing to the scarcity of labeled molecules, there has been growing interest in self-supervised learning methods that learn generalizable molecular representations from unlabeled data. Molecules are typically treated as 2D topological graphs in modeling, but it has been discovered that their 3D geometry… ▽ More Molecular property prediction with deep learning has gained much attention over the past years. Owing to the scarcity of labeled molecules, there has been growing interest in self-supervised learning methods that learn generalizable molecular representations from unlabeled data. Molecules are typically treated as 2D topological graphs in modeling, but it has been discovered that their 3D geometry is of great importance in determining molecular functionalities. In this paper, we propose the Geometry-aware line graph transformer (Galformer) pre-training, a novel self-supervised learning framework that aims to enhance molecular representation learning with 2D and 3D modalities. Specifically, we first design a dual-modality line graph transformer backbone to encode the topological and geometric information of a molecule. The designed backbone incorporates effective structural encodings to capture graph structures from both modalities. Then we devise two complementary pre-training tasks at the inter and intra-modality levels. These tasks provide properly supervised information and extract discriminative 2D and 3D knowledge from unlabeled molecules. Finally, we evaluate Galformer against six state-of-the-art baselines on twelve property prediction benchmarks via downstream fine-tuning. Experimental results show that Galformer consistently outperforms all baselines on both classification and regression tasks, demonstrating its effectiveness. △ Less

Submitted 1 September, 2023; originally announced September 2023.

Comments: 9 pages, 5 figures

arXiv:2303.07540 [pdf, other]

doi 10.1007/978-3-031-43990-2_20

Tensor-based Multimodal Learning for Prediction of Pulmonary Arterial Wedge Pressure from Cardiac MRI

Authors: Prasun C. Tripathi, Mohammod N. I. Suvon, Lawrence Schobs, Shuo Zhou, Samer Alabed, Andrew J. Swift, Haiping Lu

Abstract: Heart failure is a serious and life-threatening condition that can lead to elevated pressure in the left ventricle. Pulmonary Arterial Wedge Pressure (PAWP) is an important surrogate marker indicating high pressure in the left ventricle. PAWP is determined by Right Heart Catheterization (RHC) but it is an invasive procedure. A non-invasive method is useful in quickly identifying high-risk patients… ▽ More Heart failure is a serious and life-threatening condition that can lead to elevated pressure in the left ventricle. Pulmonary Arterial Wedge Pressure (PAWP) is an important surrogate marker indicating high pressure in the left ventricle. PAWP is determined by Right Heart Catheterization (RHC) but it is an invasive procedure. A non-invasive method is useful in quickly identifying high-risk patients from a large population. In this work, we develop a tensor learning-based pipeline for identifying PAWP from multimodal cardiac Magnetic Resonance Imaging (MRI). This pipeline extracts spatial and temporal features from high-dimensional scans. For quality control, we incorporate an epistemic uncertainty-based binning strategy to identify poor-quality training samples. To improve the performance, we learn complementary information by integrating features from multimodal data: cardiac MRI with short-axis and four-chamber views, and Electronic Health Records. The experimental analysis on a large cohort of $1346$ subjects who underwent the RHC procedure for PAWP estimation indicates that the proposed pipeline has a diagnostic value and can produce promising performance with significant improvement over the baseline in clinical practice (i.e., $Δ$AUC $=0.10$, $Δ$Accuracy $=0.06$, and $Δ$MCC $=0.39$). The decision curve analysis further confirms the clinical utility of our method. △ Less

Submitted 6 April, 2024; v1 submitted 13 March, 2023; originally announced March 2023.

arXiv:2211.16509 [pdf, other]

doi 10.1142/S2811032322500047

Multimodal Learning for Multi-Omics: A Survey

Authors: Sina Tabakhi, Mohammod Naimul Islam Suvon, Pegah Ahadian, Haiping Lu

Abstract: With advanced imaging, sequencing, and profiling technologies, multiple omics data become increasingly available and hold promises for many healthcare applications such as cancer diagnosis and treatment. Multimodal learning for integrative multi-omics analysis can help researchers and practitioners gain deep insights into human diseases and improve clinical decisions. However, several challenges a… ▽ More With advanced imaging, sequencing, and profiling technologies, multiple omics data become increasingly available and hold promises for many healthcare applications such as cancer diagnosis and treatment. Multimodal learning for integrative multi-omics analysis can help researchers and practitioners gain deep insights into human diseases and improve clinical decisions. However, several challenges are hindering the development in this area, including the availability of easily accessible open-source tools. This survey aims to provide an up-to-date overview of the data challenges, fusion approaches, datasets, and software tools from several new perspectives. We identify and investigate various omics data challenges that can help us understand the field better. We categorize fusion approaches comprehensively to cover existing methods in this area. We collect existing open-source tools to facilitate their broader utilization and development. We explore a broad range of omics data modalities and a list of accessible datasets. Finally, we summarize future directions that can potentially address existing gaps and answer the pressing need to advance multimodal learning for multi-omics data analysis. △ Less

Submitted 19 December, 2022; v1 submitted 29 November, 2022; originally announced November 2022.

Comments: 52 pages, 3 figures; Revised matrix factorization fusion section

arXiv:2208.02194 [pdf, other]

Interpretable bilinear attention network with domain adaptation improves drug-target prediction

Authors: Peizhen Bai, Filip Miljković, Bino John, Haiping Lu

Abstract: Predicting drug-target interaction is key for drug discovery. Recent deep learning-based methods show promising performance but two challenges remain: (i) how to explicitly model and learn local interactions between drugs and targets for better prediction and interpretation; (ii) how to generalize prediction performance on novel drug-target pairs from different distribution. In this work, we propo… ▽ More Predicting drug-target interaction is key for drug discovery. Recent deep learning-based methods show promising performance but two challenges remain: (i) how to explicitly model and learn local interactions between drugs and targets for better prediction and interpretation; (ii) how to generalize prediction performance on novel drug-target pairs from different distribution. In this work, we propose DrugBAN, a deep bilinear attention network (BAN) framework with domain adaptation to explicitly learn pair-wise local interactions between drugs and targets, and adapt on out-of-distribution data. DrugBAN works on drug molecular graphs and target protein sequences to perform prediction, with conditional domain adversarial learning to align learned interaction representations across different distributions for better generalization on novel drug-target pairs. Experiments on three benchmark datasets under both in-domain and cross-domain settings show that DrugBAN achieves the best overall performance against five state-of-the-art baselines. Moreover, visualizing the learned bilinear attention map provides interpretable insights from prediction results. △ Less

Submitted 19 January, 2023; v1 submitted 3 August, 2022; originally announced August 2022.

Comments: 19 pages, 7 figures

arXiv:2206.02788 [pdf]

doi 10.1073/pnas.2118836119

Accurate Virus Identification with Interpretable Raman Signatures by Machine Learning

Authors: Jiarong Ye, Yin-Ting Yeh, Yuan Xue, Ziyang Wang, Na Zhang, He Liu, Kunyan Zhang, RyeAnne Ricker, Zhuohang Yu, Allison Roder, Nestor Perea Lopez, Lindsey Organtini, Wallace Greene, Susan Hafenstein, Huaguang Lu, Elodie Ghedin, Mauricio Terrones, Shengxi Huang, Sharon Xiaolei Huang

Abstract: Rapid identification of newly emerging or circulating viruses is an important first step toward managing the public health response to potential outbreaks. A portable virus capture device coupled with label-free Raman Spectroscopy holds the promise of fast detection by rapidly obtaining the Raman signature of a virus followed by a machine learning approach applied to recognize the virus based on i… ▽ More Rapid identification of newly emerging or circulating viruses is an important first step toward managing the public health response to potential outbreaks. A portable virus capture device coupled with label-free Raman Spectroscopy holds the promise of fast detection by rapidly obtaining the Raman signature of a virus followed by a machine learning approach applied to recognize the virus based on its Raman spectrum, which is used as a fingerprint. We present such a machine learning approach for analyzing Raman spectra of human and avian viruses. A Convolutional Neural Network (CNN) classifier specifically designed for spectral data achieves very high accuracy for a variety of virus type or subtype identification tasks. In particular, it achieves 99% accuracy for classifying influenza virus type A vs. type B, 96% accuracy for classifying four subtypes of influenza A, 95% accuracy for differentiating enveloped and non-enveloped viruses, and 99% accuracy for differentiating avian coronavirus (infectious bronchitis virus, IBV) from other avian viruses. Furthermore, interpretation of neural net responses in the trained CNN model using a full-gradient algorithm highlights Raman spectral ranges that are most important to virus identification. By correlating ML-selected salient Raman ranges with the signature ranges of known biomolecules and chemical functional groups (for example, amide, amino acid, carboxylic acid), we verify that our ML model effectively recognizes the Raman signatures of proteins, lipids and other vital functional groups present in different viruses and uses a weighted combination of these signatures to identify viruses. △ Less

Submitted 5 June, 2022; originally announced June 2022.

Comments: 23 pages, 8 figures

Journal ref: Proceedings of the National Academy of Sciences of the United States of America (2022)

arXiv:2102.09380 [pdf, other]

doi 10.3389/frai.2021.668395

Topological data analysis of C. elegans locomotion and behavior

Authors: Ashleigh Thomas, Kathleen Bates, Alex Elchesen, Iryna Hartsock, Hang Lu, Peter Bubenik

Abstract: We apply topological data analysis to the behavior of C. elegans, a widely-studied model organism in biology. In particular, we use topology to produce a quantitative summary of complex behavior which may be applied to high-throughput data. Our methods allow us to distinguish and classify videos from various environmental conditions and we analyze the trade-off between accuracy and interpretabilit… ▽ More We apply topological data analysis to the behavior of C. elegans, a widely-studied model organism in biology. In particular, we use topology to produce a quantitative summary of complex behavior which may be applied to high-throughput data. Our methods allow us to distinguish and classify videos from various environmental conditions and we analyze the trade-off between accuracy and interpretability. Furthermore, we present a novel technique for visualizing the outputs of our analysis in terms of the input. Specifically, we use representative cycles of persistent homology to produce synthetic videos of stereotypical behaviors. △ Less

Submitted 21 July, 2021; v1 submitted 18 February, 2021; originally announced February 2021.

Comments: 27 pages, 15 figures/tables

MSC Class: 55N31 (Primary); 62R40 (Secondary)

Journal ref: Front. Artif. Intell. 4:668395 (2021)

arXiv:2102.04069 [pdf]

Reconstructing large networks with time-varying interactions

Authors: Chun-Wei Chang, Takeshi Miki, Masayuki Ushio, Hsiao-Pei Lu, Fuh-Kwo Shiah, Chih-hao Hsieh

Abstract: Reconstructing interactions from observational data is a critical need for investigating natural biological networks, wherein network dimensionality (i.e. number of interacting components) is usually high and interactions are time-varying. These pose a challenge to existing methods that can quantify only small interaction networks or assume static interactions under steady state. Here, we proposed… ▽ More Reconstructing interactions from observational data is a critical need for investigating natural biological networks, wherein network dimensionality (i.e. number of interacting components) is usually high and interactions are time-varying. These pose a challenge to existing methods that can quantify only small interaction networks or assume static interactions under steady state. Here, we proposed a novel approach to reconstruct high-dimensional, time-varying interaction networks using empirical time series. This method, named "multiview distance regularized S-map", generalized the state space reconstruction to accommodate high dimensionality and overcome difficulties in quantifying massive interactions with limited data. When we evaluated this method using the time series generated from a large theoretical model involving hundreds of interacting species, estimated interaction strengths were in good agreement with theoretical expectations. As a result, reconstructed networks preserved important topological properties, such as centrality, strength distribution and derived stability measures. Moreover, our method effectively forecasted the dynamic behavior of network nodes. Applying this method to a natural bacterial community helped identify keystone species from the interaction network and revealed the mechanisms governing the dynamical stability of bacterial community. Our method overcame the challenge of high dimensionality and disentangled complex time-varying interactions in large natural dynamical systems. △ Less

Submitted 8 February, 2021; originally announced February 2021.

Comments: The first 28 pages are the main text (including four figures and one table) and followed by SI Texts, including Fig. S1-S12 and Table S1-S2

arXiv:2006.03611 [pdf, other]

Neuropsychiatric Disease Classification Using Functional Connectomics -- Results of the Connectomics in NeuroImaging Transfer Learning Challenge

Authors: Markus D. Schirmer, Archana Venkataraman, Islem Rekik, Minjeong Kim, Stewart H. Mostofsky, Mary Beth Nebel, Keri Rosch, Karen Seymour, Deana Crocetti, Hassna Irzan, Michael Hütel, Sebastien Ourselin, Neil Marlow, Andrew Melbourne, Egor Levchenko, Shuo Zhou, Mwiza Kunda, Haiping Lu, Nicha C. Dvornek, Juntang Zhuang, Gideon Pinto, Sandip Samal, Jennings Zhang, Jorge L. Bernal-Rusiel, Rudolph Pienaar , et al. (1 additional authors not shown)

Abstract: Large, open-source consortium datasets have spurred the development of new and increasingly powerful machine learning approaches in brain connectomics. However, one key question remains: are we capturing biologically relevant and generalizable information about the brain, or are we simply overfitting to the data? To answer this, we organized a scientific challenge, the Connectomics in NeuroImaging… ▽ More Large, open-source consortium datasets have spurred the development of new and increasingly powerful machine learning approaches in brain connectomics. However, one key question remains: are we capturing biologically relevant and generalizable information about the brain, or are we simply overfitting to the data? To answer this, we organized a scientific challenge, the Connectomics in NeuroImaging Transfer Learning Challenge (CNI-TLC), held in conjunction with MICCAI 2019. CNI-TLC included two classification tasks: (1) diagnosis of Attention-Deficit/Hyperactivity Disorder (ADHD) within a pre-adolescent cohort; and (2) transference of the ADHD model to a related cohort of Autism Spectrum Disorder (ASD) patients with an ADHD comorbidity. In total, 240 resting-state fMRI time series averaged according to three standard parcellation atlases, along with clinical diagnosis, were released for training and validation (120 neurotypical controls and 120 ADHD). We also provided demographic information of age, sex, IQ, and handedness. A second set of 100 subjects (50 neurotypical controls, 25 ADHD, and 25 ASD with ADHD comorbidity) was used for testing. Models were submitted in a standardized format as Docker images through ChRIS, an open-source image analysis platform. Utilizing an inclusive approach, we ranked the methods based on 16 different metrics. The final rank was calculated using the rank product for each participant across all measures. Furthermore, we assessed the calibration curves of each method. Five participants submitted their model for evaluation, with one outperforming all other methods in both ADHD and ASD classification. However, further improvements are needed to reach the clinical translation of functional connectomics. We are keeping the CNI-TLC open as a publicly available resource for developing and validating new classification methodologies in the field of connectomics. △ Less

Submitted 25 November, 2020; v1 submitted 5 June, 2020; originally announced June 2020.

Comments: CNI-TLC was held in conjunction with MICCAI 2019

arXiv:1904.01096 [pdf]

Digging Deeper: Methodologies for High-Content Phenotyping and Knowledge-Abstraction in C. elegans

Authors: Nan Xu, Dhaval S. Patel, Hang Lu

Abstract: Deep phenotyping is an emerging conceptual paradigm and experimental approach that seeks to measure many aspects of phenotypes and link them to understand the underlying biology. Successful deep phenotyping has mostly been applied in cultured cells, less so in multicellular organisms. Recently, however, it has been recognized that such an approach could lead to better understanding of how genetics… ▽ More Deep phenotyping is an emerging conceptual paradigm and experimental approach that seeks to measure many aspects of phenotypes and link them to understand the underlying biology. Successful deep phenotyping has mostly been applied in cultured cells, less so in multicellular organisms. Recently, however, it has been recognized that such an approach could lead to better understanding of how genetics, the environment, and stochasticity affect development, physiology, and behavior of an organism. Over the last 50 years, the nematode Caenorhabditis elegans has become an invaluable model system for understanding the role of the genes underlying a phenotypic trait. Recent technological innovation has taken advantage of the worm physical attributes to increase the throughput and informational content of experiments. Coupling these technical advancements with computational or analytical tools has enabled a boom in deep phenotyping studies of C. elegans. In this review, we highlight how these new technologies and tools are digging into the biological origins of complex multidimensional phenotypes seen in the worm. △ Less

Submitted 1 April, 2019; originally announced April 2019.

arXiv:1902.10291 [pdf]

Accurate Target Localization by using Artificial Pinnae of brown long-eared bat

Authors: Sen Zhang, Xin Ma, Hongwang Lu, Weikai He, Weidong Zhou

Abstract: Echolocating bats locate the targets by echolocation. Many theoretical frameworks have been suggested the abilities of bats are related to the shapes of bats ears, but few artificial bat-like ears have been made to mimic the abilities, the difficulty of which lies in the determination of the elevation angle of the target. In this study, we present a device with artificial bat pinnae modeling by th… ▽ More Echolocating bats locate the targets by echolocation. Many theoretical frameworks have been suggested the abilities of bats are related to the shapes of bats ears, but few artificial bat-like ears have been made to mimic the abilities, the difficulty of which lies in the determination of the elevation angle of the target. In this study, we present a device with artificial bat pinnae modeling by the ears of brown long-eared bat (Plecotus auritus) which can accurately estimate the elevation angle of the aerial target by virtue of active sonar. An artificial neural-network with the labeled data obtained from echoes as the trained and tested data is used and optimized by a tenfold cross-validation technique. A decision method we named sliding window averaging algorithm is designed for getting the estimation results of elevation. At last, a right-angle pinnae construction is designed for determining direction of the target. The results show a higher accuracy for the direction determination of the single target. The results also demonstrate that for the Plecotus auritus bat, not only the binaural shapes, but the binaural relative orientations also play important roles in the target localization. △ Less

Submitted 26 February, 2019; originally announced February 2019.

Comments: 22 pages, 10 figures

arXiv:1802.08565 [pdf]

Additional contributions from: Nobel Symposium 162 - Microfluidics

Authors: Stefan Löfås, Amy E. Herr, Jianhua Qin, Tuomas Knowles, Takehiko Kitamori, Hang Lu, David J. Beebe, Jongyoon Han, James Landers, Andreas Manz, Roland Zengerle, David A. Weitz, Johan Elf, Thomas Laurell

Abstract: Series of short contributions that are part of Nobel Symposium 162 - Microfluidics arXiv:1712.08369. Series of short contributions that are part of Nobel Symposium 162 - Microfluidics arXiv:1712.08369. △ Less

Submitted 22 February, 2018; originally announced February 2018.

Comments: Nobel Symposium 162, Stockholm, Sweden, 2017 arXiv:1712.08369v1 Report-no: Nobel162/2017/00

Report number: Nobel162/2017/00

arXiv:1710.05722 [pdf, ps, other]

A High Order Stochastic Asymptotic Preserving Scheme for Chemotaxis Kinetic Models with Random Inputs

Authors: Shi Jin, Hanqing Lu, Lorenzo Pareschi

Abstract: In this paper, we develop a stochastic Asymptotic-Preserving (sAP) scheme for the kinetic chemotaxis system with random inputs, which will converge to the modified Keller-Segel model with random inputs in the diffusive regime. Based on the generalized Polynomial Chaos (gPC) approach, we design a high order stochastic Galerkin method using implicit-explicit (IMEX) Runge-Kutta (RK) time discretizati… ▽ More In this paper, we develop a stochastic Asymptotic-Preserving (sAP) scheme for the kinetic chemotaxis system with random inputs, which will converge to the modified Keller-Segel model with random inputs in the diffusive regime. Based on the generalized Polynomial Chaos (gPC) approach, we design a high order stochastic Galerkin method using implicit-explicit (IMEX) Runge-Kutta (RK) time discretization with a macroscopic penalty term. The new schemes improve the parabolic CFL condition to a hyperbolic type when the mean free path is small, which shows significant efficiency especially in uncertainty quantification (UQ) with multi-scale problems. The stochastic Asymptotic-Preserving property will be shown asymptotically and verified numerically in several tests. Many other numerical tests are conducted to explore the effect of the randomness in the kinetic system, in the aim of providing more intuitions for the theoretic study of the chemotaxis models. △ Less

Submitted 16 October, 2017; originally announced October 2017.

arXiv:0706.0194 [pdf]

Comparing Classical Pathways and Modern Networks: Towards the Development of an Edge Ontology

Authors: Long J. Lu, Andrea Sboner, Yuanpeng J. Huang, Hao Xin Lu, Tara A. Gianoulis, Kevin Y. Yip, Philip M. Kim, Gaetano T. Montelione, Mark B. Gerstein

Abstract: Pathways are integral to systems biology. Their classical representation has proven useful but is inconsistent in the meaning assigned to each arrow (or edge) and inadvertently implies the isolation of one pathway from another. Conversely, modern high-throughput experiments give rise to standardized networks facilitating topological calculations. Combining these perspectives, we can embed classi… ▽ More Pathways are integral to systems biology. Their classical representation has proven useful but is inconsistent in the meaning assigned to each arrow (or edge) and inadvertently implies the isolation of one pathway from another. Conversely, modern high-throughput experiments give rise to standardized networks facilitating topological calculations. Combining these perspectives, we can embed classical pathways within large-scale networks and thus demonstrate the crosstalk between them. As more diverse types of high-throughput data become available, we can effectively merge both perspectives, embedding pathways simultaneously in multiple networks. However, the original problem still remains - the current edge representation is inadequate to accurately convey all the information in pathways. Therefore, we suggest that a standardized, well-defined, edge ontology is necessary and propose a prototype here, as a starting point for reaching this goal. △ Less

Submitted 1 June, 2007; originally announced June 2007.

Comments: 30 pages including 5 figures and supplemental material

arXiv:q-bio/0601018 [pdf, ps, other]

doi 10.1103/PhysRevLett.96.058106

Protein folding dynamics via quantification of kinematic energy landscape

Authors: Sëma Kachalo, Hsiao-Mei Lu, Jie Liang

Abstract: We study folding dynamics of protein-like sequences on square lattice using physical move set that exhausts all possible conformational changes. By analytically solving the master equation, we follow the time-dependent probabilities of occupancy of all 802,075 conformations of 16-mers over 7-orders of time span. We find that (i) folding rates of these protein-like sequences of same length can di… ▽ More We study folding dynamics of protein-like sequences on square lattice using physical move set that exhausts all possible conformational changes. By analytically solving the master equation, we follow the time-dependent probabilities of occupancy of all 802,075 conformations of 16-mers over 7-orders of time span. We find that (i) folding rates of these protein-like sequences of same length can differ by 4-orders of magnitude, (ii) folding rates of sequences of the same conformation can differ by a factor of 190, and (iii) parameters of the native structures, designability, and thermodynamic properties are weak predictors of the folding rates, rather, basin analysis of the kinematic energy landscape defined by the moves can provide excellent account of the observed folding rates. △ Less

Submitted 12 January, 2006; originally announced January 2006.

Comments: 4 pages, 4 figures

arXiv:q-bio/0409011 [pdf]

SUMO Substrates and Sites Prediction Combining Pattern Recognition and Phylogenetic Conservation

Authors: Yu Xue, Fengfeng Zhou, Hualei Lu, Guoliang Chen, Xuebiao Yao

Abstract: Small Ubiquitin-related modifier (SUMO) proteins are widely expressed in eukaryotic cells, which are reversibly coupled to their substrates by motif recognition, called sumoylation. Two interesting questions are 1) how many potential SUMO substrates may be included in mammalian proteomes, such as human and mouse, 2) and given a SUMO substrate, can we recognize its sumoylation sites? To answer th… ▽ More Small Ubiquitin-related modifier (SUMO) proteins are widely expressed in eukaryotic cells, which are reversibly coupled to their substrates by motif recognition, called sumoylation. Two interesting questions are 1) how many potential SUMO substrates may be included in mammalian proteomes, such as human and mouse, 2) and given a SUMO substrate, can we recognize its sumoylation sites? To answer these two questions, previous prediction systems of SUMO substrates mainly adopted the pattern recognition methods, which could get high sensitivity with relatively too many potential false positives. So we use phylogenetic conservation between mouse and human to reduce the number of potential false positives. △ Less

Submitted 9 September, 2004; originally announced September 2004.

Comments: 15 pages (including 1 figure and 2 tables)

arXiv:physics/9907042 [pdf, ps, other]

Some Exact Results of Hopfield Neural Networks and Applications

Authors: Hong-Liang Lu, Xi-Jun Qiu

Abstract: A set of fixed points of the Hopfield type neural network was under investigation. Its connection matrix is constructed with regard to the Hebb rule from a highly symmetric set of the memorized patterns. Depending on the external parameter the analytic description of the fixed points set had been obtained. And as a conclusion, some exact results of Hopfield neural networks were gained. A set of fixed points of the Hopfield type neural network was under investigation. Its connection matrix is constructed with regard to the Hebb rule from a highly symmetric set of the memorized patterns. Depending on the external parameter the analytic description of the fixed points set had been obtained. And as a conclusion, some exact results of Hopfield neural networks were gained. △ Less

Submitted 23 July, 1999; originally announced July 1999.

Comments: 4 pages, latex, no figures

Showing 1–23 of 23 results for author: Lu, H