Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–14 of 14 results for author: Theis, F J

Searching in archive q-bio. Search in all archives.
.
  1. arXiv:2311.07621  [pdf, other

    q-bio.GN cs.LG

    To Transformers and Beyond: Large Language Models for the Genome

    Authors: Micaela E. Consens, Cameron Dufault, Michael Wainberg, Duncan Forster, Mehran Karimzadeh, Hani Goodarzi, Fabian J. Theis, Alan Moses, Bo Wang

    Abstract: In the rapidly evolving landscape of genomics, deep learning has emerged as a useful tool for tackling complex computational challenges. This review focuses on the transformative role of Large Language Models (LLMs), which are mostly based on the transformer architecture, in genomics. Building on the foundation of traditional convolutional neural networks and recurrent neural networks, we explore… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

  2. arXiv:2311.02455  [pdf, other

    cs.LG q-bio.GN q-bio.QM stat.AP

    Mixed Models with Multiple Instance Learning

    Authors: Jan P. Engelmann, Alessandro Palma, Jakub M. Tomczak, Fabian J. Theis, Francesco Paolo Casale

    Abstract: Predicting patient features from single-cell data can help identify cellular states implicated in health and disease. Linear models and average cell type expressions are typically favored for this task for their efficiency and robustness, but they overlook the rich cell heterogeneity inherent in single-cell data. To address this gap, we introduce MixMIL, a framework integrating Generalized Linear… ▽ More

    Submitted 8 March, 2024; v1 submitted 4 November, 2023; originally announced November 2023.

    Comments: AISTATS 2024 Oral, Code: https://github.com/AIH-SGML/MixMIL

  3. arXiv:2310.14935  [pdf

    cs.LG q-bio.GN

    Causal machine learning for single-cell genomics

    Authors: Alejandro Tejada-Lapuerta, Paul Bertin, Stefan Bauer, Hananeh Aliee, Yoshua Bengio, Fabian J. Theis

    Abstract: Advances in single-cell omics allow for unprecedented insights into the transcription profiles of individual cells. When combined with large-scale perturbation screens, through which specific biological mechanisms can be targeted, these technologies allow for measuring the effect of targeted perturbations on the whole transcriptome. These advances provide an opportunity to better understand the ca… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: 35 pages, 7 figures, 3 tables, 1 box

  4. arXiv:2307.00558  [pdf, other

    cs.LG q-bio.QM

    Conditionally Invariant Representation Learning for Disentangling Cellular Heterogeneity

    Authors: Hananeh Aliee, Ferdinand Kapl, Soroor Hediyeh-Zadeh, Fabian J. Theis

    Abstract: This paper presents a novel approach that leverages domain variability to learn representations that are conditionally invariant to unwanted variability or distractors. Our approach identifies both spurious and invariant latent features necessary for achieving accurate reconstruction by placing distinct conditional priors on latent features. The invariant signals are disentangled from noise by enf… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

  5. arXiv:2205.07110  [pdf, other

    cs.LG q-bio.QM

    SystemMatch: optimizing preclinical drug models to human clinical outcomes via generative latent-space matching

    Authors: Scott Gigante, Varsha G. Raghavan, Amanda M. Robinson, Robert A. Barton, Adeeb H. Rahman, Drausin F. Wulsin, Jacques Banchereau, Noam Solomon, Luis F. Voloch, Fabian J. Theis

    Abstract: Translating the relevance of preclinical models ($\textit{in vitro}$, animal models, or organoids) to their relevance in humans presents an important challenge during drug development. The rising abundance of single-cell genomic data from human tumors and tissue offers a new opportunity to optimize model systems by their similarity to targeted human cell types in disease. In this work, we introduc… ▽ More

    Submitted 14 May, 2022; originally announced May 2022.

    Comments: Published at the MLDD workshop, ICLR 2022

  6. arXiv:2104.11364  [pdf

    q-bio.OT cs.CY

    A field guide to cultivating computational biology

    Authors: Anne E Carpenter, Casey S Greene, Piero Carnici, Benilton S Carvalho, Michiel de Hoon, Stacey Finley, Kim-Anh Le Cao, Jerry SH Lee, Luigi Marchionni, Suzanne Sindi, Fabian J Theis, Gregory P Way, Jean YH Yang, Elana J Fertig

    Abstract: Biomedical research centers can empower basic discovery and novel therapeutic strategies by leveraging their large-scale datasets from experiments and patients. This data, together with new technologies to create and analyze it, has ushered in an era of data-driven discovery which requires moving beyond the traditional individual, single-discipline investigator research model. This interdisciplina… ▽ More

    Submitted 22 April, 2021; originally announced April 2021.

  7. arXiv:1910.01791  [pdf, other

    cs.LG eess.IV q-bio.CB q-bio.GN stat.ML

    Conditional out-of-sample generation for unpaired data using trVAE

    Authors: Mohammad Lotfollahi, Mohsen Naghipourfar, Fabian J. Theis, F. Alexander Wolf

    Abstract: While generative models have shown great success in generating high-dimensional samples conditional on low-dimensional descriptors (learning e.g. stroke thickness in MNIST, hair color in CelebA, or speaker identity in Wavenet), their generation out-of-sample poses fundamental problems. The conditional variational autoencoder (CVAE) as a simple conditional generative model does not explicitly relat… ▽ More

    Submitted 30 October, 2019; v1 submitted 3 October, 2019; originally announced October 2019.

    Comments: Added reference to Johansson et al. (2016) and removed sentences from Lopez et al. (2018) in the background section (see acknowledgements)

  8. arXiv:1909.12550  [pdf

    q-bio.GN q-bio.MN q-bio.PE

    Single-cell eQTLGen Consortium: a personalized understanding of disease

    Authors: Monique G. P. van der Wijst, Dylan H. de Vries, Hilde E. Groot, Gosia Trynka, Chung-Chau Hon, Martijn C. Nawijn, Youssef Idaghdour, Pim van der Harst, Chun J. Ye, Joseph Powell, Fabian J. Theis, Ahmed Mahfouz, Matthias Heinig, Lude Franke

    Abstract: In recent years, functional genomics approaches combining genetic information with bulk RNA-sequencing data have identified the downstream expression effects of disease-associated genetic risk factors through so-called expression quantitative trait locus (eQTL) analysis. Single-cell RNA-sequencing creates enormous opportunities for mapping eQTLs across different cell types and in dynamic processes… ▽ More

    Submitted 27 September, 2019; originally announced September 2019.

    Comments: 26 pages, 5 figures, position paper of sc-eQTLGen consortium

  9. arXiv:1810.04281  [pdf, other

    stat.AP q-bio.QM

    Fully integrative data analysis of NMR metabolic fingerprints with comprehensive patient data: a case report based on the German Chronic Kidney Disease (GCKD) study

    Authors: Helena U. Zacharias, Michael Altenbuchinger, Stefan Solbrig, Andreas Schäfer, Mustafa Buyukozkan, Ulla T. Schultheiß, Fruzsina Kotsis, Anna Köttgen, Jan Krumsiek, Fabian J. Theis, Rainer Spang, Peter J. Oefner, Wolfram Gronwald, GCKD study investigators

    Abstract: Omics data facilitate the gain of novel insights into the pathophysiology of diseases and, consequently, their diagnosis, treatment, and prevention. To that end, it is necessary to integrate omics data with other data types such as clinical, phenotypic, and demographic parameters of categorical or continuous nature. Here, we exemplify this data integration issue for a study on chronic kidney disea… ▽ More

    Submitted 8 October, 2018; originally announced October 2018.

  10. arXiv:1511.01658  [pdf, other

    math.OC q-bio.MN

    A simulation-based approach for solving optimisation problems with ODE-type steady state constraints

    Authors: Anna Fiedler, Fabian J. Theis, Jan Hasenauer

    Abstract: Ordinary differential equations (ODEs) are widely used to model biological, (bio-)chemical and technical processes. The parameters of these ODEs are often estimated from experimental data using ODE-constrained optimisation. This article proposes a simple simulation-based approach for solving optimisation problems with steady state constraints relying on an ODE. This simulation-based optimisation m… ▽ More

    Submitted 5 November, 2015; originally announced November 2015.

    Comments: 11 pages, 3 figures

  11. arXiv:1506.06392  [pdf, other

    q-bio.MN q-bio.QM

    Data-driven modelling of biological multi-scale processes

    Authors: Jan Hasenauer, Nick Jagiella, Sabrina Hross, Fabian J. Theis

    Abstract: Biological processes involve a variety of spatial and temporal scales. A holistic understanding of many biological processes therefore requires multi-scale models which capture the relevant properties on all these scales. In this manuscript we review mathematical modelling approaches used to describe the individual spatial scales and how they are integrated into holistic models. We discuss the rel… ▽ More

    Submitted 21 June, 2015; originally announced June 2015.

    Comments: This manuscript will appear in the Journal of Coupled Systems and Multiscale Dynamics (American Scientific Publishers)

    MSC Class: 92Bxx; 93A30

  12. arXiv:1407.2112  [pdf

    cs.GR cs.HC q-bio.QM

    MCA: Multiresolution Correlation Analysis, a graphical tool for subpopulation identification in single-cell gene expression data

    Authors: Justin Feigelman, Fabian J. Theis, Carsten Marr

    Abstract: Background: Biological data often originate from samples containing mixtures of subpopulations, corresponding e.g. to distinct cellular phenotypes. However, identification of distinct subpopulations may be difficult if biological measurements yield distributions that are not easily separable. Results: We present Multiresolution Correlation Analysis (MCA), a method for visually identifying subpopul… ▽ More

    Submitted 8 July, 2014; originally announced July 2014.

    Comments: BioVis 2014 conference

  13. Stability and multi-attractor dynamics of a toggle switch based on a two-stage model of stochastic gene expression

    Authors: Michael K. Strasser, Fabian J. Theis, Carsten Marr

    Abstract: A toggle switch consists of two genes that mutually repress each other. This regulatory motif is active during cell differentiation and is thought to act as a memory device, being able to choose and maintain cell fate decisions. In this contribution, we study the stability and dynamics of a two-stage gene expression switch within a probabilistic framework inspired by the properties of the Pu/Gata… ▽ More

    Submitted 1 December, 2011; originally announced December 2011.

    Comments: to appear in the Biophysical Journal

  14. Patterns of subnet usage reveal distinct scales of regulation in the transcriptional regulatory network of Escherichia coli

    Authors: Carsten Marr, Fabian J. Theis, Larry S. Liebovitch, Marc-Thorsten Hütt

    Abstract: The set of regulatory interactions between genes, mediated by transcription factors, forms a species' transcriptional regulatory network (TRN). By comparing this network with measured gene expression data one can identify functional properties of the TRN and gain general insight into transcriptional control. We define the subnet of a node as the subgraph consisting of all nodes topologically downs… ▽ More

    Submitted 24 May, 2010; originally announced May 2010.

    Comments: 14 pages, 8 figures, to be published in PLoS Computational Biology