-
Non-Negative Universal Differential Equations With Applications in Systems Biology
Authors:
Maren Philipps,
Antonia Körner,
Jakob Vanhoefer,
Dilan Pathirana,
Jan Hasenauer
Abstract:
Universal differential equations (UDEs) leverage the respective advantages of mechanistic models and artificial neural networks and combine them into one dynamic model. However, these hybrid models can suffer from unrealistic solutions, such as negative values for biochemical quantities. We present non-negative UDE (nUDEs), a constrained UDE variant that guarantees non-negative values. Furthermore…
▽ More
Universal differential equations (UDEs) leverage the respective advantages of mechanistic models and artificial neural networks and combine them into one dynamic model. However, these hybrid models can suffer from unrealistic solutions, such as negative values for biochemical quantities. We present non-negative UDE (nUDEs), a constrained UDE variant that guarantees non-negative values. Furthermore, we explore regularisation techniques to improve generalisation and interpretability of UDEs.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Assessment of Uncertainty Quantification in Universal Differential Equations
Authors:
Nina Schmid,
David Fernandes del Pozo,
Willem Waegeman,
Jan Hasenauer
Abstract:
Scientific Machine Learning is a new class of approaches that integrate physical knowledge and mechanistic models with data-driven techniques for uncovering governing equations of complex processes. Among the available approaches, Universal Differential Equations (UDEs) are used to combine prior knowledge in the form of mechanistic formulations with universal function approximators, like neural ne…
▽ More
Scientific Machine Learning is a new class of approaches that integrate physical knowledge and mechanistic models with data-driven techniques for uncovering governing equations of complex processes. Among the available approaches, Universal Differential Equations (UDEs) are used to combine prior knowledge in the form of mechanistic formulations with universal function approximators, like neural networks. Integral to the efficacy of UDEs is the joint estimation of parameters within mechanistic formulations and the universal function approximators using empirical data. The robustness and applicability of resultant models, however, hinge upon the rigorous quantification of uncertainties associated with these parameters, as well as the predictive capabilities of the overall model or its constituent components. With this work, we provide a formalisation of uncertainty quantification (UQ) for UDEs and investigate important frequentist and Bayesian methods. By analysing three synthetic examples of varying complexity, we evaluate the validity and efficiency of ensembles, variational inference and Markov chain Monte Carlo sampling as epistemic UQ methods for UDEs.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Exploration of methods for computing sensitivities in ODE models at dynamic and steady states
Authors:
Polina Lakrisenko,
Dilan Pathirana,
Daniel Weindl,
Jan Hasenauer
Abstract:
Estimating parameters of dynamic models from experimental data is a challenging, and often computationally-demanding task. It requires a large number of model simulations and objective function gradient computations, if gradient-based optimization is used. The gradient depends on derivatives of the state variables with respect to parameters, also called state sensitivities, which are expensive to…
▽ More
Estimating parameters of dynamic models from experimental data is a challenging, and often computationally-demanding task. It requires a large number of model simulations and objective function gradient computations, if gradient-based optimization is used. The gradient depends on derivatives of the state variables with respect to parameters, also called state sensitivities, which are expensive to compute. In many cases, steady-state computation is a part of model simulation, either due to steady-state data or an assumption that the system is at steady state at the initial time point. Various methods are available for steady-state and gradient computation. Yet, the most efficient pair of methods (one for steady states, one for gradients) for a particular model is often not clear. Moreover, depending on the model and the available data, some methods may not be applicable or sufficiently robust. In order to facilitate the selection of methods, we explore six method pairs for computing the steady state and sensitivities at steady state using six real-world problems. The method pairs involve numerical integration or Newton's method to compute the steady-state, and -- for both forward and adjoint sensitivity analysis -- numerical integration or a tailored method to compute the sensitivities at steady-state. Our evaluation shows that the two method pairs that combine numerical integration for the steady-state with a tailored method for the sensitivities at steady-state were the most robust, and amongst the most computationally-efficient. We also observed that while Newton's method for steady-state computation yields a substantial speedup compared to numerical integration, it may lead to a large number of simulation failures. Overall, our study provides a concise overview across current methods for computing sensitivities at steady state, guiding modelers to choose the right methods.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
pyPESTO: A modular and scalable tool for parameter estimation for dynamic models
Authors:
Yannik Schälte,
Fabian Fröhlich,
Paul J. Jost,
Jakob Vanhoefer,
Dilan Pathirana,
Paul Stapor,
Polina Lakrisenko,
Dantong Wang,
Elba Raimúndez,
Simon Merkt,
Leonard Schmiester,
Philipp Städter,
Stephan Grein,
Erika Dudkin,
Domagoj Doresic,
Daniel Weindl,
Jan Hasenauer
Abstract:
Mechanistic models are important tools to describe and understand biological processes. However, they typically rely on unknown parameters, the estimation of which can be challenging for large and complex systems. We present pyPESTO, a modular framework for systematic parameter estimation, with scalable algorithms for optimization and uncertainty quantification. While tailored to ordinary differen…
▽ More
Mechanistic models are important tools to describe and understand biological processes. However, they typically rely on unknown parameters, the estimation of which can be challenging for large and complex systems. We present pyPESTO, a modular framework for systematic parameter estimation, with scalable algorithms for optimization and uncertainty quantification. While tailored to ordinary differential equation problems, pyPESTO is broadly applicable to black-box parameter estimation problems. Besides own implementations, it provides a unified interface to various popular simulation and inference methods. pyPESTO is implemented in Python, open-source under a 3-Clause BSD license. Code and documentation are available on GitHub (https://github.com/icb-dcm/pypesto).
△ Less
Submitted 2 May, 2023;
originally announced May 2023.
-
A Wall-time Minimizing Parallelization Strategy for Approximate Bayesian Computation
Authors:
Emad Alamoudi,
Felipe Reck,
Nils Bundgaard,
Frederik Graw,
Lutz Brusch,
Jan Hasenauer,
Yannik Schälte
Abstract:
Approximate Bayesian Computation (ABC) is a widely applicable and popular approach to estimating unknown parameters of mechanistic models. As ABC analyses are computationally expensive, parallelization on high-performance infrastructure is often necessary. However, the existing parallelization strategies leave resources unused at times and thus do not optimally leverage them yet. We present look-a…
▽ More
Approximate Bayesian Computation (ABC) is a widely applicable and popular approach to estimating unknown parameters of mechanistic models. As ABC analyses are computationally expensive, parallelization on high-performance infrastructure is often necessary. However, the existing parallelization strategies leave resources unused at times and thus do not optimally leverage them yet. We present look-ahead scheduling, a wall-time minimizing parallelization strategy for ABC Sequential Monte Carlo algorithms, which utilizes all available resources at practically all times by proactive sampling for prospective tasks. Our strategy can be integrated in e.g. adaptive distance function and summary statistic selection schemes, which is essential in practice. Evaluation of the strategy on different problems and numbers of parallel cores reveals speed-ups of typically 10-20% and up to 50% compared to the best established approach. Thus, the proposed strategy allows to substantially improve the cost and run-time efficiency of ABC methods on high-performance infrastructure.
△ Less
Submitted 30 April, 2023;
originally announced May 2023.
-
pyABC: Efficient and robust easy-to-use approximate Bayesian computation
Authors:
Yannik Schälte,
Emmanuel Klinger,
Emad Alamoudi,
Jan Hasenauer
Abstract:
The Python package pyABC provides a framework for approximate Bayesian computation (ABC), a likelihood-free parameter inference method popular in many research areas. At its core, it implements a sequential Monte-Carlo (SMC) scheme, with various algorithms to adapt to the problem structure and automatically tune hyperparameters. To scale to computationally expensive problems, it provides efficient…
▽ More
The Python package pyABC provides a framework for approximate Bayesian computation (ABC), a likelihood-free parameter inference method popular in many research areas. At its core, it implements a sequential Monte-Carlo (SMC) scheme, with various algorithms to adapt to the problem structure and automatically tune hyperparameters. To scale to computationally expensive problems, it provides efficient parallelization strategies for multi-core and distributed systems. The package is highly modular and designed to be easily usable. In this major update to pyABC, we implement several advanced algorithms that facilitate efficient and robust inference on a wide range of data and model types. In particular, we implement algorithms to account for noise, to adaptively scale-normalize distance metrics, to robustly handle data outliers, to elucidate informative data points via regression models, to circumvent summary statistics via optimal transport based distances, and to avoid local optima in acceptance threshold sequences by predicting acceptance rate curves. Further, we provide, besides previously existing support of Python and R, interfaces in particular to the Julia language, the COPASI simulator, and the PEtab standard.
△ Less
Submitted 24 March, 2022;
originally announced March 2022.
-
A protocol for dynamic model calibration
Authors:
Alejandro F. Villaverde,
Dilan Pathirana,
Fabian Fröhlich,
Jan Hasenauer,
Julio R. Banga
Abstract:
Ordinary differential equation models are nowadays widely used for the mechanistic description of biological processes and their temporal evolution. These models typically have many unknown and non-measurable parameters, which have to be determined by fitting the model to experimental data. In order to perform this task, known as parameter estimation or model calibration, the modeller faces challe…
▽ More
Ordinary differential equation models are nowadays widely used for the mechanistic description of biological processes and their temporal evolution. These models typically have many unknown and non-measurable parameters, which have to be determined by fitting the model to experimental data. In order to perform this task, known as parameter estimation or model calibration, the modeller faces challenges such as poor parameter identifiability, lack of sufficiently informative experimental data, and the existence of local minima in the objective function landscape. These issues tend to worsen with larger model sizes, increasing the computational complexity and the number of unknown parameters. An incorrectly calibrated model is problematic because it may result in inaccurate predictions and misleading conclusions. For non-expert users, there are a large number of potential pitfalls. Here, we provide a protocol that guides the user through all the steps involved in the calibration of dynamic models. We illustrate the methodology with two models, and provide all the code required to reproduce the results and perform the same analysis on new models. Our protocol provides practitioners and researchers in biological modelling with a one-stop guide that is at the same time compact and sufficiently comprehensive to cover all aspects of the problem.
△ Less
Submitted 26 May, 2021; v1 submitted 25 May, 2021;
originally announced May 2021.
-
AMICI: High-Performance Sensitivity Analysis for Large Ordinary Differential Equation Models
Authors:
Fabian Fröhlich,
Daniel Weindl,
Yannik Schälte,
Dilan Pathirana,
Łukasz Paszkowski,
Glenn Terje Lines,
Paul Stapor,
Jan Hasenauer
Abstract:
Ordinary differential equation models facilitate the understanding of cellular signal transduction and other biological processes. However, for large and comprehensive models, the computational cost of simulating or calibrating can be limiting. AMICI is a modular toolbox implemented in C++/Python/MATLAB that provides efficient simulation and sensitivity analysis routines tailored for scalable, gra…
▽ More
Ordinary differential equation models facilitate the understanding of cellular signal transduction and other biological processes. However, for large and comprehensive models, the computational cost of simulating or calibrating can be limiting. AMICI is a modular toolbox implemented in C++/Python/MATLAB that provides efficient simulation and sensitivity analysis routines tailored for scalable, gradient-based parameter estimation and uncertainty quantification.
AMICI is published under the permissive BSD-3-Clause license with source code publicly available on https://github.com/AMICI-dev/AMICI. Citeable releases are archived on Zenodo.
△ Less
Submitted 16 December, 2020;
originally announced December 2020.
-
PEtab -- interoperable specification of parameter estimation problems in systems biology
Authors:
Leonard Schmiester,
Yannik Schälte,
Frank T. Bergmann,
Tacio Camba,
Erika Dudkin,
Janine Egert,
Fabian Fröhlich,
Lara Fuhrmann,
Adrian L. Hauber,
Svenja Kemmer,
Polina Lakrisenko,
Carolin Loos,
Simon Merkt,
Wolfgang Müller,
Dilan Pathirana,
Elba Raimúndez,
Lukas Refisch,
Marcus Rosenblatt,
Paul L. Stapor,
Philipp Städter,
Dantong Wang,
Franz-Georg Wieland,
Julio R. Banga,
Jens Timmer,
Alejandro F. Villaverde
, et al. (4 additional authors not shown)
Abstract:
Reproducibility and reusability of the results of data-based modeling studies are essential. Yet, there has been -- so far -- no broadly supported format for the specification of parameter estimation problems in systems biology. Here, we introduce PEtab, a format which facilitates the specification of parameter estimation problems using Systems Biology Markup Language (SBML) models and a set of ta…
▽ More
Reproducibility and reusability of the results of data-based modeling studies are essential. Yet, there has been -- so far -- no broadly supported format for the specification of parameter estimation problems in systems biology. Here, we introduce PEtab, a format which facilitates the specification of parameter estimation problems using Systems Biology Markup Language (SBML) models and a set of tab-separated value files describing the observation model and experimental data as well as parameters to be estimated. We already implemented PEtab support into eight well-established model simulation and parameter estimation toolboxes with hundreds of users in total. We provide a Python library for validation and modification of a PEtab problem and currently 20 example parameter estimation problems based on recent studies. Specifications of PEtab, the PEtab Python library, as well as links to examples, and all supporting software tools are available at https://github.com/PEtab-dev/PEtab, a snapshot is available at https://doi.org/10.5281/zenodo.3732958. All original content is available under permissive licenses.
△ Less
Submitted 7 August, 2020; v1 submitted 2 April, 2020;
originally announced April 2020.
-
Mathematical modeling of variability in intracellular signaling
Authors:
Carolin Loos,
Jan Hasenauer
Abstract:
Cellular signaling is essential in information processing and decision making. Therefore, a variety of experimental approaches have been developed to study signaling on bulk and single-cell level. Single-cell measurements of signaling molecules demonstrated a substantial cell-to-cell variability, raising questions about its causes and mechanisms and about how cell populations cope with or exploit…
▽ More
Cellular signaling is essential in information processing and decision making. Therefore, a variety of experimental approaches have been developed to study signaling on bulk and single-cell level. Single-cell measurements of signaling molecules demonstrated a substantial cell-to-cell variability, raising questions about its causes and mechanisms and about how cell populations cope with or exploit cellular heterogeneity. To gain insights from single-cell signaling data, analysis and modeling approaches have been introduced. This review discusses these modeling approaches, with a focus on recent advances in the development and calibration of mechanistic models. Additionally, it outlines current and future challenges.
△ Less
Submitted 17 April, 2019;
originally announced April 2019.
-
Scalable Inference of Ordinary Differential Equation Models of Biochemical Processes
Authors:
Fabian Fröhlich,
Carolin Loos,
Jan Hasenauer
Abstract:
Ordinary differential equation models have become a standard tool for the mechanistic description of biochemical processes. If parameters are inferred from experimental data, such mechanistic models can provide accurate predictions about the behavior of latent variables or the process under new experimental conditions. Complementarily, inference of model structure can be used to identify the most…
▽ More
Ordinary differential equation models have become a standard tool for the mechanistic description of biochemical processes. If parameters are inferred from experimental data, such mechanistic models can provide accurate predictions about the behavior of latent variables or the process under new experimental conditions. Complementarily, inference of model structure can be used to identify the most plausible model structure from a set of candidates, and thus gain novel biological insight. Several toolboxes can infer model parameters and structure for small- to medium-scale mechanistic models out of the box. However, models for highly multiplexed datasets can require hundreds to thousands of state variables and parameters. For the analysis of such large-scale models, most algorithms require intractably high computation times. This chapter provides an overview of state-of-the-art methods for parameter and model inference, with an emphasis on scalability.
△ Less
Submitted 11 October, 2018; v1 submitted 21 November, 2017;
originally announced November 2017.
-
A simulation-based approach for solving optimisation problems with ODE-type steady state constraints
Authors:
Anna Fiedler,
Fabian J. Theis,
Jan Hasenauer
Abstract:
Ordinary differential equations (ODEs) are widely used to model biological, (bio-)chemical and technical processes. The parameters of these ODEs are often estimated from experimental data using ODE-constrained optimisation. This article proposes a simple simulation-based approach for solving optimisation problems with steady state constraints relying on an ODE. This simulation-based optimisation m…
▽ More
Ordinary differential equations (ODEs) are widely used to model biological, (bio-)chemical and technical processes. The parameters of these ODEs are often estimated from experimental data using ODE-constrained optimisation. This article proposes a simple simulation-based approach for solving optimisation problems with steady state constraints relying on an ODE. This simulation-based optimisation method is tailored to the problem structure and exploits the local geometry of the steady state manifold and its stability properties. A parameterisation of the steady state manifold is not required. We prove local convergence of the method for locally strictly convex objective functions. Effciency and reliability of the proposed method are demonstrated in two examples. The proposed method demonstrated better convergence properties than existing general purpose methods and a significantly higher number of converged starts per time.
△ Less
Submitted 5 November, 2015;
originally announced November 2015.
-
Data-driven modelling of biological multi-scale processes
Authors:
Jan Hasenauer,
Nick Jagiella,
Sabrina Hross,
Fabian J. Theis
Abstract:
Biological processes involve a variety of spatial and temporal scales. A holistic understanding of many biological processes therefore requires multi-scale models which capture the relevant properties on all these scales. In this manuscript we review mathematical modelling approaches used to describe the individual spatial scales and how they are integrated into holistic models. We discuss the rel…
▽ More
Biological processes involve a variety of spatial and temporal scales. A holistic understanding of many biological processes therefore requires multi-scale models which capture the relevant properties on all these scales. In this manuscript we review mathematical modelling approaches used to describe the individual spatial scales and how they are integrated into holistic models. We discuss the relation between spatial and temporal scales and the implication of that on multi-scale modelling. Based upon this overview over state-of-the-art modelling approaches, we formulate key challenges in mathematical and computational modelling of biological multi-scale and multi-physics processes. In particular, we considered the availability of analysis tools for multi-scale models and model-based multi-scale data integration. We provide a compact review of methods for model-based data integration and model-based hypothesis testing. Furthermore, novel approaches and recent trends are discussed, including computation time reduction using reduced order and surrogate models, which contribute to the solution of inference problems. We conclude the manuscript by providing a few ideas for the development of tailored multi-scale inference methods.
△ Less
Submitted 21 June, 2015;
originally announced June 2015.
-
A computational model for proliferation dynamics of division- and label-structured populations
Authors:
J. Hasenauer,
D. Schittler,
F. Allgower
Abstract:
In most biological studies and processes, cell proliferation and population dynamics play an essential role. Due to this ubiquity, a multitude of mathematical models has been developed to describe these processes. While the simplest models only consider the size of the overall populations, others take division numbers and labeling of the cells into account. In this work, we present a modeling and…
▽ More
In most biological studies and processes, cell proliferation and population dynamics play an essential role. Due to this ubiquity, a multitude of mathematical models has been developed to describe these processes. While the simplest models only consider the size of the overall populations, others take division numbers and labeling of the cells into account. In this work, we present a modeling and computational framework for proliferating cell population undergoing symmetric cell division. In contrast to existing models, the proposed model incorporates both, the discrete age structure and continuous label dynamics. Thus, it allows for the consideration of division number dependent parameters as well as the direct comparison of the model prediction with labeling experiments, e.g., performed with Carboxyfluorescein succinimidyl ester (CFSE). We prove that under mild assumptions the resulting system of coupled partial differential equations (PDEs) can be decomposed into a system of ordinary differential equations (ODEs) and a set of decoupled PDEs, which reduces the computational effort drastically. Furthermore, the PDEs are solved analytically and the ODE system is truncated, which allows for the prediction of the label distribution of complex systems using a low-dimensional system of ODEs. In addition to modeling of labeling dynamics, we link the label-induced fluorescence to the measure fluorescence which includes autofluorescence. For the resulting numerically challenging convolution integral, we provide an analytical approximation. This is illustrated by modeling and simulating a proliferating population with division number dependent proliferation rate.
△ Less
Submitted 22 February, 2012;
originally announced February 2012.
-
Density-based modeling and identification of biochemical networks in cell populations
Authors:
J. Hasenauer,
S. Waldherr,
M. Doszczak,
P. Scheurich,
F. Allgower
Abstract:
In many biological processes heterogeneity within cell populations is an important issue. In this work we consider populations where the behavior of every single cell can be described by a system of ordinary differential equations. Heterogeneity among individual cells is accounted for by differences in parameter values and initial conditions. Hereby, parameter values and initial conditions are s…
▽ More
In many biological processes heterogeneity within cell populations is an important issue. In this work we consider populations where the behavior of every single cell can be described by a system of ordinary differential equations. Heterogeneity among individual cells is accounted for by differences in parameter values and initial conditions. Hereby, parameter values and initial conditions are subject to a distribution function which is part of the model specification. Based on the single cell model and the considered parameter distribution, a partial differential equation model describing the distribution of cells in the state and in the output space is derived.
For the estimation of the parameter distribution within the model, we consider experimental data as obtained from flow cytometric analysis. From these noise-corrupted data a density-based statistical data model is derived. Using this data model the parameter distribution within the cell population is computed using convex optimization techniques.
To evaluate the proposed method, a model for the caspase activation cascade is considered. It is shown that for known noise properties the unknown parameter distributions in this model are well estimated by the proposed method.
△ Less
Submitted 24 February, 2010;
originally announced February 2010.
-
Estimation of biochemical network parameter distributions in cell populations
Authors:
Steffen Waldherr,
Jan Hasenauer,
Frank Allgöwer
Abstract:
Populations of heterogeneous cells play an important role in many biological systems. In this paper we consider systems where each cell can be modelled by an ordinary differential equation. To account for heterogeneity, parameter values are different among individual cells, subject to a distribution function which is part of the model specification.
Experimental data for heterogeneous cell pop…
▽ More
Populations of heterogeneous cells play an important role in many biological systems. In this paper we consider systems where each cell can be modelled by an ordinary differential equation. To account for heterogeneity, parameter values are different among individual cells, subject to a distribution function which is part of the model specification.
Experimental data for heterogeneous cell populations can be obtained from flow cytometric fluorescence microscopy. We present a heuristic approach to use such data for estimation of the parameter distribution in the population. The approach is based on generating simulation data for samples in parameter space. By convex optimisation, a suitable probability density function for these samples is computed.
To evaluate the proposed approach, we consider artificial data from a simple model of the tumor necrosis factor (TNF) signalling pathway. Its main characteristic is a bimodality in the TNF response: a certain percentage of cells undergoes apoptosis upon stimulation, while the remaining part stays alive. We show how our modelling approach allows to identify the reasons that underly the differential response.
△ Less
Submitted 27 September, 2009; v1 submitted 8 May, 2009;
originally announced May 2009.