Search | arXiv e-print repository

Quiver Laplacians and Feature Selection

Authors: Otto Sumray, Heather A. Harrington, Vidit Nanda

Abstract: The challenge of selecting the most relevant features of a given dataset arises ubiquitously in data analysis and dimensionality reduction. However, features found to be of high importance for the entire dataset may not be relevant to subsets of interest, and vice versa. Given a feature selector and a fixed decomposition of the data into subsets, we describe a method for identifying selected featu… ▽ More The challenge of selecting the most relevant features of a given dataset arises ubiquitously in data analysis and dimensionality reduction. However, features found to be of high importance for the entire dataset may not be relevant to subsets of interest, and vice versa. Given a feature selector and a fixed decomposition of the data into subsets, we describe a method for identifying selected features which are compatible with the decomposition into subsets. We achieve this by re-framing the problem of finding compatible features to one of finding sections of a suitable quiver representation. In order to approximate such sections, we then introduce a Laplacian operator for quiver representations valued in Hilbert spaces. We provide explicit bounds on how the spectrum of a quiver Laplacian changes when the representation and the underlying quiver are modified in certain natural ways. Finally, we apply this machinery to the study of peak-calling algorithms which measure chromatin accessibility in single-cell data. We demonstrate that eigenvectors of the associated quiver Laplacian yield locally and globally compatible features. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 40 pages, 7 figures

MSC Class: 16G20; 05C50; 62P05; 62H25

arXiv:2401.00078 [pdf, other]

Absolute concentration robustness: Algebra and geometry

Authors: Luis David García Puente, Elizabeth Gross, Heather A Harrington, Matthew Johnston, Nicolette Meshkat, Mercedes Pérez Millán, Anne Shiu

Abstract: Motivated by the question of how biological systems maintain homeostasis in changing environments, Shinar and Feinberg introduced in 2010 the concept of absolute concentration robustness (ACR). A biochemical system exhibits ACR in some species if the steady-state value of that species does not depend on initial conditions. Thus, a system with ACR can maintain a constant level of one species even a… ▽ More Motivated by the question of how biological systems maintain homeostasis in changing environments, Shinar and Feinberg introduced in 2010 the concept of absolute concentration robustness (ACR). A biochemical system exhibits ACR in some species if the steady-state value of that species does not depend on initial conditions. Thus, a system with ACR can maintain a constant level of one species even as the environment changes. Despite a great deal of interest in ACR in recent years, the following basic question remains open: How can we determine quickly whether a given biochemical system has ACR? Although various approaches to this problem have been proposed, we show that they are incomplete. Accordingly, we present new methods for deciding ACR, which harness computational algebra. We illustrate our results on several biochemical signaling networks. △ Less

Submitted 29 December, 2023; originally announced January 2024.

Comments: 44 pages

MSC Class: 37N25; 92E20; 12D10; 37C25; 65H14; 14Q20

arXiv:2308.06205 [pdf, other]

Relational persistent homology for multispecies data with application to the tumor microenvironment

Authors: Bernadette J. Stolz, Jagdeep Dhesi, Joshua A. Bull, Heather A. Harrington, Helen M. Byrne, Iris H. R. Yoon

Abstract: Topological data analysis (TDA) is an active field of mathematics for quantifying shape in complex data. Standard methods in TDA such as persistent homology (PH) are typically focused on the analysis of data consisting of a single entity (e.g., cells or molecular species). However, state-of-the-art data collection techniques now generate exquisitely detailed multispecies data, prompting a need for… ▽ More Topological data analysis (TDA) is an active field of mathematics for quantifying shape in complex data. Standard methods in TDA such as persistent homology (PH) are typically focused on the analysis of data consisting of a single entity (e.g., cells or molecular species). However, state-of-the-art data collection techniques now generate exquisitely detailed multispecies data, prompting a need for methods that can examine and quantify the relations among them. Such heterogeneous data types arise in many contexts, ranging from biomedical imaging, geospatial analysis, to species ecology. Here, we propose two methods for encoding spatial relations among different data types that are based on Dowker complexes and Witness complexes. We apply the methods to synthetic multispecies data of a tumor microenvironment and analyze topological features that capture relations between different cell types, e.g., blood vessels, macrophages, tumor cells, and necrotic cells. We demonstrate that relational topological features can extract biological insight, including the dominant immune cell phenotype (an important predictor of patient prognosis) and the parameter regimes of a data-generating model. The methods provide a quantitative perspective on the relational analysis of multispecies spatial data, overcome the limits of traditional PH, and are readily computable. △ Less

Submitted 12 September, 2023; v1 submitted 11 August, 2023; originally announced August 2023.

MSC Class: 55N31; 92C17

arXiv:2308.05294 [pdf, other]

Topological classification of tumour-immune interactions and dynamics

Authors: Jingjie Yang, Heidi Fang, Jagdeep Dhesi, Iris H. R. Yoon, Joshua A. Bull, Helen M. Byrne, Heather A. Harrington, Gillian Grindstaff

Abstract: The complex and dynamic crosstalk between tumour and immune cells results in tumours that can exhibit distinct qualitative behaviours - elimination, equilibrium, and escape - and intricate spatial patterns, yet share similar cell configurations in the early stages. We offer a topological approach to analyse time series of spatial data of cell locations (including tumour cells and macrophages) in o… ▽ More The complex and dynamic crosstalk between tumour and immune cells results in tumours that can exhibit distinct qualitative behaviours - elimination, equilibrium, and escape - and intricate spatial patterns, yet share similar cell configurations in the early stages. We offer a topological approach to analyse time series of spatial data of cell locations (including tumour cells and macrophages) in order to predict malignant behaviour. We propose four topological vectorisations specialised to such cell data: persistence images of Vietoris-Rips and radial filtrations at static time points, and persistence images for zigzag filtrations and persistence vineyards varying in time. To demonstrate the approach, synthetic data are generated from an agent-based model with varying parameters. We compare the performance of topological summaries in predicting - with logistic regression at various time steps - whether tumour niches surrounding blood vessels are present at the end of the simulation, as a proxy for metastasis (i.e., tumour escape). We find that both static and time-dependent methods accurately identify perivascular niche formation, significantly earlier than simpler markers such as the number of tumour cells and the macrophage phenotype ratio. We find additionally that dimension 0 persistence applied to macrophage data, representing multi-scale clusters of the spatial arrangement of macrophages, performs best at this classification task at early time steps, prior to full tumour development, and performs even better when time-dependent data are included; in contrast, topological measures capturing the shape of the tumour, such as tortuosity and punctures in the cell arrangement, perform best at intermediate and later stages. The logistic regression coefficients reveal detailed shape differences between the classes. △ Less

Submitted 9 August, 2023; originally announced August 2023.

Comments: 29 pages, 12 figures

MSC Class: 92C17; 55N31

arXiv:2212.10883 [pdf, other]

Detecting Temporal shape changes with the Euler Characteristic Transform

Authors: Lewis Marsh, Felix Y. Zhou, Xiao Qin, Xin Lu, Helen M. Byrne, Heather A. Harrington

Abstract: Organoids are multi-cellular structures which are cultured in vitro from stem cells to resemble specific organs (e.g., brain, liver) in their three-dimensional composition. Dynamic changes in the shape and composition of these model systems can be used to understand the effect of mutations and treatments in health and disease. In this paper, we propose a new technique in the field of topological d… ▽ More Organoids are multi-cellular structures which are cultured in vitro from stem cells to resemble specific organs (e.g., brain, liver) in their three-dimensional composition. Dynamic changes in the shape and composition of these model systems can be used to understand the effect of mutations and treatments in health and disease. In this paper, we propose a new technique in the field of topological data analysis for DEtecting Temporal shape changes with the Euler Characteristic Transform (DETECT). DETECT is a rotationally invariant signature of dynamically changing shapes. We demonstrate our method on a data set of segmented videos of mouse small intestine organoid experiments and show that it outperforms classical shape descriptors. We verify our method on a synthetic organoid data set and illustrate how it generalises to 3D. We conclude that DETECT offers rigorous quantification of organoids and opens up computationally scalable methods for distinguishing different growth regimes and assessing treatment effects. △ Less

Submitted 22 December, 2022; v1 submitted 21 December, 2022; originally announced December 2022.

arXiv:2212.06505 [pdf, other]

Multiscale topology classifies and quantifies cell types in subcellular spatial transcriptomics

Authors: Katherine Benjamin, Aneesha Bhandari, Zhouchun Shang, Yanan Xing, Yanru An, Nannan Zhang, Yong Hou, Ulrike Tillmann, Katherine R. Bull, Heather A. Harrington

Abstract: Spatial transcriptomics has the potential to transform our understanding of RNA expression in tissues. Classical array-based technologies produce multiple-cell-scale measurements requiring deconvolution to recover single cell information. However, rapid advances in subcellular measurement of RNA expression at whole-transcriptome depth necessitate a fundamentally different approach. To integrate si… ▽ More Spatial transcriptomics has the potential to transform our understanding of RNA expression in tissues. Classical array-based technologies produce multiple-cell-scale measurements requiring deconvolution to recover single cell information. However, rapid advances in subcellular measurement of RNA expression at whole-transcriptome depth necessitate a fundamentally different approach. To integrate single-cell RNA-seq data with nanoscale spatial transcriptomics, we present a topological method for automatic cell type identification (TopACT). Unlike popular decomposition approaches to multicellular resolution data, TopACT is able to pinpoint the spatial locations of individual sparsely dispersed cells without prior knowledge of cell boundaries. Pairing TopACT with multiparameter persistent homology landscapes predicts immune cells forming a peripheral ring structure within kidney glomeruli in a murine model of lupus nephritis, which we experimentally validate with immunofluorescent imaging. The proposed topological data analysis unifies multiple biological scales, from subcellular gene expression to multicellular tissue organization. △ Less

Submitted 13 December, 2022; originally announced December 2022.

Comments: Main text: 8 pages, 4 figures. Supplement: 12 pages, 5 figures

MSC Class: 92-08; 55N31; 62R40; 68T09

arXiv:2212.02601 [pdf, other]

Algebraic network reconstruction of discrete dynamical systems

Authors: Heather A. Harrington, Mike Stillman, Alan Veliz-Cuba

Abstract: We present a computational algebra solution to reverse engineering the network structure of discrete dynamical systems from data. We use monomial ideals to determine dependencies between variables that encode constraints on the possible wiring diagrams underlying the process generating the discrete-time, continuous-space data. Our work assumes that each variable is either monotone increasing or de… ▽ More We present a computational algebra solution to reverse engineering the network structure of discrete dynamical systems from data. We use monomial ideals to determine dependencies between variables that encode constraints on the possible wiring diagrams underlying the process generating the discrete-time, continuous-space data. Our work assumes that each variable is either monotone increasing or decreasing. We prove that with enough data, even in the presence of small noise, our method can reconstruct the correct unique wiring diagram. △ Less

Submitted 5 December, 2022; originally announced December 2022.

Comments: 19 pages, 5 figures

MSC Class: 13P25; 37N25; 92B05; 05E40; 46N60; 92C42; 68R10; 90B10; 97N70; 62-07

arXiv:2211.09058 [pdf, other]

Stability of topological descriptors for neuronal morphology

Authors: David Beers, Heather A. Harrington, Alain Goriely

Abstract: The topological morphology descriptor of a neuron is a multiset of intervals associated to the shape of the neuron represented as a tree. In practice, topological morphology descriptors are vectorized using persistence images, which can help classify and characterize the morphology of broad groups of neurons. We study the stability of topological morphology descriptors under small changes to neuro… ▽ More The topological morphology descriptor of a neuron is a multiset of intervals associated to the shape of the neuron represented as a tree. In practice, topological morphology descriptors are vectorized using persistence images, which can help classify and characterize the morphology of broad groups of neurons. We study the stability of topological morphology descriptors under small changes to neuronal morphology. We show that the persistence diagram arising from the topological morphology descriptor of a neuron is stable for the 1-Wasserstein distance against a range of perturbations to the tree. These results guarantee that persistence images of topological morphology descriptors are stable against the same set of perturbations and reliable. △ Less

Submitted 16 November, 2022; originally announced November 2022.

Comments: 11 pages, 4 figures

arXiv:2210.07545 [pdf, other]

Hypergraphs for multiscale cycles in structured data

Authors: Agnese Barbensi, Iris H. R. Yoon, Christian Degnbol Madsen, Deborah O. Ajayi, Michael P. H. Stumpf, Heather A. Harrington

Abstract: Scientific data has been growing in both size and complexity across the modern physical, engineering, life and social sciences. Spatial structure, for example, is a hallmark of many of the most important real-world complex systems, but its analysis is fraught with statistical challenges. Topological data analysis can provide a powerful computational window on complex systems. Here we present a fra… ▽ More Scientific data has been growing in both size and complexity across the modern physical, engineering, life and social sciences. Spatial structure, for example, is a hallmark of many of the most important real-world complex systems, but its analysis is fraught with statistical challenges. Topological data analysis can provide a powerful computational window on complex systems. Here we present a framework to extend and interpret persistent homology summaries to analyse spatial data across multiple scales. We introduce hyperTDA, a topological pipeline that unifies local (e.g. geodesic) and global (e.g. Euclidean) metrics without losing spatial information, even in the presence of noise. Homology generators offer an elegant and flexible description of spatial structures and can capture the information computed by persistent homology in an interpretable way. Here the information computed by persistent homology is transformed into a weighted hypergraph, where hyperedges correspond to homology generators. We consider different choices of generators (e.g. matroid or minimal) and find that centrality and community detection are robust to either choice. We compare hyperTDA to existing geometric measures and validate its robustness to noise. We demonstrate the power of computing higher-order topological structures on spatial curves arising frequently in ecology, biophysics, and biology, but also in high-dimensional financial datasets. We find that hyperTDA can select between synthetic trajectories from the landmark 2020 AnDi challenge and quantifies movements of different animal species, even when data is limited. △ Less

Submitted 14 October, 2022; originally announced October 2022.

Comments: 6 Figures, 15 pages and Supplementary Information (including figures) as an Appendix. Associated GitHub repositories: github.com/degnbol/hyperTDA and github.com/irishryoon/minimal_generators_curves

MSC Class: 55N31; 62R40; 55P10; 60C05; 92B05; 92-10

arXiv:2209.08974 [pdf, other]

Zigzag persistence for coral reef resilience using a stochastic spatial model

Authors: Robert A. McDonald, Rosanna Neuhausler, Martin Robinson, Laurel G. Larsen, Heather A. Harrington, Maria Bruna

Abstract: A complex interplay between species governs the evolution of spatial patterns in ecology. An open problem in the biological sciences is characterising spatio-temporal data and understanding how changes at the local scale affect global dynamics/behaviour. Here, we extend a well-studied temporal mathematical model of coral reef dynamics to include stochastic and spatial interactions and generate dat… ▽ More A complex interplay between species governs the evolution of spatial patterns in ecology. An open problem in the biological sciences is characterising spatio-temporal data and understanding how changes at the local scale affect global dynamics/behaviour. Here, we extend a well-studied temporal mathematical model of coral reef dynamics to include stochastic and spatial interactions and generate data to study different ecological scenarios. We present descriptors to characterise patterns in heterogeneous spatio-temporal data surpassing spatially averaged measures. We apply these descriptors to simulated coral data and demonstrate the utility of two topological data analysis techniques--persistent homology and zigzag persistence--for characterising mechanisms of reef resilience. We show that the introduction of local competition between species leads to the appearance of coral clusters in the reef. We use our analyses to distinguish temporal dynamics stemming from different initial configurations of coral, showing that the neighbourhood composition of coral sites determines their long-term survival. Using zigzag persistence, we determine which spatial configurations protect coral from extinction in different environments. Finally, we apply this toolkit of multi-scale methods to empirical coral reef data, which distinguish spatio-temporal reef dynamics in different locations, and demonstrate the applicability to a range of datasets. △ Less

Submitted 12 August, 2023; v1 submitted 19 September, 2022; originally announced September 2022.

arXiv:2208.12748 [pdf, other]

Brain Chains as Topological Signatures for Alzheimer's Disease

Authors: Christian Goodbrake, David Beers, Travis B. Thompson, Heather A. Harrington, Alain Goriely

Abstract: We propose a topological framework to study the evolution of Alzheimer's disease, the most common neurodegenerative disease. The modeling of this disease starts with the representation of the brain connectivity as a graph and the seeding of a toxic protein in a specific region represented by a vertex. Over time, the accumulation of toxic proteins at vertices and their propagation along edges are m… ▽ More We propose a topological framework to study the evolution of Alzheimer's disease, the most common neurodegenerative disease. The modeling of this disease starts with the representation of the brain connectivity as a graph and the seeding of a toxic protein in a specific region represented by a vertex. Over time, the accumulation of toxic proteins at vertices and their propagation along edges are modeled by a dynamical system on this graph. These dynamics provide an order on the edges of the graph according to the damage created by high concentrations of proteins. This sequence of edges defines a filtration of the graph. We consider different filtrations given by different disease seeding locations. To study this filtration we propose a new combinatorial and topological method. A filtration defines a maximal chain in the partially ordered set of spanning subgraphs ordered by inclusion. To identify similar graphs, and define a topological signature, we quotient this poset by graph homotopy equivalence, which gives maximal chains in a smaller poset. We provide an algorithm to compute this direct quotient without computing all subgraphs and then propose bounds on the total number of graphs up to homotopy equivalence. To compare the maximal chains generated by this method, we extend Kendall's $d_K$ metric for permutations to more general graded posets and establish bounds for this metric. We then demonstrate the utility of this framework on actual brain graphs by studying the dynamics of tau proteins on the structural connectome. {We show that the proposed topological brain chain equivalence classes distinguish different simulated subtypes of Alzheimer's disease. △ Less

Submitted 18 September, 2023; v1 submitted 22 August, 2022; originally announced August 2022.

Comments: 33 pages, 13 figures, submitted to Journal of Applied and Computational Topology (APCT)

arXiv:2206.07760 [pdf, other]

doi 10.3390/e24081116

Multiscale methods for signal selection in single-cell data

Authors: Renee S. Hoekzema, Lewis Marsh, Otto Sumray, Thomas M. Carroll, Xin Lu, Helen M. Byrne, Heather A. Harrington

Abstract: Analysis of single-cell transcriptomics often relies on clustering cells and then performing differential gene expression (DGE) to identify genes that vary between these clusters. These discrete analyses successfully determine cell types and markers; however, continuous variation within and between cell types may not be detected. We propose three topologically motivated mathematical methods for un… ▽ More Analysis of single-cell transcriptomics often relies on clustering cells and then performing differential gene expression (DGE) to identify genes that vary between these clusters. These discrete analyses successfully determine cell types and markers; however, continuous variation within and between cell types may not be detected. We propose three topologically motivated mathematical methods for unsupervised feature selection that consider discrete and continuous transcriptional patterns on an equal footing across multiple scales simultaneously. Eigenscores ($\text{eig}_i$) rank signals or genes based on their correspondence to low-frequency intrinsic patterning in the data using the spectral decomposition of the Laplacian graph. The multiscale Laplacian score (MLS) is an unsupervised method for locating relevant scales in data and selecting the genes that are coherently expressed at these respective scales. The persistent Rayleigh quotient (PRQ) takes data equipped with a filtration, allowing the separation of genes with different roles in a bifurcation process (e.g., pseudo-time). We demonstrate the utility of these techniques by applying them to published single-cell transcriptomics data sets. The methods validate previously identified genes and detect additional biologically meaningful genes with coherent expression patterns. By studying the interaction between gene signals and the geometry of the underlying space, the three methods give multidimensional rankings of the genes and visualisation of relationships between them. △ Less

Submitted 6 October, 2022; v1 submitted 15 June, 2022; originally announced June 2022.

Comments: 32 pages, 15 figures, 1 table. Revised and published in Entropy, special issue Applications of Topological Data Analysis in the Life Sciences

Journal ref: Entropy 2022, 24(8), 1116

arXiv:2204.03348 [pdf, other]

Barcodes distinguish morphology of neuronal tauopathy

Authors: David Beers, Despoina Goniotaki, Diane P. Hanger, Alain Goriely, Heather A. Harrington

Abstract: The geometry of neurons is known to be important for their functions. Hence, neurons are often classified by their morphology. Two recent methods, persistent homology and the topological morphology descriptor, assign a morphology descriptor called a barcode to a neuron equipped with a given function, such as the Euclidean distance from the root of the neuron. These barcodes can be converted into m… ▽ More The geometry of neurons is known to be important for their functions. Hence, neurons are often classified by their morphology. Two recent methods, persistent homology and the topological morphology descriptor, assign a morphology descriptor called a barcode to a neuron equipped with a given function, such as the Euclidean distance from the root of the neuron. These barcodes can be converted into matrices called persistence images, which can then be averaged across groups. We show that when the defining function is the path length from the root, both the topological morphology descriptor and persistent homology are equivalent. We further show that persistence images arising from the path length procedure provide an interpretable summary of neuronal morphology. We introduce {topological morphology functions}, a class of functions similar to Sholl functions, that can be recovered from the associated topological morphology descriptor. To demonstrate this topological approach, we compare healthy cortical and hippocampal mouse neurons to those affected by progressive tauopathy. We find a significant difference in the morphology of healthy neurons and those with a tauopathy at a postsymptomatic age. We use persistence images to conclude that the diseased group tends to have neurons with shorter branches as well as fewer branches far from the soma. △ Less

Submitted 7 April, 2022; originally announced April 2022.

Comments: 25 pages, 10 figures

arXiv:2201.07709 [pdf, other]

Homology of homologous knotted proteins

Authors: Katherine Benjamin, Lamisah Mukta, Gabriel Moryoussef, Christopher Uren, Heather A. Harrington, Ulrike Tillmann, Agnese Barbensi

Abstract: Quantification and classification of protein structures, such as knotted proteins, often requires noise-free and complete data. Here we develop a mathematical pipeline that systematically analyzes protein structures. We showcase this geometric framework on proteins forming open-ended trefoil knots, and we demonstrate that the mathematical tool, persistent homology, faithfully represents their stru… ▽ More Quantification and classification of protein structures, such as knotted proteins, often requires noise-free and complete data. Here we develop a mathematical pipeline that systematically analyzes protein structures. We showcase this geometric framework on proteins forming open-ended trefoil knots, and we demonstrate that the mathematical tool, persistent homology, faithfully represents their structural homology. This topological pipeline identifies important geometric features of protein entanglement and clusters the space of trefoil proteins according to their depth. Persistence landscapes quantify the topological difference between a family of knotted and unknotted proteins in the same structural homology class. This difference is localized and interpreted geometrically with recent advancements in systematic computation of homology generators. The topological and geometric quantification we find is robust to noisy input data, which demonstrates the potential of this approach in contexts where standard knot theoretic tools fail. △ Less

Submitted 19 January, 2022; originally announced January 2022.

Comments: 3 figures + 2 SI figures

MSC Class: 62R40; 55N31; 57K10

arXiv:2112.00688 [pdf, other]

Algebra, Geometry and Topology of ERK Kinetics

Authors: Lewis Marsh, Emilie Dufresne, Helen M. Byrne, Heather A. Harrington

Abstract: The MEK/ERK signalling pathway is involved in cell division, cell specialisation, survival and cell death. Here we study a polynomial dynamical system describing the dynamics of MEK/ERK proposed by Yeung et al. with their experimental setup, data and known biological information. The experimental dataset is a time-course of ERK measurements in different phosphorylation states following activation… ▽ More The MEK/ERK signalling pathway is involved in cell division, cell specialisation, survival and cell death. Here we study a polynomial dynamical system describing the dynamics of MEK/ERK proposed by Yeung et al. with their experimental setup, data and known biological information. The experimental dataset is a time-course of ERK measurements in different phosphorylation states following activation of either wild-type MEK or MEK mutations associated with cancer or developmental defects. We demonstrate how methods from computational algebraic geometry, differential algebra, Bayesian statistics and computational algebraic topology can inform the model reduction, identification and parameter inference of MEK variants, respectively. Throughout, we show how this algebraic viewpoint offers a rigorous and systematic analysis of such models. △ Less

Submitted 1 December, 2021; originally announced December 2021.

arXiv:2111.00991 [pdf, ps, other]

Differential elimination for dynamical models via projections with applications to structural identifiability

Authors: Ruiwen Dong, Christian Goodbrake, Heather A Harrington, Gleb Pogudin

Abstract: Elimination of unknowns in a system of differential equations is often required when analysing (possibly nonlinear) dynamical systems models, where only a subset of variables are observable. One such analysis, identifiability, often relies on computing input-output relations via differential algebraic elimination. Determining identifiability, a natural prerequisite for meaningful parameter estimat… ▽ More Elimination of unknowns in a system of differential equations is often required when analysing (possibly nonlinear) dynamical systems models, where only a subset of variables are observable. One such analysis, identifiability, often relies on computing input-output relations via differential algebraic elimination. Determining identifiability, a natural prerequisite for meaningful parameter estimation, is often prohibitively expensive for medium to large systems due to the computationally expensive task of elimination. We propose an algorithm that computes a description of the set of differential-algebraic relations between the input and output variables of a dynamical system model. The resulting algorithm outperforms general-purpose software for differential elimination on a set of benchmark models from literature. We use the designed elimination algorithm to build a new randomized algorithm for assessing structural identifiability of a parameter in a parametric model. A parameter is said to be identifiable if its value can be uniquely determined from input-output data assuming the absence of noise and sufficiently exciting inputs. Our new algorithm allows the identification of models that could not be tackled before. Our implementation is publicly available as a Julia package at https://github.com/SciML/StructuralIdentifiability.jl. △ Less

Submitted 23 November, 2022; v1 submitted 1 November, 2021; originally announced November 2021.

arXiv:2108.11640 [pdf, other]

Topological Approximate Bayesian Computation for Parameter Inference of an Angiogenesis Model

Authors: Thomas Thorne, Paul D. W. Kirk, Heather A. Harrington

Abstract: Inferring the parameters of models describing biological systems is an important problem in the reverse engineering of the mechanisms underlying these systems. Much work has focused on parameter inference of stochastic and ordinary differential equation models using Approximate Bayesian Computation (ABC). While there is some recent work on inference in spatial models, this remains an open problem.… ▽ More Inferring the parameters of models describing biological systems is an important problem in the reverse engineering of the mechanisms underlying these systems. Much work has focused on parameter inference of stochastic and ordinary differential equation models using Approximate Bayesian Computation (ABC). While there is some recent work on inference in spatial models, this remains an open problem. Simultaneously, advances in topological data analysis (TDA), a field of computational mathematics, have enabled spatial patterns in data to be characterised. Here we focus on recent work using topological data analysis to study different regimes of parameter space for a well-studied model of angiogenesis. We propose a method for combining TDA with ABC to infer parameters in the Anderson-Chaplain model of angiogenesis. We demonstrate that this topological approach outperforms ABC approaches that use simpler statistics based on spatial features of the data. This is a first step towards a general framework of spatial parameter inference for biological systems, for which there may be a variety of filtrations, vectorisations, and summary statistics to be considered. All code used to produce our results is available as a Snakemake workflow. △ Less

Submitted 8 November, 2021; v1 submitted 26 August, 2021; originally announced August 2021.

Comments: 7 pages, 2 figures. For associated code see: https://github.com/tt104/tabc_angio

arXiv:2101.00523 [pdf, other]

doi 10.1371/journal.pcbi.1009094

Topological data analysis distinguishes parameter regimes in the Anderson-Chaplain model of angiogenesis

Authors: John T. Nardini, Bernadette J. Stolz, Kevin B. Flores, Heather A. Harrington, Helen M. Byrne

Abstract: Angiogenesis is the process by which blood vessels form from pre-existing vessels. It plays a key role in many biological processes, including embryonic development and wound healing, and contributes to many diseases including cancer and rheumatoid arthritis. The structure of the resulting vessel networks determines their ability to deliver nutrients and remove waste products from biological tissu… ▽ More Angiogenesis is the process by which blood vessels form from pre-existing vessels. It plays a key role in many biological processes, including embryonic development and wound healing, and contributes to many diseases including cancer and rheumatoid arthritis. The structure of the resulting vessel networks determines their ability to deliver nutrients and remove waste products from biological tissues. Here we simulate the Anderson-Chaplain model of angiogenesis at different parameter values and quantify the vessel architectures of the resulting synthetic data. Specifically, we propose a topological data analysis (TDA) pipeline for systematic analysis of the model. TDA is a vibrant and relatively new field of computational mathematics for studying the shape of data. We compute topological and standard descriptors of model simulations generated by different parameter values. We show that TDA of model simulation data stratifies parameter space into regions with similar vessel morphology. The methodologies proposed here are widely applicable to other synthetic and experimental data including wound healing, development, and plant biology. △ Less

Submitted 22 April, 2021; v1 submitted 2 January, 2021; originally announced January 2021.

arXiv:2008.08667 [pdf, other]

Multiscale Topology Characterises Dynamic Tumour Vascular Networks

Authors: Bernadette J. Stolz, Jakob Kaeppler, Bostjan Markelc, Franziska Mech, Florian Lipsmeier, Ruth J. Muschel, Helen M. Byrne, Heather A. Harrington

Abstract: Advances in imaging techniques enable high resolution 3D visualisation of vascular networks over time and reveal abnormal structural features such as twists and loops, and their quantification is an active area of research. Here we showcase how topological data analysis (TDA), the mathematical field that studies `shape' of data, can characterise the geometric, spatial and temporal organisation of… ▽ More Advances in imaging techniques enable high resolution 3D visualisation of vascular networks over time and reveal abnormal structural features such as twists and loops, and their quantification is an active area of research. Here we showcase how topological data analysis (TDA), the mathematical field that studies `shape' of data, can characterise the geometric, spatial and temporal organisation of vascular networks. We propose two topological lenses to study vasculature, which capture inherent multi-scale features and vessel connectivity, and surpass the single scale analysis of existing methods. We analyse images collected using intravital and ultramicroscopy modalities and quantify spatio-temporal variation of twists, loops, and avascular regions (voids) in 3D vascular networks. This topological approach validates and quantifies known qualitative trends such as dynamic changes in tortuosity and loops in response to antibodies that modulate vessel sprouting; furthermore, it quantifies the effect of radiotherapy on vessel architecture. △ Less

Submitted 26 April, 2022; v1 submitted 19 August, 2020; originally announced August 2020.

arXiv:1909.05937 [pdf, other]

Grid diagrams as tools to investigate knot spaces and topoisomerase-mediated simplification of DNA topology

Authors: Agnese Barbensi, Daniele Celoria, Heather A. Harrington, Andrzej Stasiak, Dorothy Buck

Abstract: Grid diagrams with their relatively simple mathematical formalism provide a convenient way to generate and model projections of various knots. It has been an open question whether these 2D diagrams can be used to model a complex 3D process such as the topoisomerase-mediated preferential unknotting of DNA molecules. We model here topoisomerase-mediated passages of double-stranded DNA segments throu… ▽ More Grid diagrams with their relatively simple mathematical formalism provide a convenient way to generate and model projections of various knots. It has been an open question whether these 2D diagrams can be used to model a complex 3D process such as the topoisomerase-mediated preferential unknotting of DNA molecules. We model here topoisomerase-mediated passages of double-stranded DNA segments through each other using the formalism of grid diagrams. We show that this grid diagram-based modelling approach captures the essence of the preferential unknotting mechanism, based on topoisomerase selectivity of hooked DNA juxtapositions as the sites of intersegmental passages. We show that grid diagram-based approach provide an important, new and computationally convenient framework for investigating entanglement in biopolymers. △ Less

Submitted 12 September, 2019; originally announced September 2019.

Comments: 22 pages, 9 figures, Supplementary Information at the end

arXiv:1907.08711 [pdf, other]

Topological Methods for Characterising Spatial Networks: A Case Study in Tumour Vasculature

Authors: Helen M Byrne, Heather A Harrington, Ruth Muschel, Gesine Reinert, Bernadette J Stolz, Ulrike Tillmann

Abstract: Understanding how the spatial structure of blood vessel networks relates to their function in healthy and abnormal biological tissues could improve diagnosis and treatment for diseases such as cancer. New imaging techniques can generate multiple, high-resolution images of the same tissue region, and show how vessel networks evolve during disease onset and treatment. Such experimental advances have… ▽ More Understanding how the spatial structure of blood vessel networks relates to their function in healthy and abnormal biological tissues could improve diagnosis and treatment for diseases such as cancer. New imaging techniques can generate multiple, high-resolution images of the same tissue region, and show how vessel networks evolve during disease onset and treatment. Such experimental advances have created an exciting opportunity for discovering new links between vessel structure and disease through the development of mathematical tools that can analyse these rich datasets. Here we explain how topological data analysis (TDA) can be used to study vessel network structures. TDA is a growing field in the mathematical and computational sciences, that consists of algorithmic methods for identifying global and multi-scale structures in high-dimensional data sets that may be noisy and incomplete. TDA has identified the effect of ageing on vessel networks in the brain and more recently proposed to study blood flow and stenosis. Here we present preliminary work which shows how TDA of spatial network structure can be used to characterise tumour vasculature. △ Less

Submitted 19 July, 2019; originally announced July 2019.

arXiv:1810.12663 [pdf, other]

Coloured Noise from Stochastic Inflows in Reaction-Diffusion Systems

Authors: Michael F Adamer, Heather A Harrington, Eamonn A Gaffney, Thomas E Woolley

Abstract: In this paper we present a framework for investigating coloured noise in reaction-diffusion systems. We start by considering a deterministic reaction-diffusion equation and show how external forcing can cause temporally correlated or coloured noise. Here, the main source of external noise is considered to be fluctuations in the parameter values representing the inflow of particles to the system. F… ▽ More In this paper we present a framework for investigating coloured noise in reaction-diffusion systems. We start by considering a deterministic reaction-diffusion equation and show how external forcing can cause temporally correlated or coloured noise. Here, the main source of external noise is considered to be fluctuations in the parameter values representing the inflow of particles to the system. First, we determine which reaction systems, driven by extrinsic noise, can admit only one steady state, so that effects, such as stochastic switching, are precluded from our analysis. To analyse the steady state behaviour of reaction systems, even if the parameter values are changing, necessitates a parameter-free approach, which has been central to algebraic analysis in chemical reaction network theory. To identify suitable models we use tools from real algebraic geometry that link the network structure to its dynamical properties. We then make a connection to internal noise models and show how power spectral methods can be used to predict stochastically driven patterns in systems with coloured noise. In simple cases we show that the power spectrum of the coloured noise process and the power spectrum of the reaction-diffusion system modelled with white noise multiply to give the power spectrum of the coloured noise reaction-diffusion system. △ Less

Submitted 30 November, 2018; v1 submitted 30 October, 2018; originally announced October 2018.

Comments: 31 pages, 8 figures

MSC Class: 92C42; 92C15; 60H30; 34C08

arXiv:1810.05575 [pdf, other]

Joining and decomposing reaction networks

Authors: Elizabeth Gross, Heather A Harrington, Nicolette Meshkat, Anne Shiu

Abstract: In systems and synthetic biology, much research has focused on the behavior and design of single pathways, while, more recently, experimental efforts have focused on how cross-talk (coupling two or more pathways) or inhibiting molecular function (isolating one part of the pathway) affects systems-level behavior. However, the theory for tackling these larger systems in general has lagged behind. He… ▽ More In systems and synthetic biology, much research has focused on the behavior and design of single pathways, while, more recently, experimental efforts have focused on how cross-talk (coupling two or more pathways) or inhibiting molecular function (isolating one part of the pathway) affects systems-level behavior. However, the theory for tackling these larger systems in general has lagged behind. Here, we analyze how joining networks (e.g., cross-talk) or decomposing networks (e.g., inhibition or knock-outs) affects three properties that reaction networks may possess---identifiability (recoverability of parameter values from data), steady-state invariants (relationships among species concentrations at steady state, used in model selection), and multistationarity (capacity for multiple steady states, which correspond to multiple cell decisions). Specifically, we prove results that clarify, for a network obtained by joining two smaller networks, how properties of the smaller networks can be inferred from or can imply similar properties of the original network. Our proofs use techniques from computational algebraic geometry, including elimination theory and differential algebra. △ Less

Submitted 14 August, 2019; v1 submitted 12 October, 2018; originally announced October 2018.

Comments: 44 pages; extensive revision in response to referee comments

arXiv:1809.08504 [pdf, other]

Topological Data Analysis of Task-Based fMRI Data from Experiments on Schizophrenia

Authors: Bernadette J. Stolz, Tegan Emerson, Satu Nahkuri, Mason A. Porter, Heather A. Harrington

Abstract: We use methods from computational algebraic topology to study functional brain networks, in which nodes represent brain regions and weighted edges encode the similarity of fMRI time series from each region. With these tools, which allow one to characterize topological invariants such as loops in high-dimensional data, we are able to gain understanding into low-dimensional structures in networks in… ▽ More We use methods from computational algebraic topology to study functional brain networks, in which nodes represent brain regions and weighted edges encode the similarity of fMRI time series from each region. With these tools, which allow one to characterize topological invariants such as loops in high-dimensional data, we are able to gain understanding into low-dimensional structures in networks in a way that complements traditional approaches that are based on pairwise interactions. In the present paper, we use persistent homology to analyze networks that we construct from task-based fMRI data from schizophrenia patients, healthy controls, and healthy siblings of schizophrenia patients. We thereby explore the persistence of topological structures such as loops at different scales in these networks. We use persistence landscapes and persistence images to create output summaries from our persistent-homology calculations, and we study the persistence landscapes and images using $k$-means clustering and community detection. Based on our analysis of persistence landscapes, we find that the members of the sibling cohort have topological features (specifically, their 1-dimensional loops) that are distinct from the other two cohorts. From the persistence images, we are able to distinguish all three subject groups and to determine the brain regions in the loops (with four or more edges) that allow us to make these distinctions. △ Less

Submitted 25 August, 2020; v1 submitted 22 September, 2018; originally announced September 2018.

arXiv:1808.00335 [pdf, other]

Linear compartmental models: input-output equations and operations that preserve identifiability

Authors: Elizabeth Gross, Heather A. Harrington, Nicolette Meshkat, Anne Shiu

Abstract: This work focuses on the question of how identifiability of a mathematical model, that is, whether parameters can be recovered from data, is related to identifiability of its submodels. We look specifically at linear compartmental models and investigate when identifiability is preserved after adding or removing model components. In particular, we examine whether identifiability is preserved when a… ▽ More This work focuses on the question of how identifiability of a mathematical model, that is, whether parameters can be recovered from data, is related to identifiability of its submodels. We look specifically at linear compartmental models and investigate when identifiability is preserved after adding or removing model components. In particular, we examine whether identifiability is preserved when an input, output, edge, or leak is added or deleted. Our approach, via differential algebra, is to analyze specific input-output equations of a model and the Jacobian of the associated coefficient map. We clarify a prior determinantal formula for these equations, and then use it to prove that, under some hypotheses, a model's input-output equations can be understood in terms of certain submodels we call "output-reachable". Our proofs use algebraic and combinatorial techniques. △ Less

Submitted 24 May, 2019; v1 submitted 1 August, 2018; originally announced August 2018.

Comments: v2: 26 pages, 3 figures; expanded introduction to identifiability, including updating definitions; deleted two remarks from v1; improved exposition throughout

arXiv:1702.08747 [pdf, other]

Graph-Facilitated Resonant Mode Counting in Stochastic Interaction Networks

Authors: Michael F Adamer, Thomas E Woolley, Heather A Harrington

Abstract: Oscillations in a stochastic dynamical system, whose deterministic counterpart has a stable steady state, are a widely reported phenomenon. Traditional methods of finding parameter regimes for stochastically-driven resonances are, however, cumbersome for any but the smallest networks. In this letter we show by example of the Brusselator how to use real root counting algorithms and graph theoretic… ▽ More Oscillations in a stochastic dynamical system, whose deterministic counterpart has a stable steady state, are a widely reported phenomenon. Traditional methods of finding parameter regimes for stochastically-driven resonances are, however, cumbersome for any but the smallest networks. In this letter we show by example of the Brusselator how to use real root counting algorithms and graph theoretic tools to efficiently determine the number of resonant modes and parameter ranges for stochastic oscillations. We argue that stochastic resonance is a network property by showing that resonant modes only depend on the squared Jacobian matrix $J^2$ , unlike deterministic oscillations which are determined by $J$. By using graph theoretic tools, analysis of stochastic behaviour for larger networks is simplified and chemical reaction networks with multiple resonant modes can be identified easily. △ Less

Submitted 28 February, 2017; originally announced February 2017.

Comments: 5 pages, 4 figures

arXiv:1612.08116 [pdf, other]

doi 10.1098/rsif.2018.0661

Tensor clustering with algebraic constraints gives interpretable groups of crosstalk mechanisms in breast cancer

Authors: Anna Seigal, Mariano Beguerisse-Díaz, Birgit Schoeberl, Mario Niepel, Heather A. Harrington

Abstract: We introduce a tensor-based clustering method to extract sparse, low-dimensional structure from high-dimensional, multi-indexed datasets. This framework is designed to enable detection of clusters of data in the presence of structural requirements which we encode as algebraic constraints in a linear program. Our clustering method is general and can be tailored to a variety of applications in scien… ▽ More We introduce a tensor-based clustering method to extract sparse, low-dimensional structure from high-dimensional, multi-indexed datasets. This framework is designed to enable detection of clusters of data in the presence of structural requirements which we encode as algebraic constraints in a linear program. Our clustering method is general and can be tailored to a variety of applications in science and industry. We illustrate our method on a collection of experiments measuring the response of genetically diverse breast cancer cell lines to an array of ligands. Each experiment consists of a cell line-ligand combination, and contains time-course measurements of the early-signalling kinases MAPK and AKT at two different ligand dose levels. By imposing appropriate structural constraints and respecting the multi-indexed structure of the data, the analysis of clusters can be optimized for biological interpretation and therapeutic understanding. We then perform a systematic, large-scale exploration of mechanistic models of MAPK-AKT crosstalk for each cluster. This analysis allows us to quantify the heterogeneity of breast cancer cell subtypes, and leads to hypotheses about the signalling mechanisms that mediate the response of the cell lines to ligands. △ Less

Submitted 8 February, 2019; v1 submitted 23 December, 2016; originally announced December 2016.

Comments: 22 pages, 12 figures, 4 tables

Journal ref: Journal of The Royal Society Interface, volume 16 (2019) issue 151, 20180661

arXiv:1608.06146 [pdf, other]

doi 10.1371/journal.pcbi.1005400

The Role of the Hes1 Crosstalk Hub in Notch-Wnt Interactions of the Intestinal Crypt

Authors: Sophie K. Kay, Heather A. Harrington, Sarah Shepherd, Keith Brennan, Trevor Dale, James M. Osborne, David J. Gavaghan, Helen M. Byrne

Abstract: The Notch pathway plays a vital role in determining whether cells in the intestinal epithelium adopt a secretory or an absorptive phenotype. Cell fate specification is coordinated via Notch's interaction with the canonical Wnt pathway. Here, we propose a new mathematical model of the Notch and Wnt pathways, in which the Hes1 promoter acts as a hub for pathway crosstalk. Computational simulations o… ▽ More The Notch pathway plays a vital role in determining whether cells in the intestinal epithelium adopt a secretory or an absorptive phenotype. Cell fate specification is coordinated via Notch's interaction with the canonical Wnt pathway. Here, we propose a new mathematical model of the Notch and Wnt pathways, in which the Hes1 promoter acts as a hub for pathway crosstalk. Computational simulations of the model can assist in understanding how healthy intestinal tissue is maintained, and predict the likely consequences of biochemical knockouts upon cell fate selection processes. Chemical reaction network theory (CRNT) is a powerful, generalised framework which assesses the capacity of our model for monostability or multistability, by analysing properties of the underlying network structure without recourse to specific parameter values or functional forms for reaction rates. CRNT highlights the role of beta-catenin in stabilising the Notch pathway and damping oscillations, demonstrating that Wnt-mediated actions on the Hes1 promoter can induce dynamical transitions in the Notch system, from multistability to monostability. Time-dependent model simulations of cell pairs reveal the stabilising influence of Wnt upon the Notch pathway, in which beta-catenin- and Dsh-mediated action on the Hes1 promoter are key in shaping the subcellular dynamics. Where Notch-mediated transcription of Hes1 dominates, there is Notch oscillation and maintenance of fate flexibility; Wnt-mediated transcription of Hes1 favours bistability akin to cell fate selection. Cells could therefore regulate the proportion of Wnt- and Notch-mediated control of the Hes1 promoter to coordinate the timing of cell fate selection as they migrate through the intestinal epithelium and are subject to reduced Wnt stimuli. △ Less

Submitted 22 August, 2016; originally announced August 2016.

arXiv:1608.05679 [pdf, other]

The geometry of sloppiness

Authors: Emilie Dufresne, Heather A. Harrington, Dhruva V. Raman

Abstract: The use of mathematical models in the sciences often involves the estimation of unknown parameter values from data. Sloppiness provides information about the uncertainty of this task. In this paper, we develop a precise mathematical foundation for sloppiness and define rigorously its key concepts, such as `model manifold', in relation to concepts of structural identifiability. We redefine sloppine… ▽ More The use of mathematical models in the sciences often involves the estimation of unknown parameter values from data. Sloppiness provides information about the uncertainty of this task. In this paper, we develop a precise mathematical foundation for sloppiness and define rigorously its key concepts, such as `model manifold', in relation to concepts of structural identifiability. We redefine sloppiness conceptually as a comparison between the premetric on parameter space induced by measurement noise and a reference metric. This opens up the possibility of alternative quantification of sloppiness, beyond the standard use of the Fisher Information Matrix, which assumes that parameter space is equipped with the usual Euclidean metric and the measurement error is infinitesimal. Applications include parametric statistical models, explicit time dependent models, and ordinary differential equation models. △ Less

Submitted 14 March, 2018; v1 submitted 19 August, 2016; originally announced August 2016.

Comments: 31 pages, 5 figures, Small changes throughout the paper. A table summary of the main examples now appears as appendix

MSC Class: 93B30; 62B10; 62F25; 26B10; 08A99; 26E05

arXiv:1605.00562 [pdf, other]

doi 10.1063/1.4978997

Persistent homology of time-dependent functional networks constructed from coupled time series

Authors: Bernadette J. Stolz, Heather A. Harrington, Mason A. Porter

Abstract: We use topological data analysis to study "functional networks" that we construct from time-series data from both experimental and synthetic sources. We use persistent homology with a weight rank clique filtration to gain insights into these functional networks, and we use persistence landscapes to interpret our results. Our first example uses time-series output from networks of coupled Kuramoto o… ▽ More We use topological data analysis to study "functional networks" that we construct from time-series data from both experimental and synthetic sources. We use persistent homology with a weight rank clique filtration to gain insights into these functional networks, and we use persistence landscapes to interpret our results. Our first example uses time-series output from networks of coupled Kuramoto oscillators. Our second example consists of biological data in the form of functional magnetic resonance imaging (fMRI) data that was acquired from human subjects during a simple motor-learning task in which subjects were monitored on three days in a five-day period. With these examples, we demonstrate that (1) using persistent homology to study functional networks provides fascinating insights into their properties and (2) the position of the features in a filtration can sometimes play a more vital role than persistence in the interpretation of topological features, even though conventionally the latter is used to distinguish between signal and noise. We find that persistent homology can detect differences in synchronization patterns in our data sets over time, giving insight both on changes in community structure in the networks and on increased synchronization between brain regions that form loops in a functional network during motor learning. For the motor-learning data, persistence landscapes also reveal that on average the majority of changes in the network loops take place on the second of the three days of the learning process. △ Less

Submitted 3 December, 2016; v1 submitted 2 May, 2016; originally announced May 2016.

Comments: 17 pages (+3 pages in Supplementary Information), 11 figures in many text (many with multiple parts) + others in SI, submitted

arXiv:1604.02623 [pdf, ps, other]

Decomposing the parameter space of biological networks via a numerical discriminant approach

Authors: Heather A. Harrington, Dhagash Mehta, Helen M. Byrne, Jonathan D. Hauenstein

Abstract: Many systems in biology, physics and engineering can be described by systems of ordinary differential equation containing many parameters. When studying the dynamic behavior of these large, nonlinear systems, it is useful to identify and characterize the steady-state solutions as the model parameters vary, a technically challenging problem in a high-dimensional parameter landscape. Rather than sim… ▽ More Many systems in biology, physics and engineering can be described by systems of ordinary differential equation containing many parameters. When studying the dynamic behavior of these large, nonlinear systems, it is useful to identify and characterize the steady-state solutions as the model parameters vary, a technically challenging problem in a high-dimensional parameter landscape. Rather than simply determining the number and stability of steady-states at distinct points in parameter space, we decompose the parameter space into finitely many regions, the steady-state solutions being consistent within each distinct region. From a computational algebraic viewpoint, the boundary of these regions is contained in the discriminant locus. We develop global and local numerical algorithms for constructing the discriminant locus and classifying the parameter landscape. We showcase our numerical approaches by applying them to molecular and cell-network models. △ Less

Submitted 9 April, 2016; originally announced April 2016.

Comments: 13 pages, 4 figures

arXiv:1603.09730 [pdf, other]

Differential Algebra for Model Comparison

Authors: Heather A. Harrington, Kenneth L. Ho, Nicolette Meshkat

Abstract: We present a method for rejecting competing models from noisy time-course data that does not rely on parameter inference. First we characterize ordinary differential equation models in only measurable variables using differential algebra elimination. Next we extract additional information from the given data using Gaussian Process Regression (GPR) and then transform the differential invariants. We… ▽ More We present a method for rejecting competing models from noisy time-course data that does not rely on parameter inference. First we characterize ordinary differential equation models in only measurable variables using differential algebra elimination. Next we extract additional information from the given data using Gaussian Process Regression (GPR) and then transform the differential invariants. We develop a test using linear algebra and statistics to reject transformed models with the given data in a parameter-free manner. This algorithm exploits the information about transients that is encoded in the model's structure. We demonstrate the power of this approach by discriminating between different models from mathematical biology. △ Less

Submitted 31 March, 2016; originally announced March 2016.

Comments: 17 pages

arXiv:1509.04090 [pdf, ps, other]

Geometric combinatorics and computational molecular biology: branching polytopes for RNA sequences

Authors: Elizabeth Drellich, Andrew Gainer-Dewar, Heather A. Harrington, Qijun He, Christine Heitsch, Svetlana Poznanović

Abstract: Questions in computational molecular biology generate various discrete optimization problems, such as DNA sequence alignment and RNA secondary structure prediction. However, the optimal solutions are fundamentally dependent on the parameters used in the objective functions. The goal of a parametric analysis is to elucidate such dependencies, especially as they pertain to the accuracy and robustnes… ▽ More Questions in computational molecular biology generate various discrete optimization problems, such as DNA sequence alignment and RNA secondary structure prediction. However, the optimal solutions are fundamentally dependent on the parameters used in the objective functions. The goal of a parametric analysis is to elucidate such dependencies, especially as they pertain to the accuracy and robustness of the optimal solutions. Techniques from geometric combinatorics, including polytopes and their normal fans, have been used previously to give parametric analyses of simple models for DNA sequence alignment and RNA branching configurations. Here, we present a new computational framework, and proof-of-principle results, which give the first complete parametric analysis of the branching portion of the nearest neighbor thermodynamic model for secondary structure prediction for real RNA sequences. △ Less

Submitted 16 June, 2016; v1 submitted 14 September, 2015; originally announced September 2015.

Comments: 17 pages, 8 figures

MSC Class: 92D10

arXiv:1507.04331 [pdf, other]

Numerical algebraic geometry for model selection and its application to the life sciences

Authors: Elizabeth Gross, Brent Davis, Kenneth L. Ho, Daniel J. Bates, Heather A. Harrington

Abstract: Researchers working with mathematical models are often confronted by the related problems of parameter estimation, model validation, and model selection. These are all optimization problems, well-known to be challenging due to non-linearity, non-convexity and multiple local optima. Furthermore, the challenges are compounded when only partial data is available. Here, we consider polynomial models (… ▽ More Researchers working with mathematical models are often confronted by the related problems of parameter estimation, model validation, and model selection. These are all optimization problems, well-known to be challenging due to non-linearity, non-convexity and multiple local optima. Furthermore, the challenges are compounded when only partial data is available. Here, we consider polynomial models (e.g., mass-action chemical reaction networks at steady state) and describe a framework for their analysis based on optimization using numerical algebraic geometry. Specifically, we use probability-one polynomial homotopy continuation methods to compute all critical points of the objective function, then filter to recover the global optima. Our approach exploits the geometric structures relating models and data, and we demonstrate its utility on examples from cell signaling, synthetic biology, and epidemiology. △ Less

Submitted 1 April, 2016; v1 submitted 15 July, 2015; originally announced July 2015.

Comments: References added, additional clarifications

arXiv:1506.08903 [pdf, other]

doi 10.1140/epjds/s13688-017-0109-5

A roadmap for the computation of persistent homology

Authors: Nina Otter, Mason A. Porter, Ulrike Tillmann, Peter Grindrod, Heather A. Harrington

Abstract: Persistent homology (PH) is a method used in topological data analysis (TDA) to study qualitative features of data that persist across multiple scales. It is robust to perturbations of input data, independent of dimensions and coordinates, and provides a compact representation of the qualitative features of the input. The computation of PH is an open area with numerous important and fascinating ch… ▽ More Persistent homology (PH) is a method used in topological data analysis (TDA) to study qualitative features of data that persist across multiple scales. It is robust to perturbations of input data, independent of dimensions and coordinates, and provides a compact representation of the qualitative features of the input. The computation of PH is an open area with numerous important and fascinating challenges. The field of PH computation is evolving rapidly, and new algorithms and software implementations are being updated and released at a rapid pace. The purposes of our article are to (1) introduce theory and computational methods for PH to a broad range of computational scientists and (2) provide benchmarks of state-of-the-art implementations for the computation of PH. We give a friendly introduction to PH, navigate the pipeline for the computation of PH with an eye towards applications, and use a range of synthetic and real-world data sets to evaluate currently available open-source implementations for the computation of PH. Based on our benchmarking, we indicate which algorithms and implementations are best suited to different types of data sets. In an accompanying tutorial, we provide guidelines for the computation of PH. We make publicly available all scripts that we wrote for the tutorial, and we make available the processed version of the data sets used in the benchmarking. △ Less

Submitted 12 September, 2017; v1 submitted 29 June, 2015; originally announced June 2015.

Comments: Final version; minor changes throughout, added a section to the tutorial

Journal ref: EPJ Data Science 2017 6:17, Springer Nature

arXiv:1502.03188 [pdf, other]

Algebraic Systems Biology: A Case Study for the Wnt Pathway

Authors: Elizabeth Gross, Heather A. Harrington, Zvi Rosen, Bernd Sturmfels

Abstract: Steady state analysis of dynamical systems for biological networks give rise to algebraic varieties in high-dimensional spaces whose study is of interest in their own right. We demonstrate this for the shuttle model of the Wnt signaling pathway. Here the variety is described by a polynomial system in 19 unknowns and 36 parameters. Current methods from computational algebraic geometry and combinato… ▽ More Steady state analysis of dynamical systems for biological networks give rise to algebraic varieties in high-dimensional spaces whose study is of interest in their own right. We demonstrate this for the shuttle model of the Wnt signaling pathway. Here the variety is described by a polynomial system in 19 unknowns and 36 parameters. Current methods from computational algebraic geometry and combinatorics are applied to analyze this model. △ Less

Submitted 10 February, 2015; originally announced February 2015.

Comments: 24 pages, 2 figures

arXiv:1502.01902 [pdf, other]

Mathematical and Statistical Techniques for Systems Medicine: The Wnt Signaling Pathway as a Case Study

Authors: Adam L. MacLean, Heather A. Harrington, Michael P. H. Stumpf, Helen M. Byrne

Abstract: The last decade has seen an explosion in models that describe phenomena in systems medicine. Such models are especially useful for studying signaling pathways, such as the Wnt pathway. In this chapter we use the Wnt pathway to showcase current mathematical and statistical techniques that enable modelers to gain insight into (models of) gene regulation, and generate testable predictions. We introdu… ▽ More The last decade has seen an explosion in models that describe phenomena in systems medicine. Such models are especially useful for studying signaling pathways, such as the Wnt pathway. In this chapter we use the Wnt pathway to showcase current mathematical and statistical techniques that enable modelers to gain insight into (models of) gene regulation, and generate testable predictions. We introduce a range of modeling frameworks, but focus on ordinary differential equation (ODE) models since they remain the most widely used approach in systems biology and medicine and continue to offer great potential. We present methods for the analysis of a single model, comprising applications of standard dynamical systems approaches such as nondimensionalization, steady state, asymptotic and sensitivity analysis, and more recent statistical and algebraic approaches to compare models with data. We present parameter estimation and model comparison techniques, focusing on Bayesian analysis and coplanarity via algebraic geometry. Our intention is that this (non exhaustive) review may serve as a useful starting point for the analysis of models in systems medicine. △ Less

Submitted 30 July, 2015; v1 submitted 6 February, 2015; originally announced February 2015.

Comments: Submitted to 'Systems Medicine' as a book chapter

arXiv:1409.0269 [pdf, other]

doi 10.1073/pnas.1416655112

Parameter-free methods distinguish Wnt pathway models and guide design of experiments

Authors: Adam L. MacLean, Zvi Rosen, Helen M. Byrne, Heather A. Harrington

Abstract: The canonical Wnt signaling pathway, mediated by $β$-catenin, is crucially involved in development, adult stem cell tissue maintenance and a host of diseases including cancer. We undertake analysis of different mathematical models of Wnt from the literature, and compare them to a new mechanistic model of Wnt signaling that targets spatial localization of key molecules. Using Bayesian methods we in… ▽ More The canonical Wnt signaling pathway, mediated by $β$-catenin, is crucially involved in development, adult stem cell tissue maintenance and a host of diseases including cancer. We undertake analysis of different mathematical models of Wnt from the literature, and compare them to a new mechanistic model of Wnt signaling that targets spatial localization of key molecules. Using Bayesian methods we infer parameters for each of the models to mammalian Wnt signaling data and find that all models can fit this time course. We are able to overcome this lack of data by appealing to algebraic methods (concepts from chemical reaction network theory and matroid theory) to analyze the models without recourse to specific parameter values. These approaches provide insight into Wnt signaling: The new model (unlike any other investigated) permits a bistable switch in the system via control of shuttling and degradation parameters, corresponding to stem-like vs committed cell states in the differentiation hierarchy. Our analysis also identifies groups of variables that must be measured to fully characterize and discriminate between competing models, and thus serves as a guide for performing minimal experiments for model comparison. △ Less

Submitted 8 February, 2015; v1 submitted 31 August, 2014; originally announced September 2014.

Comments: 37 pages, 6 figures; errors fixed and comparison with data

arXiv:1210.2993 [pdf, other]

Cellular compartments cause multistability in biochemical reaction networks and allow cells to process more information

Authors: Heather A. Harrington, Elisenda Feliu, Carsten Wiuf, Michael M. P. Stumpf

Abstract: Many biological, physical, and social interactions have a particular dependence on where they take place. In living cells, protein movement between the nucleus and cytoplasm affects cellular response (i.e., proteins must be present in the nucleus to regulate their target genes). Here we use recent developments from dynamical systems and chemical reaction network theory to identify and characterize… ▽ More Many biological, physical, and social interactions have a particular dependence on where they take place. In living cells, protein movement between the nucleus and cytoplasm affects cellular response (i.e., proteins must be present in the nucleus to regulate their target genes). Here we use recent developments from dynamical systems and chemical reaction network theory to identify and characterize the key-role of the spatial organization of eukaryotic cells in cellular information processing. In particular the existence of distinct compartments plays a pivotal role in whether a system is capable of multistationarity (multiple response states), and is thus directly linked to the amount of information that the signaling molecules can represent in the nucleus. Multistationarity provides a mechanism for switching between different response states in cell signaling systems and enables multiple outcomes for cellular-decision making. We find that introducing species localization can alter the capacity for multistationarity and mathematically demonstrate that shuttling confers flexibility for and greater control of the emergence of an all-or-none response. △ Less

Submitted 10 October, 2012; originally announced October 2012.

arXiv:1110.3742 [pdf, other]

Dependence of MAPK mediated signaling on Erk isoforms and differences in nuclear shuttling

Authors: Heather A. Harrington, Michał Komorowski, Mariano Beguerisse Díaz, Gian Michele Ratto, Michael P. H. Stumpf

Abstract: The mitogen activated protein kinase (MAPK) family of proteins is involved in regulating cellular fate activities such as proliferation, differentiation and apoptosis. Their fundamental importance has attracted considerable attention on different aspects of the MAPK signaling dynamics; this is particularly true for the Erk/Mek system, which has become the canonical example for MAPK signaling syste… ▽ More The mitogen activated protein kinase (MAPK) family of proteins is involved in regulating cellular fate activities such as proliferation, differentiation and apoptosis. Their fundamental importance has attracted considerable attention on different aspects of the MAPK signaling dynamics; this is particularly true for the Erk/Mek system, which has become the canonical example for MAPK signaling systems. Erk exists in many different isoforms, of which the most widely studied are Erk1 and Erk2. Until recently, these two kinases were considered equivalent as they differ only subtly at the sequence level; however, these isoforms exhibit radically different trafficking between cytoplasm and nucleus. Here we use spatially resolved data on Erk1/2 to develop and analyze spatio-temporal models of these cascades; and we discuss how sensitivity analysis can be used to discriminate between mechanisms. We are especially interested in understanding why two such similar proteins should co-exist in the same organism, as their functional roles appear to be different. Our models elucidate some of the factors governing the interplay between processes and the Erk1/2 localization in different cellular compartments, including competition between isoforms. This methodology is applicable to a wide range of systems, such as activation cascades, where translocation of species occurs via signal pathways. Furthermore, our work may motivate additional emphasis for considering potentially different roles for isoforms that differ subtly at the sequence level. △ Less

Submitted 18 November, 2011; v1 submitted 17 October, 2011; originally announced October 2011.

Comments: 19 pages, 7 figures

arXiv:1109.3670 [pdf, ps, other]

doi 10.1073/pnas.1117073109

A parameter-free model discrimination criterion based on steady-state coplanarity

Authors: Heather A. Harrington, Kenneth L. Ho, Thomas Thorne, Michael P. H. Stumpf

Abstract: We describe a novel procedure for deciding when a mass-action model is incompatible with observed steady-state data that does not require any parameter estimation. Thus, we avoid the difficulties of nonlinear optimization typically associated with methods based on parameter fitting. The key idea is to use the model equations to construct a transformation of the original variables such that any set… ▽ More We describe a novel procedure for deciding when a mass-action model is incompatible with observed steady-state data that does not require any parameter estimation. Thus, we avoid the difficulties of nonlinear optimization typically associated with methods based on parameter fitting. The key idea is to use the model equations to construct a transformation of the original variables such that any set of steady states of the model under that transformation lies on a common plane, irrespective of the values of the model parameters. Model rejection can then be performed by assessing the degree to which the transformed data deviate from coplanarity. We demonstrate our method by applying it to models of multisite phosphorylation and cell death signaling. Although somewhat limited at present, our work provides an important first step towards a parameter-free framework for data-driven model selection. △ Less

Submitted 16 August, 2012; v1 submitted 16 September, 2011; originally announced September 2011.

Comments: 13 pages, 3 figures. In press, PNAS

arXiv:0912.1548 [pdf, other]

doi 10.1371/journal.pcbi.1000956

Bistability in Apoptosis by Receptor Clustering

Authors: Kenneth L. Ho, Heather A. Harrington

Abstract: Apoptosis is a highly regulated cell death mechanism involved in many physiological processes. A key component of extrinsically activated apoptosis is the death receptor Fas, which, on binding to its cognate ligand FasL, oligomerize to form the death-inducing signaling complex. Motivated by recent experimental data, we propose a mathematical model of death ligand-receptor dynamics where FasL acts… ▽ More Apoptosis is a highly regulated cell death mechanism involved in many physiological processes. A key component of extrinsically activated apoptosis is the death receptor Fas, which, on binding to its cognate ligand FasL, oligomerize to form the death-inducing signaling complex. Motivated by recent experimental data, we propose a mathematical model of death ligand-receptor dynamics where FasL acts as a clustering agent for Fas, which form locally stable signaling platforms through proximity-induced receptor interactions. Significantly, the model exhibits hysteresis, providing an upstream mechanism for bistability and robustness. At low receptor concentrations, the bistability is contingent on the trimerism of FasL. Moreover, irreversible bistability, representing a committed cell death decision, emerges at high concentrations, which may be achieved through receptor pre-association or localization onto membrane lipid rafts. Thus, our model provides a novel theory for these observed biological phenomena within the unified context of bistability. Importantly, as Fas interactions initiate the extrinsic apoptotic pathway, our model also suggests a mechanism by which cells may function as bistable life/death switches independently of any such dynamics in their downstream components. Our results highlight the role of death receptors in deciding cell fate and add to the signal processing capabilities attributed to receptor clustering. △ Less

Submitted 8 September, 2010; v1 submitted 8 December, 2009; originally announced December 2009.

Comments: Accepted by PLoS Comput Biol

Journal ref: PLoS Comput. Biol. 6 (10): e1000956, 2010

Showing 1–42 of 42 results for author: Harrington, H A