-
Designing Machine Learning Tools to Characterize Multistationarity of Fully Open Reaction Networks
Authors:
Shenghao Yao,
AmirHosein Sadeghimanesh,
Matthew England
Abstract:
We present the first use of machine learning tools to predict multistationarity of reaction networks.
Chemical Reaction Networks (CRNs) are the mathematical formulation of how the quantities associated to a set of species (molecules, proteins, cells, or animals) vary as time passes with respect to their interactions with each other. Their mathematics does not describe just chemical reactions but…
▽ More
We present the first use of machine learning tools to predict multistationarity of reaction networks.
Chemical Reaction Networks (CRNs) are the mathematical formulation of how the quantities associated to a set of species (molecules, proteins, cells, or animals) vary as time passes with respect to their interactions with each other. Their mathematics does not describe just chemical reactions but many other areas of the life sciences such as ecology, epidemiology, and population dynamics. We say a CRN is at a steady state when the concentration (or number) of species do not vary anymore. Some CRNs do not attain a steady state while some others may have more than one possible steady state. The CRNs in the later group are called multistationary. Multistationarity is an important property, e.g. switch-like behaviour in cells needs multistationarity to occur. Existing algorithms to detect whether a CRN is multistationary or not are either extremely expensive or restricted in the type of CRNs they can be used on, motivating a new machine learning approach.
We address the problem of representing variable-length CRN data to machine learning models by developing a new graph representation of CRNs for use with graph learning algorithms. We contribute a large dataset of labelled fully open CRNs whose production necessitated the development of new CRN theory. Then we present experimental results on the training and testing of a graph attention network model on this dataset, showing excellent levels of performance. We finish by testing the model predictions on validation data produced independently, demonstrating generalisability of the model to different types of CRN.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Resultant Tools for Parametric Polynomial Systems with Application to Population Models
Authors:
AmirHosein Sadeghimanesh,
Matthew England
Abstract:
We are concerned with the problem of decomposing the parameter space of a parametric system of polynomial equations, and possibly some polynomial inequality constraints, with respect to the number of real solutions that the system attains. Previous studies apply a two step approach to this problem, where first the discriminant variety of the system is computed via a Groebner Basis (GB), and then a…
▽ More
We are concerned with the problem of decomposing the parameter space of a parametric system of polynomial equations, and possibly some polynomial inequality constraints, with respect to the number of real solutions that the system attains. Previous studies apply a two step approach to this problem, where first the discriminant variety of the system is computed via a Groebner Basis (GB), and then a Cylindrical Algebraic Decomposition (CAD) of this is produced to give the desired computation. However, even on some reasonably small applied examples this process is too expensive, with computation of the discriminant variety alone infeasible. In this paper we develop new approaches to build the discriminant variety using resultant methods (the Dixon resultant and a new method using iterated univariate resultants). This reduces the complexity compared to GB and allows for a previous infeasible example to be tackled. We demonstrate the benefit by giving a symbolic solution to a problem from population dynamics -- the analysis of the steady states of three connected populations which exhibit Allee effects - which previously could only be tackled numerically.
△ Less
Submitted 9 February, 2022; v1 submitted 31 January, 2022;
originally announced January 2022.
-
Polynomial Superlevel Set Representation of the Multistationarity Region of Chemical Reaction Networks
Authors:
AmirHosein Sadeghimanesh,
Matthew England
Abstract:
In this paper we introduce a new representation for the multistationarity region of a reaction network, using polynomial superlevel sets. The advantages of using this polynomial superlevel set representation over the already existing representations (cylindrical algebraic decompositions, numeric sampling, rectangular divisions) is discussed, and algorithms to compute this new representation are pr…
▽ More
In this paper we introduce a new representation for the multistationarity region of a reaction network, using polynomial superlevel sets. The advantages of using this polynomial superlevel set representation over the already existing representations (cylindrical algebraic decompositions, numeric sampling, rectangular divisions) is discussed, and algorithms to compute this new representation are provided. The results are given for the general mathematical formalism of a parametric system of equations and so may be applied to other application domains.
△ Less
Submitted 12 September, 2022; v1 submitted 17 March, 2020;
originally announced March 2020.
-
PLIT: An alignment-free computational tool for identification of long non-coding RNAs in plant transcriptomic datasets
Authors:
S. Deshpande,
J. Shuttleworth,
J. Yang,
S. Taramonli,
M. England
Abstract:
Long non-coding RNAs (lncRNAs) are a class of non-coding RNAs which play a significant role in several biological processes. RNA-seq based transcriptome sequencing has been extensively used for identification of lncRNAs. However, accurate identification of lncRNAs in RNA-seq datasets is crucial for exploring their characteristic functions in the genome as most coding potential computation (CPC) to…
▽ More
Long non-coding RNAs (lncRNAs) are a class of non-coding RNAs which play a significant role in several biological processes. RNA-seq based transcriptome sequencing has been extensively used for identification of lncRNAs. However, accurate identification of lncRNAs in RNA-seq datasets is crucial for exploring their characteristic functions in the genome as most coding potential computation (CPC) tools fail to accurately identify them in transcriptomic data. Well-known CPC tools such as CPC2, lncScore, CPAT are primarily designed for prediction of lncRNAs based on the GENCODE, NONCODE and CANTATAdb databases. The prediction accuracy of these tools often drops when tested on transcriptomic datasets. This leads to higher false positive results and inaccuracy in the function annotation process. In this study, we present a novel tool, PLIT, for the identification of lncRNAs in plants RNA-seq datasets. PLIT implements a feature selection method based on L1 regularization and iterative Random Forests (iRF) classification for selection of optimal features. Based on sequence and codon-bias features, it classifies the RNA-seq derived FASTA sequences into coding or long non-coding transcripts. Using L1 regularization, 31 optimal features were obtained based on lncRNA and protein-coding transcripts from 8 plant species. The performance of the tool was evaluated on 7 plant RNA-seq datasets using 10-fold cross-validation. The analysis exhibited superior accuracy when evaluated against currently available state-of-the-art CPC tools.
△ Less
Submitted 12 February, 2019;
originally announced February 2019.