Data Science

Supporting Information for:
Discovering governing equations from data:

Sparse identification of nonlinear dynamical systems
Steven L. Brunton1 , Joshua L. Proctor2 , J. Nathan Kutz3
1
Department of Mechanical Engineering, University of Washington, Seattle, WA 98195, United States
2
Institute for Disease Modeling, Intellectual Ventures Laboratory, Bellevue, WA 98004, United States
3
Department of Applied Mathematics, University of Washington, Seattle, WA 98195, United States
Contents
1 Technical introduction 2
2 Background 3
2.1 Symbolic regression and machine learning . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Sparse representation and compressive sensing . . . . . . . . . . . . . . . . . . . . . . 3
3 Nonlinear system identification using sparse representation 4

3.1 Algorithm for sparse representation of dynamics with noise . . . . . . . . . . . . . . 6
3.2 Cross-validation to determine parsimonious sparse solution on Pareto front . . . . . 7
3.3 Extensions and Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3.1 Discrete-time representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3.2 High-dimensional systems, partial differential equations, and dimensional-
ity reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3.3 External forcing, bifurcation parameters, and normal forms . . . . . . . . . . 8
4 Results 9
4.1 Example 1: Simple illustrative systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.1.1 Example 1a: Two-dimensional damped oscillator (linear vs. nonlinear) . . . . 9
4.1.2 Example 1b: Three-dimensional linear system . . . . . . . . . . . . . . . . . . 9
4.2 Example 2: Lorenz system (Nonlinear ODE) . . . . . . . . . . . . . . . . . . . . . . . 11
4.3 Example 3: Fluid wake behind a cylinder (Nonlinear PDE) . . . . . . . . . . . . . . . 14
4.3.1 Direct numerical simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.3.2 Mean field model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.3.3 Cubic nonlinearities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.4 Example 4: Bifurcations and normal forms . . . . . . . . . . . . . . . . . . . . . . . . 18
4.4.1 Logistic map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.4.2 Hopf normal form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.5 Sparse identification of the Lorenz system with time-delay coordinates . . . . . . . . 20
5 Discussion 22
Appendix A: Choice of basis functions 23
Appendix B: Limitations of the sparse identification framework 25
Appendix C: Identified coefficients of dynamics 29
1
1 Technical introduction
There is a long and fruitful history of modeling dynamics from data, resulting in powerful tech-
niques for system identification [1]. Many of these methods arose out of the need to understand
complex flexible structures, such as the Hubble space telescope or the international space station.
The resulting models have been widely applied in nearly every branch of engineering and applied
mathematics, most notably for model-based feedback control. However, methods for system iden-
tification typically require assumptions on the form of the model, and most often result in linear
dynamics, limiting their effectiveness to small amplitude transient perturbations around a fixed
point of the dynamics [2].
A recent breakthrough in nonlinear system identification has resulted in a new approach to
determine the underlying structure of a nonlinear dynamical system from data [3]. This method
uses symbolic regression to determine dynamics and conservation laws, and it balances the com-
plexity of the model (measured in the number of model terms) with the agreement with data.
The resulting identification algorithm realizes a long-sought goal of the physics and engineering
communities to discover dynamical systems from data. However, the symbolic regression prob-
lem is expensive, does not clearly scale well to large-scale dynamical systems of interest, and may
be prone to over-fitting unless care is taken explicitly balance model complexity with predictive
power. In [3], the Pareto front is used to isolate parsimonious models from a large family of can-
didate models. There are a host of additional techniques for modeling emergent behavior [4] and
the discovery of governing equations from time-series data [5]. These include statistical methods
of automated inference of dynamics [6, 7, 8], and equation-free modeling [9], including empirical
dynamic modeling [10, 11].
In the present work, we re-envision the dynamical system discovery problem from the per-
spective of sparse regression [12, 13, 14] and compressive sensing [15, 16, 17, 18, 19, 20]. In par-
ticular, we leverage the fact that most physical systems have only a few nonlinear terms in the
dynamics, making the right hand side of the equations sparse in a high-dimensional nonlinear
function space. Before the advent of compressive sampling, and related sparsity-promoting meth-
ods, determining the few non-zero terms in a nonlinear dynamical system would have involved
a combinatorial brute-force search, meaning that the methods would not scale to larger problems
with Moore’s law. However, powerful new theory guarantees that the sparse solution may be de-
termined with high-probability using convex methods that do scale favorably with problem size.
The resulting nonlinear model identification inherently balances model complexity (i.e., sparsity
of right hand side dynamics) with accuracy, and the underlying convex optimization algorithms
ensure that the method will be applicable to large-scale problems.
The method described here shares some similarity to the recent dynamic mode decomposition
(DMD), which is a linear dynamic regression [21, 22]. DMD is an example of an equation-free
method [9], since it only relies on measurement data, but not on knowledge of the governing
equations. Recent advances in the extended DMD have developed rigorous connections between
DMD built on nonlinear observable functions and the Koopman operator theory for nonlinear
dynamical systems [21, 23]. However, there is currently no theory for which nonlinear observable
functions to use, so that assumptions must be made on the form of the dynamical system. In
contrast, the method developed here results in a sparse, nonlinear regression that automatically
determines the relevant terms in the dynamical system. The trend to exploit sparsity in dynamical
systems is recent but growing [24, 25, 26, 27, 28, 29, 30]. In this work, promoting sparsity in the
dynamics results in parsimonious natural laws.
2
2 Background
This work combines methods from symbolic regression and sparse representation. Symbolic re-
gression is used to find nonlinear functions describing the relationships between variables and
measured dynamics (i.e., time derivatives). Traditionally, model complexity is balanced with de-
scribing capability using parsimony arguments such as the Pareto front. Here, we use sparse
representation to determine the relevant model terms in an efficient and scalable framework.
2.1 Symbolic regression and machine learning

Symbolic regression involves the determination of a function that relates input–output data, and it
may be viewed as a form of machine learning. Typically, the function is determined using genetic
programming, which is an evolutionary algorithm that builds and tests candidate functions out of
simple building blocks [31]. These functions are then modified according to a set of evolutionary
rules and generations of functions are tested until a pre-determined accuracy is achieved.
Recently, symbolic regression has been applied to data from dynamical systems, and ordinary
differential equations were discovered from measurement data [3]. Because it is possible to overfit
with symbolic regression and genetic programming, a parsimony constraint must be imposed,
and in [3], they accept candidate equations that are at the Pareto front of complexity.
2.2 Sparse representation and compressive sensing

In many regression problems, only a few terms in the regression are important, and a sparse feature
selection mechanism is required. For example, consider data measurements y ∈ Rm that may
be a linear combination of columns from a feature library Θ ∈ Rm×p ; the linear combination of
columns is given by entries of the vector ξ ∈ Rp so that:
y = Θξ. (1)
Performing a standard regression to solve for ξ will result in a solution with nonzero contributions
in each element. However, if sparsity of ξ is desired, so that most of the entries are zero, then it is
possible to add an L1 regularization term to the regression, resulting in the LASSO [12, 13, 14]:
ξ = argmin kΘξ 0 − yk2 + λkξ 0 k1 . (2)

ξ0
The parameter λ weights the sparsity constraint. This formulation is closely related to the com-
pressive sensing framework, which allows for the sparse vector ξ to be determined from relatively
few incoherent random measurements [15, 16, 17, 18, 19, 20]. The sparse solution ξ to Eq. 1 may
also be used for sparse classification schemes. Importantly, the compressive sensing and sparse
representation architectures are convex and scale well to large problems, as opposed to brute-force
combinatorial alternatives.
3
3 Nonlinear system identification using sparse representation
In this work, we are concerned with identifying the governing equations that underly a physical
system based on data that may be realistically collected in simulations or experiments. Generically,
we seek to represent the system as a nonlinear dynamical system
ẋ(t) = f (x(t)). (3)

T
The vector x(t) = x1 (t) x2 (t) · · · xn (t) ∈ Rn represents the state of the system at time t,
and the nonlinear function f (x(t)) represents the dynamic constraints that define the equations of
motion of the system. In the following sections, we will generalize Eq. (3) to allow the dynamics f
to vary in time, and also with respect to a set of bifurcation parameters µ ∈ Rq .
The key observation in this paper is that for many systems of interest, the function f often
consists of only a few terms, making it sparse in the space of possible functions. For example,
the Lorenz system in Eq. (22c) has very few terms in the space of polynomial functions. Recent
advances in compressive sensing and sparse regression make this viewpoint of sparsity favorable,
since it is now possible to determine which right hand side terms are non-zero without performing
a computationally intractable brute-force search.
To determine the function f from data, we collect a time-history of the state x(t) and either
measure the derivative ẋ(t) or approximate it numerically from x. The data is sampled at several
times t1 , t2 , · · · , tm and arranged into two large matrices:
state
 −−−−−−−−−−−−−−−−−−−−−−−−→
  

xT (t1 )
x1 (t1 ) x2 (t1 ) · · · xn (t1 ) 
 x1 (t2 ) x2 (t2 ) · · · xn (t2 ) 
 xT (t2 ) 
   
 (4a)
time
X =  . =  . . .
 ..   .. .. . .. ..  


xT (tm ) x1 (tm ) x2 (tm ) · · · xn (tm ) y
   
ẋT (t1 ) ẋ1 (t1 ) ẋ2 (t1 ) ··· ẋn (t1 )
 ẋT (t2 )   ẋ1 (t2 ) ẋ2 (t2 ) ··· ẋn (t2 ) 
   
Ẋ =  .  =  . .. .. ..  . (4b)
 ..   .. . . . 
ẋT (tm ) ẋ1 (tm ) ẋ2 (tm ) · · · ẋn (tm )
Next, we construct an augmented library Θ(X) consisting of candidate nonlinear functions of the
columns of X. For example, Θ(X) may consist of constant, polynomial and trigonometric terms:
 
Θ(X) =  1 X XP2 X P3 ··· sin(X) cos(X) sin(2X) cos(2X) · · ·  . (5)
Here, higher polynomials are denoted as XP2 , XP3 , etc. For example, XP2 denotes the quadratic
nonlinearities in the state variable x, given by:
 2 
x1 (t1 ) x1 (t1 )x2 (t1 ) · · · x22 (t1 ) x2 (t1 )x3 (t1 ) · · · x2n (t1 )
 x2 (t2 ) x1 (t2 )x2 (t2 ) · · · x2 (t2 ) x2 (t2 )x3 (t2 ) · · · x2 (t2 ) 
 1 2 n 
XP2 =  . .. . .. .. . ..  . (6)
 . . . . . . . . . . 
x21 (tm ) x1 (tm )x2 (tm ) · · · x22 (tm ) x2 (tm )x3 (tm ) · · · x2n (tm )
4
6 6 7 7 6 6
7 z) 6 y 6
77 (19)
(27) 7(27)
6 6 7
· · · ẏ ·=
· ·7 x(⇢
7
⇥(X) =⇥(X)
6 =6 . 7. 6
6 6 7 7 6 6
6 777
7
6
6
6
6 ż = 7 xy
7
7 z. 6 6
7 6 7 7
7 (20)
6 6 7 7⇥(X)6= 66 ··· 777. (27)
6 6
6 the6standard parameters
7
7
7 4 6
7= 10, 6 5 7
7
For this example, we use
6 7 = 8/3, ⇢ = 28, with an 7
6 initial condition
⇤T6 7 6 7
⇥ ⇤T ⇥ 4 4 5 5 6
6
7
7
: Lorenz system (Nonlinear
x y z = ODE)
8 7 27 . 6
4
7
5
r the nonlinear LorenzI.system

2 to explore the identification 22of
2 ẋ3 ẏ ż 3chaotic
23
2
2 1 xdynamics:
3 y z x2 xy xz y
2 5
33 3⇠2 ⇠33
2 z 3 2⇠1
'' 'xi_1' 'xi_2' 'xi_3'
True Lorenz System '1' [ 0] [ 0] [ 0]
6 76 7
6 7 6x(t)z(t)
4x(t) y(t) z(t) x(t)2 x(t)y(t) 6 7 6 7 y(t)72 y(t)z(t) z(t) · · ·5
6 76 (21)'x' 7
26 76 7 [-9.9996] [27.9980] [ 0]
ẋ = (y 2 x)
6 7 66 726 7
(18) 7 36
6
76
76
7
7
2 3 6 7366 766 7
6 7 66 766 7 6 7 7 6 7 76
6 76
76'y' 7 7
7 [ 9.9998] [-0.9997] [ 0]
ẏ = 6 6 y Data In 7
6 7 66 766 7
6 7766 766 7 6
6 7
7 6
(19)4.2 Example
7
7
76 76'z' 7
2: Lorenz system (Nonlinear ODE)
76 76
7
7 [ 0] [ 0] [-2.6665]
6x(⇢ 6 z) 6 7
76 76 7
7 6 7766 76 67 7 76 76'xx' 7 [ 0] [ 0] [ 0]
6 6 7 6 7766 766 7 6
6 7
6 7 76
76
76
76'xy' 7
7
7 [ 0] [ 0] [ 1.0000]
6 6 7 6 7766 766 7 6
6 7
(20) 7 76 76 7
ż = 6 xy 6
Full Simulation
6
6
6
6
z. 7 6 ⇥(X)
Identified
7766 7 6 7
7 6 7766 =76
7 6 7766System,
66 7
766 7
6 ⌘ =7
6 7
70.01 = 66 7Here, weSystem,
Identified
7
consider
76
76
76
76 ⌘=the
10nonlinear
76
76
76
76
'xz'
'yy'
7
7
7
7.
7
7
Lorenz
[
[
0]
(28)system
0] [
to explore
[-0.9999] [
0] [
the0] identification
0]
of chaotic
6 2 6 7 6 7766 76 66 7 7 76 76 7
we use the standard 50

parameters
6 6
6 = 6= 10, = 8/3,50
3
⇢ =7
7 6 77662 76
6 7with
28, 66 7an
⇥(X) 6
6=
6
7
initial
6
7 3
7condition
6 7
7.
76
76
(28)
76'yz'
726
(28)
3 7 7
(28)
7 [ 0] [
ẋ = (y
0] [
x)
0]
⇥(X) = ⇥(X) 7 66 7 . . 6 50 76
76
7
7
6
6... 7 7
7 ... ... ...
8 7 27 .
⇤T 6
6 6
6 6 6
6
6 7
7
7
7
6
6
7
7
7
7
6
6
6
6
7326 77666 7 6 7
7 6
6 746 7
7
6
6 32 7
7
7 7
7 ⇥(X) = 6 7
7 ···
56
6
6
76
766 7
'yzzzz'
766 7 7. 7
7
7 [ 0] [
ẏ0] =[ x(⇢ 0] (27)
0] [
3 z) y 0]
0]
z 6
6 6 6
6 6
2
7
7 z
7 6 7 76 66 7 6 6 76 7
7 7 7
7z 6 7
2
3 6
6
76
766 7
'zzzzz' 73 72
7 [ 32 [
25 6 6 6
6
6 7
2 2 7 25 7
7766 77666 7 6 7
77 66 77 666 7 6 7 6 7766 7 7
2 25
6 7
7
6
6 766
4 7
7
564
7776
Sparse
776
56
ż = xy 777of Dynamics
Coefficients
6 z.
2 2 3 36 6
4 6 4
6 6
6
7 5
766 3
766 7 7666 7 6 7
3
75666 7 6 7 6 7766 7
6 7
2 7
7 6 7
7
6
6 33 7
7 6
7
7 77
7
76
76
6
6 7
6 6 7 7 6
6
6 6 6 7
76
766 7 7 76 66 7 6 7 3 76 7 6 7
6(27)
6 6 7
7
6
6 7 7
77 7
6 ⌅ 7 77
7
76
76
6
6
7
7
Model Out
6 6 6 67 766 7 777666 7 6 7 76 7 6 6 7 6 776 6 7
6
6
6
6
6
77
77 =6
⇥(X)
77
6 6 67 .
6 · · ·6 6
650 6 7 · ⇥(X)
76
766 7
=76.6⇥(X)
7
776=66 7· ·6 7 76
· · 666 7
4=777666. ·7 506 7
7
6
6. 6
For 7 6 this
⇥(X)
⇥= 777666 . (27)
77 = 6
7 7example,
77 ⇤ 5·6
we use the77standard 77
76 parameters (27)
76 6
7 (27) 6 · · ·
. · · (27)
7 . = 650
7 = 10,
7 = 8/3, ⇢ 28, with an initia
= (27)
y(t) z(t) x(t)
6
6
6
6
2
6 0
x(t)y(t)
77
77
6 ⇥(X) = 66
6
7 7 x(t)z(t)
62 6 7
··
0 y 66 6 7y(t)z(t)
y(t) 0 764 7 777666 7 ⇥(X)
5
7
z(t)4 42
6 5
· 7
6
4
· · 5
· 5
7(27)
5
0 y 66 7
6 · ⇥(X)
6 0
7
··
(21) 6 7 7 T6
··
6
⇥ ⇤T7 ·⇥(X)
7.
7
=76.
⇥(X)
776
7 0 6
6
(27)
··· 7.
7
(27)
6
6
6
6
6
-20 70 7 6
· · · · · · 7 .7 . x 6
77 20 -50
6
6
6 6 67
6
-20 7
76
6
76 0
7
7
·· 7
7
7
7 6
7
76
. 6 20 -50 7⇥(X)
76
6
6 -20
6
6
6 =7 6 x 7
7
7
6 y 7z 7
6 · · ·0 7 7 . 20
6 7 7 6 = 8 y 7 27 7 77
-50
7
7
7
6
6
76
6
6
7
7
7
6
⇥(X) ⇥(X)
=6 =6
6 6 77 6
6
6 ⇥(X)
⇥(X)
6
6
=6 = (27)
6 67
6 67
(27)
6 · · ·7·6
76
76
7 .7x
77
77
6
6
76
76
= 6 (27)
⇥(X) (27)
6 7
6 7
6
6
6
4 ·7
7
6· ·
6
6
x77 77. 66
77 6
(27)7
(27)
7
7
57
7
7
76
76
76
6
6
7
7
7
6 6 77 4 6 67 54 77 56 6 6
7
77
6
6
6
6
6
77
7 7 6
4
6
6 65
6 7
77 Ẋ 4
7 6 4 6
6 5
7
6
6
⇥(X) 5 4
7 7 24
7 7 6 7 554 4 5
3
6
6
6
6
77
77 6 6 77 6 6 77 5
6
4
6
4 50
77
55
6
4
6
4 50
77
55
6
4
6
4 50
77
55 III. Identified System
2 32 3 2 32 3 2 32 3 4x(t)
2 3 2 y(t)
3 z(t) x(t)22 3 2x(t)y(t)
3 x(t)z(t)
2 32
T 3y(t)
2 y(t)z(t) z(t)2 · · ·5
tion Identified
z
System, ⌘ = 0.01 z 66 77Identified
6 7 System,
6 76 ⌘77 = 10 6 76 7 6 76 7 6 76 ẋ = ⇥(x
7 6 76 )⇠71
x32y3 6 3z2⇠23 66
76
2 ẋ 32y
23 23 2 2 3x 32⇠13 66 77 6662 ẏ
23 2
327
72 3 6 7 6 xz 7 6 2 ż 72
7 2 336
23 3
2 6 327 6
7 6z xy7
7
32⇠33 666 777 666 7 6 76 7
76 2 3
7 76 7 36 76 7
7 6 76 7
25 6 67 67 6 776
2566 7 66 67 6767 776 7 6 7 25 66 76
76
76 6
7
76
67 7
6 76
6 76
767 6
7
7 6 76
7
7 6 7 6 T )⇠7
ẏ = 77⇥(x
6 76 7
50 6 67 67 6 6 776 73666 77750 7 66 67 6767
76 7
6 76
76 7 6 7 6 7
73666 777 666 7 63 6
7
67 7
76
6 7 6 7
6 777766776677 662 6 7 6
7663 77 666 777 666
7 7 6 76 2
7
2 3 63 67 67 6 6 776 226 2 73663 67 6
7676 7 7 7 62 726 2 7 76 6 3 67 7
76
6 7 7 6 76 7
2
6 67 67 6 76 7 66
776
7 66 67 6
7 66 67 6767 776 7 6 7 6
7
7 7 76 6 6 7776
6 6 7767767 6 7 76 7 6 7 6 7 76 76 7
6 67 67 6 6 776 766 77 66 7676 7 76 7666 777 666
6 76 T 7 67 6 67 7
6 66 7 7
7 7 76 76 7
6
6
6
6
7 67 67 67 6 6 776
7 67 67 67 6 7
50 776 07 6
776 76 7 666
6
6
6 6 77667 67 6
76
77667 67 6
76
77667 67 6
76 7
7 7
76 7
776
6 76 6 7
Full Simulation
7 66 76
7 66 76
6 6 7
6 6507 ż = 777⇥(x
Identified System,
776 7 6
6 76 )⇠7773⌘ = 0.01
7 67 6
76 6
7
7 67 7
76
6
76 6
7 67 7
6
7
77
6
6
6
767 66
7
Identified System
77 767 6 6 7
7
750
6 7 6 76
7667 77 66 77 66
7
7
6
6
6
6 z 0 7 67 67 67 66 6 77 76 7z6
66 6
6 7=77666 6666 066
y ⇥(X) =777766 -20
-20777 666777 666777 6667770⇥(X)
6 76
77667 67 6
7676 7
76 7 7
76
76 7 6 76
7 66 76
6 6 7
60 6 7
=7
6
777607 6
76 76
76
7 67
7 67 67 7
7 67 67 776
6
6 666 7776
6 767 66
767 66 0 7 7677 7 6 7 6
7
7
7
77766-20
77667 67 6 6 6 76
76 76 7
7 67 7 7 6 7 6 7 6 76 7
6
6
6
6
25 7 67 67 67 6 6 776
x
6
=6 7
6 76
7766 7725
776
20 66
-50
76
6
6
6 x 6 y 7
77667 6706
77667 67 6
76
6
=
76
7
6
7
. =
7⇥(X)
6 7
7
776
20
776 7
x-50
6
6
7 66 766 6(28)7
6 766 6 7
7 6 y
7
.
⇥(X) 76 76
21
76 76
76
7766 777 666
(28)
7.
7
76 6
7 67
6
7 67 6 7 67
077(28)
7
6
67 7
6
67 7
76
6
6
.=
7⇥(X)(28)
66 7
7
7=6
76 62077 766
7
76
7-50
7 66
67 66
7 .(28) ⇥(X) =767 7 6(28)7 6
7
766677 777 666 777 50
6
7.
7
6
6
6
6
6
6
7 67 67 67 6
7 67 67 67 6 6 7
776
776 7 66
76 =
6
66
66
6 6
6 6 77667 67 6
76
77667 67 6
7
7
76 7
6 77
7
6 7 66
76 7 6 766 6 7
6 6
(28) 6 7 7
76 76
76 76
7 6 (28)
7
7 7
7 6750
67 67 7
6 7 67 50
6
76
7
6 6 7776 76
7
7 66
67 66
6 77776677.66. 77 66(28)
7
7 7 7
6
6
7
7
7⇥(X)
67 67 67= 66 7
(28) 776
(28) . 6 776
⇥(X) 7
7=6
7.. 7
6 66=766 = 6 7 (28)
(28)776. 7 6 76 7
7 67⇥(X) 7= 7 (28)(28)767. 7 6 7 6 7
time
⇥(X)
⇥(X) = 6 =6 ⇥(X) .
7. 6 ⇥(X)
76 =6
6 77667 67 6 7 6 ⇥(X)
77 7 6 ⇥(X)
7 6
76 7 66 766 6 7 7 6
7 6 7 6 7 67 6
7 7 67 7
76
6 6 7 67 7 6 7 6
6 6
Figure 3: Trajectories of the Lorenz
7 67 67 67 6 6 776 66
77766. 77system
6 6 77667 67 6
7676 7
for short-time integration from t = 0 to t = 20 (top)
(27)
6 7=66 766 6 7· · · 77766. 777 666
7(27)
6 76 7 67 6
7 7 67 7
76
6
6 6 ⇥(X)
7(27)
76 767=66 7 7 6
· · · 767 . 7 6 7 6
7 6 7
7
7
6
6
6
6
76 76 7 6
7 67 67 67 6 6 776
7 67 67 67 6 6 774
···
7 6 7⇥(X)
7 =
774 5 666
6
66
6
6
6
6 6 7766
7766
7766
7 67 6
76
7 67 6
76
7 67 6
76
76 7
7
76 7
77
7
⇥(X)
7
z6 6 766 6 7
74 5 6 6 766 6 7z 774 5 6
76 76
56 76
7
7
67 67 7
7 67 67 67 7
7 67 67 67 7
6
76 6 776
6
z
76
6 576
767 66
747 66
7
7
7477 5 666 777 666
6 7
7
t 77= 6 7
6 6 7 6 66 77 66 76 6 76
6 6
50 and long-time integration from
7 67 67 67 6 6 77
50 7 06 66 to t = 250 (bottom). The full
6 7766 50dynamics (left) are compared
7 67 6
7676 7 7 66 766 6 7 777 6
6 76 7
7 67 67 67 76
76 6 76
7 7 66 7
777 6 7 25
7
6
6
0y
6
6 0 7 67 67 67 6
7 67 67 67 6 6 77
0 systems
77
7 7 6 66606
6
6
67 67 7
6
77667 67 6
0 76
75665 67 6
6
7
7 6
77
7
77 6
6
7
6
6
6
66 766 6 7
7 77 6
6 76
6 76
7
7 67 67 67 7
7
7 6725
67 67 7
6
25
76 6 76
6
7 7 66
7 7 66
7
7 7 6
7
7
4 4
-20 the sparse
with identified
5 65 67 67 6 6 77 7 64-20
7 (middle,4
right) for various additive
76
noise.
56 7 7
The trajectories are
46 44 4 5
77 4 6 76 7
7 65 65 67 776
6
56 6 74
76 7 76
7 5 46
7
777 46 57 46 7
-50 y -50 isy a proxy for local sensitivity.
4 5 46 67 6 5
20 -50 0 6 67 67 6
6 67 67 6 6 77
77
colored by xt, the20adaptive Runge-Kutta
777 6
6 06 20 color6 67 6
76
7
77
7 6 6 77 6 77
4 54 5 4 6 67 7
5
6 67 7
6
6 6 6 77 6 5
77
5
(28)
6 6
6 67 67 6 4 77
6 67 67 6
4 45 45 4
77
55
55 4 ⇥(X) = 6 x This
time step.
6 67 6
6 67 6
4 45 4
76
54
77
7.
7 6 4 77
55
4 55 6 67 7
6 67 7
4 45 5
66
6 6 4 77
44
77
55
4 5 5
6 0 7 0y
50 0 0y
50 0
6 -20 7
0 27 20 -50 -20 0 -20 0
50
II. Sparse Regression
2
50
6
to Solve 3for Active Terms in2the Dynamics 3 x 3 x 20 -50 x 20 -5
6 7
Figure 1: Schematic
6 7
6
6 of sparse
21 7
7
21 6
6 6
dynamic representation,
7
7 demonstrated
6
67 on the Lorenz
7
7 equations. Data is col-
z 6
6
7
7z
6
6 6 7
7 67
6
50 states X667= (x, y, z) and
7
7
50
lected from measurements 7 of a complex system, 6 including derivatives Ẋ = (ẋ, ẏ, ż). 50
6 6 7 7
6 7 7
25 6
6 7 25
7 6
6 4 7
7
6
65
7
7
Next, a large collection
6
⇥(X) = 6
6 of7 nonlinear functions
7
7.
6
of the
⇥(X)
7
states,
6
(28)= 6
z
7
Θ(X)
7.
6
is constructed.z This nonlinear feature z
6
(28) = 6
⇥(X)
7
7
7. (28)
6 7 6 7 6 7
library is used to
6
6 find the fewest
7
7 dynamics
6
6 terms to 7
7 25 Ẋ = Θ(X)Ξ,
satisfy 6
6 resulting
7
7 25
in a sparse model. 25
50 0 6 50 7 6 6 507 7 6
0 7
6 7 6 7
0y -20 6 0y 7
-20 6 0y 7 6 7
20 -50 0 20 -50
4 5
0 4
20 -50
5 4
50
5
50
x x 0 0 0
-20 0y -20 0y -20
0 20 -50 0 20 -50 0
ories of the Lorenz system for short-time integration from t = 0 to t = 20 (top) x x x 20 -5
21 2110
tegration from t = 210 to Each
t = 250column
(bottom).ofThe represents a are
candidate function for the of right handsystem
side of for Eq. (3). There
21 21 21 21
Θ(X)full
21 21
dynamics (left) compared Figure
21 21 3: Trajectories the Lorenz short-time integration from t = 0 to t
21
dentified systems (middle, right) for various
is tremendous freedom additive noise.in
of choice The trajectories arethe
constructing entries integration
and long-time in this matrix of0 nonlinearities.
from t = to t = 250 (bottom). Since
The full dynamics (left) are
e adaptive Runge-Kutta time step. This color is a proxy for local sensitivity. with the sparse identified systems (middle, right) for various additive noise. The traj
we believe that only a few of these nonlinearities are by
colored active
t, thein each Runge-Kutta
adaptive row of f , time
we step.
mayThis
setcolor
upisaa proxy for local sens
21
sparse regression problem to determine the sparse vectors of coefficients Ξ = ξ 1 ξ 2 · · · ξ n
that determine which nonlinearities are active, as illustrated in Fig. 1.
Ẋ = Θ(X)Ξ. (7)
21 21 21
Each column ξ k of Ξ represents a sparse vector of coefficients determining which terms are
active in the right hand side for one of the row equations ẋk = fk (x) in Eq. (3). Once Ξ has been
determined, a model of each row of the governing equations may be constructed as follows:
21 21 21
ẋk = fk (x) = Θ(xT )ξ k . 10 (8)
10
Note that Θ(xT ) is a vector of symbolic functions of elements of x, as opposed to Θ(X), which is
a data matrix. This results in the overall model
ẋ = f (x) = ΞT (Θ(xT ))T . (9)
We may solve for Ξ in Eq. (7) using sparse regression. In many cases, we may need to normal-
ize the columns of Θ(X) first to ensure that the restricted isometry property holds [24]; this is
especially important when the entries in X are small, since powers of X will be minuscule.
21
5
3.1 Algorithm for sparse representation of dynamics with noise
There are a number of algorithms to determine sparse solutions Ξ to the regression problem in
Eq. (7). Each column of Eq. (7) requires a distinct optimization problem to find the sparse vector
of coefficients ξ k for the k th row equation.
For the examples in this paper, the matrix Θ(X) has dimensions m × p, where p is the number
of candidate nonlinear functions, and where m p since there are more time samples of data
than there are candidate nonlinear functions. Realistically, often only the data X is available, and
Ẋ must be approximated numerically, as in the examples below. Thus, both X and Ẋ will be
contaminated with noise so that Eq. (7) does not hold exactly. Instead,
Ẋ = Θ(X)Ξ + ηZ, (10)
where Z is a matrix of independent identically distributed Gaussian entries with zero mean, and
η is the noise magnitude. Thus we seek a sparse solution to an overdetermined system with noise.
The LASSO [12, 14] from statistics works well with this type of data, providing a sparse re-
gression. However, it may be computationally expensive for very large data sets.
An alternative is to implement the sequential thresholded least-squares algorithm in Code (1).
In this algorithm, we start with a least-squares solution for Ξ and then threshold all coefficients
that are smaller than some cutoff value λ. Once the indices of the remaining non-zero coefficients
are identified, we obtain another least-squares solution for Ξ onto the remaining indices. These
new coefficients are again thresholded using λ, and the procedure is continued until the non-zero
coefficients converge. This algorithm is computationally efficient, and it rapidly converges to a
sparse solution in a small number of iterations. The algorithm also benefits from simplicity, with
a single parameter λ required to determine the degree of sparsity in Ξ.
Depending on the noise, it may be necessary to filter X and Ẋ before solving for Ξ. In many
of the examples below, only the data X is available, and Ẋ is obtained by differentiation. To
counteract differentiation error, we use the total variation regularized derivative [32] to de-noise
the derivative; this is based on the total variation regularization [33]. This works quite well when
only state data X is available, as illustrated on the Lorenz system in Fig. 7. Alternatively, the
data X and Ẋ may be filtered, for example using the optimal hard threshold for singular values
described in [34]. It is important to note that previous algorithms to identify dynamics from data
have been quite sensitive to noise [3, 24]. The algorithm in Code 1 is remarkably robust to noise,
even when derivatives must be approximated from noisy data.
Code 1: Sparse representation algorithm in Matlab.

%% compute Sparse regression: sequential least squares
Xi = Theta\dXdt; % initial guess: Least-squares
% lambda is our sparsification knob.

for k=1:10
smallinds = (abs(Xi)<lambda); % find small coefficients
Xi(smallinds)=0; % and threshold
for ind = 1:n % n is state dimension
biginds = ˜smallinds(:,ind);
% Regress dynamics onto remaining terms to find sparse Xi
Xi(biginds,ind) = Theta(:,biginds)\dXdt(:,ind);
end
end
6
3.2 Cross-validation to determine parsimonious sparse solution on Pareto front
To determine the sparsification parameter λ in the algorithm in Code (1), it is helpful to use the
concept of cross-validation from machine learning. It is always possible to hold back some test
data apart from the training data to test the validity of models away from training values. In
addition, it is important to consider the balance of model complexity (given by the number of
nonzero coefficients in Ξ) with the model accuracy. There is an “elbow” in the curve of accuracy
vs. complexity parameterized by λ, the so-called Pareto front. This value of λ represents a good
tradeoff between complexity and accuracy, and it is similar to the approach taken in [3].
3.3 Extensions and Connections

There are a number of extensions to the basic theory above that generalize this approach to a
broader set of problems. First, the method is generalized to a discrete-time formulation, establish-
ing a connection with the dynamic mode decomposition (DMD). Next, high-dimensional systems
obtained from discretized partial differential equations are considered, extending the method to
incorporate dimensionality reduction techniques to handle big data. Finally, the sparse regression
framework is modified to include bifurcation parameters, time-dependence, and external forcing.
3.3.1 Discrete-time representation

The aforementioned strategy may also be implemented on discrete-time dynamical systems:
xk+1 = f (xk ). (11)
There are a number of reasons to implement Eq. (11). First, many systems, such as the logistic
map in Eq. (26) are inherently discrete-time systems. In addition, it may be possible to recover
specific integration schemes used to advance Eq. (3). The discrete-time formulation also foregoes
the calculation of a derivative from noisy data. The data collection will now involve two matrices
Xm−1
1 and Xm 2 :
   
xT1 xT2
 xT2   xT3 
   
Xm−1 =  .  , X m
=  . . (12)
1
 .
.  2
 .
. 
xTm−1 xTm
The continuous-time sparse regression problem in Eq. (7) now becomes:

m−1
Xm
2 = Θ(X1 )Ξ (13)
and the function f is the same as in Eq. (9).

In the discrete setting in Eq. (11), and for linear dynamics, there is a striking resemblance to
dynamic mode decomposition. In particular, if Θ(x) = x, so that the dynamical system is linear,
then Eq. (13) becomes
m−1 T T
Xm
2 = X1 Ξ =⇒ (Xm
2 ) =Ξ
T
Xm−1
1 . (14)
This is equivalent to the DMD, which seeks a dynamic regression onto linear dynamics ΞT . In
particular, ΞT is n × n dimensional, which may be prohibitively large for a high-dimensional state
x. Thus, DMD identifies the dominant terms in the eigendecomposition of ΞT .
7
3.3.2 High-dimensional systems, partial differential equations, and dimensionality reduction
Often, the physical system of interest may be naturally represented by a partial differential equa-
tion (PDE) in a few spatial variables. If data is collected from a numerical discretization or from
experimental measurements on a spatial grid, then the state dimension n may be prohibitively
large. For example, in fluid dynamics, even simple two-dimensional and three-dimensional flows
may require tens of thousands up to billions of variables to represent the discretized system.
The method described above is prohibitive for a large state dimension n, both because of the
factorial growth of Θ in n and because each of the n row equations in Eq. (8) requires a separate op-
timization. Fortunately, many high-dimensional systems of interest evolve on a low-dimensional
manifold or attractor that may be well-approximated using a dimensionally reduced low-rank ba-
sis Ψ. For example, if data X is collected for a high-dimensional system as in Eq. (4a), it is possible
to obtain a low-rank approximation using the singular value decomposition (SVD):
XT = ΨΣV∗ . (15)
In this case, the state x may be well approximated in a truncated modal basis Ψr , given by the first
r columns of Ψ from the SVD:
x ≈ Ψr a, (16)
where a is an r-dimensional vector of mode coefficients. We assume that this is a good approxi-
mation for a relatively low rank r. Thus, instead of using the original high-dimensional state x, it
is possible to obtain a sparse representation of the Galerkin projected dynamics fP in terms of the
coefficients a:
ȧ = fP (a). (17)
There are many choices for a low-rank basis, including proper orthogonal decomposition (POD) [35,
36], based on the SVD.
3.3.3 External forcing, bifurcation parameters, and normal forms

It is also possible to identify normal forms associated with a bifurcation parameter µ by suspend-
ing the parameter as:
ẋ = f (x; µ) (18a)
µ̇ = 0. (18b)
Here we consider the bifurcation parameter µ as a variable with zero time derivative. It is then
possible to identify the right hand side f (x; µ) as a sparse combination of functions of compo-
nents in x as well as the bifurcation parameter µ. This idea is illustrated on two examples, the
one-dimensional logistic map and the two-dimensional Hopf normal form. This is an important
generalization, since it is now possible to identify a normal form from data, allowing for the pre-
diction of bifurcation phenomena and dynamic unfoldings [2].
Similarly, time-dependence may be added to the vector field by suspending the time variable:
ẋ = f (x, t) (19a)
ṫ = 1. (19b)
This includes both time-varying vector fields as well as external forcing terms.
8
4 Results
The methods described in Sec. 3 to identify governing equations from data are now demonstrated
on a number of example systems of varying complexity. The first example illustrates the method
on simple systems including a comparison of two-dimensional linear vs. nonlinear damped os-
cillators, as well as a three-dimensional stable linear system. In the second example, the chaotic
Lorenz system is investigated, and the sparse identification algorithm accurately reproduces the
form of the governing equations, and hence the attractor dynamics. The third example demon-
strates the extension of this method to nonlinear partial differential equations (PDEs) by investi-
gating the fluid flow past a circular cylinder at Reynolds number 100. In this example, data from
direct numerical simulation of the Navier-Stokes equations are used to obtain a low-dimensional
proper orthogonal decomposition (POD) subspace. In POD coordinates, the identified system ac-
curately captures limit cycle dynamics as well as transients. Importantly, the identified nonlinear
terms are quadratic, which is consistent with the form of the Navier-Stokes equations; thus the
mean-field model captures the subtle slow-manifold dynamics of the system. In the fourth exam-
ple, normal forms are identified for both the logistic map and the Hopf normal form; in both cases,
the model is correctly parameterized. The examples in Sections 4.2-4.4 approximate derivatives ẋ
from noisy state measurements x using the total-variation regularized derivative [32].
4.1 Example 1: Simple illustrative systems

4.1.1 Example 1a: Two-dimensional damped oscillator (linear vs. nonlinear)
In this example, we consider the two-dimensional damped harmonic oscillator with either linear
dynamics, as in Eq. (20b), or with cubic dynamics, as in Eq. (20b). The dynamic data and the
sparse identified model are shown in Fig. 2. The correct form of the nonlinearity is obtained in
each case; the augmented nonlinear library Θ(x) includes polynomials in x up to fifth order. The
sparse identified model and algorithm parameters are shown in the Appendix in Tables 6 and 7.

d x −0.1 2 x
= (20a)
dt y −2 −0.1 y
3
d x −0.1 2 x
= (20b)
dt y −2 −0.1 y 3
4.1.2 Example 1b: Three-dimensional linear system

A linear system with three variables and the sparse approximation are shown in Fig. 3. In this
case, the dynamics are given by
    
x −0.1 −2 0 x
d  
y =  2 −0.1 0  y  . (21)
dt
z 0 0 −0.3 z
The sparse identification algorithm correctly identifies the system in the space of polynomials up
to second or third order, and the sparse model is given in Table 8. Interestingly, including poly-
nomial terms of higher order (e.g. orders 4 or 5) introduces a degeneracy in the sparse identifica-
tion algorithm, because linear combinations of powers of eλt may approximate other exponential
rates. This unexpected degeneracy motivates a hierarchical approach to identification, where sub-
sequently higher order terms are included until the algorithm either converges or diverges.
9
Linear System Cubic Nonlinearity
Linear System Cubic Nonlinearity
2 2
2 x1 2
x21
model
x2
model
xk xk
xk xk
0 0
0 0
-2 -2
-20 5 10 15 20 25 -20 5 10 15 20 25
0 5 10Time15 20 25 0 5 10Time15 20 25
Time Time
2 2
2 xk 2
model
xk
model
x2 x2
x2 x2
0 0
0 0
-2 -2
-2-2 0 2 -2-2 0 2
-2 0 x1 2 -2 0 x1 2
x1 x1
Figure 2: Comparison of linear system (left) against system with cubic nonlinearity (right). The sparse
Figure 2: system
identified Comparison of linear
correctly system
identifies the (left)
form against system with
of the dynamics andcubic nonlinearity
accurately (right).
reproduces the The sparse
phase por-
identified system correctly identifies the form of the dynamics and accurately reproduces the phase por-
traits.
traits.
2 1
2 1
x3
x3
xk 0.5
xk 0.5
0
0
0
0
-2 2
-2 2
0 0
-2 0x1 2 -2 0 x2
-20 25 50 x1 2 -2 x2
0 25
Time 50
Time
Figure 3:
Figure 3: Three-dimensional
Three-dimensional linear
3: Three-dimensional linear system
system (solid
(solid colored
colored lines)
lines) is
is well-captured
well-captured by
by sparse
sparse identified
identified system
system
Figure
(dashed black line). linear system (solid colored lines) is well-captured by sparse identified system
(dashed black line).
(dashed black line).
10
10
4.2 Example
4.2 Example 2:
2: Lorenz
Lorenz system
system (Nonlinear
(Nonlinear ODE)
ODE)
Here, we consider the nonlinear Lorenz system [37] to explore the identification of chaotic dynam-
Here, we consider the nonlinear Lorenz system [40] to explore the identification of chaotic dynam-
ics evolving on an attractor:
ics evolving on an attractor:
ẋ = σ(y − x) (22a)
ẋ = (y x) (22a)
ẏ = x(ρ − z) − y (22b)
ẏ = x(⇢ z) y (22b)
ż = xy − βz. (22c)
ż = xy z. (22c)
For this example, we use the standard parameters σ = 10, β = 8/3, ρ = 28, with an initial condition
For
this example,
we use the standard parameters = 10, = 8/3, ⇢ = 28, with an initial condition
⇥ x y z⇤TT = ⇥ −8 7 27⇤TT . Data is collected from t = 0 to t = 100 with a time-step of ∆t = 0.001.
x y z = 8 7 27 . Data is collected from t = 0 to t = 100 with a time-step of t = 0.001.
The system is identified in the space of polynomials in (x, y, z) up to fifth order:
The system is identified in the space of polynomials in (x, y, z) up to fifth order:
 
2 3
Θ(X) = x(t) y(t) z(t) x(t)2 x(t)y(t) x(t)z(t) y(t)2 y(t)z(t) z(t)2 · · · z(t)5  . (23)
⇥(X) = 4x(t) y(t) z(t) x(t)2 x(t)y(t) x(t)z(t) y(t)2 y(t)z(t) z(t)2 · · · z(t)5 5 . (23)
To explore the effect of noisy derivatives in a controlled setting, we add zero-mean Gaussian mea-
To explore the effect of noisy derivatives in a controlled setting, we add zero-mean Gaussian mea-
surement noise with variance η to the exact derivatives. The short-time (t = 0 to t = 20) and
surement noise with variance ⌘ to the exact derivatives. The short-time (t = 0 to t = 20) and
long-time (t = 0 to t = 250) system reconstruction is shown in Fig. 4 for two different noise values,
long-time (t = 0 to t = 250) system reconstruction is shown in Fig. 4 for two different noise values,
η = 0.01 and η = 10. The trajectories are also shown in dynamo view in Fig. 5, and the ` error vs.
⌘ = 0.01 and ⌘ = 10. The trajectories are also shown in dynamo view in Fig. 5, and the `22 error vs.
time for increasing noise η is shown in Fig. 6. Although the `2 error increases for large noise val-
time for increasing noise ⌘ is shown in Fig. 6. Although the `2 error increases for large noise val-
ues η, the form of the equations, and hence the attractor dynamics, are accurately captured. The
ues ⌘, the form of the equations, and hence the attractor dynamics, are accurately captured. The
system has a positive Lyapunov exponent, and small differences in model coefficients or initial
system has a positive Lyapunov exponent, and small differences in model coefficients or initial
conditions grow exponentially, until saturation, even though the attractor remains intact.
conditions grow exponentially, until saturation, even though the attractor remains intact.
Full Simulation Identified System, ⌘ = 0.01 Identified System, ⌘ = 10
50 50 50
z z z
25 25 25
0 50 0 50 0 50
-20 0y -20 0y -20 0y
0 20 -50 0 20 -50 0 20 -50
x x x
50 50 50
z z z
25 25 25
0 50 0 50 0 50
-20 0y -20 0y -20 0y
0 0 0
x 20 -50 x 20 -50 x 20 -50
Figure 4: Trajectories of the Lorenz system for short-time integration from t = 0 to t = 20 (top) and long-
Figure 4: Trajectories
time integration from of
t =the0 Lorenz
to t = system for short-time
250 (bottom). The fullintegration
dynamics from
(left) tare to t = 20 with
= 0compared (top) the
andsparse
long-
time integration
identified systemsfrom to t =for
t = 0 right)
(middle, (bottom).
250various The full
additive dynamics
noise, (left)measurements
assuming are comparedofwith the ẋ.
x and sparse
The
identified systems (middle, right) for various additive noise, assuming measurements of x and ẋ. The
trajectories are colored by ∆t, the adaptive Runge-Kutta time step. This color is a proxy for local sensitivity.
trajectories are colored by t, the adaptive Runge-Kutta time step. This color is a proxy for local sensitivity.
11
11
⌘=
⌘=0.01
0.01 ⌘ =⌘ 10
= 10
30 30 30 30
xx x x
0 0 0 0
-30-30 -30-30
0 0 5 5 10 10 15 15 20 20 0 0 5 5 10 10 15 15 20 20
30 30 30 30
y y y y
0 0 0 0
-30-30 -30-30
0 0 5 5 10 10 15 15 20 20 0 0 5 5 10 10 15 15 20 20
Time
Time Time
Time
Figure 5: 5:
Figure
Figure 5:Dynamo
Dynamoview
Dynamo of of
view
view trajectories
of of of
trajectories
trajectories thethe
of Lorenz
the system
Lorenz
Lorenz forfor
system
system thethe
for illustrative
the case
illustrative
illustrative where
case
case x and
where
where ẋ are
x and
x and ẋ are
ẋ are
measured
measured
measuredwith noise.
with
with The
noise.
noise. The
Theexact system
exact
exact is shown
system
system is in black
is shown
shown in ( )((−)
in black
black and thethe
) and
and sparse
the identified
sparse
sparse system
identified
identified is shown
system
system is in in
is shown
shown in
thethe
dashed
the red
dashed arrow
red arrow( ( ).
dashed red arrow (−−). ).
Measuring
Measuringx and
x and
ẋ ẋ Measuring
Measuringx and computing
x and derivatives
computing derivatives
102102 102102
100100 101101
100100
10 102 2
Error
Error
Error
Error
1
10 10 1
10 104 4
2
Increasing ⌘ ⌘
Increasing 10 10 2
10 106 6 3
10 10 3
10 108 8 4 4
10 10
0 0 5 5 10 10 15 15 20 20 0 0 5 5 10 10 15 15 20 20
Time
Time Time
Time
Figure 6: 6:
Figure
Figure 6:Error vs.vs.
Error
Error vs.time forfor
time
time sparse
for identified
sparse
sparse systems
identified
identified generated
systems
systems from
generated
generated data
from
from with
data
data increasing
with
with noise
increasing
increasing mag-
noise
noise mag-
mag-
nitude
nitude
nitude ⌘.When
⌘. η. When
When x and
x and
x ẋ are
and ẋ measured,
ẋ are
are measured,
measured, then
then
thennoise is added
noise
noise is added
is addedto to
thethe
to derivative
the derivative
derivative (left).
(left).This
(left). error
This
This corre-
error
error corre-
corre-
sponds
sponds to the
to difference
the difference between
between solid black
solid and
black anddashed
dashedred curves
red curves in Fig.
in 5.
Fig. Sensor
5.
sponds to the difference between solid black and dashed red curves in Fig. 5. Sensor noise values are Sensor noise
noisevalues are
values are
⌘ 2η⌘ {0.0001,
∈ {0.0001, 0.001, 0.01,
0.001, 0.1,
0.01, 1.0,
0.1, 10.0}.
1.0, When
10.0}. When only
only
x is
x measured,
is measured,noise
noiseis added
is added to the
to state,
the and
state,
2 {0.0001, 0.001, 0.01, 0.1, 1.0, 10.0}. When only x is measured, noise is added to the state, and the deriva- the
and deriva-
the deriva-
tives ẋ are
tives
tives ẋ are
ẋ arecomputed
computed
computed using
using
usingthethetotal
the variation
total
total variation
variationregularized
regularized
regularizedderivative
derivative
derivative [34][34]
(right).
[32] In In
(right).
(right). this
In thiscase,
this thethe
case,
case, largest
the largest
largest
noise
noisemagnitude
magnitude ⌘ =
η 10.0
= is
10.0 omitted,
is omitted,because
becausethe approximation
the approximation
noise magnitude ⌘ = 10.0 is omitted, because the approximation fails. fails.
fails.
12
12 12
Noisy Computed
Measurements Derivatives
z x ẋ
TVdiff
y ẏ
y x
z ż
y
x
SINDy
‘xi_1’ ‘xi_2’ ‘xi_3’
‘x’ [-9.9614] [27.5343] [ 0]
‘y’ [ 9.9796] [-0.8038] [ 0]
‘z’ [ 0] [ 0] [-2.6647]
‘xx’ [ 0] [ 0] [ 0]
‘xy’ [ 0] [ 0] [ 1.0003]
‘xz’ [ 0] [-0.9900] [ 0]
Figure 7: SINDy procedure when only noisy state measurements are available for the Lorenz
system. Gaussian noise with σ = 1 is added to the state, and derivatives are computed using
the total variation derivative [32]. The exact system without noise is shown in red, and the noisy
measurements and approximated derivatives are shown in black.
Next, we explore the SINDy algorithm on the Lorenz equation when only noisy measurements
of the state x are available. Gaussian noise with variance η is added to the state x, and derivatives
ẋ are computed using the total-variation regularized derivative [32]. This procedure is illustrated
for a relatively large noise magnitude η = 1.0 in Fig. 7. The correct terms are identified, and the
attractor is captured by these sparse identified dynamics. A systematic investigation for varying
η is shown in the right panel of Fig. 6. Again, the `2 error rapidly grows because of the chaotic
nature of the Lorenz attractor, so that individual trajectories rapidly diverge. However, even for
large noise magnitudes, the attractor dynamics are captured.
We also explore the ability to capture the attractor dynamics using time-delay coordinates
when incomplete measurements are taken. This extension is presented in Section 4.5.
13
4.3
4.3 Example
Example 3:
3: Fluid
Fluid wake
wake behind
behind aa cylinder
cylinder (Nonlinear
(Nonlinear PDE)
PDE)
Here we demonstrate the generalization of the sparse dynamics method to partial differential
Here we demonstrate the generalization of the sparse dynamics method to partial differential
equations (PDEs). Data is collected for the fluid flow past a cylinder at Reynolds number 100 using
equations (PDEs). Data is collected for the fluid flow past a cylinder at Reynolds number 100 using
direct numerical simulations of the two-dimensional Navier-Stokes equations [44, 45]. Then, the
direct numerical simulations of the two-dimensional Navier-Stokes equations [38, 39]. Then, the
dynamic relationship between low-rank coherent structures is determined.
dynamic relationship between low-rank coherent structures is determined.
The low-Reynolds number flow past a cylinder is a particularly interesting example because
The low-Reynolds number flow past a cylinder is a particularly interesting example because
of its rich history in fluid mechanics and dynamical systems. It has long been theorized that
of its rich history in fluid mechanics and dynamical systems. It has long been theorized that
turbulence may be the result of a sequence of Hopf bifurcations that occur as the Reynolds number
turbulence may be the result of a sequence of Hopf bifurcations that occur as the Reynolds number
of the flow increases [46]. The Reynolds number is a rough measure of the ratio of inertial and
of the flow increases [40]. The Reynolds number is a rough measure of the ratio of inertial and
viscous forces, and an increasing Reynolds number may correspond, for example, to increasing
viscous forces, and an increasing Reynolds number may correspond, for example, to increasing
flow velocity, giving rise to more rich and intricate structures in the fluid.
flow velocity, giving rise to more rich and intricate structures in the fluid.
It took roughly 15 years to find the first Hopf bifurcation in fluid mechanics, in the transition
It took roughly 15 years to find the first Hopf bifurcation in fluid mechanics, in the transition
from a laminar steady wake to laminar periodic vortex shedding behind a cylinder at Reynolds
from a laminar steady wake to laminar periodic vortex shedding behind a cylinder at Reynolds
number 47 [47, 48, 49]. This discovery led to another long-standing debate about how a Hopf
number 47 [41, 42]. This discovery led to another long-standing debate about how a Hopf bi-
bifurcation, with cubic nonlinearity, can be exhibited in a Navier-Stokes fluid with quadratic non-
furcation, with cubic nonlinearity, can be exhibited in a Navier-Stokes fluid with quadratic non-
linearities. After 15 more years, this issue was finally resolved using a separation of time-scales
linearities. After 15 more years, this issue was finally resolved using a separation of time-scales
argument and a mean-field model [50]. It was demonstrated that coupling between oscillatory
argument and a mean-field model [43]. It was demonstrated that coupling between oscillatory
wake modes with the base flow gives rise to a slow manifold (see Fig. 8), and this slow manifold
wake modes with the base flow gives rise to a slow manifold (see Fig. 8), and this slow manifold
produces algebraic terms that approximate cubic nonlinearities on slow timescales.
produces algebraic terms that approximate cubic nonlinearities on slow timescales.
This example provides a compelling test-case for the proposed algorithm, since the under-
This example provides a compelling test-case for the proposed algorithm, since the under-
lying form of the dynamics took nearly three decades to uncover. Indeed, the sparse dynam-
lying form of the dynamics took nearly three decades to uncover. Indeed, the sparse dynam-
ics algorithm identifies the on-attractor and off-attractor dynamics using quadratic nonlinearities
ics algorithm identifies the on-attractor and off-attractor dynamics using quadratic nonlinearities
and reproduces a parabolic slow manifold. It is interesting to note that when the off-attractor
and reproduces a parabolic slow manifold. It is interesting to note that when the off-attractor
trajectories are not included in the system identification, the algorithm incorrectly identifies the
trajectories are not included in the system identification, the algorithm incorrectly identifies the
dynamics using cubic nonlinearities, and fails to correctly identify the dynamics associated with
dynamics using cubic nonlinearities, and fails to correctly identify the dynamics associated with
the shift mode, which connects the mean flow to the unstable steady state.
the shift mode, which connects the mean flow to the unstable steady state.
z A - vortex shedding ux - POD mode 1

Limit cycle A
B
B - mean flow uy - POD mode 2
C - unstable fixed point uz - shift mode

2
Slow 1
manifold 0
C
y -1
x
-2
-1 0 1 2 3 4 5 6 7 8
Figure
Figure 8:
8: Illustration
Illustration of
of the
the low-rank
low-rank dynamics
dynamics underlying
underlying the
the periodic
periodic vortex
vortex shedding
shedding behind
behind aa circular
circular
cylinder
cylinder at
at low
low Reynolds
Reynolds number,
number, Re
Re =
= 100.
100.
14
14
4.3.1 Direct
4.3.1 Direct numerical
numerical simulation
simulation
The direct
The direct numerical
simulation involves
involves aa fast
fast multi-domain
multi-domain immersed
immersed boundary
boundary projection
projection
method [44,
method [38, 45].
39]. Four
Four grids
grids are
are used,
used, each
each with
with aa resolution
resolution ofof 450
450 ⇥ with the
200, with
× 200, the finest
finest grid
grid
having dimensions of 9 4 cylinder diameters and the largest grid having dimensions
having dimensions of 9 ⇥ 4 cylinder diameters and the largest grid having dimensions of 72 ⇥ 32
× of 72 × 32
diameters. The
diameters. The finest
finest grid
grid has
has 90,000
90,000 points,
points, and
and each
each subsequent
subsequent coarser
coarser grid
grid has
has 67,500
67,500 distinct
distinct
points. Thus, if the state includes the vorticity at each grid point, then the state
points. Thus, if the state includes the vorticity at each grid point, then the state dimension is dimension is
292,500. The vorticity field on the finest grid is shown in Fig. 8. The code is non-dimensionalized
292,500. The vorticity field on the finest grid is shown in Fig. 8. The code is non-dimensionalized
so that
so that the
the cylinder
cylinder diameter
diameter and and free-stream
free-stream velocity
velocity are
are both
both equal
equal toto one:
one: DD= and U
= 11 and U1
∞= = 1,1,
respectively. The simulation time-step is non dimensional
respectively. The simulation time-step is t = 0.02 non dimensional time units.
∆t = 0.02 time units.
4.3.2 Mean
4.3.2 Mean field
field model
model
To develop
To develop aa mean-field
mean-field model
model forfor the
the cylinder
cylinder wake,
wake, first
first we
we must
must reduce
reduce the
the dimension
dimension of of
the system. The proper orthogonal decomposition (POD) [36], provides a low-rank
the system. The proper orthogonal decomposition (POD) [36], provides a low-rank basis that is basis that is
optimal in
optimal in the
the L 2 sense; for
L sense;
2 for fluid
fluid velocity
velocity fields,
fields, the
the POD
POD results
results in
in aa hierarchy
hierarchy ofof orthonormal
orthonormal
modes that, when truncated, capture the most energy of the original system
modes that, when truncated, capture the most energy of the original system for the given for the given rank
rank
truncation. The first two most energetic POD modes capture a significant portion
truncation. The first two most energetic POD modes capture a significant portion of the energy; of the energy;
the steady-state
the steady-state vortex
vortex shedding
shedding is is aa limit
limit cycle
cycle in
in these
these coordinates.
coordinates. An An additional
additional mode,
mode, called
called
the shift mode, is included to capture the transient dynamics connecting the unstable
the shift mode, is included to capture the transient dynamics connecting the unstable steady state steady state
with the
with the mean
mean of
of the
the limit
limit cycle
cycle [50].
[43].
Full Simulation Identified System
0 0
z z
-75 -75
200 200
-150 -150
-200 0 -200 0
0 -200 y 0 -200 y
x 200 x 200
Figure 9:
Figure 9: Evolution
Evolution of
of the
the cylinder
cylinder wake
wake trajectory
trajectory in
in reduced
reduced coordinates.
coordinates. The
The full
full simulation
simulation (left)
(left) comes
comes
from direct
from direct numerical
simulation of
of the
the Navier-Stokes
Navier-Stokes equations,
equations, and
and the
the identified
identified system
system (right)
(right) captures
captures
the dynamics on the slow manifold. Color indicates simulation time.
the dynamics on the slow manifold. Color indicates simulation time.
In the
In the three-dimensional
three-dimensional coordinate
coordinate system
system described
described above,
above, the
the mean-field
mean-field model
model for
for the
the
cylinder dynamics
cylinder dynamics are
are given
given by:
by:
ẋẋ =
= µx − !y
µx ωy +
+ Axz
Axz (24a)
(24a)
ẏẏ =
= ωx +
!x + µy
µy +
+ Ayz
Ayz (24b)
(24b)
2 2
(z − xx2 − yy2 ).
= −λ(z
żż = ). (24c)
(24c)
If λ is
If is large,
large, so
so that
that the
the z-dynamics are fast,
z-dynamics are fast, then
then the
the mean
mean flow
flow rapidly
rapidly corrects
corrects to
to be
be on
on the
the (slow)
(slow)
manifold z = x 2 + y22 given by the amplitude of vortex shedding. When substituting this algebraic
manifold z = x + y given by the amplitude of vortex shedding. When substituting this algebraic
2
relationship into
relationship into Eqs.
Eqs. 24a
24a and
and 24b,
24b, we
we recover
recover the
the Hopf
Hopf normal
normal form
form on
on the
the slow
slow manifold.
manifold.
15
15
Remarkably,
Remarkably, similar
similar dynamics
dynamics are
dynamics are discovered
are discovered by
discovered by the
by the sparse
the sparse dynamics
sparse dynamics algorithm,
dynamics algorithm, purely
algorithm, purely from
purely from
from
data collected from simulations. The identified
data collected from simulations. The identified model model coefficients,
modelcoefficients,
coefficients,shown shown
shownin in Table
inTable
Table10,9, only include
only include
9, only include
quadratic
quadratic nonlinearity, consistent with the Navier-Stokes equations. Moreover, the transient be-
nonlinearity, consistent
consistent with
with the
the Navier-Stokes
Navier-Stokes equations.
equations. Moreover,
Moreover, the
the transient
transient be-
be-
havior, shown in Figs. 10 and
and 11,
11, is
is captured
captured qualitatively
qualitatively for
for solutions
solutions
havior, shown in Figs. 10 and 11, is captured qualitatively for solutions that do not start on the that
that do
do not
not start
start on
on the
the
slow
slow manifold.
manifold. When
When the the off-attractor
off-attractor dynamics
off-attractor dynamics in
dynamics in Fig.
in Fig. 10
Fig. 10 are
10 are not
are not included
not included in
included in the
in the training
the training data,
training data,
data,
then
then the model only recovers a simple Hopf normal form in x and y with cubic terms, but does not
the model only recovers
recovers a
a simple
simple Hopf
Hopf normal
normal form
form in
in xx and
and yy with
with cubic
cubic terms,
terms, but
but does
does not
not
correctly identify the slow-manifold
correctly identify the slow-manifold
slow-manifold with with quadratic
with quadratic nonlinearities.
quadratic nonlinearities.
nonlinearities. Note Note that
Note that time
that time derivatives
derivatives of
time derivatives of
of
the POD coefficients are approximated
approximated numerically
numerically using
using a
a fourth
fourth order
order
the POD coefficients are approximated numerically using a fourth order central difference scheme.central
central difference
difference scheme.
scheme.
The
The data
data from
from Fig.
Fig. 11
11 was
was not
was not included
not included in
included in the
in the training
the training data,
training data, and
data, and although
and although qualitatively
although qualitatively similar,
qualitatively similar,
similar,
the identified model
the identified model does does
does notnot exactly
not exactly reproduce
reproduce the
exactly reproduce the transients.
the transients.
transients. Since Since this
Since this initial
this initial condition
condition had
initial condition had
had
twice
twice the fluctuation energy in the x and y directions, the slow manifold approximation may not
the fluctuation energy
energy in
in the
the x
x and
and y
y directions,
directions, the
the slow
slow manifold
manifold approximation
approximation may
may not
not
be valid here. Finally, reducing
reducing the
the sparsifying
sparsifying parameter
parameter ,
λ, it
it is
is possible
possible
be valid here. Finally, reducing the sparsifying parameter , it is possible to obtain models that to
to obtain
obtain models
models that
that
agree
agree almost
almost perfectly
perfectly with
with the
with the data
the data in
data in Figs.
in Figs. 9-11,
Figs. 9-11, although
9-11, although the
although the model
the model will
model will then
will then include
then include higher
include higher
higher
order nonlinearities; this is
is discussed
discussed in
in Sec.
Sec.
order nonlinearities; this is discussed in Sec. 4.3.3. 4.3.3.
4.3.3.
Full
Full Simulation
Simulation Identified
Identified System
System
00 00
zz zz
-75
-75 -75
-75
200
200 200
200
-150
-150 -150
-150
-200 00 -200 00
-200 00 -200 yy -200 00 -200 y
y
xx 200 -200 xx 200 -200
200 200
Figure Evolution of
of the cylinder
cylinder wake trajectory
trajectory starting from
from a flow state
state initialized at
at the mean
mean of
Figure 10:
10: Evolution
Evolution of the
the cylinder wake
wake trajectory starting
starting from aa flow
flow state initialized
initialized at the
the mean of
of
the limit cycle.
cycle. Both the
the full simulation
simulation and sparse
sparse model capture
capture the off-attractor
off-attractor dynamics,
the steady-state
steady-state limit
limit cycle. Both
Both the full
full simulation and
and sparse model
model capture the
the off-attractor dynamics,
dynamics,
characterized by rapid attraction
rapid attraction of
of the
the trajectory
trajectory onto
onto the
the slow
slow manifold.
manifold.
characterized by rapid attraction of the trajectory onto the slow manifold.
Full
Full Simulation
Simulation Identified
Identified System
System
50
50 50
50
zz zz
00 00
200
200 200
200
-50
-50 -50
-50
-200 00 -200 00
-200 00 -200 yy -200 00 -200 yy
xx 200 -200 x 200 -200
200 x 200
Figure 11: This simulation corresponds to an initial condition obtained by doubling the magnitude of the
Figure
Figure
limit 11: This
This simulation
11:behavior.
cycle corresponds
This data
simulation to
to an
an initial
was not included
corresponds in the
initial condition
trainingobtained
condition by
by doubling
of the sparse
obtained model. the
doubling the magnitude
magnitude of
of the
the
limit cycle behavior. This data was not included in the training of the sparse
limit cycle behavior. This data was not included in the training of the sparse model.model.
16
16
4.3.3 Cubic nonlinearities
4.3.3
4.3.3 Cubic Cubicnonlinearities
nonlinearities
It is important to note that although the nonlinear system in Figs. 9 and 10 is identified using
ItIt isis important to
to note
note that
that although the nonlinear system in
in Figs. 99 and 10
10 isis identified using
only important
quadratic nonlinearities, although
they alsothe nonlinear
contain system
constant forcing Figs.
terms, and
which identified
introduce anusing
extra
only quadratic
only quadratic nonlinearities,
nonlinearities, they also contain
they also contain constant
constanttheforcing
forcing terms, which
terms, which introduce
introduce an extra
spurious fixed point in the z direction. If we decrease sparsifying parameter , so an
that extra
we
spurious
spurious fixed
fixed point
point in
in the
the zz direction.
direction. IfIf we
we decrease
decreasethethe sparsifying
theidentified
sparsifying parameter
parameter λ,, so
so that
that we we
obtain a model that also includes cubic nonlinearities, system is more accurate in
obtain
obtain aamodel
model that
that also
also includes
includes cubic
cubic nonlinearities,
nonlinearities, the
the identified
identified system
system isis more
more accurate
accurate in
in
terms of the dynamic response and does not posses this spurious extra fixed point. This is not
terms
terms of the
the dynamic
ofsurprising
dynamic response and and does not
not posses this
this spurious extra fixed
fixed point. This isis not
entirely thatresponse
the quadratic does
response posses
has limitationsspurious
when extra
only using point.POD
three Thismodes, not
entirely
entirely surprising
surprising that
that the
the quadratic
quadratic response
response has
has limitations
limitations when
when only
only using
using three
three POD
POD modes,
modes,
since there is additional energy captured by higher POD pairs.
since
sincethere thereisisadditional
additionalenergy
energycaptured
capturedby byhigher
higherPODPODpairs.
pairs.

0 0
0 0
z z
z z
-75 -75
-75 -75
200 200
-150 200 -150 200
-150
-200 0 -150
-200 0
-200 0 -200 0 yy -200 0 -200 0 yy
0 x 200-200 0 x 200-200
x 200 x 200
Figure 12: Evolution of the cylinder wake trajectory in reduced coordinates. The full simulation (left)
Figure 12: Evolution
comes from of the cylinder
direct numerical wake
simulation trajectory
of the in reduced
Navier-Stokes coordinates.
equations, and theThe full simulation
identified (left)
system (right)
comes from direct numerical simulation of the Navier-Stokes equations, and the identified system (right)
comes from
captures thedirect numerical
dynamics on thesimulation of theColor
slow manifold. Navier-Stokes equations, time.
indicates simulation and the identified system (right)
captures the dynamics on the slow manifold. Color indicates simulation time.

0 0
0 0
z z
z z
-75 -75
-75 -75
200 200
-150 200 -150 200
-150
-200 0 -150
-200 0
-200 0 -200 0 yy -200 0 -200 0 yy
0 x 200-200 0 x 200-200
x 200 x 200
Figure 13: Evolution of the cylinder wake trajectory starting from a flow state initialized at the mean of
Figure
Figure
the 13: Evolution
13: Evolution
steady-state ofthe
of theBoth
limit cycle. cylinder
cylinder wake
wake
the full trajectoryand
trajectory
simulation starting
starting
sparse from
from aa flow
model flow statethe
state
capture initialized
initialized at the
at
off-attractorthe mean of
mean of
dynamics,
thesteady-state
the steady-state
characterized limit
bylimit cycle.
cycle.
rapid Boththe
Both
attraction thethe
of full
full simulation
simulation
trajectory and
and
onto sparse
thesparse modelcapture
model capturethe
slow manifold. the off-attractor
off-attractor dynamics,
dynamics,
characterizedby
characterized byrapid
rapidattraction
attractionof
ofthe
thetrajectory
trajectoryonto
ontothe
theslow
slowmanifold.
manifold.
17
17
17
4.4 Example 4: Bifurcations and Normal
normal forms
Forms
It is
It is also
also possible
possible to
to identify
identify normal
normal forms
forms associated
associated with
with aa bifurcation
bifurcation parameter µ by
parameter µ by suspend-
suspend-
ing it
ing it in
in the
the dynamics
dynamics asas aa variable:
variable:
ẋ = f (x; µ) (25a)
µ̇ = 0. (25b)
It is then possible to identify the right hand side f (x; µ) as a sparse combination of functions of
It
components in x as well as the bifurcation parameter µ. This idea is illustrated on two examples,
components
the one-dimensional logistic map and the two-dimensional Hopf normal form.
the
4.4.1 Logistic map

4.4.1
The logistic map is a classical model that exhibits a cascade of bifurcations, leading to chaotic
The
trajectories. The dynamics with stochastic forcing ⌘ηk and parameter r are given by
trajectories.
xk+1 = rxk (1 − xk ) + η⌘k . (26)
Sampling the stochastic system at ten parameter values of r, the algorithm correctly identifies the
Sampling
underlying parameterized dynamics, shown in Fig. 14 and Table 12.
underlying 11.
Stochastic System Sparse Identified System

1 1
2 2
r r
3 3
4 4
0 0.5 1 0 0.5 1
x x
3.45 3.45
3.63 3.63
r r
3.82 3.82
4 4
0 0.5 1 0 0.5 1
x x
Figure 14:
Figure 14: Attracting
Attracting sets of the logistic map vs. the parameter r. (left) Data from stochastically
sets of the logistic map vs. the parameter r. (left) Data from stochastically
forced system and (right) the sparse identified system. Data is sampled at rows indicated in red for
forced system and (right) the sparse identified system. Data is sampled at rows indicated in red for
r ∈ {2.5, 2.75, 3, 3.25, 3.5, 3.75, 3.8, 3.85, 3.9, 3.95}. The forcing η is Gaussian with magnitude 0.025.
r 2 {2.5, 2.75, 3, 3.25, 3.5, 3.75, 3.8, 3.85, 3.9, 3.95}. The forcing ⌘kk is Gaussian with magnitude 0.025.
18
18
4.4.2
4.4.2 Hopf
4.4.2 Hopfnormal
Hopf normalform
normal form
form
The
Thefinal
The finalexample
final exampleillustrating
example illustratingthe
illustrating theability
the abilityof
ofthe
thesparse
sparsedynamics
dynamicsmethod
methodtotoidentify
identifyparameterized
parameterized
normal
normal forms
forms is
is the
the Hopf
Hopf normal
normal form [53].
[44]. Noisy data is collected from the Hopf
normal forms is the Hopf normal form [53]. Noisy data is collected from the Hopf system system
2 2
ẋẋ =
= µx
µx++!y
!y − Ax(x
ωy Ax(x2 ++yy 2)) (27a)
(27a)
2 2
ẏẏ = !x
!x+
= −ωx +µy
µy − Ay(x
Ay(x2 + +yy 2)) (27b)
(27b)
for
for various
for various values
various values
values ofof the
of the parameter
the parameter
parameter µ. µ. Data
Data isis collected
collected on
on the blue and
the blue andred
redtrajectories
trajectoriesininFig.
Fig.15,
15,
and
and noise
noise is
is added
added to
to simulate
simulate sensor
sensor noise. The total variation derivative [32]
[34]
and noise is added to simulate sensor noise. The total variation derivative [34] is used to de-noiseis used to de-noise
the
the derivative
thederivative
derivativeforfor use
foruse
usein in the
inthe algorithm.
thealgorithm.
algorithm.
The
The sparse model identification
The sparse model identification algorithm
sparse model identification algorithm correctly identifies the
correctly identifies the Hopf
Hopf normal
normal form,
form, with
with
model
model parameters given in Table 12. The noise-free model reconstruction is shown in Fig. 16.
model parameters
parameters given
given in
in Table
Table 13.
12. The noise-free model reconstruction is shown in Fig. 16.
Note
Note that
Note that with
that with noise
with noise
noise inin the
in the training
the training
training data,
data, although
although the the model terms are
model terms are correctly
correctly identified,
identified, the
the
actual
actual values
actual values
values of of the
of the cubic
the cubic terms
terms are
cubic terms are off
off by
by almost
almost 8%. 8%. Collecting
Collecting more
more training
trainingdatadataor
orreducing
reducing
the
the noise
thenoise magnitude
noisemagnitude
magnitudeboth both improve
bothimprove
improvethe themodel
modelagreement.
agreement.
11
yy
00
11
00
-1 xx
-1
-0.2
-0.2 0. 0.2 -1
-1
0. µµ 0.2 0.4 0.6
0.4 0.6
Figure 15:
Figure 15: Training
Training data
data to
to identify
identify Hopf
Hopf normal
normal form.
form. Blue
Blue trajectories
trajectories denote
denote solutions
solutions that
that start
start outside
outside
Figure
of the 15: Training
fixed point for data
µ < tooridentify
0 the Hopf
limit normal
cycle for µ >form.
0, Blue
and red trajectories
trajectoriesdenote
denote solutions
solutions that
thatstart
startoutside
inside
of
of the
the fixed
fixed point
point for
forµµ <
< 00 or
or the
the limit
limitcycle
cycle for
forµµ >
> 0,
0, and
and red
red trajectories
trajectoriesdenote
denote solutions
solutions that
thatstart
startinside
inside
of
of the
the limit
limit cycle.
cycle.
of the limit cycle.
11
yy
00
11
00
-1 xx
-1
-0.2
-0.2 0. 0.2 -1
-1
0. µµ 0.2 0.4 0.6
0.4 0.6
Figure
Figure16:
Figure 16: Sparse
16: Sparsemodel
Sparse modelcaptures
model capturesthe
captures theHopf
the Hopfnormal
normalform.
form. Initial
Initialconditions
conditionsare
arethe
thesame
sameas
asin
inFig.
Fig.15
15
19
19
19
4.5 Sparse identification of the Lorenz system with time-delay coordinates
It is not always clear what measurements of a dynamical system to take, and even if we did know,
these
4.5 measurements may be prohibitively
Sparse Identification of the Lorenz expensive
System with to collect. Here, we
Time-Delay explore the ability to
Coordinates
extract dynamics in the Lorenz system if only the first variable x(t) is measured. It is well-known
thatIt is not alwayscoordinates
time-delay clear what measurements to take of a additional
allow us to synthesize dynamical system, dynamic and even if we
variables did know,
using a time-
these measurements may be prohibitively expensive to collect.
series measurement from a single variable x(t) [11]. The dynamics in these time-delay coordinates Here, we explore the ability to
extractadynamics
produce new attractorin thewith
Lorenz
the system if only the
same topology, first variable
according x is measured.
to Takens’ theorem [45] It is.well-known
In particular,
that time-delay coordinates allow us to synthesize additional dynamic variables using a time-
we construct a Hankel matrix by stacking delayed time-series of x as rows:
series measurement from a single variable x(t). The dynamics in these time-delay coordinates
produce a new attractor with thesame x1 topology,
x2 x3 according
··· to 
xpTakens’ theorem [38] . In particular,
we construct a Hankel matrix bystacking delayed
x4 time-series
··· xp+1of x as rows:
x2 x3 
 x5 · · · xp+2 3
H = 2x3 x4  (28)
 ..x1 x.. 2 .. 3 ·. ·. ·
x x ..p 
.
6x2 x. 3 . . . 
6xq xq+1 xq+2 x4 · · · xp+1 7
6 · · · xp+q−17
H = 6x3 x4 x5 · · · xp+2 7 7 (28)
6 .. .. .. . . .. 7
Taking the singular value decomposition 4. . (SVD),. we . obtain . 5
xq xq+1 xq+2 · · · xp+q 1
H = ΨΣV∗ , (29)
Taking the singular value decomposition (SVD), we obtain
where we may think of columns of V as a hierarchical set of eigen-time-series. For this example,
H = ⌃V⇤ , (29)
we collect measurements from t = 0 to t = 100 with ∆t = 0.001, and we stack q = 10 rows in H; it
is possible
where wetomay stack more
think ofrows in H,
columns ofalthough this is not set
V as a hierarchical relevant for this discussion.
of eigen-time-series. For this example,
we collect measurements from t = 0 to t = 100 with t = 0.001, and we stack q = 10columns
We choose the first three dominant eigen-time-series given by the first three rows in H; ofitV,
and is we denote
possible these more
to stack coordinates
rows in asH,u, v, and wthis
although forisconvenience.
not relevant for The new
this time-delay embedding
discussion.
is shown, for short
We choose thetime
firstup to tdominant
three = 5, in Fig. 17.
eigen-time-series given by the first three columns of V,
and
Usingwe denote these coordinates
these time-delay as u, v,itand
coordinates, w for convenience.
is possible to computeThe thenew time-delay
derivatives u̇, embedding
v̇, and ẇ nu-
is shown,
merically for short
using time up to tcentral
a fourth-order = 5, in difference;
Fig. 17. in cases with noise, we recommend the total-
variation Using these time-delay
regularized coordinates,
derivative. Next, we it is possible
use to compute
our time-delay the derivatives
coordinates u̇, v̇, and ẇ as
and derivatives nu-in-
merically using a fourth-order central difference; in cases with noise,
puts to the SINDy algorithm, and the resulting model coefficients identified up to cubic order are we recommend the total-
variation
shown regularized
in Table 1. Thesederivative.
coefficients Next,
havewe beenuseidentified
our time-delay coordinates and
after normalizing the derivatives
columns ofasΘ(V). in-
puts to the sparse identification of nonlinear dynamics algorithm,
We use a third order polynomial basis since increasing the polynomial order results in over-fitting and the resulting model coef-
forficients identified
this case. Since we updetermine
to cubic order are shown
time-delay in Table 1.using
coordinates These coefficients
the SVD, therehave is a been
smallidentified
amount of
after normalizing the columns of ⇥(V).
information missing from the three coordinates chosen that are captured in lower energy columns
0.01 0.015 0.015
0.01 0.01
0.005
0.005 0.005
0
w
w
v
0 0
-0.005
-0.005 -0.005
-0.01 -0.01 -0.01

-0.01 0 0.01 -0.01 0 0.01 -0.01 0 0.01
u u v
Figure
Figure17:
17:Lorenz
Lorenzattractor
attractorin
in time-delay coordinatesfor
time-delay coordinates forshort
shorttime.
time.
20
20
0.01 0.015 0.015
0.01 0.01
0.005
0.005 0.005
0
w
v
0 0
-0.005
-0.005 -0.005
-0.01 -0.01 -0.01

-0.01 0 0.01 -0.01 0 0.01 -0.01 0 0.01
u u v
Figure 18:18:SINDy
Figure SINDyreconstruction
reconstructionof
of Lorenz attractorin
Lorenz attractor intime-delay
time-delaycoordinates.
coordinates.
of V. Thus, when selecting higher order polynomials, we may encounter overfit models, when in
realityThethere is a small amount
reconstruction using theof missing information
sparse identified in thefrom
dynamics lower energy
Table coordinates.
1 is shown It would
in Fig. 18. For
be short
interesting
times, to
theinvestigate the interplay
identified dynamics between thequite
are qualitatively number andto
similar energy
the trueof time-delay
the time-delay
embed-coor-
dinates, the polynomial order, and the attainable model fidelity.
ding, capturing the skeleton of the attractor. For longer times, the identified dynamics do not have
The
the reconstruction
same measure on the using the sparse
attractor as theidentified
time-delaydynamics
embedding, frombutTable 1 isattract
instead shown in Fig.
onto 18. For
a compli-
cated
short quasi-periodic
times, the identifiedorbitdynamics
on the attractor. We believequite
are qualitatively that this behavior
similar to theintrue
the time-delay
identified system
embed-
is the
ding, result ofthe
capturing small numerical
skeleton of theerrors introduced
attractor. by truncating
For longer times, the the SVD anddynamics
identified the nonlinear terms.
do not have
However, the short-time behavior is extremely promising, indicating that it
the same measure on the attractor as the time-delay embedding, but instead attract onto a compli- is possible to capture
veryquasi-periodic
cated similar dynamics orbiteven without
on the collecting
attractor. the right
We believe measurements
that this behavior up-front. Finally, Fig.
in the identified 19
system
shows the accuracy of the identified dynamics in satisfying the equations V̇
is the result of small numerical errors introduced by truncating the SVD and the nonlinear terms. = ⇥(V)⌅.
However, the short-time behavior is extremely promising, indicating that it is possible to capture
Table 1: SINDy coefficients for Lorenz system in time-delay coordinates.
very similar dynamics even without collecting the right measurements up-front. Finally, Fig. 19
shows the ’’ accuracy’udot’ of the identified dynamics in’wdot’
’vdot’ satisfying the equations V̇ = Θ(V)Ξ, and Fig. 20
’1’ [ 0] [ 0] [ 0]
shows the correlation between measured and approximated derivatives for various λ.
’u’ [ 0] [ -5.1787] [-12.0796]
’v’ [ 5.2307] [ 0] [ -7.2660]
’w’ Table
[ 1: SINDy
0] coefficients
[-10.3782]for Lorenz system in time-delay coordinates.
[ -5.1039]
’uu’ [ 0] [ 0] [ 0]
’’ ’uv’ ’udot’
[ 0] ’vdot’
[ 0] ’wdot’
[ 0]
’1’’uw’ [[ 0]0] [[ 0]
0] [[ 0]
0]
’u’’vv’ [[ 0]0] [[ -5.1787] 0] [-12.0796]
[ 0]
’v’’vw’ [ [ 5.2307]0] [[ 0]
0] [[ -7.2660]
0]
’w’’ww’ [[ 0]0] [-10.3782]
[ 0] [[ -5.1039]
0]
’uu’
’uuu’ [[ 0]0] [[ 0]
0] [[ 17.3748]
0]
’uv’
’uuv’ [[ 0]0] [[ 0]
0] [[ 20.0657]
0]
’uw’
’uuw’ [[ 0]0] [[ 0]
0] [[ 0]
0]
’uvv’
’vv’ [[ 0]0] [[ 0]
0] [[ -4.9360]
0]
’uvw’
’vw’ [[ 0]0] [[ 0]
0] [[ 0]
0]
’uww’
’ww’ [[ 0]0] [[ 0]
0] [[ 0]
0]
’vvv’
’uuu’ [[ 0]0] [[ 0]
0] [[ 17.3748]
0]
’vvw’
’uuv’ [[ 0]0] [[ 0]
0] [[ 20.0657]
0]
’vww’
’uuw’ [ [ 0]0] [[ 0]
0] [[ 0]
0]
’www’
’uvv’ [[ 0]0] [[ 0]
0] [[ -4.9360]
0]
’uvw’ [ 0] [ 0] [ 0]
’uww’ [ 0] [ 0] [ 0]
’vvv’ [ 0] [ 0] [ 0]
’vvw’ [ 0] [ 0] [ 0]
’vww’ [ 0] [ 0] [ 0]
’www’ [ 0] [ 0] 21
[ 0]
21
0.05
u
u 0
-0.05
40 42 44 46 48 50 52 54 56 58 60
0.2
0.1
v 0
v
-0.1
-0.2
40 42 44 46 48 50 52 54 56 58 60
0.4
Θ(V )Ξ
0.2
V̇
w 0
w
-0.2
-0.4
40 42 44 46 48 50 52 54 56 58 60
Time
Figure
Figure 19:
19: Accuracy
Accuracy of
of sparse
sparse dynamics
dynamics coefficients
coefficients in
in capturing
capturing V̇
V̇ =
= Θ(V)Ξ.
⇥(V)⌅.
1 20
5 Discussion
0.8
15
Number of terms
In this work, we have demonstrated a powerful new technique to identify nonlinear dynamical
Correlation
0.6
systems from data without assumptions on the form of 10 the nonlinearity. This builds on prior work
in symbolic regression but with innovations related to sparse representation, which allow our al-
0.4
gorithms to 0.2
scale to high-dimensional complex U systems.
5 The new method is demonstrated on a
V
number of example systems exhibiting chaos, W big data with low-rank coherence, and parameter-
ized dynamics.0
-3
The identification
-2 -1
of sparse
0
nonlinearities
1
0 and parameterizations mark a significant
-3 -2 -1 0 1
10 10 10 10 10 10 10 10 10 10
step toward the long-heldLambda, goal ofλ intelligent, unassisted identification of dynamical
Lambda, λ systems.
There
Figure 20: are numerous
(left) fields
Correlation ofwhere
computed this method
derivatives mayΘ(V)Ξ
be applied,
withwhere
measured therederivatives
is ample data and
V̇ and
the absence of governing equations. These applications include neuroscience,
(right) number of terms in the differential equations as a function of the sparsifying parameter λ. climate science, epi-
demiology, and financial markets. As shown in the Lorenz example, the ability to predict a specific
trajectory, may be less important than the ability to capture the attractor dynamics. The method
5also Discussion
generalizes to partial differential equations, as demonstrated on an example from fluid me-
chanics. Finally, normal forms may be discovered by including parameters in the optimization, as
In this work,
shown on twowe have demonstrated a powerful new technique to identify nonlinear dynamical
examples.
systems from
In each ofdata without assumptions
the examples shown, we on theinvestigated
have form of the nonlinearity.
the robustness Thisof builds on prior
the sparse work
dynamics
in symbolic regression but with innovations related to sparse representation,
algorithm to measurement noise and the unavailability of derivative measurements. In each case, which allow our al-
gorithms to scale to high-dimensional complex systems. The new method
the sparse regression framework appears well-suited to measurement and process noise, espe- is demonstrated on a
number of example
cially when derivatives systems exhibitingusing
are smoothed chaos,the bigtotal-variation
data with low-rank coherence,
regularized and parameter-
derivative. However,
ized dynamics. The identification of sparse nonlinearities and parameterizations
we do find that larger noise magnitude increases the data required for accurate model mark a significant
identifica-
step
tion. toward the long-held goal of intelligent, unassisted identification of dynamical systems.
There are numerous fields where this method may be applied, where there is ample data and
22
22
the absence of governing equations. These applications include neuroscience, climate science, epi-
demiology, and financial markets. As shown in the Lorenz example, the ability to predict a specific
trajectory, may be less important than the ability to capture the attractor dynamics. The method
also generalizes to partial differential equations, as demonstrated on an example from fluid me-
chanics. Finally, normal forms may be discovered by including parameters in the optimization, as
shown on two examples.
In each of the examples shown, we have investigated the robustness of the sparse dynamics
algorithm to measurement noise and the unavailability of derivative measurements. In each case,
the sparse regression framework appears well-suited to measurement and process noise, espe-
cially when derivatives are smoothed using the total-variation regularized derivative. We also
find that larger noise magnitude increases the data required for accurate model identification.
There are significant implications of this method for fields that are already using symbolic re-
gression or genetic programming. The inclusion of genetic programming and symbolic regression
in a convex framework may allow these methods to generalize to much larger systems.
A number of open problems remain surrounding the dynamical systems aspects of this pro-
cedure. For example, many systems possess dynamical symmetries and conserved quantities that
may alter the form of the identified dynamics. For example, the degenerate identification of a
linear system in a space of high-order polynomial nonlinearities suggest a connection with near-
identity transformations and dynamic similarity. We believe that this may be a fruitful line of re-
search. A significant outstanding issue in the above approach is the correct choice of measurement
coordinates and the choice of sparsifying function basis for the dynamics. There is no simple so-
lution to this challenge, and there must be a coordinated effort to incorporate expert knowledge,
feature extraction, and inference based methods to tackle this in general. However, in practice,
there may be some hope of obtaining the correct coordinate system and function basis without
knowing the solution ahead of time, since we often know something about the physics that guide
the choice of function space. In the case that we have few measurements, these may be augmented
using time delay coordinates, and when we have too many measurements, we may extract coher-
ent structures using advanced methods from dimensionality reduction and machine learning. It
may also be possible to make the dynamics more sparse through subsequent coordinate trans-
formations [2]. We hope that this connection between sparsity methods, machine learning, and
dynamical systems will spur developments to automate and improve these choices.
Appendix A: Choice of basis functions

In real-world systems, the correct choice of basis functions to sparsely represent the dynamics
might not be clear, although physical intuition may be leveraged in many systems. Here, we
explore a simple ODE where the right hand side dynamics are given by a trigonometric function:
d
x = − sin(x) (30a)
dt
1 1
= −x + x3 − x5 + O(x7 ). (30b)
3! 5!
On this test problem, we investigate the SINDy algorithm with different bases, including polyno-
mial basis functions, trigonometric functions, and a combination of polynomial and trigonometric
functions. In the case of a polynomial basis, shown in Table 2, the SINDy algorithm identifies the
correct terms in the Taylor expansion of f (x) = − sin(x). The dynamic reconstruction, shown in
Fig. 21 is excellent. In both of the other cases of a purely trigonometric basis or a basis consisting
of polynomials and trigonometric functions, the correct term f (x) = − sin(x) is identified.
23
Table 2: Polynomial basis.
’’ ’xdot’
’1’ [ 0]
’x’ [ -1.0000]
’xx’ [ 0]
’xxx’ [ 0.1664]
’xxxx’ [ 0]
’xxxxx’ [ -0.0079]
Table 3: Trigonometric basis.

’’ ’xdot’
’sin(x)’ [ -1.0000]
’cos(x)’ [ 0]
’sin(2x)’ [ 0]
’cos(2x)’ [ 0]
Table 4: Polynomial and trigonometric basis.

’’ ’xdot’
’1’ [ 0]
’x’ [ 0]
’xx’ [ 0]
’xxx’ [ 0]
’xxxx’ [ 0]
’xxxxx’ [ 0]
’sin(x)’ [ -1.0000]
’cos(x)’ [ 0]
’sin(2x)’ [ 0]
’cos(2x)’ [ 0]
x 0
-1
-2
0 5 10 15 20 25 30 35 40 45 50
ẋ 0
-1
-2
0 5 10 15 20 25 30 35 40 45 50
Time
Figure 21: Data (black) and sparse dynamics reconstruction (red) for a sequence of initial condi-
Figure 20: Data (black)
tions initialized every 5and
timesparse
units. dynamics reconstruction
Initial conditions (red)from
are chosen for a−1.25
sequence of in
to 1.25 initial condi-
increments
tions initialized every 5 time
of 0.25 (excluding x0 = 0). units. Initial conditions are chosen from 1.25 to 1.25 in increments
of 0.25 (excluding x0 = 0).
Appendix B: Identified Coefficients24of Dynamics
Table 5: Damped harmonic oscillator with linear terms.

Appendix B: Limitations of the sparse identification framework
The sparse identification algorithm above relies on having measurements in a sensible coordinate
system where the dynamics are sparse in the chosen function basis. Here, we explore the limi-
tations of this modeling framework when the coordinates or function basis are not amenable to
sparse representation of the dynamics.
B-1: Lorenz system in nonlinear coordinates

The limitations of the method are clear when we transform the simple Lorenz example from the
natural coordinates (x, y, z) nonlinearly into the new coordinates (A, B, C) according to the map:
A = x sin(x) (31a)
B = y cos(y) (31b)
C = z sin(z). (31c)
In these new coordinates the Lorenz system has complicated nonlinear behavior that is not well
approximated by a dynamical system with polynomial nonlinearities. The system response in
(A, B, C) coordinates is shown in Fig. 22.
The sparse identification algorithm fails to identify a model that agrees with the measured
derivatives for any value of the sparsity promoting parameter λ, as shown in the correlation plot
in Fig. 23. For this problem, various search spaces were explored, including polynomial nonlin-
earities up to fifth order as well as trigonometric functions.
30 50 50
20
10
0 0 0
C
C
B
-10
-20
-30 -50 -50

-20 -10 0 10 20 -20 -10 0 10 20 -20 0 20
A A B
Figure 22: Lorenz attractor in nonlinear coordinates (A, B, C) from Eq. (31).
1 60
A
B 50
0.8
C
Number of terms
40
Correlation
0.6
30
0.4
20
0.2
10
0 0
-6 -4 -2 0 -6 -4 -2 0
10 10 10 10 10 10 10 10
Lambda, λ Lambda, λ
Figure 23: Correlation of sparse prediction of derivatives Θ(X)Ξ and measured derivatives Ẋ of
Lorenz system in nonlinear coordinates (A, B, C) from Eq. (31).
25
As in Sec. 4.5, the use of generalized eigen-time-delay coordinates present a promising tech-
nique to find a natural coordinate system. Figure 24 shows the first three time-delay coordinates
obtained from the eigen-decomposition of the Hankel matrix in Eq. (28); for this example, the
number of rows is q = 100 and the Lorenz system is simulated from t = 0 to t = 100 with
∆t = 0.001. Using these time-delay coordinates results in much better correlation between mea-
sured and modeled derivatives, even for relatively sparse models, as shown in Fig. 25. Increasing
the number of time-delay coordinates to the leading four results in improved correlation, shown
in Fig. 26. However, the resulting dynamical systems for each of these cases does not accurately
reproduce the attractor dynamics, motivating further research to identify natural coordinate sys-
tems to measure in and natural function bases to represent dynamics sparsely.
0.01 0.015 0.015
0.01 0.01
0.005
0.005 0.005
0 0 0
w
w
v
-0.005 -0.005
-0.005
-0.01 -0.01
-0.01 -0.015 -0.015

-0.02 0 0.02 -0.02 0 0.02 -0.01 0 0.01
u u v
Figure 24: Lorenz attractor in time-delay coordinates obtained from measurements (A, B, C) from Eq. (31).
1 20
u
0.8 v
w 15
Number of terms
Correlation
0.6
10
0.4
5
0.2
0 0
-3 -2 -1 0 1 -3 -2 -1 0 1
10 10 10 10 10 10 10 10 10 10
Figure 25: Correlation of sparse model prediction Θ(X)Ξ and measured derivatives Ẋ of Lorenz system
using the first three time-delay coordinates obtained from measurements (A, B, C) from Eq. (31).
1 70
u
v
60
0.8
w
Number of terms
x 50
Correlation
0.6 40
0.4 30
20
0.2
10
0 0
10 -3 10 -2 10 -1 10 0 10 1 10 -3 10 -2 10 -1 10 0 10 1
Figure 26: Correlation of sparse model prediction Θ(X)Ξ and measured derivatives Ẋ of Lorenz system
using the first four time-delay coordinates obtained from measurements (A, B, C) from Eq. (31).
26
B-2: Glycolytic oscillator model
The glycolytic oscillator model is a standard benchmark problem for model prediction, system
identification and automatic inference [6, 8, 7]. We simulate the system presented in Daniels and
Nemenman [8] (Eq. (19) in [8]) :
dS1 k1 S1 S6
= J0 − , (32a)
dt 1 + (S6 /K1 )q
dS2 k1 S1 S6
= 2 − k2 S2 (N − S5 ) − k6 S2 S5 , (32b)
dt 1 + (S6 /K1 )q
dS3
= k2 S2 (N − S5 ) − k3 S3 (A − S6 ), (32c)
dt
dS4
= k3 S3 (A − S6 ) − k4 S4 S5 − κ(S4 − S7 ), (32d)
dt
dS5
= k2 S2 (N − S5 ) − k4 S4 S5 − k6 S2 S5 , (32e)
dt
dS6 k1 S1 S6
= −2 + 2k3 S3 (A − S6 ) − k5 S6 , (32f)
dt 1 + (S6 /K1 )q
dS7
= ψκ(S4 − S7 ) − kS7 . (32g)
dt
Daniels and Nemenman [8] provide the various parameters (Table 1 in [8]) and initial con-
ditions (Table 2 in [8]) to match yeast glycolysis. Data from a simulation of Eq. (32) using these
parameters and initial conditions and a time step of ∆t = 0.001 minutes is shown in Fig. 27.
3.5
S1
3
S2
S3
2.5
Concentration (mM)
S4
2 S5
S6
1.5 S7
0.5
0
0 1 2 3 4 5 6 7 8 9 10
Time (minutes)
Figure 27: Glycolytic oscillator network dynamics for random initial conditions chosen from the
ranges provided in Table 2 of Daniels and Nemenman [8].
27
The results of the sparse identification of nonlinear dynamics algorithm are shown in Tab. 5.
The algorithm accurately identifies the dynamics for S3 , S4 , S5 , and S7 , since each of these vari-
ables have dynamics that are sparse in the polynomial search basis. However, the algorithm does
not identify sparse dynamics for the S1 , S2 , and S6 terms, which each have a rational function
in their dynamics. Although the identified model in Tab. 5 produces derivatives that accurately
match the measured derivatives, as seen in the correlation plot in Fig. 28, the dynamic model does
not agree with the true system, except for a very short time at the beginning of the simulation.
The fact that the algorithm produces accurate sparse dynamics in some of the variables (S3 , S4 , S5 ,
and S7 ) is a good indication that the measurement coordinates are correct. The fact that the dynam-
ics are not sparse in the remaining equations indicates that the function basis is not appropriate for
sparse representation of the dynamics for the remaining equations (S1 , S2 , and S6 ). Investigating
how to generalize the SINDy algorithm to include a broader function search space is an important
area of current and future work.
Table 5: Identified dynamics for metabolic network.
’’ ’S1dot’ ’S2dot’ ’S3dot’ ’S4dot’ ’S5dot’ ’S6dot’ ’S7dot’

’1’ [ -780.181] [ 1.565e+03] [ 0] [ 0] [ 0] [-1.565e+03] [ 0]
’S1’ [ -82.317] [ 164.633] [ 0] [ 0] [ 0] [ -164.633] [ 0]
’S2’ [ -70.328] [ 134.657] [ 6.00] [ 0] [ 6.00] [ -140.657] [ 0]
’S3’ [-1.578e+04] [ 3.156e+04] [-64.00] [ 64.00] [ 0] [-3.143e+04] [ 0]
’S4’ [-1.044e+03] [ 2.087e+03] [ 0] [ -13.00] [ 0] [-2.087e+03] [ 1.3000]
’S5’ [ 1.309e+04] [-2.618e+04] [ 0] [ 0] [ 0] [ 2.618e+04] [ 0]
’S6’ [ -439.231] [ 878.461] [ 0] [ 0] [ 0] [ -879.741] [ 0]
’S7’ [ 3.982e+04] [-7.964e+04] [ 0] [ 13.00] [ 0] [ 7.964e+04] [-3.1000]
’S1S1’ [ 18.308] [ -36.615] [ 0] [ 0] [ 0] [ 36.615] [ 0]
’S1S2’ [ 253.763] [ -507.527] [ 0] [ 0] [ 0] [ 507.526] [ 0]
’S1S3’ [ 3.706e+03] [-7.412e+03] [ 0] [ 0] [ 0] [ 7.412e+03] [ 0]
’S1S4’ [ -607.006] [ 1.214e+03] [ 0] [ 0] [ 0] [-1.214e+03] [ 0]
’S1S5’ [-1.752e+03] [ 3.505e+03] [ 0] [ 0] [ 0] [-3.505e+03] [ 0]
’S1S6’ [ 311.284] [ -622.568] [ 0] [ 0] [ 0] [ 622.568] [ 0]
’S1S7’ [-9.857e+03] [ 1.972e+04] [ 0] [ 0] [ 0] [-1.972e+04] [ 0]
’S2S2’ [ 231.996] [ -463.993] [ 0] [ 0] [ 0] [ 463.993] [ 0]
’S2S3’ [ 1.107e+04] [-2.213e+04] [ 0] [ 0] [ 0] [ 2.213e+04] [ 0]
’S2S4’ [ -242.407] [ 484.813] [ 0] [ 0] [ 0] [ -484.813] [ 0]
’S2S5’ [-8.786e+03] [ 1.757e+04] [ -6.00] [ 0] [ -18.00] [-1.757e+04] [ 0]
’S2S6’ [ 434.818] [ -869.636] [ 0] [ 0] [ 0] [ 869.636] [ 0]
’S2S7’ [-2.041e+04] [ 4.082e+04] [ 0] [ 0] [ 0] [-4.082e+04] [ 0]
’S3S3’ [-1.086e+03] [ 2.171e+03] [ 0] [ 0] [ 0] [-2.171e+03] [ 0]
’S3S4’ [-3.535e+04] [ 7.071e+04] [ 0] [ 0] [ 0] [-7.071e+04] [ 0]
’S3S5’ [ 2.056e+04] [-4.112e+04] [ 0] [ 0] [ 0] [ 4.112e+04] [ 0]
’S3S6’ [ 5.193e+03] [-1.039e+04] [ 16.00] [ -16.00] [ 0] [ 1.035e+04] [ 0]
’S3S7’ [ 4.116e+03] [-8.232e+03] [ 0] [ 0] [ 0] [ 8.232e+03] [ 0]
’S4S4’ [ 7.587e+03] [-1.517e+04] [ 0] [ 0] [ 0] [ 1.517e+04] [ 0]
’S4S5’ [ 1.342e+04] [-2.684e+04] [ 0] [-100.00] [-100.00] [ 2.684e+04] [ 0]
’S4S6’ [ 1.267e+03] [-2.533e+03] [ 0] [ 0] [ 0] [ 2.533e+03] [ 0]
’S4S7’ [-1.320e+04] [ 2.640e+04] [ 0] [ 0] [ 0] [-2.640e+04] [ 0]
’S5S5’ [-7.955e+03] [ 1.591e+04] [ 0] [ 0] [ 0] [-1.591e+04] [ 0]
’S5S6’ [-5.507e+03] [ 1.101e+04] [ 0] [ 0] [ 0] [-1.101e+04] [ 0]
’S5S7’ [ 2.174e+04] [-4.348e+04] [ 0] [ 0] [ 0] [ 4.348e+04] [ 0]
’S6S6’ [ 214.779] [ -429.558] [ 0] [ 0] [ 0] [ 429.558] [ 0]
’S6S7’ [-1.561e+04] [ 3.122e+04] [ 0] [ 0] [ 0] [-3.122e+04] [ 0]
’S7S7’ [ 8.259e+04] [-1.652e+05] [ 0] [ 0] [ 0] [ 1.652e+05] [ 0]
28
1 40
S1
0.8 S3 S1 S2
30
Number of terms
S5 S4 S2 S3
Correlation
0.6
S7 S6 S4 S2
20
0.4 S5 S6
S6 10
0.2
S7
0 0
-2 0 2 4 -2 0 2 4
10 10 10 10 10 10 10 10
Figure 28: Correlation of sparse model prediction Θ(X)Ξ and measured derivatives Ẋ for gly-
colytic oscillator model.
Appendix C: Identified coefficients of dynamics
Table 6: Damped harmonic oscillator with linear terms.
’’ ’xdot’ ’ydot’
’1’ [ 0] [ 0]
’x’ [-0.1015] [-1.9990]
’y’ [ 2.0027] [-0.0994]
’xx’ [ 0] [ 0]
’xy’ [ 0] [ 0]
’yy’ [ 0] [ 0]
’xxx’ [ 0] [ 0]
’xxy’ [ 0] [ 0]
’xyy’ [ 0] [ 0]
’yyy’ [ 0] [ 0]
’xxxx’ [ 0] [ 0]
’xxxy’ [ 0] [ 0]
’xxyy’ [ 0] [ 0]
’xyyy’ [ 0] [ 0]
’yyyy’ [ 0] [ 0]
’xxxxx’ [ 0] [ 0]
’xxxxy’ [ 0] [ 0]
’xxxyy’ [ 0] [ 0]
’xxyyy’ [ 0] [ 0]
’xyyyy’ [ 0] [ 0]
’yyyyy’ [ 0] [ 0]
29
Table 7: Damped harmonic oscillator with cubic nonlinearity.
’’ ’xdot’ ’ydot’
’1’ [ 0] [ 0]
’x’ [ 0] [ 0]
’y’ [ 0] [ 0]
’xx’ [ 0] [ 0]
’xy’ [ 0] [ 0]
’yy’ [ 0] [ 0]
’xxx’ [-0.0996] [-1.9994]
’xxy’ [ 0] [ 0]
’xyy’ [ 0] [ 0]
’yyy’ [ 1.9970] [-0.0979]
’xxxx’ [ 0] [ 0]
’xxxy’ [ 0] [ 0]
’xxyy’ [ 0] [ 0]
’xyyy’ [ 0] [ 0]
’yyyy’ [ 0] [ 0]
’xxxxx’ [ 0] [ 0]
’xxxxy’ [ 0] [ 0]
’xxxyy’ [ 0] [ 0]
’xxyyy’ [ 0] [ 0]
’xyyyy’ [ 0] [ 0]
’yyyyy’ [ 0] [ 0]
Table 8: Three-dimensional linear system.
’’ ’xdot’ ’ydot’ ’zdot’

’1’ [ 0] [ 0] [ 0]
’x’ [-0.0996] [-1.9997] [ 0]
’y’ [ 2.0005] [-0.0994] [ 0]
’z’ [ 0] [ 0] [-0.3003]
’xx’ [ 0] [ 0] [ 0]
’xy’ [ 0] [ 0] [ 0]
’xz’ [ 0] [ 0] [ 0]
’yy’ [ 0] [ 0] [ 0]
’yz’ [ 0] [ 0] [ 0]
’zz’ [ 0] [ 0] [ 0]
30
Table 9: Lorenz system identified using SINDy, assuming measurements of x and ẋ, with η = 1.0.
’1’ [ 0] [ 0] [ 0]
’x’ [-9.9996] [27.9980] [ 0]
’y’ [ 9.9998] [-0.9997] [ 0]
’z’ [ 0] [ 0] [-2.6665]
’xx’ [ 0] [ 0] [ 0]
’xy’ [ 0] [ 0] [ 1.0000]
’xz’ [ 0] [-0.9999] [ 0]
’yy’ [ 0] [ 0] [ 0]
’yz’ [ 0] [ 0] [ 0]
’zz’ [ 0] [ 0] [ 0]
’xxx’ [ 0] [ 0] [ 0]
’xxy’ [ 0] [ 0] [ 0]
’xxz’ [ 0] [ 0] [ 0]
’xyy’ [ 0] [ 0] [ 0]
’xyz’ [ 0] [ 0] [ 0]
’xzz’ [ 0] [ 0] [ 0]
’yyy’ [ 0] [ 0] [ 0]
’yyz’ [ 0] [ 0] [ 0]
’yzz’ [ 0] [ 0] [ 0]
’zzz’ [ 0] [ 0] [ 0]
’xxxx’ [ 0] [ 0] [ 0]
’xxxy’ [ 0] [ 0] [ 0]
’xxxz’ [ 0] [ 0] [ 0]
’xxyy’ [ 0] [ 0] [ 0]
’xxyz’ [ 0] [ 0] [ 0]
’xxzz’ [ 0] [ 0] [ 0]
’xyyy’ [ 0] [ 0] [ 0]
’xyyz’ [ 0] [ 0] [ 0]
’xyzz’ [ 0] [ 0] [ 0]
’xzzz’ [ 0] [ 0] [ 0]
’yyyy’ [ 0] [ 0] [ 0]
’yyyz’ [ 0] [ 0] [ 0]
’yyzz’ [ 0] [ 0] [ 0]
’yzzz’ [ 0] [ 0] [ 0]
’zzzz’ [ 0] [ 0] [ 0]
’xxxxx’ [ 0] [ 0] [ 0]
’xxxxy’ [ 0] [ 0] [ 0]
’xxxxz’ [ 0] [ 0] [ 0]
’xxxyy’ [ 0] [ 0] [ 0]
’xxxyz’ [ 0] [ 0] [ 0]
’xxxzz’ [ 0] [ 0] [ 0]
’xxyyy’ [ 0] [ 0] [ 0]
’xxyyz’ [ 0] [ 0] [ 0]
’xxyzz’ [ 0] [ 0] [ 0]
’xxzzz’ [ 0] [ 0] [ 0]
’xyyyy’ [ 0] [ 0] [ 0]
’xyyyz’ [ 0] [ 0] [ 0]
’xyyzz’ [ 0] [ 0] [ 0]
’xyzzz’ [ 0] [ 0] [ 0]
’xzzzz’ [ 0] [ 0] [ 0]
’yyyyy’ [ 0] [ 0] [ 0]
’yyyyz’ [ 0] [ 0] [ 0]
’yyyzz’ [ 0] [ 0] [ 0]
’yyzzz’ [ 0] [ 0] [ 0]
’yzzzz’ [ 0] [ 0] [ 0]
’zzzzz’ [ 0] [ 0] [ 0]
31
Table 10: Identified dynamics of cylinder wake modes. Notice that quadratic terms are identified.
’1’ [ -0.1225] [ -0.0569] [ -20.8461]
’x’ [ -0.0092] [ 1.0347] [-4.6476e-04]
’y’ [ -1.0224] [ 0.0047] [ 2.4057e-04]
’z’ [-9.2203e-04] [-4.4932e-04] [ -0.2968]
’xx’ [ 0] [ 0] [ 0.0011]
’xy’ [ 0] [ 0] [ 0]
’xz’ [ 2.1261e-04] [ 0.0022] [ 0]
’yy’ [ 0] [ 0] [ 8.6432e-04]
’yz’ [ -0.0019] [ -0.0018] [ 0]
’zz’ [ 0] [ 0] [ -0.0010]
’xxx’ [ 0] [ 0] [ 0]
’xxy’ [ 0] [ 0] [ 0]
’xxz’ [ 0] [ 0] [ 0]
’xyy’ [ 0] [ 0] [ 0]
’xyz’ [ 0] [ 0] [ 0]
’xzz’ [ 0] [ 0] [ 0]
’yyy’ [ 0] [ 0] [ 0]
’yyz’ [ 0] [ 0] [ 0]
’yzz’ [ 0] [ 0] [ 0]
’zzz’ [ 0] [ 0] [ 0]
’xxxx’ [ 0] [ 0] [ 0]
’xxxy’ [ 0] [ 0] [ 0]
’xxxz’ [ 0] [ 0] [ 0]
’xxyy’ [ 0] [ 0] [ 0]
’xxyz’ [ 0] [ 0] [ 0]
’xxzz’ [ 0] [ 0] [ 0]
’xyyy’ [ 0] [ 0] [ 0]
’xyyz’ [ 0] [ 0] [ 0]
’xyzz’ [ 0] [ 0] [ 0]
’xzzz’ [ 0] [ 0] [ 0]
’yyyy’ [ 0] [ 0] [ 0]
’yyyz’ [ 0] [ 0] [ 0]
’yyzz’ [ 0] [ 0] [ 0]
’yzzz’ [ 0] [ 0] [ 0]
’zzzz’ [ 0] [ 0] [ 0]
’xxxxx’ [ 0] [ 0] [ 0]
’xxxxy’ [ 0] [ 0] [ 0]
’xxxxz’ [ 0] [ 0] [ 0]
’xxxyy’ [ 0] [ 0] [ 0]
’xxxyz’ [ 0] [ 0] [ 0]
’xxxzz’ [ 0] [ 0] [ 0]
’xxyyy’ [ 0] [ 0] [ 0]
’xxyyz’ [ 0] [ 0] [ 0]
’xxyzz’ [ 0] [ 0] [ 0]
’xxzzz’ [ 0] [ 0] [ 0]
’xyyyy’ [ 0] [ 0] [ 0]
’xyyyz’ [ 0] [ 0] [ 0]
’xyyzz’ [ 0] [ 0] [ 0]
’xyzzz’ [ 0] [ 0] [ 0]
’xzzzz’ [ 0] [ 0] [ 0]
’yyyyy’ [ 0] [ 0] [ 0]
’yyyyz’ [ 0] [ 0] [ 0]
’yyyzz’ [ 0] [ 0] [ 0]
’yyzzz’ [ 0] [ 0] [ 0]
’yzzzz’ [ 0] [ 0] [ 0]
’zzzzz’ [ 0] [ 0] [ 0]
32
Table 11: Identified dynamics of cylinder wake modes with smaller λ, resulting in cubic nonlinearities.
’1’ [ 0] [ 0] [ 0]
’x’ [ 0] [ 0] [ 0]
’y’ [ -1.0420] [ 0.0062] [ 2.5451e-04]
’z’ [ 1.9812e-05] [-3.5585e-05] [ 0.4750]
’xx’ [ 0] [ 0] [ 6.0153e-05]
’xy’ [ 0] [ 0] [-1.9444e-04]
’xz’ [ 0.0014] [ -0.0074] [ 0]
’yy’ [ 0] [ 0] [-5.7268e-05]
’yz’ [ -0.0037] [ -0.0037] [ 0]
’zz’ [ 0] [ 0] [ 0.0053]
’xxx’ [ 0] [ 4.5311e-05] [ 0]
’xxy’ [ 0] [ 0] [ 0]
’xxz’ [ 0] [ 0] [-3.0965e-05]
’xyy’ [ 0] [ 4.9559e-05] [ 0]
’xyz’ [ 0] [ 0] [-2.3562e-05]
’xzz’ [ 1.0918e-05] [-2.1442e-05] [ 0]
’yyy’ [ 0] [ 0] [ 0]
’yyz’ [ 0] [ 0] [-2.4035e-05]
’yzz’ [-1.5787e-05] [-1.6271e-05] [ 0]
’zzz’ [ 0] [ 0] [ 1.4677e-05]
’xxxx’ [ 0] [ 0] [ 0]
’xxxy’ [ 0] [ 0] [ 0]
’xxxz’ [ 0] [ 0] [ 0]
’xxyy’ [ 0] [ 0] [ 0]
’xxyz’ [ 0] [ 0] [ 0]
’xxzz’ [ 0] [ 0] [ 0]
’xyyy’ [ 0] [ 0] [ 0]
’xyyz’ [ 0] [ 0] [ 0]
’xyzz’ [ 0] [ 0] [ 0]
’xzzz’ [ 0] [ 0] [ 0]
’yyyy’ [ 0] [ 0] [ 0]
’yyyz’ [ 0] [ 0] [ 0]
’yyzz’ [ 0] [ 0] [ 0]
’yzzz’ [ 0] [ 0] [ 0]
’zzzz’ [ 0] [ 0] [ 0]
’xxxxx’ [ 0] [ 0] [ 0]
’xxxxy’ [ 0] [ 0] [ 0]
’xxxxz’ [ 0] [ 0] [ 0]
’xxxyy’ [ 0] [ 0] [ 0]
’xxxyz’ [ 0] [ 0] [ 0]
’xxxzz’ [ 0] [ 0] [ 0]
’xxyyy’ [ 0] [ 0] [ 0]
’xxyyz’ [ 0] [ 0] [ 0]
’xxyzz’ [ 0] [ 0] [ 0]
’xxzzz’ [ 0] [ 0] [ 0]
’xyyyy’ [ 0] [ 0] [ 0]
’xyyyz’ [ 0] [ 0] [ 0]
’xyyzz’ [ 0] [ 0] [ 0]
’xyzzz’ [ 0] [ 0] [ 0]
’xzzzz’ [ 0] [ 0] [ 0]
’yyyyy’ [ 0] [ 0] [ 0]
’yyyyz’ [ 0] [ 0] [ 0]
’yyyzz’ [ 0] [ 0] [ 0]
’yyzzz’ [ 0] [ 0] [ 0]
’yzzzz’ [ 0] [ 0] [ 0]
’zzzzz’ [ 0] [ 0] [ 0]
33
Table 12: Logistic map identified using SINDy.
’’ ’x_{k+1}’ ’r_{k+1}’
’1’ [ 0] [ 0]
’x’ [ 0] [ 0]
’r’ [ 0] [1.0000]
’xx’ [ 0] [ 0]
’xr’ [ 0.9993] [ 0]
’rr’ [ 0] [ 0]
’xxx’ [ 0] [ 0]
’xxr’ [-0.9989] [ 0]
’xrr’ [ 0] [ 0]
’rrr’ [ 0] [ 0]
’xxxx’ [ 0] [ 0]
’xxxr’ [ 0] [ 0]
’xxrr’ [ 0] [ 0]
’xrrr’ [ 0] [ 0]
’rrrr’ [ 0] [ 0]
’xxxxx’ [ 0] [ 0]
’xxxxr’ [ 0] [ 0]
’xxxrr’ [ 0] [ 0]
’xxrrr’ [ 0] [ 0]
’xrrrr’ [ 0] [ 0]
’rrrrr’ [ 0] [ 0]
34
Table 13: Hopf normal form identified with SINDy. Here u represents the bifurcation parameter µ.
’’ ’xdot’ ’ydot’ ’udot’
’1’ [ 0] [ 0] [ 0]
’x’ [ 0] [ 0.9914] [ 0]
’y’ [-0.9920] [ 0] [ 0]
’u’ [ 0] [ 0] [ 0]
’xx’ [ 0] [ 0] [ 0]
’xy’ [ 0] [ 0] [ 0]
’xu’ [ 0.9269] [ 0] [ 0]
’yy’ [ 0] [ 0] [ 0]
’yu’ [ 0] [ 0.9294] [ 0]
’uu’ [ 0] [ 0] [ 0]
’xxx’ [-0.9208] [ 0] [ 0]
’xxy’ [ 0] [-0.9244] [ 0]
’xxu’ [ 0] [ 0] [ 0]
’xyy’ [-0.9211] [ 0] [ 0]
’xyu’ [ 0] [ 0] [ 0]
’xuu’ [ 0] [ 0] [ 0]
’yyy’ [ 0] [-0.9252] [ 0]
’yyu’ [ 0] [ 0] [ 0]
’yuu’ [ 0] [ 0] [ 0]
’uuu’ [ 0] [ 0] [ 0]
’xxxx’ [ 0] [ 0] [ 0]
’xxxy’ [ 0] [ 0] [ 0]
’xxxu’ [ 0] [ 0] [ 0]
’xxyy’ [ 0] [ 0] [ 0]
’xxyu’ [ 0] [ 0] [ 0]
’xxuu’ [ 0] [ 0] [ 0]
’xyyy’ [ 0] [ 0] [ 0]
’xyyu’ [ 0] [ 0] [ 0]
’xyuu’ [ 0] [ 0] [ 0]
’xuuu’ [ 0] [ 0] [ 0]
’yyyy’ [ 0] [ 0] [ 0]
’yyyu’ [ 0] [ 0] [ 0]
’yyuu’ [ 0] [ 0] [ 0]
’yuuu’ [ 0] [ 0] [ 0]
’uuuu’ [ 0] [ 0] [ 0]
’xxxxx’ [ 0] [ 0] [ 0]
’xxxxy’ [ 0] [ 0] [ 0]
’xxxxu’ [ 0] [ 0] [ 0]
’xxxyy’ [ 0] [ 0] [ 0]
’xxxyu’ [ 0] [ 0] [ 0]
’xxxuu’ [ 0] [ 0] [ 0]
’xxyyy’ [ 0] [ 0] [ 0]
’xxyyu’ [ 0] [ 0] [ 0]
’xxyuu’ [ 0] [ 0] [ 0]
’xxuuu’ [ 0] [ 0] [ 0]
’xyyyy’ [ 0] [ 0] [ 0]
’xyyyu’ [ 0] [ 0] [ 0]
’xyyuu’ [ 0] [ 0] [ 0]
’xyuuu’ [ 0] [ 0] [ 0]
’xuuuu’ [ 0] [ 0] [ 0]
’yyyyy’ [ 0] [ 0] [ 0]
’yyyyu’ [ 0] [ 0] [ 0]
’yyyuu’ [ 0] [ 0] [ 0]
’yyuuu’ [ 0] [ 0] [ 0]
’yuuuu’ [ 0] [ 0] [ 0]
’uuuuu’ [ 0] [ 0] [ 0]
35
References
[1] Ljung L (1999) System Identification: Theory for the User (Prentice Hall).
[2] Holmes P, Guckenheimer J (1983) Nonlinear oscillations, dynamical systems, and bifurcations of
vector fields, Applied Mathematical Sciences (Springer-Verlag, Berlin) Vol. 42.
[3] Schmidt M, Lipson H (2009) Distilling free-form natural laws from experimental data. Science
324:81–85.
[4] Roberts AJ (2014) Model emergent dynamics in complex systems (SIAM).
[5] Crutchfield JP, McNamara BS (1987) Equations of motion from a data series. Complex systems
1:417–452.
[6] Schmidt MD, et al. (2011) Automated refinement and inference of analytical models for
metabolic networks. Physical biology 8:055011.
[7] Daniels BC, Nemenman I (2015) Automated adaptive inference of phenomenological dynam-
ical models. Nature communications 6.
[8] Daniels BC, Nemenman I (2015) Efficient inference of parsimonious phenomenological mod-
els of cellular dynamics using s-systems and alternating regression. PloS one 10:e0119821.
[9] Kevrekidis IG, et al. (2003) Equation-free, coarse-grained multiscale computation: Enabling
microscopic simulators to perform system-level analysis. Communications in Mathematical
Science 1:715–762.
[10] Sugihara G, et al. (2012) Detecting causality in complex ecosystems. Science 338:496–500.
[11] Ye H, et al. (2015) Equation-free mechanistic ecosystem forecasting using empirical dynamic
modeling. PNAS 112:E1569–E1576.
[12] Hastie T, et al. (2009) The elements of statistical learning (Springer) Vol. 2.
[13] James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning

(Springer).
[14] Tibshirani R (1996) Regression shrinkage and selection via the lasso. J. of the Royal Statistical
Society B pp 267–288.
[15] Donoho DL (2006) Compressed sensing. IEEE Trans. Information Theory 52:1289–1306.
[16] Candès EJ, Romberg J, Tao T (2006) Robust uncertainty principles: exact signal reconstruc-
tion from highly incomplete frequency information. IEEE Transactions on Information Theory
52:489–509.
[17] Candès EJ, Romberg J, Tao T (2006) Stable signal recovery from incomplete and inaccurate
measurements. Communications in Pure and Applied Mathematics 59:1207–1223.
[18] Candès EJ (2006) Compressive sensing. Proc. International Congress of Mathematics.
[19] Baraniuk RG (2007) Compressive sensing. IEEE Signal Processing Magazine 24:118–120.
36
[20] Tropp JA, Gilbert AC (2007) Signal recovery from random measurements via orthogonal
matching pursuit. IEEE Transactions on Information Theory 53:4655–4666.
[21] Rowley CW, Mezić I, Bagheri S, Schlatter P, Henningson D (2009) Spectral analysis of nonlin-
ear flows. J. Fluid Mech. 645:115–127.
[22] Schmid PJ (2010) Dynamic mode decomposition of numerical and experimental data. Journal
of Fluid Mechanics 656:5–28.
[23] Mezic I (2013) Analysis of fluid flows via spectral properties of the koopman operator. Annual
Review of Fluid Mechanics 45:357–378.
[24] Wang WX, Yang R, Lai YC, Kovanis V, Grebogi C (2011) Predicting catastrophes in nonlinear
dynamical systems by compressive sensing. PRL 106:154101.
[25] Schaeffer H, Caflisch R, Hauck CD, Osher S (2013) Sparse dynamics for partial differential
equations. Proceedings of the National Academy of Sciences USA 110:6634–6639.
[26] Ozoliņš V, Lai R, Caflisch R, Osher S (2013) Compressed modes for variational problems in
mathematics and physics. Proceedings of the National Academy of Sciences 110:18368–18373.
[27] Mackey A, Schaeffer H, Osher S (2014) On the compressive spectral method. Multiscale
Modeling & Simulation 12:1800–1827.
[28] Brunton SL, Tu JH, Bright I, Kutz JN (2014) Compressive sensing and low-rank libraries for
classification of bifurcation regimes in nonlinear dynamical systems. SIAM Journal on Applied
Dynamical Systems 13:1716–1732.
[29] Proctor JL, Brunton SL, Brunton BW, Kutz JN (2014) Exploiting sparsity and equation-free
architectures in complex systems (invited review). The European Physical Journal Special Topics
223:2665–2684.
[30] Bai Z, et al. (2014) Low-dimensional approach for reconstruction of airfoil data via compres-
sive sensing. AIAA Journal pp 1–14.
[31] Koza JR (1992) Genetic programming: on the programming of computers by means of natural selec-
tion (MIT press) Vol. 1.
[32] Chartrand R (2011) Numerical differentiation of noisy, nonsmooth data. ISRN Applied Math-
ematics 2011.
[33] Rudin LI, Osher S, Fatemi E (1992) Nonlinear total variation based noise removal algorithms.
Physica D: Nonlinear Phenomena 60:259–268.
√
[34] Gavish M, Donoho DL (2014) The optimal hard threshold for singular values is 4/ 3. ArXiv
e-prints.
[35] Berkooz G, Holmes P, Lumley JL (1993) The proper orthogonal decomposition in the analysis
of turbulent flows. Annual Review of Fluid Mechanics 23:539–575.
[36] Holmes PJ, Lumley JL, Berkooz G, Rowley CW (2012) Turbulence, coherent structures, dynamical
systems and symmetry, Cambridge Monographs in Mechanics (Cambridge University Press,
Cambridge, England), 2nd edition.
37
[37] Lorenz EN (1963) Deterministic nonperiodic flow. J. Atmos. Sciences 20:130–141.
[38] Taira K, Colonius T (2007) The immersed boundary method: a projection approach. Journal
of Computational Physics 225:2118–2137.
[39] Colonius T, Taira K (2008) A fast immersed boundary method using a nullspace approach
and multi-domain far-field boundary conditions. Computer Methods in Applied Mechanics and
Engineering 197:2131–2146.
[40] Ruelle D, Takens F (1971) On the nature of turbulence. Comm. Math. Phys. 20:167–192.
[41] Jackson CP (1987) A finite-element study of the onset of vortex shedding in flow past vari-
ously shaped bodies. Journal of Fluid Mechanics 182:23–45.
[42] Zebib Z (1987) Stability of viscous flow past a circular cylinder. Journal of Engineering Mathe-
matics 21:155–165.
[43] Noack BR, Afanasiev K, Morzynski M, Tadmor G, Thiele F (2003) A hierarchy of low-
dimensional models for the transient and post-transient cylinder wake. Journal of Fluid Me-
chanics 497:335–363.
[44] Marsden JE, McCracken M (1976) The Hopf bifurcation and its applications (Springer-Verlag)
Vol. 19.
[45] Takens F (1981) Detecting strange attractors in turbulence. Lecture Notes in Mathematics
898:366–381.
38

Data Science

Uploaded by

Copyright:

Available Formats

Data Science

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Science

Uploaded by

Copyright:

Available Formats

Supporting Information for:

Discovering governing equations from data:

3 Nonlinear system identification using sparse representation 4

Appendix A: Choice of basis functions 23

Appendix B: Limitations of the sparse identification framework 25

Appendix C: Identified coefficients of dynamics 29

2.1 Symbolic regression and machine learning

2.2 Sparse representation and compressive sensing

ξ = argmin kΘξ 0 − yk2 + λkξ 0 k1 . (2)

ẋ(t) = f (x(t)). (3)

Θ(X) =  1 X XP2 X P3 ··· sin(X) cos(X) sin(2X) cos(2X) · · ·  . (5)

r the nonlinear LorenzI.system

we use the standard 50

ẋ = f (x) = ΞT (Θ(xT ))T . (9)

Ẋ = Θ(X)Ξ + ηZ, (10)

Code 1: Sparse representation algorithm in Matlab.

% lambda is our sparsification knob.

3.3 Extensions and Connections

3.3.1 Discrete-time representation

xk+1 = f (xk ). (11)

The continuous-time sparse regression problem in Eq. (7) now becomes:

and the function f is the same as in Eq. (9).

3.3.3 External forcing, bifurcation parameters, and normal forms

4.1 Example 1: Simple illustrative systems

4.1.2 Example 1b: Three-dimensional linear system

Full Simulation Identified System, ⌘ = 0.01 Identified System, ⌘ = 10

z A - vortex shedding ux - POD mode 1

B - mean flow uy - POD mode 2

C - unstable fixed point uz - shift mode

Full Simulation Identified System

Full Simulation Identified System

Full Simulation Identified System

4.4.1 Logistic map

xk+1 = rxk (1 − xk ) + η⌘k . (26)

Stochastic System Sparse Identified System

0.01 0.015 0.015

-0.01 -0.01 -0.01

-0.01 -0.01 -0.01

Appendix A: Choice of basis functions

Table 3: Trigonometric basis.

Table 4: Polynomial and trigonometric basis.

Appendix B: Identified Coefficients24of Dynamics

Table 5: Damped harmonic oscillator with linear terms.

B-1: Lorenz system in nonlinear coordinates

-30 -50 -50

0.01 0.015 0.015

-0.01 -0.015 -0.015

Table 5: Identified dynamics for metabolic network.

’’ ’S1dot’ ’S2dot’ ’S3dot’ ’S4dot’ ’S5dot’ ’S6dot’ ’S7dot’

Appendix C: Identified coefficients of dynamics

Table 6: Damped harmonic oscillator with linear terms.

Table 8: Three-dimensional linear system.

’’ ’xdot’ ’ydot’ ’zdot’

[4] Roberts AJ (2014) Model emergent dynamics in complex systems (SIAM).

[13] James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning

[18] Candès EJ (2006) Compressive sensing. Proc. International Congress of Mathematics.

You might also like