Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
32 views

Data-Driven Computational Mechanics

This document presents a new "data-driven computing" paradigm that uses experimental material data directly in calculations instead of empirical material models, bypassing modeling error and uncertainty. It describes data-driven solvers that assign states from a data set closest to satisfying conservation laws like equilibrium. Two examples are given: static equilibrium of 3D trusses and linear elasticity. Data-driven solvers show good convergence with respect to data points and local data assignment. As the data set approximates a classical material law, data-driven solutions converge to classical solutions. The approach is robust to spatial discretization, with data-driven finite element solutions converging jointly in mesh size and data approximation.

Uploaded by

Ishimaru Đoàn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Data-Driven Computational Mechanics

This document presents a new "data-driven computing" paradigm that uses experimental material data directly in calculations instead of empirical material models, bypassing modeling error and uncertainty. It describes data-driven solvers that assign states from a data set closest to satisfying conservation laws like equilibrium. Two examples are given: static equilibrium of 3D trusses and linear elasticity. Data-driven solvers show good convergence with respect to data points and local data assignment. As the data set approximates a classical material law, data-driven solutions converge to classical solutions. The approach is robust to spatial discretization, with data-driven finite element solutions converging jointly in mesh size and data approximation.

Uploaded by

Ishimaru Đoàn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Available online at www.sciencedirect.

com

ScienceDirect

Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101


www.elsevier.com/locate/cma

Data-driven computational mechanics


T. Kirchdoerfer, M. Ortiz ∗
Graduate Aerospace Laboratories, California Institute of Technology, 1200 E. California Blvd., Pasadena, CA, 91125, USA

Received 10 September 2015; received in revised form 19 January 2016; accepted 1 February 2016
Available online 8 February 2016

Abstract

We develop a new computing paradigm, which we refer to as data-driven computing, according to which calculations are
carried out directly from experimental material data and pertinent constraints and conservation laws, such as compatibility and
equilibrium, thus bypassing the empirical material modeling step of conventional computing altogether. Data-driven solvers seek
to assign to each material point the state from a prespecified data set that is closest to satisfying the conservation laws. Equivalently,
data-driven solvers aim to find the state satisfying the conservation laws that is closest to the data set. The resulting data-driven
problem thus consists of the minimization of a distance function to the data set in phase space subject to constraints introduced
by the conservation laws. We motivate the data-driven paradigm and investigate the performance of data-driven solvers by means
of two examples of application, namely, the static equilibrium of nonlinear three-dimensional trusses and linear elasticity. In these
tests, the data-driven solvers exhibit good convergence properties both with respect to the number of data points and with regard
to local data assignment. The variational structure of the data-driven problem also renders it amenable to analysis. We show that,
as the data set approximates increasingly closely a classical material law in phase space, the data-driven solutions converge to the
classical solution. We also illustrate the robustness of data-driven solvers with respect to spatial discretization. In particular, we
show that the data-driven solutions of finite-element discretizations of linear elasticity converge jointly with respect to mesh size
and approximation by the data set.
⃝c 2016 Elsevier B.V. All rights reserved.

Keywords: Data science; Big data; Approximation theory; Scientific computing

1. Introduction

Boundary-value problems in science and engineering typically combine two types of equations: (i) Conservation
laws, which derive from universal principles such as conservation of momentum or energy and are, therefore,
uncertainty-free; and (ii) material laws, formulated through physical modeling based on experimental observation,
that are, therefore, empirical and uncertain. The prevailing classical computational paradigm has been to calibrate
empirical material models using observational data and then use the calibrated material model in calculations. This
process of modeling a fortiori adds error and uncertainty to the solutions, especially in systems with high-dimensional
phase spaces and complex behavior. This modeling error and uncertainty arise from imperfect knowledge of the

∗ Correspondence to: 1200 E. California Blvd., MC 105-50, Pasadena, CA, 91125, USA.
E-mail address: ortiz@caltech.edu (M. Ortiz).

http://dx.doi.org/10.1016/j.cma.2016.02.001
0045-7825/⃝ c 2016 Elsevier B.V. All rights reserved.
82 T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101

functional form of the material laws, the phase space in which they are defined, and from scatter and noise in the
experimental data. Furthermore, often the models used to fit the data are ad hoc, without a clear basis in physics or a
mathematical criterion for their selection, and thus the process of modeling is mired in empiricism and arbitrariness.
Indeed, the entire process of empirical material modeling, and model validation thereof, is open-ended and no rigorous
mathematical theory exists to date that makes it precise and quantitative.
Previous work has been carried out with a view to incorporating observational data into boundary-value problem
solution methodologies, but typically with the aim of parametric identification, or augmenting and automating, rather
than replacing, the use and generation of material models. Material informatics uses database techniques to first
identify parameters of correlation and then use machine-learning regression techniques [1] to ultimately provide
predictive quantitative models [2]. Principal-component analysis provides methods of dimensional reduction that
allow such modeling techniques to be applied [3]. These approaches have been extended to the generation of multi-
scale modeling correlations between macroscopic and microscopic constitutive properties [4–8].
These efforts, and others like them, may be understood as instances of Data Science, the extraction of ‘knowledge’
from large volumes of unstructured data [9,10]. Data science often requires sorting through big-data sets and extracting
‘insights’ from these data. Data science uses data management, statistics and machine learning to derive mathematical
models for subsequent use in decision making. Data Science currently influences primarily fields such as marketing,
advertising, finance, social sciences, security, policy, medical informatics, whereas the full potential of Data Science
as it relates to high-performance scientific computing is yet to be realized. Despite these limitations, reference to
Data Science does effectively serve the purpose of bringing data and artificial intelligence considerations to the
forefront.
In this work, we propose a new and different paradigm, which we refer to as data-driven computing, consisting of
formulating calculations directly from experimental material data and pertinent essential constraints and conservation
laws, thus bypassing the empirical material modeling step of conventional computing altogether. In this new comput-
ing paradigm, essential constraints and conservation laws such as compatibility and equilibrium remain unchanged, as
do all the numerical schemes used in their discretization, such as finite elements, time-integrators, etc. Such conserva-
tion laws confer mathematical structure to the calculations, and this mathematical structure carries over to the present
data-driven paradigm. However, in sharp contrast to conventional computing, in data-driven computing the experi-
mental material-data points are used directly in calculations in lieu of an empirical material model. In this manner,
material modeling empiricism, error and uncertainty are eliminated entirely and no loss of experimental information
is incurred. Specifically, data-driven solvers seek to assign to each material point the state from a prespecified data set
that is closest to satisfying the conservation laws. Equivalently, data-driven solvers aim to find the state satisfying the
conservation laws that is closest to the data set. The resulting data-driven problem thus consists of the minimization
of a distance function to the data set in phase space subject to the satisfaction of essential constraints and conservation
laws.
We provide an efficient implementation of data-driven computing and demonstrate the practicality of the approach
by means of two examples of application, namely, the static equilibrium of nonlinear three-dimensional trusses and
linear elasticity. In these tests, the data-driven solvers exhibit good convergence properties both with respect to the
number of data points and with regard to local data assignment. The variational structure of the data-driven problem
also renders it amenable to analysis. We show that, as the data set approximates increasingly closely a classical mate-
rial law in phase space, the data-driven solutions converge to the classical solutions. We also illustrate the robustness
of data-driven solvers with respect to spatial discretization. In particular, we show that the data-driven solutions of
finite-element discretizations of linear elasticity converge jointly with respect to mesh size and approximation by the
data set. The mathematical analysis is also suggestive of a number of generalizations and extensions of the data-driven
computing paradigm.

2. Truss structures

We proceed to introduce and motivate the general approach with the aid of a simple non-linear elastic truss prob-
lem. Trusses are assemblies of articulated bars that deform in uniaxial tension or compression. Therefore, the material
behavior of a bar is characterized by a particularly simple relation between uniaxial strain ε and uniaxial stress σ . We
refer to the space of pairs (ε, σ ) as phase space. We assume that the behavior of the material of each bar e = 1, . . . , m,
where m is the number of bars in the truss, is characterized by – possibly different – sets E e of pairs (ε, σ ), or local
T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101 83

100

80

60

40

20

0
0.00 0.01 0.02 0.03 0.04 0.05

Fig. 1. Typical material data set for truss bar.

states. For instance, each point in the data set may correspond to, e. g., an experimental measurement, a subgrid mul-
tiscale calculation, or some other means of characterizing material behavior. A typical data set is notionally depicted
in Fig. 1.

2.1. Data-driven solver

For a given material data set, the proposed data-driven solvers seek to assign to each bar e = 1, . . . , m of the truss
the best possible local state (εe , σe ) from the corresponding data set E e , while simultaneously satisfying compatibility
and equilibrium. We understand optimality of the local state in terms of an appropriate figure of merit that penalizes
distance to the data set in phase space. For definiteness, we consider local penalty functions of the type
 
Fe (εe , σe ) = ′ min W e (εe − εe

) + W ∗
e (σe − σe

) , (1)
(εe ,σe )∈E e

for each bar e = 1, . . . , m in the truss, with

1 1 σe2
We (εe ) = Ce εe2 and We∗ (σe ) = (2)
2 2 Ce
and with the minimum taken over all local states (εe′ , σe′ ) in the local data set E e . We may regard We and We∗ are
reference strain and complementary energy densities, respectively. We emphasize that the functions We and We∗ are
introduced as part of the numerical scheme and need not represent any actual material behavior. In particular, the
constant Ce is also numerical in nature and does not represent a material property.
Given a global state consisting of the collection of the local states (εe , σe ) of each one of its bars, the combined
penalty function
m

F= we Fe (εe , σe ) (3)
e=1

simultaneously penalizes all local departures of the local states of the bars from their corresponding data sets. Here and
subsequently, we = Ae L e denotes the volume of truss member e, with Ae its cross-sectional area and L e its length.
The aim of the data-driven solver is to minimize F with respect to the global state {(ε, σ )} subject to equilibrium and
84 T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101

compatibility constraints. These considerations lead to the constrained minimization problem


m

Minimize: we Fe (εe , σe ), (4a)
e=1
n
 m

subject to: εe = Bei u i and we Bei σe = f i , (4b)
i=1 e=1

where {u i , i = 1, . . . , n} is the array of displacement degrees of freedom, { f i , i = 1, . . . , n} is the array of applied


forces and the coefficients Bei encode the connectivity and geometry of the truss.
The compatibility constraint can be enforced simply by expressing the strains in terms of displacements. The
equilibrium constraint can be enforced by means of Lagrange multipliers, leading to the stationary problem
 
 m  n  N  m 
δ we Fe Bei u i , σe − we Bei σe − f i ηi = 0. (5)
e=1 i=1 i=1 e=1

Taking all possible variations, we obtain


m  n 
δu i ⇒ we C e Bej u j − εe∗ Bei = 0, (6a)
e=1 j=1
n
1 
δσe ⇒ (σe − σe∗ ) = Bei ηi , (6b)
Ce i=1
m

δηi ⇒ we Bei σe = f i , (6c)
e=1

where (εe∗ , σe∗ ) denote (unknown) optimal data points for each of the bars, i. e., data points such that
n
  n
   
Fe Bei u i , σe = We Bei u i − εe∗ + We∗ σe − σe∗ (7)
i=1 i=1
or
n
    n
 
We Bei u i − εe∗ + We∗ σe − σe∗ ≤ We Bei u i − εe′ + We∗ (σe − σe′ ) (8)
i=1 i=1

for all data points (εe′ , σe′ ) in the local data set E e . Once all optimal data points are determined, Eqs. (6) define a system
of linear equations for the nodal displacements, the local stresses and the Lagrange multipliers. A straightforward
manipulation of these equations renders them in the equivalent form
 
n m m
we Ce Bej Bei u j = we Ce εe∗ Bei , (9a)
j=1 e=1 e=1
 
n
 m m

we Ce Bei Bej η j = f i − we Bei σe∗ . (9b)
j=1 e=1 e=1

We recognize in these equations two standard linear-elastic truss-equilibrium problems with identical stiffness matrix
corresponding to the reference linear truss defined by We and We∗ , e = 1, . . . , m. The displacement problem (9a)
is driven by the optimal local strains, whereas the Lagrange multiplier problem (9b) is driven by the out-of-balance
forces attendant to the optimal local stresses.
It remains to determine the optimal local data points, i. e., the stress and strain pairs (εe∗ , σe∗ ) in the local data sets E e
that result in the closest possible satisfaction of compatibility and equilibrium. The determination of the optimal local
∗(0) ∗(0)
data points can be effected iteratively. Initially, all bars in the truss are assigned random points (εe , σe ) from the
(0) (0)
corresponding local data sets E e . The displacements u i and Lagrange multipliers ηi are then computed by solving
T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101 85

(0)
(9) and the stresses σe are evaluated from (6b). The next local data assignment is then effected by determining, for
∗(1) ∗(1) (0) (0)
every member in the truss, the data points (εe , σe ) in E e that are optimal with respect to the local state (εe , σe ),
i. e., such that

We (εe(0) − εe∗(1) ) + We∗ (σe(0) − σe∗(1) ) ≤ We (εe(0) − εe′ ) + We∗ (σe(0) − σe′ ), (10)

for all data points (εe′ , σe′ ) in the local data set E e . This operation entails simple local searches in phase space. The
iteration then proceeds by recursion and terminates when the local data assignments effect no change. A detailed
flowchart of the data-driven solver is listed in Algorithm 1.

Algorithm 1 Data-driven solver


Require: Local data sets E e , Be -matrices, e = 1, . . . , m. Applied loads f i , i = 1, . . . , n.
i) Set k = 0. Initial local data assignment:
for all e = 1, . . . , m do
∗(0) ∗(0)
Choose (εe , σe ) randomly from E e
end for
ii) Solve:
 
n m m
(k)
  
we Ce Bej Bei u j = we Ce εe∗(k) Bei , (11a)
j=1 e=1 e=1
 
n m m
(k)
  
we Ce Bei Bej η j = f i − we Bei σe∗(k) , (11b)
j=1 e=1 e=1
(k) (k)
for u i and ηi , i = 1, . . . , n.
iii) Compute local states:
for all e = 1, . . . , m do

n n
(k) (k)
εe(k) = σe(k) = σe∗(k) + Ce
 
Bei u i , Bei ηi (12)
i=1 i=1
end for
iv) Local state assignment:
for all e = 1, . . . , m do
∗(k+1) ∗(k+1) (k) (k)
Choose (εe , σe ) closest to (εe , σe ) in E e .
end for
v) Test for convergence:
∗(k+1) ∗(k+1) ∗(k) ∗(k)
if (εe , σe ) = (εe , σe ) for all e = 1, . . . , m, then
(k)
v.a) u i = u i , i = 1, . . . , n.
(k) (k)
v.b) (εe , σe ) = (εe , σe ), e = 1, . . . , m.
v.c) exit.
else
k ← k + 1, goto (ii).
end if

(k) (k)
The geometry of the local data assignment (10) is illustrated in Fig. 2. Thus, given a trial local state (εe , σe ) of
∗(k+1) ∗(k+1)
bar e, corresponding to the kth iteration of the solver, the next data point (εe , σe ) assigned to the bar is the
(k) (k)
point in E e that is closest to (εe , σe ) in the norm
 1/2
∥(εe , σe )∥e = We (εe ) + We∗ (σe ) . (13)
86 T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101

1.0

0.8

0.6

0.4

0.2

0
0.00 0.2 0.4 0.6 0.8 1.0

Fig. 2. Voronoi tessellation of a data set.


∗(k+1) ∗(k+1) (k) (k)
This is precisely the data point (εe , σe ) in E e whose Voronoi cell contains (εe , σe ). Thus, the penalty
function (1) or, equivalently, the norm (13) divides the phase space into cells according to the Voronoi tessellation
of E e . Each cell in that tessellation may be regarded as the ’domain of influence’ of the corresponding data point.
The local state assignment then simply assigns material points according to their domain of influence and the iteration
terminates when the local states of all bars lie within the domain of influence of the corresponding data points assigned
to the bars.
2.2. Numerical analysis of convergence

A central question to be ascertained concerns the convergence of data-driven solvers with respect to the data
set. Specifically, suppose that the materials in the truss obey a well-defined constitutive law in the form of a graph,
or stress–strain curve, in (ε, σ )-phase space. Then, we expect the data-driven solutions to converge to the classical
solution when the data sets approximate the stress–strain curve increasingly closely, in some appropriate sense to be
made precise subsequently.
In this section, we exhibit this convergence property in a specific example of application. Fig. 3 shows the
geometry, boundary conditions and applied loads on a truss containing 1048 degrees of freedom. The truss undergoes
small deformations and the material in all bars obeys the nonlinear non-linear elastic law shown in Fig. 4. A
Newton–Raphson solver is used to calculate the reference solution. The reference solution values thus obtained are
plotted on the constitutive stress–strain curve to exhibit the extent of non-linearity and the range of local states covered
by the solution.
Suppose that, in actual practice, the stress–strain curve in Fig. 4 is not known exactly but, instead, sampled by
means of a finite collection of points, or data sets. We begin by considering a sequence (E k ) of increasingly fine data
sets consisting of points on the stress–strain curve at uniform distances ρk ↓ 0, with distance defined in the sense of
the norm (13).
The convergence of the local data assignment iteration is shown in Fig. 5 for data sets of sizes 102 , 103 , 104 , 105 .
In all cases, the initial local data assignment is random and convergence is monitored in terms of the penalty function
F, Eq. (3). We note that the problem of assigning data points optimally to each bar of the truss is of combinatorial
complexity. Therefore, it is remarkable that the local data assignment iteration converges after a relatively small
number of steps. As expected, the number of iterations to convergence increases with the size of the data set. However,
it bears emphasis that each local data assignment iteration entails a linear solve corresponding to the linear comparison
truss. The matrix of the system of equations, or stiffness matrix, can be factorized once and for all at the start of the
iteration and subsequent iterations require inexpensive back-substitutes only.
T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101 87

Fig. 3. Model problem geometry with boundary conditions.

Fig. 4. Material model with reference solution values superimposed.

Next, we turn to the question of convergence with respect to the number of data points. For definiteness, we
monitor the convergence of the resulting sequence of data-driven solutions to the reference solution in the sense of the
normalized percent root-mean-square stress and strain errors
 m 1/2
we (εe − εeref )2
1  e=1
ε(%RMS) = ref   ,

(14a)
εmax  m 
88 T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101

Fig. 5. Convergence of the local data-assignment iteration.

Fig. 6. Convergence of strain and stress root-mean-square errors with number of sampling points. Histograms correspond to 30 different initial
random assignments of data points to the truss members.

m
 1/2
we (σe − σeref )2
1 
 e=1
σ(%RMS) = ,

 (14b)
σmax
ref  m 

respectively, where (εeref , σeref ), e = 1, . . . , m are the strains and stresses corresponding to the reference solution and
(εmax
ref , σ ref ) are the corresponding maximum values.
max
Fig. 6 shows convergence plots of the strain and stress root-mean-square errors with number of sampling points.
As may be observed from the figure, the convergence is close to linear in both strains and stresses, which verifies the
convergence of the method as the data set approaches the exact model. We recall that the data assignment Algorithm
1 starts by randomly assigning data points to the truss members. Evidently, the subsequent iteration depends on this
T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101 89

Fig. 7. Typical data set with Gaussian random noise.

Fig. 8. Convergence of strain and stress root-mean-square errors with number of sampling points and data sets with Gaussian noise. Histograms
correspond to 100 data sets.

initial choice. In order to demonstrate insensitivity to such initialization, convergence plots for 30 initial random
assignments are shown in Fig. 6 and the resulting errors are binned into histograms. The tightness of these histograms
verifies the robustness of the iteration with respect to the initial data point selection.
Next, we revisit the question of convergence with respect to the number of data points when the data set is noisy,
i. e., when it does not sample the limit stress–strain curve but is offset from the curve with some probability. In this
case, the data sets converge to the exact stress–strain curve as sets, in a manner to be made precise subsequently.
In calculations we specifically begin by sampling the limit stress–strain curve at uniform distances ρk ↓ 0, as in the
preceding test cases, and subsequently add Gaussian noise to the data points of variance ρk . A typical data set is shown
in Fig. 7 by way of illustration. Convergence plots corresponding to 100 data sets are shown in Fig. 8. As may be seen
from the figure, convergence is achieved with increasing number of points, albeit the convergence rate of roughly 1/2
is lower than the convergence rate in the case of noiseless data.
90 T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101

Fig. 9. Distribution of values of local penalty functions Fe (ε, σ ) for converged data-driven solution.

Finally, we examine the question of sample quality, i. e., the ability of a given data set to sample closely all the
local states covered by the solution. Fig. 9 shows the distribution of the values of the local penalty function Fe , Eq. (1)
corresponding to data sets of sizes 102 , 103 , 104 , 105 . We recall that the value of the function Fe provides a measure
of the distance of the local state (εe , σe ) to the data set. As may be seen from the figure, Fe tends to decrease with the
number of sampling points, as expected. However, for every data-set size there remains a certain spread in the values
of Fe , indicating that the states of certain truss members are better sampled by the data set than others. Specifically,
truss members for which no data point lies close to their states result in high values of Fe , indicative of poor coverage
by the data set. This analysis of the local values Fe of the penalty function suggests a criterion for improving data
sets adaptively so as to improve their quality vis a vis a particular application. Evidently, the optimal strategy is to
target for further testing the region of phase space corresponding to the truss members with highest values of Fe . In
particular, outliers, or truss members with states lying far from the data set, are targeted for further testing. In this
manner, the data set is adaptively expanded so as to provide the best possible coverage of the distribution of local
states corresponding to a particular application.

3. Linear elasticity

As a second motivational example of application of the data-driven paradigm, we consider three-dimensional linear
elasticity. In this case, the local phase space of the material consists of pairs (ϵ, σ ) of strain and stress, respectively.
Since both stresses and strains are symmetric tensors, it follows that the corresponding phase space is twelve-
dimensional. This dimensionality is high enough to start raising questions regarding material sampling and material-
data coverage of the relevant region of phase space. An additional issue that is raised by linear elasticity concerns
the infinite-dimensional character of the solution space. Thus, even if the problem is rendered finite-dimensional by
recourse to spatial discretization, the question of convergence with respect to mesh size must necessarily be elucidated
within an appropriate functional framework. In this section, we extend the truss data-driven solver to linear elasticity
and address the issue of data sampling in high dimensions by exploiting material and geometrical symmetry in the
problem. Finally, we address the question of convergence of the finite-element discretized data-driven solver with
respect to mesh size.

3.1. Data-driven solver

We consider a finite-element model of a nonlinear-elastic solid in the linearized kinematics approximation. The
material behavior of a solid is characterized by a relation between the strain tensor ϵ and the stress tensor σ . We refer
to the space of pairs (ϵ, σ ) as phase space. We assume that the behavior of the material or integration points in the
T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101 91

model is characterized by – possibly different – sets E e of pairs (ϵ, σ ), or local states, where e = 1, . . . , m labels the
material points and m is the number of material points in the finite-element model.
We consider local penalty functions of the type
 
Fe (ϵ e , σ e ) = ′ min W e (ϵ e − ϵ ′
e ) + W ∗
e (σ e − σ ′
e ) , (15)
(ϵ e ,σ e )∈E e

for each integration point e = 1, . . . , m in the solid, with


1
We (ϵ e ) = λ(tr ϵ e )2 + µϵ e · ϵ e ≡ Ce ϵ e · ϵ e , (16a)
2
1 1 λ
We∗ (σ e ) = σe · σe − (tr σ e )2 ≡ C−1
e σ e · σ e, (16b)
4µ 4µ 3λ + 2µ
with the minimum taken over all local states (ϵ ′e , σ ′e ) in the local data set E e . We may regard We and We∗ as reference
strain and complementary energy densities, respectively.
Given a global state consisting of a collection of local states (ϵ e , σ e ) at each material point, we define a global
penalty function as
m

F= we Fe (ϵ e , σ e ), (17)
e=1

we are quadrature or integration weights. This function penalizes jointly all departures of local states from their
corresponding data sets. The data-driven problem is to minimize F with respect to the global state {(ϵ, σ )} subject to
equilibrium and compatibility constraints, namely,
m

Minimize: we Fe (ϵ e , σ e ), (18a)
e=1
n
 m

subject to: ϵ e = Bea ua and we Bea
T
σ e = fa , (18b)
a=1 e=1

where {ua , a = 1, . . . , n} is the array of nodal displacements, {fa , a = 1, . . . , n} is the array of applied nodal forces,
n is the number of nodes and the coefficients Bea encode the connectivity and geometry of the finite-element mesh.
As in the data-driven truss problem, the compatibility constraint can be enforced simply by expressing the strains
in terms of displacements. The equilibrium constraint can in turn be enforced by means of Lagrange multipliers,
resulting in the stationary problem
 
m n   N  m 
δ we Fe Bea ua , σ e − we Bea σ e − fa ηa = 0.
T
(19)
e=1 a=1 a=1 e=1

Taking all possible variations, we obtain the system of Euler–Lagrange equations


m  n 
δua ⇒ we Bea
T
Ce Beb ub − ϵ ∗e = 0, (20a)
e=1 b=1
n

δσ e ⇒ C−1
e (σ e − σ e ) =

Bea ηa , (20b)
a=1
m

δηa ⇒ we Bea
T
σ e = fa , (20c)
e=1

where (ϵ ∗e , σ ∗e ) denote the unknown optimal data points at material point e, i. e., the data point such that
n
  n
 
Fe Bea ua , σ e = We Bea ua − ϵ ∗e + We∗ (σ e − σ ∗e ), (21)
a=1 a=1
92 T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101

or
n
  n
 
We Bea ua − ϵ ∗e + We∗ (σ e − σ ∗e ) ≤ We Bea ua − ϵ ′e + We∗ (σ e − σ ′e ), (22)
a=1 a=1

for all data points (ϵ ′e , σ ′e ) in the local data set E e . Once all optimal data points are determined, Eqs. (20) define
a system of linear equations for the nodal displacements, the local stresses and the Lagrange multipliers. As in the
data-driven truss problem, these equations can be rendered in the equivalent form
 
n m m
we Bea
T
Ce Beb ub = we Bea
T
Ce ϵ ∗e , (23a)
b=1 e=1 e=1
 
n m m

we Bea
T
Ce Beb ηb = fa − we Bea
T ∗
σe. (23b)
b=1 e=1 e=1
Here we recognize two standard linear-elastic equilibrium problems with identical stiffness matrix corresponding to
the reference linear solid defined by We and We∗ , e = 1, . . . , m. The displacement problem (23a) is driven by the
optimal local strains, whereas the Lagrange multiplier problem (23b) is driven by the out-of-balance forces attendant
to the optimal local stresses.

3.2. Using material symmetries to reduce data sets

Phase-space sampling requirements can be reduced if a priori knowledge of material behavior is available. In par-
ticular, material symmetry can be effectively exploited for purposes of reducing material sampling requirements. A
simple and commonly encountered example of material symmetry is isotropy. For a three-dimensional isotropic ma-
terial in the linearized kinematics approximation, if (ϵ e , σ e ) is a material data point, then so are (Re T ϵ e Re , Re T σ e Re )
for all rotation matrices Re ∈ S O(3), the group of proper orthogonal matrices in three dimensions. Thus, if a point
(ϵ e , σ e ) is in the local data set E e , then so is the entire orbit of the point by S O(3).
Under these conditions, local optimality demands
 
Fe (ϵ e , σ e ) = ′ min min W e (ϵ e − Re ϵ
T ′
e Re ) + W ∗
e (σ e − Re σ
T ′
e Re ) . (24)
(ϵ e ,σ e )∈E e Re ∈S O(3)

The corresponding optimality condition is


∂ We ∂ ∂ We∗ ∂ ∂
(ϵkl

Rki Rl j ) + (σ ′ Rki Rl j ) − (3i j Rki Rk j ) = 0, (25)
∂ϵi j ∂ Rmn ∂σi j ∂ Rmn kl ∂ Rmn
where Λ = ΛT is a Lagrange multiplier enforcing the orthogonality of Re . Evaluating the derivatives, we obtain, in
matrix form,
∂ We ∂ We∗
   
Re ϵ e Re
T ′
+ Re σ e Re
T ′
= Λ. (26)
∂ϵ e ∂σ e
Transposing both sides and using tensor symmetry we obtain
∂ We ∂ We∗
   
Re T ϵ ′e Re + Re T σ ′e Re = Λ, (27)
∂ϵ e ∂σ e
whence it follows that
∂ We ∂ We∗ ∂ We ∂ We∗
       
(Re T ϵ ′e Re ) + (Re T σ ′e Re ) = (Re T ϵ ′e Re ) + (Re T σ ′e Re ). (28)
∂ϵ e ∂σ e ∂ϵ e ∂σ e
These equations are now to be solved for the local optimal principal directions {Re , e = 1, . . . , m}, e. g., by recourse
to a Newton–Raphson iteration based on a convenient parametrization of S O(3).
A simple situation arises when the local state (ϵ e , σ e ) is itself isotropic, i. e., ϵ e and σ e have the same principal
directions, and We and We∗ are chosen to be isotropic. In this case, the optimality condition (28) is satisfied if Re T ϵ ′e Re
T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101 93

a b

Fig. 10. (a) Sketch of the simulation set-up of a thin tensile specimen loaded in tension [11]. The thickness of the sample is 1 mm for the three
dimensional model. (b) Isometric view of the simulation set-up in 3D consisting of two rigid pins and the tensile specimen.

Fig. 11. (a) Coarse mesh with 811 element and an average element edge length h ≈ 1 mm; (b) Fine mesh with 6428 elements and an average
element edge length h = 0.5 mm.

and DWe and Re T σ ′e Re and DWe∗ commute, which in turn holds if and only if Re T ϵ ′e Re and ϵ e and Re T σ ′e Re and σ e
have the same eigenvectors. Introducing the representations

ϵ e = Qe T ee Qe , σ e = Qe T se Qe ,
T T
(29)
ϵ ′e = Q′e e′e Q′e , σ ′e = Q′e s′e Q′e ,
with Qe , Q′e ∈ S O(3) and ee , se , e′e , s′e diagonal, local optimality then requires

Re = Q′e Qe −1 , (30)
which determines explicitly the optimal data point in the S O(3)-orbit of (e′e , s′e ).
In general, since the local states (ϵ e , σ e } follow from independent Euler–Lagrange equations, Eqs. (20), they need
not be exactly isotropic and the general optimality equations (28) need to be solved in order to determine the optimal
principal directions of the local data points.

3.3. Numerical analysis of convergence

Similarly to the case of truss analysis considered earlier, we revisit the question of convergence of the linear-
elasticity data-driven solver with respect to the data set. We specifically consider the problem of the thin tensile
specimen shown in Fig. 10, cf. [11]. The specimen is loaded by two rigid pins and contains a short gauge section
undergoing ostensibly homogeneous deformation. By contrast, the regions surrounding the pin-loaded holes undergo
complex heterogeneous deformations. Two finite-element discretizations are used in order to ascertain the influence of
mesh resolution. The coarse mesh, Fig. 11(a), consists of 811 elements and one element across the thickness, whereas
the fine mesh, Fig. 11(b), consists of 3214 elements in two-dimensions and 6428 elements in three-dimensions,
respectively. These discretizations correspond to average element sizes of h = 1 mm for the coarse mesh and
h = 0.5 mm for the fine mesh. The mesh consists of eight-node hexahedral elements containing eight Gauss
quadrature points each.
Sampling requirements are reduced by virtue of the plane-stress conditions of the problem under consideration.
Specifically, only a neighborhood of the subspace σ13 = σ23 = σ33 = 0 in stress space needs to be covered by
the data. We accomplish this requirement by sampling an appropriate region of the (σ11 , σ22 , τ12 ) stress plane on a
uniform cubic grid. The corresponding strains (ϵ11 , ϵ22 , ϵ12 ) then obey an isotropic linear-elastic law. A reference
isotropic linear-elastic solid of the type (16), unrelated to the actual material behavior sampled by the material data,
is used in the data-driven calculations. These reductions effectively limit the material data set to a three-dimensional
space.
94 T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101

Fig. 12. Linear-elastic tensile specimen. Convergence of the local material-data assignment iteration. Functional F decays through increasing data
resolution in a three dimensional sampling of the plane stress space for both mesh resolutions.

As in the case of the data-driven truss problem, we focus on the questions of convergence for a data-driven linear-
elastic solver with respect to local data assignment, or step-wise convergence, and with respect to the data set. In
Fig. 12, the global functional F is again shown to decay through iteration for both mesh resolutions on increasingly
large data sets. The number of iterations to convergence increases with the material-data sample size but, remarkably,
remains modest in all cases.
To calculate percent error with respect to the reference solution we re-define the RMS error metric as
 m 1/2
we W ∗ (σ e − σ ref
e )
 e=1
σ(%R M S) = 
  m
 ,
 (31a)
we W (σ e )
∗ ref
e=1
m
 1/2
we W (ϵ e − ϵ ref
e )
 e=1
ϵ(%R M S) =  ,

  m

 (31b)
we W (ϵ ref
e )
e=1

where W and W∗are the strain and complementary-energy densities as calculated using the reference solution moduli,
respectively. Plots of these errors against the cubic root of the number of data points are shown in Fig. 13. The plots
are indicative of ostensibly linear convergence, in keeping with the analytical estimates derived next.

4. Mathematical analysis of convergence

We proceed to abstract from the preceding examples a general class of data-driven problems and to establish some
of their fundamental properties by way of analysis. We consider systems whose state is characterized by points in a
certain phase space Z . For instance, in the case of linear elasticity, the system of interest is an elastic solid occupying
a certain domain Ω and the state of the system is defined by the pair (ϵ(x), σ (x)), where ϵ(x) is the strain field and
σ (x) is the stress field, both defined over Ω . In this case, the phase space Z of the elastic solid is an appropriate space
of pairs (ϵ(x), σ (x)) of strain and stress fields over Ω .
We particularly wish to characterize the states of the system that are in a constraint set C of states satisfying
essential constraints and conservation laws. For instance, in the running example of linear elasticity we may wish to
T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101 95

Fig. 13. Linear-elastic tensile specimen. Convergence with respect to sample size. RMS errors decay linearly in data resolution for both stresses
(σ ) and strains (ϵ).

determine states (ϵ(x), σ (x)) ∈ Z satisfying compatibility, i. e., such that

ϵ(x) = 1/2 ∇u(x) + ∇uT (x) , x ∈ Ω,


 
(32a)
u(x) = ū(x), x ∈ ∂Ω D , (32b)

for some displacement field u over Ω and prescribed displacements ū over the Dirichlet boundary ∂Ω D , and satisfying
equilibrium, i. e., such that

∇ · σ (x) + f (x) = 0, x ∈ Ω, (33a)


σ (x)n(x) = t̄(x), x ∈ ∂Ω N , (33b)

for some applied body force field f over Ω and tractions t̄ over the Neumann boundary ∂Ω N , with unit outer
normal n.
Classically, the problem is closed by putting forth a material law restricting the set of admissible states to a graph
E in Z . For instance, in elasticity the material law may classically take the form of a nonlinear Hooke’s law

σ (x) = DW ϵ(x) , x ∈ Ω ,
 
(34)

where W is the strain–energy density of the material and we consider linearized kinematics. The set E then consists
of the set of strain and stress fields satisfying the material law at all material points in Ω . The classical solution set
is then the intersection E ∩ C, consisting of states of the system satisfying the essential constraints, the conservation
laws and the material law simultaneously. In the case of linear elasticity, the classical solutions would consist of
compatible strain fields and equilibrium stress fields satisfying the material law at all material points. In general,
the cardinality of the solution set E ∩ C depends on the transversality of C with respect to E, depending on which
transversality, the solution set may be empty or non-empty, in which latter case the solution set may consist of a single
point, corresponding to uniqueness of the solution, or multiple points.
In contrast to the classical problem just formulated, here we suppose that the material response is not known exactly
and, instead, it is imperfectly characterized by a set, also denoted E, consisting locally at every material point, e. g.,
of a finite collection of states obtained by means of experimental testing, cf. Fig. 14. Under such conditions, E ∩ C
is likely to be empty even in cases when solutions could reasonably be expected to exist. It is, therefore, necessary
to replace the overly-rigid problem of determining E ∩ C by a suitable relaxation thereof. To this end, we begin by
96 T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101

100

80

60

40

20

0
0.00 0.01 0.02 0.03 0.04 0.05

Fig. 14. Schematic of a local material set E consisting of a finite number of states obtained, e. g., from experimental testing. Also shown is a
possible constraint set C and near intersections between E and C.

introducing a norm | · | in phase space. For instance, for truss structures we may choose
m
σ2
  
|x|2 = we Ce εe2 + e , (35)
e=1
Ce

with x ≡ (ϵe , σe )m
e=1 denoting a generic point in phase space Z . For discretized linear-elastic solids we may choose
m
  
|x|2 = we Ce ϵ e · ϵ e + C−1
e σe · σe , (36)
e=1

with e labeling the integration points in the discretization and x ≡ (ϵ e , σ e )m


e=1 denoting a generic point in phase space
Z . Finally, for continuum linear elasticity we may choose
  
|x|2 = C(x)ϵ(x) · ϵ(x) + C−1 (x)σ (x) · σ (x) d x (37)

with x ≡ { ϵ(x), σ (x) , x ∈ Ω } denoting a generic point in phase space Z .


 

Based on this metrization of phase space, we define the data-driven problem as the double minimum problem
min min |y − x| = min dist(y, C), (38)
y∈E x∈C y∈E

or, equivalently,
min min |x − y| = min dist(x, E). (39)
x∈C y∈E x∈C

Thus, the aim of the data-driven problem, as expressed in (38), is to find the point in the material-data set that is
closest to satisfying the essential constraints and conservation laws, or, as expressed in (39), to find the point in the
constraint set that is closest to the material-data set. In the particular example of linear elasticity, the aim of the data-
driven problem, as expressed in (38), is to find the point in the material-data set that is closest to being compatible and
in equilibrium, or, as expressed in (39), to find the compatible equilibrium point that is closest to the material-data set.
We note that the data-driven problems considered in Sections 2 and 3 are indeed examples of (38) and (39) with
norms (35) and (36), respectively.
T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101 97

4.1. Finite-dimensional case: convergence with respect to sample size

We begin by considering systems whose local states take values in a finite-dimensional phase space Z . The global
state of the system is then characterized by a point x ∈ Z . The essential constraints and conservation laws pertaining
to the system have the effect of constraining its global state to lie on a subset C of Z . For instance, for linear-elastic
trusses such as considered in the preceding section, the local phase space Z e of bar e is the space of pairs (ϵ, σ ),
where ϵ is axial strain of a bar and σ the corresponding axial stress. The global phase space of the entire truss is then
Z = Z 1 × · · · × Z m , where m is the number of bars in the truss. In addition, the constraint set C is the affine space of
compatible and equilibrated states of stress and strain in the truss.
The data-driven problem (38) is now formulated by specifying a set E of possible material states in Z . For instance,
in the case of a truss a local material set E e of the form shown in Fig. 14 may be supplied for every bar e of the truss and
the global material set is then E = E 1 × · · · × E m . We note that, if E is compact, e. g., consisting of a finite collection
of points, then the corresponding data-driven problem has solutions by the Weierstrass extreme-value theorem.
We proceed to consider the question of convergence with respect to the data set. Specifically, we suppose that a
sequence (E k ) of data sets is supplied that approximates increasingly closely a limiting data set E. The particular
case in which E is a graph concerns convergence of data-driven solutions to classical solutions. For instance, the
approximations (E k ) may be the result of an increasing number of experimental tests sampling the behavior of a
material characterized by a – possibly unknown – stress–strain curve E. The sequence of approximate material data
sets (E k ) generates in turn a sequence of approximate data-driven problems
min min |xk − yk |, (40)
xk ∈C yk ∈E k

and attendant approximate solutions (xk ). We wish to ascertain conditions under which (xk ) converges to solutions of
the E-problem.
Conditions ensuring such convergence at a well-defined convergence rate are given in the following proposition.
Henceforth, we denote by dist(x, E) the distance from a point x ∈ Z to a subset E ⊂ Z , i. e.,
dist(x, E) = inf |x − y|, (41)
y∈E

and by PY x the projection of x ∈ Z onto a subspace Y of Z , i. e.,


|x − PY x| = dist(x, Y ). (42)

Proposition 1. Let (E k ) be a sequence of finite subsets of Z , E a subset of Z and C a subspace of Z . Let x be an


isolated point of E ∩ C and let xk , yk ∈ Z be such that
(xk , yk ) ∈ argmin{|x − y|, x ∈ C, y ∈ E k }. (43)
Suppose that:
(i) There is a sequence ρk ↓ 0 such that
dist(z, E k ) ≤ ρk , ∀z ∈ E. (44)
(ii) There is a sequence tk ↓ 0 such that
dist(z k , E) ≤ tk , ∀z k ∈ E k . (45)
(iii) (Transversality) There is a constant 0 ≤ λ < 1 and a neighborhood U of x such that
|PC z − x| ≤ λ|z − x|, (46)
for all z ∈ E ∩ U .
Then,
tk + λ(tk + ρk )
|xk − x| ≤ , (47)
1−λ
and, therefore, limk→∞ |xk − x| = 0.
98 T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101

Proof. By assumption (i), we can find z k ∈ E k such that


|z k − x| ≤ ρk . (48)
By optimality,
dist(yk , C) ≤ dist(z k , C). (49)
Then, we have
|xk − yk | = dist(yk , C) ≤ dist(z k , C) ≤ |z k − x| ≤ ρk . (50)
By assumption (ii), we can find z k ∈ E such that
|yk − z k | ≤ tk . (51)
By the triangle inequality, we have
|xk − x| ≤ |xk − PC z k | + |PC z k − x|. (52)
By the contractivity of projections,
|xk − PC z k | = |PC yk − PC z k | = |PC (yk − z k )| ≤ |yk − z k | ≤ tk . (53)
In addition, by transversality, we have
|PC z k − x| ≤ λ|z k − x|, (54)
with 0 ≤ λ < 1. Triangulating again,
|z k − x| ≤ |z k − yk | + |yk − xk | + |xk − x|. (55)
Collecting all the preceding estimates, we obtain
|xk − x| ≤ tk + λ(tk + ρk + |xk − x|) (56)
whence (47) follows. 
The preceding proposition presumes that the limiting data set E and the constraint subspace C are transversal at
isolated intersections, which are identified with the limiting solutions. For instance, E may be a Lipschitz continuous
graph in a neighborhood of classical solutions with E not contained in C in that neighborhood. In the particular
case in which the limiting material response is linear, the transversality condition reduces to the requirement that the
displacement stiffness matrix of the system be non-singular. Assumptions (i) and (ii) set how the sequence of data sets
E k must approximate E. Thus, (i) ensures that there are approximate data points increasingly and uniformly closer
to any point of E, whereas (ii) ensures that there are no outliers in the approximate data sets such as could spoil the
approximate solutions, cf. Fig. 15. In particular, (ii) requires E k to be contained uniformly within the tk -neighborhood
of E.
Precise convergence rates with respect to, e. g., the number of sampling points Nk = # E k , are derived from the
preceding theorem if the sequences ρk and tk are related to Nk . In particular, we have the following,

Corollary 1. Assume that there are constants C1 > 0, C2 > 0 and α > 0 such that ρk ≤ C1 Nk−α and tk ≤ C2 Nk−α .
Then
C2 + λ(C1 + C2 ) −α
|xk − x| ≤ Nk . (57)
1−λ

The numerical convergence rates of Sections 2 and 3 indeed conform to this estimate. Thus, for the case of truss
structures with noise-free data, ρk and tk scale as Nk−1 , resulting in a linear convergence rate α = 1, cf. Fig. 6,
−1/2
whereas for noisy data ρk and tk scale as Nk , resulting in a linear convergence rate α = 1/2, cf. Fig. 8. For the
case of plane-stress linear elasticity with noise-free data, ρk and tk scale as Nk−3 , resulting in a linear convergence rate
α = 3, cf. Fig. 13.
T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101 99

Fig. 15. Schematic of convergent sequence of material-data sets. The parameter tk controls the spread of the material-data sets away from the
limiting data set and the parameter ρk controls the density of material-data point.

4.2. Infinite-dimensional case: convergence with respect to mesh size

The linear-elastic case considered in Section 3 differs from the truss case of Section 2 in that it is obtained by
discretization of an infinite-dimensional problem. The question then naturally arises of convergence of the data-driven
problem with respect to the mesh size. Consider, for simplicity, a sequence of discretizations of the domain into
constant strain triangles of size h k . Let x ≡ (ϵ, σ ) be the classical solution and x h k ≡ (ϵ h k , σ h k ) the corresponding
sequence of finite-element solutions. Simultaneously consider a sequence (E k ) of local material-data sets satisfying
conditions (i) and (ii) of Proposition 1 for some sequences ρk ↓ 0 and tk ↓ 0. Additionally suppose that the sequence
of discretizations is regular in the sense that

|x h k − x| ≤ Ch k (58)

for some constant C > 0 and that the transversality constants λk of the sequence of finite-element models does not
degenerate, i. e.,

0 ≤ λk ≤ λ, (59)

for some λ < 1. Then, by Proposition 1 we have

tk + λ(tk + ρk )
|xk,h k − x| ≤ |xk,h k − x h k | + |x h k − x| ≤ + Ch k , (60)
1−λ
where xk,h k ≡ (ϵ k,h k , σ k,h k ) denotes the data-driven solution corresponding to the E k material-data set and the h k
discretization. It thus follows that if ρk and tk are controlled by the mesh size, i. e., there is a constant C > 0 such that

ρk < Ch k , tk < Ch k , (61)

then

|xk,h k − x| ≤ Ch k , (62)

for some constant C > 0 not renamed. We thus conclude that the data-driven paradigm is robust with respect to
spatial discretization, in the sense that it preserves convergence provided that the fidelity of the data set increases
appropriately with increasing mesh resolution.
100 T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101

5. Summary and concluding remarks

We have formulated a new computing paradigm, which we refer to as data-driven computing, consisting of
formulating calculations directly from experimental material data and pertinent essential constraints and conservation
laws, thus bypassing the empirical material modeling step of conventional computing altogether. The data-driven
solver specifically seeks to assign to each material point of the computational model the closest possible state from
a prespecified material-data set, while simultaneously satisfying the essential constraints and conservation laws.
Optimality of the local state assignment is understood in terms of a figure of merit that penalizes distance to the
data set in phase space. The resulting data-driven problem thus consists of the minimization of a distance function to
the data set in phase space subject to constraints set forth by the essential constraints and conservation laws.
We have investigated the performance of the data-driven solver with the aid of two particular examples of
application, namely, the static equilibrium of nonlinear three-dimensional trusses and of finite-element discretized
linear-elastic solids. In these cases, the penalty function in phase space may be regarded as representing a linear-
comparison solid with an initial state of strain and stress. The equilibrium constraint can be conveniently enforced
by means of Lagrange multipliers. The corresponding stationarity equations correspond to the solution of two linear-
static equilibrium problems for the comparison solid. We have formulated a local data assignment algorithm by which
each member of the truss is pegged to a particular point in the data set. The algorithm terminates when the local state
of every member of the truss is in the Voronoi cell of its assigned data point in phase space. We show, by way of
numerical testing, that the data-driven solver possesses good convergence properties both with respect to the number
of data points and with regards to the local data assignment iteration.
The variational structure of the data-driven problem confers robustness to the solver and renders it amenable to
analysis. By exploiting this connection, we show that data-driven solutions converge to classical solutions when the
data set approximates a limiting constitutive law with increasing fidelity. By virtue of this property, we may regard
data-driven problems as a generalization of classical problems in which the material behavior is defined by means of
an arbitrary data set in phase space, not necessarily a graph. In particular, classical solutions are recovered precisely
when the data set coincides with the graph of a material law.
Whereas the data-driven paradigm has been formulated in the context of computational mechanics and, specifically
elastic quasistatic problems, we believe that its range and scope is much larger. Indeed, field theories governed by
linear or nonlinear elliptic partial-differential equations should be amenable, upon discretization, to an analogous
treatment. Extensions to dynamical problems are also straightforward. Indeed, dynamics essentially adds inertia forces
in the equations of motion that are independent – and do not affect the description – of the material behavior. By
contrast, inelastic materials raise the fundamental problem of sampling history-dependent material behavior. Such
sampling should provide appropriate coverage of possible processes and evolutions of the system and is thus likely to
result in exceedingly large and complex data sets. The use of tools from Data Science and Big Data management may
be expected to be particularly beneficial in dealing with such data sets.
We close by pointing out that the traditional computing paradigm has insulated problems from the data on which
their solution is based. Removing this barrier creates a powerful new tool in the arsenal of scientific computing. With
data-driven computing, data sets can be used directly to provide predictive analysis capability for unmodeled materials.
Traceability and inherent measures of data fidelity enable both deeper investigations into data-solution relationships
and natural alerts for appropriate model use. Having the ability to tie solution results back to specific data points within
a set allows for the creation of a new kind of causality in material analysis. The data-driven paradigm can also ensure
the collection of descriptive data sets for prospective uses. Error measures highlight data regions that require additional
resolution, as well as point the analyst toward sensitivities within the solution-source relations. These methods can be
used to check if a constitutive relation based on a certain data-set is capable of performing a desired simulation prior
to the analysis. Tying the solution back to the data set also establishes an elegant way of limiting model accuracy
to the resolution of the source data. Additionally, it should be noted that material models have specific regimes over
which they are developed. However, the models themselves are easily used outside this development range. Especially
with regards to empirical ad-hoc curve fits, such overreach can neither be justified nor easily prevented. By directly
using a data set in calculations, attempts to simulate beyond the data regime are met and penalized by large calculated
errors, regardless of how the user receives the data set. These tangible and intangible benefits add considerable appeal
to data-driven solvers beyond their mere usefulness as numerical schemes.
T. Kirchdoerfer, M. Ortiz / Comput. Methods Appl. Mech. Engrg. 304 (2016) 81–101 101

Acknowledgment

The support of Caltech’s Center of Excellence on High-Rate Deformation Physics of Heterogeneous Materials,
AFOSR Award FA9550-12-1-0091, is gratefully acknowledged.

References

[1] C. Bishop, Pattern Recognition and Machine Learning, Information Science and Statistics, Springer, 2006.
[2] K. Rajan, Ma terials informatics, Mater. Today 8 (10) (2005) 38–45. http://dx.doi.org/10.1016/S1369-7021(05)71123-8.
URL http://www.sciencedirect.com/science/article/pii/S1369702105711238.
[3] S. Curtarolo, D. Morgan, K. Persson, J. Rodgers, G. Ceder, Predicting crystal structures with data mining of quantum calculations, Phys. Rev.
Lett. 91 (2003) 135503. http://dx.doi.org/10.1103/PhysRevLett.91.135503. URL http://link.aps.org/doi/10.1103/PhysRevLett.91.135503.
[4] C.M. Breneman, L.C. Brinson, L.S. Schadler, B. Natarajan, M. Krein, K. Wu, L. Morkowchuk, Y. Li, H. Deng, H. Xu, Stalking the materials
genome: A data-driven approach to the virtual design of nanostructured polymers, Adv. Funct. Mater. 23 (46) (2013) 5746–5752.
http://dx.doi.org/10.1002/adfm.201301744.
[5] S. Kalidindi, S. Niezgoda, A. Salem, Microstructure informatics using higher-order statistics and efficient data-mining protocols, JOM 63 (4)
(2011) 34–41. http://dx.doi.org/10.1007/s11837-011-0057-7.
[6] M.P. Krein, B. Natarajan, L.S. Schadler, L.C. Brinson, H. Deng, D. Gai, Y. Li, C.M. Breneman, Development of materials informatics tools
and infrastructure to enable high throughput materials design, in: Symposium UU: Combinatorial and High-Throughput Methods in Materials
Science, in: MRS Online Proceedings Library, vol. 1425, 2012. http://dx.doi.org/10.1557/opl.2012.57.
[7] S.R. Kalidindi, Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials, Int. Mater. Rev.
60 (3) (2015) 150–168. http://dx.doi.org/10.1179/1743280414Y.0000000043.
[8] A. Gupta, A. Cecen, S. Goyal, A.K. Singh, S.R. Kalidindi, Structureproperty linkages using a data science approach: Application to a non-
metallic inclusion/steel composite system, Acta Mater. 91 (0) (2015) 239–254. http://dx.doi.org/10.1016/j.actamat.2015.02.045.
URL http://www.sciencedirect.com/science/article/pii/S1359645415001603.
[9] R. Agarwal, V. Dhar, Big data, data science, and analytics: The opportunity and challenge for is research, Inf. Syst. Res. 25 (3) (2014)
443–448. http://dx.doi.org/10.1287/isre.2014.0546. URL http://WOS:000343098100001.
[10] B. Baesens, Analytics in a Big Data World : The Essential Guide to Data Science and its Applications, Wiley & SAS business series, 2014.
[11] X. Feng, G. Fischer, R. Zielke, B. Svendsen, W. Tillmann, Investigation of plc band nucleation in aa5754, Mater. Sci. Eng. A 539 (0) (2012)
205–210.

You might also like