Composing Modeling and Simulation With Machine Learning in Julia
Composing Modeling and Simulation With Machine Learning in Julia
Composing Modeling and Simulation With Machine Learning in Julia
submodels, and finally make it easy to check the results another Julia package for acausal modeling. For a compar-
against the non-approximated model. ison between MTK, Modia and Modelica, the reader re-
To address these issues, we introduce JuliaSim — a ferred to this article 3 as well this section of the documen-
modeling and simulation environment, which merges ele- tation 4 . Similarly to Modelica, MTK allows for build-
ments of acausal modeling frameworks like Modelica with ing models hierarchically in a component-based fashion.
machine learning elements. The core of the environment For example, defining a component in MTK is to define a
is the open source ModelingToolkit.jl (Ma et al. 2021), an function which generates an ODESystem:
acausal modeling framework with an interactive compila-
tion mechanism for including exact and inexact transfor- function Capacitor(;name, C = 1.0)
mations. To incorporate machine learning, we describe val = C
the continuous-time echo state network (CTESN) archi- @named p = Pin(); @named n = Pin()
tecture as an approximation transformation of time series @variables v(t); @parameters C
D = Differential(t)
data to a DAE component. Notably, the CTESN archi-
eqs = [v ~ p.v - n.v
tecture allows for an implicit training to handle the stiff 0 ~ p.i + n.i
equations common in engineering simulations. To demon- D(v) ~ p.i / C]
strate the utility of this architecture, we showcase the ODESystem(eqs, t, [v], [C],
CTESN as a methodology for translating a Room Air Con- systems=[p, n],
ditioner model from a Functional Mock-up Unit (FMU) defaults=Dict(C => val),
binary to an accelerated ModelingToolkit.jl 1 model with name=name)
4% error over the operating parameter range, accelerat- end
ing it by 340x. We then show how the accelerated model
can be used to speed up global parameter optimization by Systems can then be composed by declaring subsystems
over two orders of magnitude. As a component within and defining the connections between them. For instance,
an acausal modeling framework, we demonstrate its abil- the classic RC circuit can be built from standard electrical
ity to be composed with other models, here specifically components as:
in the context of the FMI co-simulation environment. We
believe that these results indicate the promise of blending @named resistor = Resistor(R=100)
machine learning surrogate methods in the broader mod- @named capacitor = Capacitor(C=0.001)
@named source = ConstantVoltage(V=10)
eling and simulation workflow.
@named ground = Ground()
@named rc_model = ODESystem([
2 Overview of JuliaSim connect(source.p, resistor.p)
The flow of the architecture (Figure 1) is described as fol- connect(resistor.n, capacitor.p)
lows. We start by describing the open ModelingToolkit.jl connect(capacitor.n, source.n,
ground.g)],
acausal modeling language as a language with compos-
t, systems=[resistor, capacitor,
able transformation passes to include exact and approxi- source, ground])
mate symbolic transformations. To incorporate machine
learning into this acausal modeling environment, we de- The core of MTK’s utility is its system of trans-
scribe the CTESN, which is a learnable DAE structure that formations, where a transformation is a function
can be trained on highly stiff time series to build a repre- which takes an AbstractSystem type to another
sentation of a component. To expand the utility of com- AbstractSystem type. Given this definition, trans-
ponents, we outline the interaction with the FMI standard formations can be composed and chained. Transfor-
to allow for connecting and composing models. Finally, mations, such as dae_index_lowering, transform
we present the JuliaSim model library, which is a collec- a higher-index DAE into an index-1 DAE via the Pan-
tion of acausal components that includes pre-trained sur- telides algortithm (Pantelides 1988). Nonlinear tear-
rogates of models so that users can utilize the acceleration ing and alias_elimination (Otter and Elmqvist
without having to pay for the cost of training locally. 2017) are other commonly used transformations, which
2.1 Interactive Acausal Modeling with Model- match the workflow of the Dymola Modelica compiler
ingToolkit.jl (Brück et al. 2002) (and together are given the alias
structural_simplify). However, within this sys-
ModelingToolkit.jl (Ma et al. 2021) (MTK) is a frame- tem the user can freely compose transformations with
work for equation-based acausal modeling written in domain- and problem-specific transformations, such as
the Julia programming language (Bezanson et al. 2017), “exponentiation of a variable to enforce positivity” or “ex-
which generates large systems of DAEs from symbolic tending the system to include the tangent space”. After
models. MTK takes a different approach than Modia.jl2 ,
3 http://www.stochasticlifestyle.com/modelingtoolkit-modelica-and-
1 https://github.com/SciML/ModelingToolkit.jl modia-the-composable-modeling-future-in-julia/
2 https://github.com/ModiaSim/Modia.jl 4 https://mtk.sciml.ai/stable/comparison/
Figure 1. Compiler passes in the JuliaSim Modeling and Simulation system. Ordinarily, most systems simulate equation-based
models, described in the “Training Data Preparation” and the “Simulation or Co-simulation” phases. We provide an additional set
of steps in our compiler to compute surrogates of models. Blue boxes represent code transformations, yellow represents user source
code, gray represents data sources, and gold represents surrogate models. The dotted line indicates a feature that is currently work
in progress.
transformations have been composed, the ODEProblem spaces through predefined dynamics of a nonlinear system
constructor compiles the resulting model to a native Ju- (Lukoševičius and Jaeger 2009). CTESNs are effective at
lia function for usage with DifferentialEquations.jl (Rack- learning the dynamics of systems with widely separated
auckas and Nie 2017). time scales because their design eliminates the require-
ment of training via local optimization algorithms, like
2.2 Representing Surrogates as DAEs with
gradient descent, which are differential equation solvers
Continuous-Time Echo State Networks in a stiff parameter space. Instead of using optimization,
In order to compose a trained machine learning model CTESNs are semi-implicit neural ODEs where the first
with the components of ModelingToolkit.jl, one needs to layer is fixed, which results in an implicit training process.
represent such a trained model as a set of DAEs. To this To develop the CTESN, first a non-stiff dynamical sys-
end, one can make use of continuous machine learning tem, called the reservoir, is chosen. This is given by the
architectures, such as neural ODEs (Chen et al. 2018) expression
or physics-informed neural networks (Raissi, Perdikaris,
and Karniadakis 2019). However, prior work has demon-
r0 = f Ar +Whyb x (p∗ ,t)
strated that such architectures are prone to instabilities (1)
when being trained on stiff models (Wang, Teng, and
where A is a fixed random sparse NR × NR matrix, Whyb
Perdikaris 2020). In order to account for these difficul-
is a fixed random dense NR ×N matrix, and x(p∗ ,t) is a so-
ties, we have recently demonstrated a new architecture,
lution of the system at a candidate set of parameters from
CTESNs, which allows for implicit training in parame-
the parameter space, and f is an activation function.
ter space to stabilize the ill-conditioning present in stiff
Projections (Wout ) from the simulated reservoir time se-
systems (Anantharaman et al. 2021). For this reason,
ries to the truth solution time series are then computed,
CTESNs are the default surrogate algorithm of JuliaSim
using the following equation:
and will be the surrogate algorithm used throughout the
rest of the paper. We provide an overview of the CTESN
here, but for more details on the method, we refer the x(t) = g (Wout r(t)) (2)
reader to (Anantharaman et al. 2021).
The CTESN is a continuous-time generalization of where g is an activation function (usually the identity),
echo state networks (ESNs) (Lukoševičius 2012), a r(t) represents the solution to the reservoir equation, and
reservoir computing framework for learning a nonlin- x(t) represents the solution to full model. This projec-
ear map by projecting the inputs onto high-dimensional tion is usually computed via least-squares minimization
using the singular value decomposition (SVD), which is where y1 , y2 , and y3 are the concentrations of three rea-
robust to ill-conditioning by avoiding gradient-based opti- gants. This system has widely separated reaction rates
mization. A projection is computed for each point in the (0.04, 104 , 3 · 107 ), and is well-known to be very stiff
parameter space, and a map is constructed from the pa- (Gobbert 1996; Robertson and Williams 1975; Robertson
rameter space P to each projection matrix Wout (in our ex- 1976). It is commonly used as an example for evaluat-
amples, we will use a radial basis function to construct this ing integrators of stiff ODEs (Hosea and Shampine 1996).
map). Thus our final prediction is the following: Finding an accurate surrogate for this system is difficult
because it needs to capture both the stable slow-reacting
system and the fast transients. This breaks many data-
x̂(t) = g(Wout ( p̂)r(t)) (3) driven surrogate methods, such as PINNs and LSTMs
(Anantharaman et al. 2021). We shall now demonstrate
For a given test parameter p̂, a matrix Wout ( p̂) is com- training a surrogate of this system with the reaction rates
puted, the reservoir equation is simulated, and then the as inputs/design parameters.
final prediction x̂ is a given by the above matrix multipli- Table 1 shows the result of surrogatization using the
cation. LPCTESN and the NPCTESN, while considering the fol-
While the formulation above details linear projec- lowing ranges of design parameters corresponding to the
tions from the reservoir time series (Linear Projection three reaction rates: (0.036, 0.044), (2.7 · 107 , 3.3 · 107 )
CTESN or LPCTESN), nonlinear projections in the form and (0.9 · 104 , 1.1 · 104 ). We observe three orders of mag-
of parametrized functions can also be used to project from nitude smaller reservoir equation size, resulting in a com-
the reservoir time series to the reference solution (Nonlin- putationally cheaper surrogate model.
ear Projection CTESN). For this variation, a radial basis
function can be applied to model the nonlinear projection Table 1. Comparison between LPCTESN and NPCTESN on
r(t) 7→ x(t) in equation 2. The learned polynomial coeffi- surrogatization of the Robertson equations. “Res” stands for
cients βi from radial basis functions are used, and a map- reservoir.
ping between the model parameter space and coefficients
βi ’s is constructed. Model Res. ODE size Avg Rel. Err %
LPCTESN 3000 0.1484
rbf(βi )(r(t)) ≈ x(pi ,t) ∀i ∈ {1, . . . , k} (4) NPCTESN 3 0.0200
rbf(pi ) ≈ βi ∀i ∈ {1, . . . , k} (5)
2.3 Composing with External Models via the
where k is the total number of parameter samples used FMI Standard
for training. Finally, during prediction, first the coeffi-
cients are predicted and a radial basis function for the pre- While these surrogatized CTESNs can be composed with
diction of the time series is constructed: other MTK models, more opportunities can be gained by
composing with models from external languages. The
β̂ = rbf( p̂) (6) Functional Mock-up Interface (FMI) (Blochwitz et al.
2011) is an open-source standard for coupled simulation,
x̂(t) = rbf(β̂ )(r(t)) (7) adopted and supported by many simulation tools5 , both
open source and commercial. Models can be exported
Notice that both the LPCTESN and the NPCTESN rep- as Functional Mock-up Units (FMUs), which can then be
resent the trained model as a set of DAEs, and thus can simulated in a shared environment. Two forms of coupled
be represented as an ODESystem in MTK, and can be simulation are standardized. Model exchange uses a cen-
composed similarly to any other DAE model. tralized time-integration algorithm to solve the coupled
A significant advantage of applying NPCTESNs over sets of differential-algebraic equations exported by the in-
LPCTESNs is the reduction of reservoir sizes, which cre- dividual FMUs. The second approach, co-simulation, al-
ates a cheaper surrogate with respect to memory usage. lows FMUs to export their own simulation routine, and
LPCTESNs often use reservoirs whose dimensions reach synchronizes them using a master algorithm. Notice that
an order of 1000. While this reservoir ODE is not-stiff, as DAEs, the FMU interface is compatible with Modeling-
and is cheap to simulate, this leads to higher memory re- Toolkit.jl components and, importantly, trained CTESN
quirements. Consider the surrogatization of the Robertson models.
equations (Robertson 1976), a canonical stiff benchmark JuliaSim can simulate an FMU in parallel at different
problem: points in the design space. For each independent simu-
lation, the fmpy package6 was used to run the FMU in
ModelExchange with CVODE (Cohen, Hindmarsh, and
y˙1 = −0.04y1 + 104 y2 · y3 (8)
Dubois 1996) or co-simulation with the FMUs exported
y˙2 = 0.04y1 − 104 y2 · y3 − 3 · 107 y22 (9)
5 https://fmi-standard.org/tools/
y˙3 = 3 · 107 y22 (10) 6 https://github.com/CATIA-Systems/FMPy
Room Air Temp. Relative Error % over time Compressor Shaft Power Relative Error % over time
Temperature (K)
Power (W)
Total heat dissipation outside Relative Error % over time Refrigerant Sat. Temp. Relative Error % over time
Temperature (K)
Heat (J)
Figure 2. Surrogate prediction of the room temperature of the RAC model in blue, while the ground truth is in red. This is a
prediction for points over which the surrogate has not been trained. Relative error is calculated throughout the time span at 1000
uniformly spaced points. The CTESN surrogate was trained on a timespan of an entire day, using data from 100 simulations. The
simulation parameters were sampled from a chosen input space using Latin hypercube sampling. The simulation time span goes
from 188 days to 189 days at a fixed step size of 5 seconds. Table 3 presents the list of and ranges of inputs the surrogate has been
trained on. The relative error usually peaks at a point with a discontinuous derivative in time, usually induced by a step or ramp
input (which, in this case, is the parametrized compressor speed ramp input.). Another feature of the prediction error above is that it
is sometimes stable throughout the time span (such as with the compressor shaft power, top right). This is a feature of how certain
outputs vary through the parameter space. Sampling the space with more points or reducing the range of the chosen input space
would reduce this error. Table 2 shows the maximum relative error computed for many other outputs of interest. Figure 3 computes
and aggregates maximum errors across a 100 new test points from the space.
Figure 3. Performance of surrogate when tested on 100 test parameters from the parameter space. The test parameters were chosen
via Sobol low discrepancy sampling, and maximum relative error across the time span was calculated for all output quantities. The
average maximum error across all output quantities was then plotted as a histogram. Our current test points may not be maximally
separated through the space, but we anticipate similar performance with more test examples and a maximal sampling scheme.
Table 2. Relative errors when the surrogate is tested on parameters it has not been trained on. HEX stands for “heat exchanger”
and LEV stands for “linear expansion valve”.
Output quantity Max. Rel. Err % Output quantity Max. Rel. Err %
Air temp. in room 0.033 Rel. humidity in room 0.872
Outdoor dry bulb temp. 0.0001 Outdoor rel. humidity 0.003
Compressor inlet pressure 4.79 Compressor outlet pressure 3.50
LEV inlet pressure 3.48 LEV outlet pressure 4.84
LEV refrigerant outlet enthalpy 1.31 Compressor refrigerant mass flow rate 4.51
Evaporator refrigerant saturation temp. 0.205 Evaporator refrigerant outlet temp. 0.145
Total heat dissipation of outdoor HEX 8.15 Sensible heat load of indoor HEX 0.892
Latent heat load of indoor HEX 3.51 Outdoor coil outlet air temperature 0.432
Indoor coil outlet air temperature 0.070 Compressor shaft power 3.04
Table 3. Surrogate Operating Parameters. The surrogate is ex- enough workflows to overcome the training cost7 . In addi-
pected to work over this entire range of design parameters. tion, if the modeler is presented with both the component
and its pre-trained surrogate with known accuracy statis-
Input Parameter Range tics, such a modeler could effectively use the surrogate
(e.g., to perform a parameter study) and easily swap back
Compressor Speed (ramp) Start Time - (900, 1100) s to the high- fidelity version for the final model. This al-
Start Value - (45, 55) rpm lows users to test the surrogate in their downstream appli-
Offset - (9, 11) rpm cation, examine the resulting behaviour, and make a de-
LEV Position (252, 300) cision on whether the surrogate is good enough for their
Outdoor Unit Fan Speed (680, 820) rpm task. A discussion of error dynamics of the surrogate is
Indoor Unit Fan Speed (270, 330) rpm left to future work.
Radiative Heat Gain (0.0, 0.1) Thus to complement the JuliaSim surrogatization archi-
Convective Heat Gain (0.0, 0.1) tecture with a set of pre-trained components, we devel-
Latent Heat Gain (0.3, 0.4) oped the JuliaSim Model Library and training infrastruc-
ture for large-scale surrogatization of DAE models. Ju-
liaSim’s automated model training pipeline can serve and
store surrogates in the cloud. It consists of models from
solver. The resultant time series was then fitted to cu-
the Modelica Standard Library, CellML Physiome model
bic splines. Integration with state-of-the-art solvers from
repository (Yu et al. 2011), and other benchmark problems
DifferentialEquations.jl (Rackauckas and Nie 2017) for
defined using ModelingToolkit. In future work, we shall
simulating ModelExchange FMUs is planned in future re-
demonstrate workflows using these surrogates for acceler-
leases.
ated design and development.
2.4 Incorporating Surrogates into the Ju- Each of the models in the library contains a source form
which is checked by continuous integration scripts, and
liaSim Model Library
surrogates are regenerated using cloud resources when-
Reduced order modeling and surrogates in the space of ever the source model is updated8 . For some models, cus-
simulation have traditionally targeted PDE problems be- tom importers are also run in advance of the surrogate
cause of the common reuse of standard PDE models such generation. For instance, the CellMLToolkit.jl importer
as Navier-Stokes equations. Since surrogates have a train- translates the XML-based CellML schema into Model-
ing cost, it is only beneficial to use them if that cost is ingToolkit.jl. Components and surrogates from other
amortized over many use cases. In equation-based model- sources, such as Systems Biology Markup Language li-
ing systems, such as Modelica or Simulink, it is common braries (SBML), are scheduled to be generated. Addition-
for each modeler to build and simulate a unique model. ally, for each model, a diagnostic report is generated de-
While at face value this may seem to defeat opportuni- tailing:
ties for amortizing the cost, the composability of compo-
1. the accuracy of the surrogate across all outputs of in-
nents within these systems is what grants a new opportu-
terest
nity. For example, in Modelica it is common to hierar-
7 We note that an additional argument can be made for pre-trained
chically build models from components originating in li-
braries, such as the Modelica standard library. This means models in terms of user experience. If a user of a modeling software
needs a faster model for real-time control, then having raised the total
that large components, such as high-fidelity models of simulation cost to reduce the real-time user cost would still have a net
air conditioners, specific electrical components, or phys- benefit in terms of the application
iological organelles, could be surrogatized and accelerate 8 https://buildkite.com/
2. the parameter space which was trained on which system performance is maximized, yielding two or-
ders of magnitude speedup over using the full model. Fi-
3. and performance of the surrogate against the original nally, we discuss the deployment of the surrogate in a co-
model simulation loop coupled with another FMU.
is created to be served along with the models. With this in- 3.1 Surrogates of Coupled RAC Models
formation, a modeler can check whether the surrogatized We first consider surrogate generation of a Room Air Con-
form matches the operating requirements of their simula- ditioner (RAC) model using JuliaSim, consisting of a cou-
tion and replace the usage of the original component with pled room model with a vapor compression cycle model,
the surrogate as necessary. Note that a GUI exists for users which removes heat from the room and dissipates it out-
of JuliaSim to surrogatize their own components through side. This model was provided to us by a user as-is,
this same system. and a maximum relative error tolerance of 5% was cho-
sen. The vapor compression cycle itself consists of de-
3 Accelerating Building Simulation tailed physics-based component models of a compressor,
with Composable Surrogates an expansive valve and a finite volume, and a staggered-
grid dynamic heat exchanger model (Laughman 2014).
To demonstrate the utility of the JuliaSim architecture, we This equipment is run open-loop in this model to sim-
focus on accelerating the simulation of energy efficiency plify the interactions between the equipment and the ther-
of buildings. Sustainable building simulation and design mal zone. The room model is designed using components
involves evaluating multiple options, such as building en- from the Modelica Buildings library (Wetter, Zuo, et al.
velope construction, Heating Ventilation, Air Condition- 2014). The room is modeled as a volume of air with in-
ing and Refrigeration (HVAC/R) systems, power systems ternal convective heat gain and heat conduction outside.
and control strategies. Each choice is modeled indepen- The Chicago O’Hare TMY3 weather dataset9 is imported
dently by specialists drawing upon many years of develop- and is used to define the ambient temperature of the air
ment, using different tools, each with their own strengths outside. This coupled model is written and exported from
(Wetter 2011). For instance, the equation-oriented Model- Dymola 2020x as a co-simulation FMU.
ica language (Elmqvist, Mattsson, and Otter 1999; Fritz- The model is simulated with 100 sets of parameters
son and Engelson 1998) allows modelers to express de- sampled from a chosen parameter space using Latin hy-
tailed multi-physics descriptions of thermo-fluid systems percube sampling. The simulation timespan was a full day
(Laughman 2014). Other tools, such as EnergyPlus, DOE- with a fixed step size of 5 seconds. The JuliaSim FMU
2, ESP-r, TRNSYS have all been compared in the litera- simulation backend runs simulations for each parameter
ture (Sousa 2012; Wetter, Treeck, and Hensen 2013). set in parallel and fits cubic splines to the resulting time
These models are often coupled and run concurrently series outputs to continuously sample points from parts of
to make use of results generated by other models at run- the trajectory. Then the CTESN algorithm computes pro-
time (Nicolai and Paepcke 2017). For example, a build- jections from the reservoir time series to output time series
ing energy simulation model computing room air temper- at each parameter set. Finally, a radial basis function cre-
atures may require heating loads from an HVAC supply ates a nonlinear map between the chosen parameter space
system, with the latter coming from a simulation model and the space of projections. Figure 2 and Table 2 show
external to the building simulation tool. Thus, integration the relative errors when the surrogate is tested at a param-
of these models into a common interface to make use of eter set on which it has not been trained. To demonstrate
their different features, while challenging (Wetter, Treeck, the reliability of the surrogate through the chosen param-
and Hensen 2013), is an important task. eter space, 100 further test parameters were sampled from
While the above challenge has been addressed by FMI, the space, and the errors for each test were compiled into a
the resulting coupled simulation using FMUs is com- histogram, as shown in 3. At any test point, the surrogate
putationally expensive due to the underlying numerical takes about 6.1 seconds to run, while the full model takes
stiffness (Robertson and Williams 1975) widely prevalent 35 minutes, resulting in a speedup of 344x.
in many engineering models. These simulations require This surrogate model can then be reliably deployed for
adaptive implicit integrators to step forward in time (Wan- design and optimization, which is outlined in the follow-
ner and Hairer 1996). For example, building heat transfer ing section.
dynamics has time constants in hours, whereas feedback
controllers have time constants in seconds. Thus, surro- 3.2 Accelerating Global Optimization
gate models are often used in building simulation (West- Building design optimization (Nguyen, Reiter, and Rigo
ermann and Evins 2019). 2014; Machairas, Tsangrassoulis, and Axarli 2014) has
In the following sections, we describe surrogate gener- benefited from the use of surrogates by accelerating opti-
ation of a complex Room Air Conditioner (RAC) model, mization through faster function evaluations and smooth-
which has been exported as an FMU. We then use the sur-
rogate to find the optimal set of design parameters over 9 https://bcl.nrel.gov/node/58958
Figure 4. Comparison of global optimization while using the full model and the surrogate. Loss is measured using the full model’s
objective function. (Left) Convergence of loss with number of function evaluations (Right) Convergence of loss with wall clock
time. The optimization using the surrogate converged much before the result from the first function evaluation of the full model is
over. This is why the blue line appears translated horizontally in time.
ing objective functions with discontinuities (Westermann 3.3 Co-simulation with Surrogates
and Evins 2019; Wetter and Wright 2004).
Next we examine a co-simulation loop with two coupled
The quantity to be maximized (or whose negative value
FMUs and replace one of the FMUs with a surrogate. Co-
is to be minimized) is the average coefficient of perfor-
simulation is a form of coupled simulation where a mas-
mance (COP) across the time span. We calculate this us-
ter algorithm simulates and synchronizes time dependent
ing output time series from the model by means of the
models models at discrete time steps. An advantage of
following formula:
co-simulation over model exchange is that the individ-
ual FMUs can be shipped with their own solvers. These
Qtot (t) FMU solver calls are abstracted away from the master al-
COP(t) = (11)
max(0.01,CSP(t)) gorithm, which only pays heed to initialization and syn-
tN
COP(tn ) chronization of the FMUs.
∑n=1
COPavg = (12) We examine a simplified example of an HVAC system
Nt
providing cooling to a room from the Modelica Buildings
where COP refers to the coefficient of performance, library (Wetter, Bonvini, et al. 2015). Both the HVAC sys-
COPavg refers to the average coefficient of performance tem and room models have been exported as FMUs, which
across the time interval (the quantity to optimize), Qtot the are then imported into JuliaSim and then coupled via co-
total heat dissipation from the coupled model, CSP(t) is simulation. At each step of the co-simulation, the mod-
the compressor shaft power, and Nt represents the number els are simulated for a fixed time step, and the values of
of points in time sampled from the interval (720). the coupling variables are queried and then set as inputs
We use an adaptive differential evolution global opti- to each other, before the models are simulated at the next
mization algorithm, which does not require the calcula- time step.
tion of gradients or Hessians (Price, Storn, and Lampinen JuliaSim then generates a surrogate of the HVAC sys-
2006). We chose this algorithm because of its ability to tem by training over the set of inputs received during the
handle black-box objective functions. We use the differen- co-simulation loop. It is then deployed in a “plug and
tial optimizers in BlackBoxOptim.jl10 for this experiment. play” fashion, by coupling the outputs of the surrogates
Figure 4 shows that the surrogate produces a series of to the inputs of the room and vice versa. The resultant
minimizers, which eventually converge to within 1% of output from the coupled system is shown in Figure 5. The
the reference minimum value chosen, but two orders of above co-simulation test has been conducted at the same
magnitude faster. The surrogate does take more function set of set of design parameters as the original simulation.
evaluations to converge than the true model, but since each 11 While the individual models in this test are simplified,
function value is relatively inexpensive, the impact on wall
11 We tried to simulate this coupled system at different design pa-
clock time is negligible.
rameters, but were unable to, for reasons currently unknown to us,
10 https://github.com/robertfeldt/BlackBoxOptim.jl change certain parameters on Dymola 2020x. We were also not able
Temperature (K)
Figure 5. Coupled co-simulation of a surrogate and an FMU. The blue line represents the ground truth, which is the output from
the co-simulation of two coupled FMUs, and the red line represents the output from the coupled surrogate and an FMU. While
the prediction smooths over transients found in the ground truth, it does so at a relative error of less than 1.5%. This result also
empirically suggests that the output from the surrogate is bounded over the set of inputs it has received over co-simulation. The
surrogate was trained over a sample of 100 inputs received from the room model. The error over the transients can be reduced by
sampling more inputs from the co-simulation.
they serve as a proof of concept for a larger coupled sim- larly being explored. But together, JuliaSim demonstrates
ulation, either involving more FMUs or involving larger that future modeling and simulation software does not
models, which may be prohibitively expensive (Wetter, need to, and should not, eschew all of the knowledge of
Fuchs, and Nouidui 2015). the past equation-based systems in order to bring machine
learning into the system.
4 Conclusion
Acknowledgements
We demonstrate the capabilities of JuliaSim, a software
for automated generation of deployment of surrogates for The information, data, or work presented herein was
design, optimization and coupled simulation. Our surro- funded in part by ARPA-E under award numbers DE-
gates can reproduce outputs from detailed multi-physics AR0001222 and DE-AR0001211, and NSF award num-
systems and can be used as stand-ins for global opti- ber IIP-1938400. The views and opinions of authors ex-
mization and coupled simulation. Our results show the pressed herein do not necessarily state or reflect those of
promise of blending machine learning surrogates in Ju- the United States Government or any agency thereof.
liaSim, and we believe that it can enable a machine
learning-accelerated workflow for design and develop- References
ment of complex multi-physical systems. Anantharaman, Ranjan et al. (2021). “Accelerating Simulation
There are many avenues for this work to continue. Fur- of Stiff Nonlinear Systems using Continuous-Time Echo
ther work to deploy these embedded surrogates as FMUs State Networks”. In: Proceedings of the AAAI 2021 Spring
themselves is underway. This would allow JuliaSim to Symposium on Combining Artificial Intelligence and Ma-
chine Learning with Physical Sciences.
ship accelerated FMUs to other platforms. Other surro-
Benner, Peter, Serkan Gugercin, and Karen Willcox (2015). “A
gate algorithms, such as proper orthogonal decomposi- survey of projection-based model reduction methods for para-
tion (Chatterjee 2000), neural ordinary differential equa- metric dynamical systems”. In: SIAM review 57.4, pp. 483–
tions (Chen et al. 2018; S. Kim et al. 2021), and dynamic 531.
mode decomposition (Schmid 2010) will be added in up- Bezanson, Jeff et al. (2017). “Julia: A fresh approach to numer-
coming releases and rigorously tested on the full model ical computing”. In: SIAM review 59.1, pp. 65–98.
library. Incorporating machine learning in other fashions, Blochwitz, Torsten et al. (2011). “The functional mockup inter-
such as within symbolic simplification algorithms, is simi- face for tool independent exchange of simulation models”. In:
Proceedings of the 8th International Modelica Conference.
to change those parameters having exported the constituent models as Linköping University Press, pp. 105–114.
co-simulation FMUs. We shall aim to debug this issue and complete
this story in future work.
Brück, Dag et al. (2002). “Dymola for multi-engineering model- public, May 15-17, 2017. 132. Linköping University Elec-
ing and simulation”. In: Proceedings of modelica. Vol. 2002. tronic Press, pp. 63–72.
Citeseer. Otter, Martin and Hilding Elmqvist (2017). “Transformation of
Chatterjee, Anindya (2000). “An introduction to the proper or- differential algebraic array equations to index one form”. In:
thogonal decomposition”. In: Current science, pp. 808–817. Proceedings of the 12th International Modelica Conference.
Chen, Ricky TQ et al. (2018). “Neural ordinary differential Linköping University Electronic Press.
equations”. In: arXiv preprint arXiv:1806.07366. Pantelides, Constantinos C. (1988). “The Consistent Initializa-
Cohen, Scott D, Alan C Hindmarsh, and Paul F Dubois (1996). tion of Differential-Algebraic Systems”. In: SIAM Journal on
“CVODE, a stiff/nonstiff ODE solver in C”. In: Computers in Scientific and Statistical Computing 9.2, pp. 213–231. DOI:
physics 10.2, pp. 138–143. 10.1137/0909014.
Elmqvist, Hilding, Sven Erik Mattsson, and Martin Otter (1999). Price, Kenneth, Rainer M Storn, and Jouni A Lampinen (2006).
“Modelica-a language for physical system modeling, visual- Differential evolution: a practical approach to global opti-
ization and interaction”. In: Proceedings of the 1999 IEEE mization. Springer Science & Business Media.
international symposium on computer aided control system Rackauckas, Christopher and Qing Nie (2017). “Differentiale-
design (Cat. No. 99TH8404). IEEE, pp. 630–639. quations. jl–a performant and feature-rich ecosystem for solv-
Fritzson, Peter, Peter Aronsson, et al. (2005). “The OpenModel- ing differential equations in julia”. In: Journal of Open Re-
ica modeling, simulation, and development environment”. In: search Software 5.1.
46th Conference on Simulation and Modelling of the Scandi- Raissi, Maziar, Paris Perdikaris, and George E Karniadakis
navian Simulation Society (SIMS2005), Trondheim, Norway, (2019). “Physics-informed neural networks: A deep learning
October 13-14, 2005. framework for solving forward and inverse problems involv-
Fritzson, Peter and Vadim Engelson (1998). “Modelica—A uni- ing nonlinear partial differential equations”. In: Journal of
fied object-oriented language for system modeling and sim- Computational Physics 378, pp. 686–707.
ulation”. In: European Conference on Object-Oriented Pro- Ratnaswamy, Vishagan et al. (2019). “Physics-informed Recur-
gramming. Springer, pp. 67–90. rent Neural Network Surrogates for E3SM Land Model”. In:
Gobbert, Matthias K (1996). “Robertson’s example for stiff dif- AGU Fall Meeting Abstracts. Vol. 2019, GC43D–1365.
ferential equations”. In: Arizona State University, Technical Robertson, HH (1976). “Numerical integration of systems of
report. stiff ordinary differential equations with special structure”.
Hosea, ME and LF Shampine (1996). “Analysis and imple- In: IMA Journal of Applied Mathematics 18.2, pp. 249–263.
mentation of TR-BDF2”. In: Applied Numerical Mathematics Robertson, HH and J Williams (1975). “Some properties of al-
20.1-2, pp. 21–37. gorithms for stiff differential equations”. In: IMA Journal of
Hu, Liwei et al. (2020). “Neural networks-based aerodynamic Applied Mathematics 16.1, pp. 23–34.
data modeling: A comprehensive review”. In: IEEE Access 8, Schmid, Peter J (2010). “Dynamic mode decomposition of nu-
pp. 90805–90823. merical and experimental data”. In: Journal of fluid mechan-
Kim, Suyong et al. (2021). Stiff Neural Ordinary Differential ics 656, pp. 5–28.
Equations. arXiv: 2103.15341 [math.NA]. Sousa, Joana (2012). “Energy simulation software for build-
Kim, Youngkyu et al. (2020). “A fast and accurate physics- ings: review and comparison”. In: International Workshop on
informed neural network reduced order model with shallow Information Technology for Energy Applicatons-IT4Energy,
masked autoencoder”. In: arXiv preprint arXiv:2009.11990. Lisabon.
Laughman, Christopher R (2014). “A Comparison of Transient Wang, Sifan, Yujun Teng, and Paris Perdikaris (2020).
Heat-Pump Cycle Simulations with Homogeneous and Het- “Understanding and mitigating gradient pathologies in
erogeneous Flow Models”. In: physics-informed neural networks”. In: arXiv preprint
Lukoševičius, Mantas (2012). “A practical guide to applying arXiv:2001.04536.
echo state networks”. In: Neural networks: Tricks of the trade. Wanner, Gerhard and Ernst Hairer (1996). Solving ordinary dif-
Springer, pp. 659–686. ferential equations II. Vol. 375. Springer Berlin Heidelberg.
Lukoševičius, Mantas and Herbert Jaeger (2009). “Reservoir Westermann, Paul and Ralph Evins (2019). “Surrogate mod-
computing approaches to recurrent neural network training”. elling for sustainable building design–A review”. In: Energy
In: Computer Science Review 3.3, pp. 127–149. and Buildings 198, pp. 170–186.
Ma, Yingbo et al. (2021). “ModelingToolkit: A Composable Wetter, Michael (2011). A view on future building system mod-
Graph Transformation System For Equation-Based Model- eling and simulation. Tech. rep. Lawrence Berkeley National
ing”. In: arXiv preprint arXiv:2103.05244. Lab.(LBNL), Berkeley, CA (United States).
Machairas, Vasileios, Aris Tsangrassoulis, and Kleo Axarli Wetter, Michael, Marco Bonvini, et al. (2015). “Modelica build-
(2014). “Algorithms for optimization of building design: A ings library 2.0”. In: Proc. of The 14th International Confer-
review”. In: Renewable and sustainable energy reviews 31, ence of the International Building Performance Simulation
pp. 101–112. Association (Building Simulation 2015), Hyderabad, India.
Nguyen, Anh-Tuan, Sigrid Reiter, and Philippe Rigo (2014). Wetter, Michael, Marcus Fuchs, and Thierry Nouidui (2015).
“A review on simulation-based optimization methods applied “Design choices for thermofluid flow components and sys-
to building performance analysis”. In: Applied Energy 113, tems that are exported as Functional Mockup Units”. In:
pp. 1043–1058. Wetter, Michael, Christoph van Treeck, and Jan Hensen (2013).
Nicolai, Andreas and Anne Paepcke (2017). “Co-Simulation be- “New generation computational tools for building and com-
tween detailed building energy performance simulation and munity energy systems”. In: IEA EBC Annex 60.
Modelica HVAC component models”. In: Proceedings of the Wetter, Michael and Jonathan Wright (2004). “A comparison
12th International Modelica Conference, Prague, Czech Re- of deterministic and probabilistic optimization algorithms for