Calibrating Building Energy Simulation Models
Calibrating Building Energy Simulation Models
Calibrating Building Energy Simulation Models
a r t i c l e i n f o a b s t r a c t
Article history: Building energy simulation (BES) plays a significant role in buildings with applications such as architec-
Received 3 August 2021 tural design, retrofit analysis, and optimizing building operation and controls. There is a recognized need
Revised 17 September 2021 for model calibration to improve the simulations’ credibility, especially with building data becoming
Accepted 28 September 2021
increasingly available and the promises that a digital twin brings. However, BES calibration remains chal-
Available online 04 October 2021
lenging due to the lack of clear guidelines and best practices. This study aims to provide the foundation
for future research through a detailed systematic review of the vital aspects of BES calibration.
Keywords:
Specifically, we conducted a meta-analysis and categorization of the simulation inputs and outputs, data
Building performance simulation
Building simulation
type and resolution, key calibration methods, and calibration performance evaluation. This study also
Calibration identified reproducible simulations as a critical issue and proposes an incremental approach to encourage
Reproducibility future research’s reproducibility.
Optimization Ó 2021 Elsevier B.V. All rights reserved.
Uncertainty
Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1. Calibration in building energy simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2. Calibration, validation, and verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3. Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4. Aim and objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1. Search and eligibility criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2. Study selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Study context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.1. Simulation engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.2. Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
4. Key calibration approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
4.1. Optimization-based calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
4.2. Calibration under uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4.2.1. Types and sources of uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4.2.2. Uncertainty quantification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.3. Analytical tools and techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.3.1. Analytical techniques by approach and application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.3.2. Sensitivity analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.4. Multi-stage calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5. Data requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.1. Inputs and outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.2. Most common observed outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
⇑ Corresponding author.
E-mail address: adrian.chong@nus.edu.sg (A. Chong).
https://doi.org/10.1016/j.enbuild.2021.111533
0378-7788/Ó 2021 Elsevier B.V. All rights reserved.
A. Chong, Y. Gu and H. Jia Energy & Buildings 253 (2021) 111533
classes: (1) calibration based on manual, iterative, and pragmatic Criteria Description
intervention; (2) calibration based on a suite of informative graph- Keywords [‘‘calibration” OR ‘‘model calibration”] AND
ical comparative displays; (3) calibration based on specific tests [‘‘building performance simulation” OR ‘‘building energy model”
and analytical procedures; and (4) analytical/mathematical meth- OR ‘‘building energy modeling” OR ‘‘building energy simulation”
ods of calibration. In 2014, Coakley et al. [13] extended these clas- OR‘‘building simulation” OR ‘‘energy simulation” OR ‘‘whole
building energy model”]
sifications to include advancements in optimization techniques,
Database Scopus
Bayesian calibration, and alternative modeling techniques such Search 16 January 2021
as meta-modeling. Additionally, a broader definition of calibration date
approaches as either manual or automated was proposed. In the Limit to Year: 2015–2020
Language: English
following year, Fabrizio and Monetti [14] built upon the study by
Document type: Article
Coakley et al. [13] by discussing in further detail the issues affect- Source type: Journal
ing BES calibration. Subject areas: Engineering, Environmental Science, Energy
Journals: Energy and Buildings, Applied Energy, Automation in
Construction, Building and Environment, Solar Energy, Applied
thermal energy, Journal of Building Performance Simulation,
1.4. Aim and objectives Journal of Building Engineering, Building Simulation, HVAC and R
Research
Exclude Subject areas: Material science, Social Sciences, Chemical
Although there have been numerous BES calibration studies
engineering
over the past decade, most studies focused on applying a specific Total number of publications returned: 186
calibration methodology to specific case study buildings. Com-
bined with the lack of open code and data, BES calibration remains
challenging to replicate. Additionally, as described in the preceding 2.1. Search and eligibility criteria
paragraph, existing review articles focus on providing an overview
of current calibration methodologies. However, proper specifica- Table 1 presents the search strategy used to identify relevant
tion of model inputs and outputs is equally important. To date, publications from the Scopus database. The keywords ‘‘model cal-
there has been little quantitative analysis about model inputs ibration” and ‘‘building energy simulation” were used to identify
and outputs, calibration methods, and the criteria for evaluating an initial list of publications. To capture as many relevant publica-
calibration performance. The determination of all these details is tions as possible, synonyms that are interchangeable with ‘‘model
highly subjective, often requiring a high level of expertise, experi- calibration” and ‘‘building energy simulation” in the BES literature
ence, and domain knowledge. Despite its importance, there is little were included in the search string.
guidance on best practices to facilitate BES calibration. The initial search returned 2,762 publications. Limiting the
With the aim of enhancing reproducibility and enabling others search to English journal articles published between 2015 and
to build upon published work more easily, the objectives of this 2020 resulted in 781 publications. We limit the review to the
review article are to: immediate past six years to reflect recent trends and state-of-
the-art in BES. Additionally, the most recent review paper for BES
Synthesize relevant BES literature and the relationship between calibration was in the year 2014 [13] and 2015 [14].
various model inputs and outputs. Further refinements to the search criteria were made by includ-
Perform a detailed meta-analysis of the calibration methods ing relevant subject areas (Engineering, Environmental Science,
and measures of calibration performance currently utilized in and Energy) and explicitly excluding irrelevant subject areas
the existing literature. (Material science, Social Sciences, and Chemical engineering).
Provide recommendations to facilitate reproducibility in BES. These criteria excluded 338 studies and left 443 for the review.
The titles and abstracts of the 443 studies were subsequently
We believe that meeting these objectives will provide a solid screened to identify relevant publications and their corresponding
foundation and platform for future research to advance the current source journal, yielding 186 publications.
state of BES calibration. This is also the first systematic review on
the subject.
In Section 2 we describe the methodology of our systematic
review. Section 3 contextualizes the review with an overview of
2.2. Study selection
the simulation engine used, and the location of case studies. Sec-
tion 4 describes the state-of-the-art calibration approaches, includ-
The full papers of the 186 publications were subsequently
ing a comparison against the 2014 review by Coakley et al. [13].
screened for relevance to this review based on the following crite-
Section 5 analyzes the relationship between the inputs and outputs
ria: (1) the study involved the use of building energy simulation;
used for calibration. Section 6 summarizes the metrics commonly
(2) the study contains the application of calibration methods or
used to evaluate calibration performance. Section 7 discusses the
techniques; and (3) the study is not vague on the proposed calibra-
significant findings and identifies areas for future research. Sec-
tion approach and is not ambiguous on the model input(s) and out-
tion 8 concludes the paper.
put(s). Of the 186 studies, 107 were selected for the review.
2. Method
3. Study context
A systematic review was adopted to provide a comprehensive
and unbiased summary of evidence on calibration methods and This section provides the context to this review by summarizing
techniques, model inputs/outputs, and calibration performance the 107 selected papers regarding the simulation engine used and
metrics. the location of the calibration case studies.
3
A. Chong, Y. Gu and H. Jia Energy & Buildings 253 (2021) 111533
Fig. 1. Geographic distribution of case study buildings and the corresponding scale of the simulation (component/system, building, or urban) (top plot) and the distribution of
the case study buildings based on the köppen climate zones (bottom plot.
3.1. Simulation engine white box simulation engines (EnergyPlus, TRNSYS, and DOE-2), an
RC network is a gray-box model that combines simplified physical
A majority of the papers reviewed (60%) used EnergyPlus for a representations of the building with operation data that are used to
variety of reasons (Table 2). EnergyPlus is an open-source whole- identify the model’s coefficients [19]. The benefits of an RC net-
building energy simulation engine that has been and continues work lies in having physical descriptions of the building while
to be supported by the U.S. Department of Energy (DOE) [15]. being computationally more efficient than white-box models. Fur-
Moreover, EnergyPlus supports many application software [16]. ther easing implementation, the development of RC models may
15% and 7% of the papers reviewed use TRNSYS and DOE-2 respec- also follow the well-established international standard ISO
tively. TRNSYS [17] is a transient systems simulation program that 13790:2008 [20] that was subsequently revised by ISO 52016–
is designed to provide flexibility in conducting energy simulations 1:2017 [21].
through a modular structure and extensive add-on component
libraries. DOE-2 [18] is a building energy simulation program that
3.2. Location
performs hourly simulation given descriptions of the building lay-
out, constructions, operating schedules, HVAC systems, and utility
Fig. 1 shows the geographic distribution of the case study build-
rates.
ings extracted from the papers reviewed across various simulation
What stands out in Table 2 is that Resistance–Capacitance (RC)
scales (top map plot) and within each köppen climate zones (bot-
networks (also known as lumped parameter models) were used in
tom bar-plot). From the figure, it is apparent that a majority of case
10% of the studies reviewed. Unlike the other three commonly used
study applications are located in the U.S. or Europe, with several in
4
A. Chong, Y. Gu and H. Jia Energy & Buildings 253 (2021) 111533
Table 2 at the urban-scale and the remaining 8%1 for the calibration of a
Simulation engines used in the reviewed calibration applications (N ¼ 107). single building component or system. A further observation that
Simulation Engine Type Percentage of emerged from the data was that urban-scale case studies are located
Papers Reviewed only in the U.S. (54%), Europe (34%), and the Middle East (12%). None
EnergyPlus white-box 60% of the urban-scale studies were located in the tropics.
TRNSYS 15%
DOE-2 7%
Resistance–Capacitance (RC) Gray-box 10% 4. Key calibration approaches
Others NA 8%
In general, calibration approaches can be classified as either
manual or automated [13]. Automated approaches employ some
form of computerized processes to tune model parameters by max-
imizing the model’s fit to observations. In contrast, manual
China and South Korea. These applications are situated between approaches rely on iterative pragmatic intervention by the mod-
latitude 30 N and 65 N. Consequently, 96% of the studies belong eler. The number of papers utilizing an automated calibration
to an arid (dry), temperate (mild mid-latitude), or continental (cold approach has approximately tripled when comparing this review
mid-latitude) climate group. Only 4% of the applications were in to the review by Coakley et al. [13] in 2014 (Fig. 2).
the equatorial region characterized by a warm and humid climate In this review, a majority of the automated approaches employ
all year round. either mathematical optimization (58.5%) or Bayesian calibration
An inspection of the data in Fig. 1 reveals that over three- (33%), with several using sampling methods to select a subset of
quarters of the case studies are at the building scale (77%). 15% models with the best fit (8.5%). To aid future applications, Table 3
provides a list of packages, libraries, code repositories, and applica-
tions for sensitivity analysis, optimization and Bayesian
calibration.
Table 3
Applications, R packages, Python libraries and code repositories for performing sensitivity analysis, optimization, and Bayesian inference algorithms on building energy
simulation models.
Sensitivity Analysis
sensitivity CRAN R SRC, SRRC, PRCC, Morris, FAST, Sobol [22]
SALib PyPI/GitHub Python Morris, FAST, Sobol [23]
Optimization-based calibration
GenOpt Application Java GPSHJ, PSO [24]
jEPlus Application Java NSGA-II [25]
DEAP PyPI/GitHub Python NSGA-II, PSO [26]
ecr CRAN/GitHub R NSGA-II, PSO [27]
Bayesian calibration
SAVE CRAN/GitHub R Bayesian emulation, calibration, and validation following Bayarri et al. [28] with roots in Kennedy and [31]
O’Hagan [29] and Higdon et al. [30]
bc-stan GitHub R Bayesian emulation and calibration following Kennedy and O’Hagan [29] and Higdon et al. [30] [32]
pySIP GitHub Python Bayesian emulation and calibration for continuous time stochastic state-space (e.g. RC networks) [33]
Abbreviations: PyPI (Python Package Index); CRAN (Comprehensive R Archive Network); SRC (Standardized Regression Coefficients); PRCC (Partial Rank Correlation Coef-
ficient); SRRC (Standardized Rank Regression Correlation); FAST (Fourier Amplitude Sensitivity Testing); GPSHJ (Generalized Pattern Search Hooke Jeeves); PSO (Particle
Swarm Optimization); NSGA-II (Non-dominated sorting genetic algorithm II);
5
A. Chong, Y. Gu and H. Jia Energy & Buildings 253 (2021) 111533
frameworks that minimize a cost function that needs to be evalu- 4.2. Calibration under uncertainty
ated by an external BES program. Additionally, population-based
metaheuristic algorithms such as PSO and GA initialize the opti- Uncertainty is an inevitable characteristic of BES models
mization with a population of randomly distributed points to because of the complexity and interactions between different
reduce the risk of converging to local minima. However, situations building systems. In many building energy applications, uncer-
of falling far from the pareto-optimal front can be hard to detect, tainty management is an important aspect when accounting for
and therefore defining a stopping criterion is difficult. Although risk in the decision-making process. It is somewhat surprising that
guidelines [59–61] specifying thresholds for accuracy metrics such only 32 of the 107 (30%) papers reviewed involved some form of
as CV(RMSE) and NMBE are often used, it has been shown that uncertainty quantification during model calibration.
these are not proper stopping criteria for optimization-based cali-
bration [38]. Nonetheless, minimizing CV(RMSE) was also found to 4.2.1. Types and sources of uncertainty
be the most robust cost function under different combinations of In general, uncertainty can be classified as either aleatory or
error metrics, calibration output, and calibration dataset time res- epistemic [70,71]. Aleatory uncertainty (or irreducible uncertainty)
olution [38]. Constraints in the form of a specified range for the is the uncertainty caused by inherent variations or randomness of
modeling parameters are often added to prevent unreasonable val- the building system or sub-system under investigation that cannot
ues [62]. be explained by the data collected. In contrast, epistemic uncer-
Optimization-based calibration has been widely applied in BES tainty (or reducible uncertainty) is the uncertainty that arises from
(Fig. 2). A prominent example is the Autotune project that aims to a lack of knowledge (or data). The distinction between aleatory and
replace manual calibration with a calibration method that lever- epistemic uncertainty has merit in guiding the uncertainties that
ages supercomputing, large databases of simulation results, and have the potential of being reduced [72]. However, developing
an evolutionary algorithm to automate the calibration process BES models involves a significant degree of subjectivity that
[63,64]. Sun et al. [65] proposed a pattern-based optimization depends on the data available that may also evolve throughout a
approach that determines the parameters to tune based on the building’s lifecycle. As a result, most uncertainties are often a com-
identified bias in monthly utility bills. Yang and Becerik-Gerber bination of both aleatory and epistemic uncertainty, making it dif-
[66] performed independent single objective optimization at the ficult to distinguish between the two.
component, zone, and building level. The union of the independent Related to the types of uncertainty is the identification and clas-
solution sets is then used for the subsequent multi-objective sification of uncertainty by their sources, which forms an impor-
optimization. tant part of a comprehensive uncertainty quantification
Table 4
Analytical tools and techniques that were used to support the calibration process applied in the papers reviewed. Adapted from [13].
6
A. Chong, Y. Gu and H. Jia Energy & Buildings 253 (2021) 111533
framework [71,73]. In BES calibration, the sources of uncertainty any model inadequacy or bias (Eq. 2). There have been several
can be classified as follows: applications of KOH’s approach in the field of BES calibration
[79–87], including a detailed guideline for its application in the
Parameter uncertainty: Uncertainty associated with influential field [32].
model inputs that are not known with certainty.
Model form uncertainty: Model discrepancy (also called model yðxÞ ¼ gðx; tÞ þ dðxÞ þ ðxÞ ð2Þ
inadequacy) that results from all assumptions, conceptualiza-
tions, abstractions, and approximations of the real-world phys- where, yðxÞ is the observed field measurement, gðx; t Þ is the output
ical processes. of the BES given observable inputs x and calibration parameters
Observation uncertainty: Uncertainties that result from obser- t; dðxÞ is the model inadequacy, and ðxÞ is the observation errors.
vation errors. Other noteworthy approaches include Bayesian hierarchical
modeling for the calibrating of urban-scale building energy models
4.2.2. Uncertainty quantification [88,89] and sequential updating taking advantage of Bayes theo-
Uncertainty quantification in BES can be broadly categorized as rem to keep the model up to date without losing past knowledge
either forward or inverse. Forward approaches quantify uncer- [85,90].
tainty in the model output(s) by propagating them from uncertain- Given the complexity of BES models, posterior distributions
ties in the model parameters (Fig. 3). Statistical sampling often cannot be derived analytically. Consequently, Markov chain
techniques such as Monte Carlo simulation or Latin Hypercube Monte Carlo (MCMC) is often used in Bayesian calibration to sam-
Sampling are easy to apply and the most common in the field of ple from the posterior distributions because of its flexibility and
BES [74–78]. straightforward application to complex problems. However, it is
Inverse approaches involve quantifying various sources of well known that performing Bayesian inference via MCMC is com-
uncertainties given a set of observations from the building system putationally expensive, especially when likelihood evaluations
being modeled. In particular, the calibration paradigm known as involve computationally expensive models such as in the case of
Bayesian calibration has gained popularity in BES due to its ability b also known as the potential
BES. The Gelman-Rubin statistic ( RÞ,
to naturally incorporate uncertainty and combine prior informa- scale reduction factor, is often used to determine if convergence
tion with measured data to derive posterior estimates of the model to a stationary distribution has been achieved [88,91,92,81,32].
parameters (Eq. 1). To alleviate the high computation cost of Bayesian inference,
metamodels have been proposed as surrogates of the energy
pðtjyÞ / pð yjtÞ pðtÞ ð1Þ
model. Gaussian processes (GP) [79,80,32,91,81,82,93,85,86] and
A notable approach within the Bayesian calibration paradigm is linear regression [94,95,83,96,87] are the most popular with com-
the formulation proposed by Kennedy and O’Hagan (KOH) [29]. peting trade-offs between computation cost and accuracy. Addi-
KOH’s approach differs from traditional approaches by allowing tionally, more efficient MCMC sampling strategies such as
for various sources of uncertainty and attempting to correct for Hamiltonian Monte Carlo (HMC) [81,97] and Approximate Baye-
Fig. 4. Analytical techniques (for both manual and automated calibration approaches) used in this review (top plot) and the review by Coakley et al. (2014) [13].
7
A. Chong, Y. Gu and H. Jia Energy & Buildings 253 (2021) 111533
Fig. 5. Analytical techniques used in the papers reviewed grouped by the corresponding spatial scale (left plot) and calibration approach (right plot). Calibration processes
can involve more than one analytical technique. Therefore, the values do not add up to the total number of papers reviewed (N = 107).
sian Computation (ABC) methods [98] have also been proposed to monly applied in automated approaches. Comparatively, SA is
reduce computation cost. not as widely used in manual calibration approaches.
Moving on to model calibration at different spatial scales, it can
be observed that SA, high-resolution data, UQ, and building audits
4.3. Analytical tools and techniques
are prevalent at the building-scale. On the contrary, parameter
reduction and expert knowledge are the predominant analytical
Analytical tools and techniques are often applied to both man-
techniques for urban scale model calibration efforts. Parameter
ual or automated calibration approaches. Coakley et al. [13] list
reduction aims to reduce the number of model inputs by character-
these techniques with detailed explanations. Table 4 presents a
izing and grouping similar inputs to reduce the complexity of the
subset of the techniques [13] that is relevant to this review. As
model while preserving the final decision based on the full set of
can be seen from Table 4, We do not extend the classifications pro-
parameters. Well-known examples of parameter reduction
posed in [13] but augment their descriptions so that it encom-
techniques in BES are day-typing (grouping schedules with similar
passes the publications reviewed.
profile) and zone-typing (grouping similar thermal zones) [13]. At
the urban-scale, archetypes are commonly used to reduce the
4.3.1. Analytical techniques by approach and application number of model inputs and therefore the effort and cost of mod-
Fig. 4 provides an overview of the number of papers employing eling distinct buildings [95,100,89,87,92,101,102].
a certain analytical technique to assist or complete the calibration Archetype generation involves two steps, segmentation fol-
process. What stands out in the figure is that the application of lowed by characterization [103]. Segmentation divides buildings
sensitivity analysis (SA) and the use of high-resolution data (HIGH) with similar characteristics based on key parameters such as build-
have the highest frequency. By contrast, in the review by Coakley ing type, construction year or period, floor area, building height,
et al. [13], SA and the use of high-resolution data are not common and/or shape (if geometry data is not available) [95,100,89,87].
analytical techniques that form a part of the calibration process. What follows is the characterization of building construction and
The increase in the use of high-resolution data could be attrib- operation properties based on expert knowledge that involves
uted to the proliferation of IoT devices and sensor networks in deriving input values from existing databases, building codes and
buildings making hourly and sub-metered data more readily avail- standards, and representations of national building typologies
able for calibration. The increase in the use of SA could be associ- (also referred to as reference buildings). For example, several stud-
ated with the growth in the utilization of automated approaches ies [89,92,101] modeled construction properties (inferred from
(Fig. 2). This is further corroborated by Fig. 5, which shows the construction year) using information from the TABULA (2009–
breakdown of analytical techniques according to the calibration 2012) and EPISCOPE (2013–2016) projects [104,105] that were
approach (manual or automated) and the spatial scale (compo- aimed at providing national residential building typologies for var-
nent/system, building, or urban). The results, as shown in Fig. 5 ious European countries. Another example is the use of the U.S.
indicates that sensitivity analysis (SA), high-resolution data, uncer- Commercial Buildings Energy Consumption Survey (CBECS) and
tainty quantification (UQ), and building audits are the most com-
8
A. Chong, Y. Gu and H. Jia Energy & Buildings 253 (2021) 111533
Fig. 6. Types of sensitivity analysis used in building energy simulation split by automated vs manual calibration approaches.
the U.S. Residential Energy Consumption Survey (RECS) databases ple sizes for accurate approximations of the sensitivity indices.
to derive detailed information on the construction and operation Suppose that there are t uncertain parameters, the approximate
of the buildings (e.g. insulation levels, internal loads and schedules, number of model evaluations required is approximately t for per-
mechanical systems, and hot water consumption) [106,95]. Like- turbation local SA methods; 10 100t for screening methods;
wise, Chen, Deng and Hong [102] derived input values based on 100 1000t for regression and RSA methods; and > 1000t for
the minimum energy efficiency requirements in the California’s variance-based methods [125]. Consequently, metamodels, surro-
building energy efficiency standards Title 24 while Krayem et al. gates or emulators are typically used in place of computationally
[107] defined internal loads and schedules following the ASHRAE expensive simulation runs required by computationally demand-
90.1 Standard. ing SA methods [125]. Specific choices of metamodels may also
provide sensitivity measures that can be used to rank model
4.3.2. Sensitivity analysis parameters according to their influence on the output of interest.
The results of this review confirm the close association between Examples include the use of random forest variable importance
sensitivity analysis (SA) and automated model calibration pro- [96,91,94] and estimates of marginal posterior using Gaussian pro-
cesses (Fig. 6). Only about 20% of the papers utilizing manual cesses [82].
approaches employ SA in contrast with 65% of the papers for auto- Screening methods are popular due to their low computation
mated approaches. A possible explanation is that equifinality cost compared to other global SA methods, making it suitable for
issues are especially challenging for automated approaches since BES models that are typically non-linear with high-dimensional
objective functions are normally designed to minimize discrepan- parameter space. The method of Morris [126] is the most estab-
cies between simulated and observed responses. This might pro- lished and widely used screening method. Sampling for the Morris
duce a model with a higher prediction accuracy, but it might not method is carried out by randomly selecting r starting points that
inform the modeler about the true parameter values [108]. In con- are perturbed One-at-A-Time (OAT). The computation cost is
trast, manual approaches adjust the calibration parameters based therefore rðt þ 1Þ for t model parameters. A measure of global sen-
on heuristics that are based on the expertise of an experienced sitivity is commonly obtained using the r trajectories to compute
modeler. Of the studies that employed a manual approach without the mean l [126] or the modified mean l proposed by [127]. In
sensitivity analysis, the dominant (86%) analytical techniques general, most studies rely on graphical plots of l (or l ) and r
employed include conducting detailed audits [109–115], utilizing for better interpretability when screening out non-influential
expert knowledge or judgment [116–118,100,119,106,107], imple- parameters [79,88,81,32,85,52,128,36,66,37,49,39,50]. Others con-
menting an evidence-based approach[120–122,100,112,123,124], sider only l (or l ) to rank and identify dominant parameters
or using high-resolution data [120–122,100,112,124]. [84,86,101,47,78,45].
It is evident from Fig. 6 that global sensitivity analyses are more Perturbation methods are the simplest type of SA and involve
commonly used in automated approaches. A possible explanation varying (perturbing) the model inputs from their base or nominal
is the ability of global methods to provide an overall view of the values One-At-a-Time (OAT). Compared with global SA methods,
importance of different inputs while considering their interactions the advantages of perturbation methods include (1) its ease of
[125]. Specifically, screening methods (46%) are the most popular application and interpretation [129], and (2) requiring the least
followed by perturbation (23%), regression (13%), metamodel number of model evaluations [125]. The order of influence is often
(10%), variance (4%), and regional sensitivity analysis (RSA) (4%) used to identify a subset of parameters to be calibrated
(Fig. 6). [130,65,131–133,128,134,135]. Sun et al. [65] illustrate this clearly
In this review, variance-based SA methods are not common by using parametric perturbation to identify a priority list of 17
because they are computationally demanding requiring large sam- calibration parameters that would be adjusted within a pattern-
9
A. Chong, Y. Gu and H. Jia Energy & Buildings 253 (2021) 111533
10
A. Chong, Y. Gu and H. Jia Energy & Buildings 253 (2021) 111533
Fig. 9. Most common observed output used for calibrating building energy simulation models split by the temporal resolution used for the calibration and the scale of the
energy model (i.e., calibration of Component/ System, Building, or Urban scale building energy models).
Fig. 10. Most common observed inputs used for calibrating building energy simulation models. The color indicates the class of the model parameter.
Fig. 11. Most common calibration parameters used for calibrating building energy simulation models split by sensitivity analysis and calibration approach.
currently to provide the boundary conditions of the simulation tion. In contrast, calibration against electricity and gas/steam
[79,149,150,43,110,134,120,36,45,76,151,37,97,141,152,50]. How- energy is generally carried out with monthly resolution data.
ever, what stands out in building scale studies is that calibration Turning now to urban-scale building energy models (UBEM),
against indoor dry bulb temperature [149,150,43,54,110,134, 137, about two-thirds of the studies used monthly or annual electricity,
35,131,120,55,45,122,76,37,97,38,77,78,50,124,90], HVAC energy gas/steam, total load/energy, and/or cooling load/energy for the
[135,139,121,66,48,153], and equipment electricity consumption calibration. The use of monthly or annual measurements for the
[53,134,143,139] is almost always carried out at an hourly resolu- calibration is not surprising because using higher resolution data
12
A. Chong, Y. Gu and H. Jia Energy & Buildings 253 (2021) 111533
Fig. 12. The magnitude of the relationship between the calibration parameters and their corresponding observed outputs for calibrating building energy simulation models.
might be computationally intractable at the urban-scale. Addition- a significant influence on a building’s energy use [155]. In this
ally, UBEM studies often utilize utility data that are only available review, infiltration rates are typically derived from airtightness
at a monthly resolution [95,40,107]. values that are obtained using the blower door test
[156,34,111,36,45,37,100,113].
5.3. Most common observed inputs
5.4. Mapping calibration parameters to outputs
Fig. 10 provides an overview of the most commonly used
observed inputs. What stands out from Fig. 10 is the obvious use Fig. 11 lists the model parameters most commonly adjusted to
of local weather data (dry bulb temperature, solar radiation, rela- match simulation output to the measurements faceted by the type
tive humidity, wind speed and direction) as observed inputs to of sensitivity analysis conducted and whether the calibration uti-
the model. If local site measurements are not available, an annual lized an automated or manual approach. The figure reveals two
meteorological year (AMY) weather file from the nearest weather interesting observations. First, SA, especially global SA, is less likely
station is used. These observations indicate the importance of to be used when the calibration involves using schedules (occu-
using actual weather data for the calibration since the weather file pant, equipment, lighting, and HVAC operation). Second, auto-
forms the energy simulation’s boundary conditions. The weather mated calibration approaches tend to calibrate parameters such
file’s importance was also demonstrated in previous research that as material properties, infiltration rate, and internal load densities
showed that the annual building energy consumption and the compared to schedules. By contrast, manual approaches are
monthly building loads could vary by 7% and 40%, respectively, equally likely to calibrate material properties and schedules.
based on the provided weather data [154]. Fig. 12 shows the magnitude of the relationship between the
Interestingly, several studies used measured indoor environ- most commonly used calibration parameters and their correspond-
mental conditions as inputs to the model to obtain a model that ing observed outputs. The mapping reveals that parameters con-
is better calibrated at the zone level. For instance, Mihai and cerning the building envelope (material properties and
Zmeureanu [137] showed that using measured indoor air temper- infiltration rate), internal gains (occupant, lighting, and equipment
atures in place of those from the technical specification led to more power density), and zone cooling and heating setpoints are often
accurate predictions of zone airflow rates. Yin, Kiliccote, and Piette adjusted when calibrating the energy model to building electricity
[139] used air temperature and airflow measurements to derive energy consumption. Closer inspection of the first column of
the zone thermostat setpoint and the VAV box minimum/maxi- Fig. 12 shows that HVAC component efficiency and zone outdoor
mum airflow respectively. Infiltration rate is sometimes derived air levels were also calibrated in a considerable number of papers.
from measurements because it is highly uncertain and can have Not surprisingly, hot water usage was also adjusted in several
13
A. Chong, Y. Gu and H. Jia Energy & Buildings 253 (2021) 111533
Table 5 since BES models are often deemed ‘‘calibrated” if they meet the
Metrics used for the evaluation of calibration performance. Each paper may employ CV(RMSE) and NMBE limits (Table 6) specified by ASHRAE Guide-
more than one metric when assessing calibration performance. Therefore, the
cumulative sum for the column ‘‘No. of Papers” is greater than the total number of
line 14 [59], the International Performance Measurement and Ver-
papers reviewed (N = 107). ification Protocol (IPMVP) [60], or the Federal Energy Management
Program (FEMP) [61]. Interestingly, approximately half of the
Metric Acronym No. of
Papers
papers reviewed (51%) used two metrics in their evaluation. 24%
utilized one metric while 19% used three metrics simultaneously.
Coefficient of Variation of the Root Mean Square CV ðRMSEÞ 72
Error
The remaining 7% of the papers reviewed used either four or five
Normalized Mean Bias Error NMBE 59 metrics for the assessment.
Root Mean Square Error RMSE 20 CV(RMSE) (Eq. 3) provides an indication of how close the sim-
Coefficient of Determination R2 12 ulation predictions are to measured data while NMBE (Eq. 4)
Goodness of Fit GOF 11 serves as an indicator of overall bias in the simulation predic-
Annual Percentage Error APE 8
tions. However, NMBE suffers from cancellation between positive
Coefficient of Variation CV 4
Mean Absolute Percentage Error MAPE 4 and negative bias which can lead to misleading interpretations of
Mean Absolute Error MAE 3 predictive performance [160]. This review also confirms the find-
Gelman-Rubin statistic b
R 3 ings of Ruiz and Bandera [161] that the NMBE acronymn is often
Othersy – 18 erroneously referred to as MBE even though the formula is cor-
y Metrics with 6 2 counts. rect (i.e., MBE (%) = formula for NMBE). NMBE is MBE normalized
by the mean of the observed values so that they are comparable.
Several papers also utilize RMSE, which provides a measure of the
studies that were calibrating models of residential typologies variability of the residuals and is the non-normalized form of CV
[157,95,130,158]. About six to seven articles calibrated the internal (RMSE).
loads’ schedule [157,143,65,40,159,102,107]. sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn 2
Fig. 12 reveals several other interesting observations. First, the 1 i¼1 ðmi si Þ
CV ðRMSEÞ ¼ 100 ð3Þ
same calibration parameters were used when calibrating against
m np
both building electricity and gas/steam energy consumption. Sec-
ond, the parameters calibrated when matching simulation predic- Pn
1 i¼1 ðmi
si Þ
tions to total building load/energy are somewhat similar, except NMBEð%Þ ¼ 100 ð4Þ
m np
that equipment and lighting schedules are less likely to be
adjusted. where mi and si are the measured and simulated values respec-
Turning to zone dry-bulb temperature as the observed output, is the mean of the measured values, n is the number of
tively, m
parameters concerning envelope material properties are the most data points, and p is the number of adjustable model parameters.
commonly adjusted, followed by infiltration rate. A similar obser- Around 10% of the papers use GOF and R2 to assess calibration
vation can be made when the model is calibrated to the building’s performance. GOF (Eq. (5)) which was proposed by ASHRAE RP-
heating or cooling load/energy. Material properties and infiltration 1051 [162] incorporates both variance and bias errors through a
rate are selected because indoor temperature measurements are formulation that considers both CV(RMSE) and NMBE. Since GOF
often used to investigate the relative changes in the building envel- combines CV(RMSE) and NMBE into a single composite function,
ope performance with varying boundary conditions it has the advantage of being able to identify a single optimal solu-
[150,149,51,131]. Additionally, studies have found parameters tion and to some extent solve multi-objective optimization prob-
affiliated with infiltration rate and material properties to influence lems more effectively. Therefore, it has been used to define the
indoor air temperature [79,131,45,76,50]. cost function in several optimization-based calibrations
Finally and intuitively, Fig. 12 shows that HVAC component [40,149,36,35,40,49]. For similar reasons, studies that utilize sam-
capacity and efficiency are typically adjusted when the observed pling methods (such as Monte Carlo sampling [102] and Latin
output is the HVAC component’s energy consumption. Likewise, Hypercube sampling [75–77]) have also used GOF to rank and iden-
EPD and equipment schedules are adjusted when the model is cal- tify suitable solutions.
ibrated against equipment energy consumption. pffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2
GOF ¼ CV ðRMSEÞ2 þ NMBE2 ð5Þ
2
6. Calibration performance evaluation
R2 (Eq. (6)) provides an indication of the variability in the
6.1. Current approaches dependent variable from the mean values that are explained by
the regression model. ASHRAE Guideline 14 [59] recommends
Table 5 ranks the metrics used to assess calibration perfor- the use of CV ðRMSEÞ and R2 to select the best whole-building
mance based on the number of occurrences in the papers reviewed. energy use regression models such as the algorithms of the ASH-
A large proportion of the papers use CV(RMSE) or NMBE to deter- RAE Inverse Model Toolkit (IMT), which was developed from RP-
mine if a BES model was calibrated. This result is not unexpected 1050 [163,164]. Although there is currently no prescribed mini-
Table 6
Error limits specified by various guidelines and protocols for a building energy simulation model to be deemed calibrated.
14
A. Chong, Y. Gu and H. Jia Energy & Buildings 253 (2021) 111533
mum value for R, IPMVP [165] advised that an R2 value of 0.75 pro- trast to out-of-sample buildings, Hedegarrd et al. [92] calibrated
vides a reasonably good causal relationship between energy use 159 BES models using one month of hourly data, and evaluated
and the independent variables. Using EnergyPlus simulations, their predictions using the subsequent month. Wang et al. [101]
Chakraborty and Elzarka [160] demonstrated that R2 used in tan- calibrated 84 residential buildings using five years of monthly
dem with a range normalized RMSE (RN(RMSE)) (Eq. 7) would pro- data and evaluated their predictions using the subsequent two
vide a better representation of the predictive performance of years of data.
system-level energy models. At the building-scale, the out-of-sample dataset typically com-
prises either a randomly sampled subset of the time-series data
Pn
ðmi si Þ2 that was not used for the calibration [81,32], data from a period
R2 ¼ 1 Pni¼1 ð6Þ
2 after the model was calibrated [82,139,83,42,158,66], or a selected
i¼1 ðmi mÞ
period based on occupancy levels and season [124].
where mi and si are the measured and simulated values respec-
is the mean of the measured values, and n is the number
tively, m
of data points. 7. Discussion
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn 2 7.1. Inputs and outputs
1 i¼1 ðmi si Þ
RN ðRMSEÞ ¼ 100 ð7Þ
rangeðmÞ np
The most prominent finding from the meta-analysis is that
where mi and si are the measured and simulated values respec- monthly building electricity and hourly indoor dry bulb tempera-
is the difference between the maximum and minimum
tively, m ture measurements are most commonly used to calibrate BES
of the measured values, n is the number of data points, and p is models, especially at the building scale. A possible explanation is
the number of adjustable model parameters. that electricity and gas/steam data are often obtained from utility
providers who typically provide monthly data. In comparison,
6.2. Evaluating probabilistic predictions measurements of the other outputs such as HVAC energy, equip-
ment electricity, and indoor dry bulb temperature would involve
Calibration methods that involve uncertainty quantification installing sub-meters and/or accessing the building automation
often provide probabilistic predictions to support risk- system where data is usually available at sub-hourly resolution.
conscious decision-making. However, almost all of the evalua- Another finding is that material thermophysical properties,
tion methods in the literature evaluate probabilistic predictions infiltration rate, and internal load densities are frequently selected
in a deterministic manner. Specifically, central tendency mea- for calibration, especially in automated calibration frameworks. It
sures such as the mean or median are used to compute accu- is well known amongst researchers in BES calibration that these
racy metrics, some of which are CV (RMSE) parameters are the main model parameters used to describe a
[94,32,91,81,82,95,93,84,101,89], NMBE [32,81,93,89], APE building and often represent a significant source of uncertainty
[166,82,95,101], RMSE [91,84,87], and MAPE [87] (Table 5). when estimating building energy performance. One well-known
However, it has been shown that using a single value such as early study that is often cited for uncertainties in infiltration rates
the mean to represent the entire distribution may result in an is that of Persily [168]. Likewise, material properties have also been
optimistic bias of the model’s prediction accuracy [85]. There- shown to be uncertain due to various reasons such as poor detail-
fore, these metrics are often accompanied by graphical plots ing/workmanship and thermal bridges [2,169,170].
comparing probabilistic predictions (e.g. using box-plots or error Previous studies have demonstrated the importance of schedule
bars) to the observed values [32,88,82,84,89]. adjustment in model calibration [143,144]. Therefore, it is some-
Alternative assessment methods have also been proposed to what surprising that schedules are typically not considered in
more precisely evaluate probabilistic predictions. For example, automated calibration frameworks. This inconsistency might be
assessing performance by comparing CV (RMSE) and NMBE median due to the sharp increase in computation cost if every schedule
or mean values with their 95% confidence intervals [88,40,98]. The parameter were considered in the calibration. Another possible
Kolmogorov–Smirnov (KS) test has also been used to assess cali- explanation for not considering schedules is that it could result
bration performance by comparing the predicted and measured in identifiability issues if a comprehensive dataset is not available
EUI distributions [166,95]. to avoid overparameterization [32,171]. Consequently, schedule
In order to facilitate comparison between probabilistic predic- adjustment typically involves simplification, such as selecting from
tions and deterministic observations, Chong, Augenbroe, and Da a list of predefined discrete schedules that best fit the measured
[144] proposed using the coverage width-based criterion (CWC). data [157,40,107,102]. As data in the built environment becomes
Likewise, the continuous rank probability score (CRPS) was pro- more available and accessible, developing scalable calibration algo-
posed to measure the distance between the probabilistic predic- rithms that can consider multiple data sources might prove impor-
tions and their corresponding observations [83]. Both the CWC tant in future research.
and the CRPS are the only metrics that consider both correctness
and informativeness of the probabilistic predictions. For detailed 7.2. Calibrating urban-scale models
explanation and formulation of the CWC and the CRPS, the reader
is referred to [144,167] respectively. The result of this review indicates that approximately 15% of
the papers are at the urban-scale. Of the 15%, most are located in
6.3. Validation using out-of-sample data the U.S. (54%) and Europe (34%), with none in a tropical climate.
Since the urban context (inter-building effects and urban microcli-
63% of the studies reviewed did not evaluate the calibrated mate) is an important aspect that should be considered in UBEMs
model on an out-of-sample test dataset. The remaining 37% advo- [172,173], it would be interesting to evaluate the performance of
cated the use of an out-of-sample test dataset to avoid bias in the the UBEM calibration methodologies in the tropics and cities out-
evaluation process. For instance, out-of-sample test buildings side of the U.S. and Europe.
have been used to evaluate the robustness and homogeneity of This review also found that UBEMs are typically calibrated
urban-scale archetype predictive performance [88,95,89]. In con- using monthly or annual data (Fig. 9) and rely on expert knowledge
15
A. Chong, Y. Gu and H. Jia Energy & Buildings 253 (2021) 111533
research in BES [187,188]. Specifically, Docker images and Docker- Declaration of Competing Interest
files are Docker concepts that help resolve the technical challenges
to reproducibility mentioned at the start of this paragraph [185]. The authors declare that they have no known competing finan-
To facilitate reproducibility, we created a GitHub repository (Sec- cial interests or personal relationships that could have appeared
tion 8) to demonstrate a Docker based approach for reproducing to influence the work reported in this paper.
BES research.
Acknowledgements
8. Conclusion
This research is supported by the National University of Singa-
pore, Singapore under its start-up grant (Project No. R-296-000-
Calibration remains a challenging task because there are no
190-133); the Republic of Singapore’s National Research Founda-
clear guidance and best practices on calibration procedures such
tion through a grant to the Berkeley Education Alliance for
as model inputs and outputs, calibration methods, calibration per-
Research in Singapore (BEARS) for the Singapore-Berkeley Building
formance evaluation, simulation reproducibility. As a result, BES
Efficiency and Sustainability in the Tropics (SinBerBEST) Program.
calibration has remained highly subjective, and perhaps even elu-
BEARS has been established by the University of California, Berke-
sive, and almost impossible to reproduce. Therefore, this study
ley as a center for intellectual excellence in research and education
contributes to existing knowledge of BES calibration by providing
in Singapore.
a coherent and detailed summary of the calibration methodology,
data requirements, performance evaluation criteria, and the cur-
rent state of knowledge. References
The findings indicate a significant increase in the use of auto-
mated calibration approaches. Amongst the automated calibration [1] J.L. Hensen, R. Lamberts, Building performance simulation for design and
operation, second ed., Routledge, 2019.
approaches, optimization and Bayesian calibration were the most [2] P. De Wilde, The gap between predicted and measured energy performance of
popular. In general, global sensitivity analysis is often applied buildings: A framework for investigation, Autom. Constr. 41 (2014) 40–49,
within automated approaches. In contrast, the dominant tech- https://doi.org/10.1016/j.autcon.2014.02.009.
[3] C. Turner, M. Frankel, et al., Energy performance of leed for new construction
niques used in manual approaches include using detailed audits, buildings, New Build. Inst. (2008) 1–42.
expert knowledge, and/or evidence-based procedures. High- [4] E. Mantesi, C.J. Hopfe, M.J. Cook, J. Glass, P. Strachan, The modelling gap:
resolution data is prevalent in both automated and manual Quantifying the discrepancy in the representation of thermal mass in building
simulation, Build. Environ. 131 (2018) 74–98, https://doi.org/10.1016/j.
approaches possibly due to increasing sensing capabilities and data
buildenv.2017.12.017.
availability in the built environment. [5] A.C. Menezes, A. Cripps, D. Bouchlaghem, R. Buswell, Predicted vs. actual
BES models are usually calibrated against one or two observed energy performance of non-domestic buildings: Using post-occupancy
outputs. The two most commonly used data sources for BES cali- evaluation data to reduce the performance gap, Appl. Energy 97 (2012)
355–364, https://doi.org/10.1016/j.apenergy.2011.11.075.
bration were monthly electricity consumption and hourly indoor [6] S. De Wit, G. Augenbroe, Analysis of uncertainty in building design
dry bulb temperature. Monthly electricity often stems from utility evaluations and its implications, Energy Build. 34 (9) (2002) 951–958,
bills and is often used to calibrate the building envelope’s thermo- https://doi.org/10.1016/S0378-7788(02)00070-1.
[7] H. Yoshino, T. Hong, N. Nord, Iea ebc annex 53: Total energy use in buildings-
physical parameters, infiltration rate, various internal gains densi- analysis and evaluation methods, Energy Build. 152 (2017) 124–136, https://
ties, and indoor setpoint temperatures. Hourly measurements of doi.org/10.1016/j.enbuild.2017.07.038.
indoor dry-bulb temperature during free-floating periods when [8] T.G. Trucano, L.P. Swiler, T. Igusa, W.L. Oberkampf, M. Pilch, Calibration,
validation, and sensitivity analysis: What’s what, Reliab. Eng. Syst. Saf. 91
the indoor temperatures are allowed to float during non- (10–11) (2006) 1331–1357, https://doi.org/10.1016/j.ress.2005.11.031.
operating hours are often used to calibrate thermophysical param- [9] N. Oreskes, K. Shrader-Frechette, K. Belitz, Verification, validation, and
eters of the building envelope and infiltration rate. confirmation of numerical models in the earth sciences, Science 263 (5147)
(1994) 641–646, https://doi.org/10.1126/science.263.5147.641.
The review indicates a lack of reproducibility due to the absence [10] L.F. Konikow, J.D. Bredehoeft, Ground-water models cannot be validated, Adv.
of clarity in reporting the modeling and data assumptions, calibra- Water Resour. 15 (1) (1992) 75–83, https://doi.org/10.1016/0309-1708(92)
tion parameters, observed inputs, and observed outputs. Therefore, 90033-X.
[11] A.I. of Aeronautics, Astronautics, AIAA guide for the verification and
an incremental approach to encourage reproducibility in BES
validation of computational fluid dynamics simulations, American Institute
research was proposed in this study, along with a fully repro- of aeronautics and astronautics, 1998. doi:https://doi.org/10.2514/
ducible example on GitHub (Section 8). 4.472855.001..
Taken together, the present study lays the groundwork that [12] T.A. Reddy, Literature review on calibration of building energy simulation
programs: uses, problems, procedures, uncertainty, and tools, ASHRAE Trans.
future calibration studies can build on. While it is clear that there 112 (2006) 226.
is a significant body of work available, the precise mechanism of [13] D. Coakley, P. Raftery, M. Keane, A review of methods to match building
BES calibration and the evaluation of model credibility remains energy simulation models to measured data, Renew. Sustain. Energy Rev. 37
(2014) 123–141, https://doi.org/10.1016/j.rser.2014.05.007.
to be elucidated. Incorporating multiple data sources within auto- [14] E. Fabrizio, V. Monetti, Methodologies and advancements in the calibration of
mated calibration algorithms would also be exciting for future building energy models, Energies 8 (4) (2015) 2548–2574, https://doi.org/
work with increasing data availability. We also believe that a cul- 10.3390/en8042548.
[15] US Department of Energy (DOE) Office of Energy Efficiency & Renewable
ture of reproducibility will significantly aid efforts in establishing a Energy, EnergyPlus. https://www.energy.gov/eere/buildings/downloads/
standardized calibration methodology. energyplus-0..
[16] IBPSA-USA, Building Energy Software Tools (BEST) directory. https://www.
buildingenergysoftwaretools.com/software-listing?keywords=EnergyPlus..
[17] S.A. Klein, TRNSYS 18: A transient simulation program. https://sel.me.wisc.
Data availability edu/trnsys..
[18] DOE-2.[link]. https://www.doe2.com..
[19] X. Li, J. Wen, Review of building energy modeling for control and operation,
The research compendium for this article can be found at
Renew. Sustain. Energy Rev. 37 (2014) 517–537, https://doi.org/10.1016/j.
https://github.com/ideas-lab-nus/calibrating-building-simulation- rser.2014.05.056.
review, hosted at GitHub. [20] Energy performance of buildings – Calculation of energy use for space heating
The simple example of reproducible building energy simulation and cooling, Standard, International Organization for Standardization,
Geneva, CH (2008)..
(Section 7.4) can be found at https://github.com/ideas-lab-nus/re- [21] Energy performance of buildings – Energy needs for heating and cooling,
producing-building-simulation, hosted at GitHub. internal temperatures and sensible and latent heat loads – Part 1: Calculation
17
A. Chong, Y. Gu and H. Jia Energy & Buildings 253 (2021) 111533
procedures, Standard, International Organization for Standardization, [47] Q. Zhang, Z. Tian, Z. Ma, G. Li, Y. Lu, J. Niu, Development of the heating load
Geneva, CH (2017).. prediction model for the residential building of district heating based on
[22] B. Iooss, S.D. Veiga, A. Janon, G. Pujol, with contributions from Baptiste Broto, model calibration, Energy 205. doi:10.1016/j.energy.2020.117949..
K. Boumhaout, T. Delage, R.E. Amri, J. Fruth, L. Gilquin, J. Guillaume, M.I. [48] S.-W. Ha, S.-H. Park, J.-Y. Eom, M.-S. Oh, G.-Y. Cho, E.-J. Kim, Parameter
Idrissi, L. Le Gratiet, P. Lemaitre, A. Marrel, A. Meynaoui, B.L. Nelson, F. calibration for a trnsys bipv model using in situ test data, Energies 13 (18).
Monari, R. Oomen, O. Rakovec, B. Ramos, O. Roustant, E. Song, J. Staum, R. doi:10.3390/en13184935..
Sueur, T. Touati, F. Weber, sensitivity: Global Sensitivity Analysis of Model [49] G. Larochelle Martin, D. Monfet, H. Nouanegue, K. Lavigne, S. Sansregret,
Outputs, r package version 1.24.0 (2021). https://CRAN.R-project. Energy calibration of hvac sub-system model using sensitivity analysis and
org/package=sensitivity.. meta-heuristic optimization, Energy Build. 202. doi:10.1016/j.
[23] J. Herman, W. Usher, Salib: an open-source python library for sensitivity enbuild.2019.109382..
analysis, J. Open Source Software 2 (9) (2017) 97, https://doi.org/10.21105/ [50] M. Ferrara, C. Lisciandrello, A. Messina, M. Berta, Y. Zhang, E. Fabrizio,
joss.00097. Optimizing the transition between design and operation of zebs: Lessons
[24] M. Wetter, et al., Genopt-a generic optimization program, in: Seventh learnt from the solar decathlon china, scutxpolito prototype, Energy Build.
International IBPSA Conference, Rio de Janeiro, 2001, pp. 601–608.. 213 (2018), https://doi.org/10.1016/j.enbuild.2020.109824.
[25] Zhang, Yi, JEPlus – An parametric tool for EnergyPlus and TRNSYS. http:// [51] A. Cacabelos, P. Eguía, L. Febrero, E. Granada, Development of a new multi-
www.jeplus.org/wiki/doku.php.. stage building energy model calibration methodology and validation in a
[26] F.-A. Fortin, F.-M. De Rainville, M.-A.G. Gardner, M. Parizeau, C. Gagné, Deap: public library, Energy Build. 146 (2017) 182–199, https://doi.org/10.1016/j.
Evolutionary algorithms made easy, J. Mach. Learn. Res. 13 (1) (2012) 2171– enbuild.2017.04.071.
2175. [52] W. Li, Z. Tian, Y. Lu, F. Fu, Stepwise calibration for residential building thermal
[27] J. Bossek, ecr: Evolutionary Computation in R, r package version 2.1.0 (2017). performance model using hourly heat consumption data, Energy Build. 181
https://CRAN.R-project.org/package=ecr.. (2018) 10–25, https://doi.org/10.1016/j.enbuild.2018.10.001.
[28] M.J. Bayarri, J.O. Berger, R. Paulo, J. Sacks, J.A. Cafeo, J. Cavendish, C.-H. Lin, J. [53] A. Abdelalim, W. O’Brien, Z. Shi, Data visualization and analysis of energy flow
Tu, A framework for validation of computer models, Technometrics 49 (2) on a multi-zone building scale, Autom. Constr. 84 (2017) 258–273, https://
(2007) 138–154, https://doi.org/10.1198/004017007000000092. doi.org/10.1016/j.autcon.2017.09.012.
[29] M.C. Kennedy, A. O’Hagan, Bayesian calibration of computer models, J. R. Stat. [54] A. Ogando, N. Cid, M. Fernández, Energy modelling and automated
Soc.: Ser. B (Statistical Methodology) 63 (3) (2001) 425–464, https://doi.org/ calibrations of ancient building simulations: A case study of a school in the
10.1111/1467-9868.00294. northwest of spain, Energies 10 (6). doi:10.3390/en10060807..
[30] D. Higdon, M. Kennedy, J.C. Cavendish, J.A. Cafeo, R.D. Ryne, Combining field [55] E. Carlon, M. Schwarz, A. Prada, L. Golicza, V. Verma, M. Baratieri, A.
data and computer simulations for calibration and prediction, SIAM J. Gasparella, W. Haslinger, C. Schmidl, On-site monitoring and dynamic
Scientific Comput. 26 (2) (2004) 448–466, https://doi.org/10.1137/ simulation of a low energy house heated by a pellet boiler, Energy Build.
S1064827503426693. 116 (2016) 296–306, https://doi.org/10.1016/j.enbuild.2016.01.001.
[31] J. Palomo, R. Paulo, G. García-Donato, SAVE: an R package for the statistical [56] K. Deb, A. Pratap, S. Agarwal, T. Meyarivan, A fast and elitist multiobjective
analysis of computer models, J. Stat. Softw. 64 (13) (2015) 1–23, URL: http:// genetic algorithm: Nsga-ii, IEEE Trans. Evol. Comput. 6 (2) (2002) 182–197,
www.jstatsoft.org/v64/i13/. https://doi.org/10.1109/4235.996017.
[32] A. Chong, K. Menberg, Guidelines for the bayesian calibration of building [57] J. Kennedy, R. Eberhart, Particle swarm optimization, in: Proceedings of
energy models, Energy Build. 174 (2018) 527–547, https://doi.org/10.1016/j. ICNN’95-international conference on neural networks, vol. 4, IEEE, 1995, pp.
enbuild.2018.06.028. 1942–1948. doi:10.1109/ICNN.1995.488968..
[33] L. Raillon, S. Rouchier, S. Juricic, pysip: an open-source tool for bayesian [58] R. Hooke, T.A. Jeeves, ”direct search”solution of numerical and statistical
inference and prediction of heat transfer in buildings, in: Congres français de problems, J. ACM 8 (2) (1961) 212–229, https://doi.org/10.1145/
thermique, Nantes, 2019.. 321062.321069.
[34] C. Bandera, G. Ruiz, Towards a new generation of building envelope [59] ASHRAE, Guideline 14, measurement of energy and demand savings,
calibration, Energies 10 (12). doi:10.3390/en10122102.. American Society of Heating, Ventilating, and Air Conditioning Engineers,
[35] G. Ramos Ruiz, C. Fernández Bandera, Analysis of uncertainty indices used for Atlanta, Georgia..
building envelope calibration, Appl. Energy 185 (2017) 82–94, https://doi. [60] EVO, International performance measurement and verification protocol:
org/10.1016/j.apenergy.2016.10.054. Concepts and options for determining energy and water savings volume 1,
[36] G. Ramos Ruiz, C. Fernández Bandera, T. Gómez-Acebo Temes, A. Sánchez- Efficiency Valuation Organization..
Ostiz Gutierrez, Genetic algorithm for building envelope calibration, Appl. [61] US DOE FEMP, M&V guidelines: Measurement and verification for
Energy 168 (2016) 691–705, https://doi.org/10.1016/j.apenergy.2016.01.075. performance-based contracts, version 4.0, Energy Efficiency and Renewable
[37] S. Zuhaib, M. Hajdukiewicz, J. Goggins, Application of a staged automated Energy..
calibration methodology to a partially-retrofitted university building energy [62] S. Qiu, Z. Li, Z. Pang, W. Zhang, Z. Li, A quick auto-calibration approach based
model, J. Build. Eng. 26. doi:10.1016/j.jobe.2019.100866.. on normative energy models, Energy Build. 172 (2018) 35–46, https://doi.
[38] S. Martínez, P. Eguía, E. Granada, A. Moazami, M. Hamdy, A performance org/10.1016/j.enbuild.2018.04.053.
comparison of multi-objective optimization-based approaches for calibrating [63] G. Chaudhary, J. New, J. Sanyal, P. Im, Z. O’Neill, V. Garg, Evaluation of
white-box building energy models, Energy Build. 216. doi:10.1016/j. autotune calibration against manual calibration of building energy models,
enbuild.2020.109942.. Appl. Energy 182 (2016) 115–134, https://doi.org/10.1016/j.
[39] S. Martínez, E. Pórez, P. Egua, A. Erkoreka, E. Granada, Model calibration and apenergy.2016.08.073.
exergoeconomic optimization with nsga-ii applied to a residential [64] A. Garrett, J. New, Scalable tuning of building models to hourly data, Energy
cogeneration, Appl. Therm. Eng. 169. doi:10.1016/j. 84 (2015) 493–502, https://doi.org/10.1016/j.energy.2015.03.014.
applthermaleng.2020.114916.. [65] K. Sun, T. Hong, S. Taylor-Lange, M. Piette, A pattern-based automated
[40] S. Nagpal, J. Hanson, C. Reinhart, A framework for using calibrated campus- approach to building energy model calibration, Appl. Energy 165 (2016) 214–
wide building energy models for continuous planning and greenhouse gas 224, https://doi.org/10.1016/j.apenergy.2015.12.026.
emissions reduction tracking, Appl. Energy 241 (2019) 82–97, https://doi.org/ [66] Z. Yang, B. Becerik-Gerber, A model calibration framework for simultaneous
10.1016/j.apenergy.2019.03.010. multi-level building energy simulation, Appl. Energy 149 (2015) 415–431,
[41] S. Tian, S. Shao, B. Liu, Investigation on transient energy consumption of cold https://doi.org/10.1016/j.apenergy.2015.03.048.
storages: Modeling and a case study, Energy 180 (2019) 1–9, https://doi.org/ [67] H. Schreiber, F. Lanzerath, A. Bardow, Predicting performance of adsorption
10.1016/j.energy.2019.04.217. thermal energy storage: From experiments to validated dynamic models,
[42] J. Chen, X. Gao, Y. Hu, Z. Zeng, Y. Liu, A meta-model-based optimization Appl. Therm. Eng. 141 (2018) 548–557, https://doi.org/10.1016/j.
approach for fast and reliable calibration of building energy models, Energy applthermaleng.2018.05.094.
188. doi:10.1016/j.energy.2019.116046.. [68] L. Santos, A. Afshari, L. Norford, J. Mao, Evaluating approaches for district-
[43] C. Andrade-Cabrera, D. Burke, W. Turner, D. Finn, Ensemble calibration of wide energy model calibration considering the urban heat island effect, Appl.
lumped parameter retrofit building models using particle swarm Energy 215 (2018) 31–40, https://doi.org/10.1016/j.apenergy.2018.01.089.
optimization, Energy Build. 155 (2017) 513–532, https://doi.org/10.1016/j. [69] A. Zekar, S. Khatib, Development and assessment of simplified building
enbuild.2017.09.035. representations under the context of an urban energy model: Application to
[44] T. Yang, Y. Pan, J. Mao, Y. Wang, Z. Huang, An automated optimization method arid climate environment, Energy Build. 173 (2018) 461–469, https://doi.org/
for calibrating building energy simulation models with measured data: 10.1016/j.enbuild.2018.04.030.
Orientation and a case study, Appl. Energy 179 (2016) 1220–1231, https:// [70] W. Tian, Y. Heo, P. De Wilde, Z. Li, D. Yan, C.S. Park, X. Feng, G. Augenbroe, A
doi.org/10.1016/j.apenergy.2016.07.084. review of uncertainty analysis in building energy assessment, Renew. Sustain.
[45] F. Roberti, U. Oberegger, A. Gasparella, Calibrating historic building energy Energy Rev. 93 (2018) 285–301, https://doi.org/10.1016/j.rser.2018.05.029.
models to hourly indoor air and surface temperatures: Methodology and case [71] C.J. Roy, W.L. Oberkampf, A comprehensive framework for verification,
study, Energy Build. 108 (2015) 236–243, https://doi.org/10.1016/j. validation, and uncertainty quantification in scientific computing, Comput.
enbuild.2015.09.010. Methods Appl. Mech. Eng. 200 (25–28) (2011) 2131–2144, https://doi.org/
[46] C. Andrade-Cabrera, W. Turner, D. Finn, Augmented ensemble calibration of 10.1016/j.cma.2011.03.016.
lumped-parameter building models, Build. Simul. 12 (2) (2019) 207–230, [72] A. Der Kiureghian, O. Ditlevsen, Aleatory or epistemic? Does it matter?, Struct
https://doi.org/10.1007/s12273-018-0473-5. Saf. 31 (2) (2009) 105–112, https://doi.org/10.1016/j.strusafe.2008.06.020.
18
A. Chong, Y. Gu and H. Jia Energy & Buildings 253 (2021) 111533
[73] Y. Sun, Closing the building energy performance gap by improving our [99] P. Raftery, M. Keane, A. Costa, Calibrating whole building energy models:
predictions, Ph.D. thesis, Georgia Institute of Technology (2014).. Detailed case study using hourly measured data, Energy Build. 43 (12) (2011)
[74] G. Yun, K. Song, Development of an automatic calibration method of a vrf 3666–3679, https://doi.org/10.1016/j.enbuild.2011.09.039.
energy model for the design of energy efficient buildings, Energy Build. 135 [100] J. Fernandez, L. del Portillo, I. Flores, A novel residential heating consumption
(2017) 156–165, https://doi.org/10.1016/j.enbuild.2016.11.060. characterisation approach at city level from available public data:
[75] L. Harmer, G. Henze, Using calibrated energy models for building Description and case study, Energy Build. 221. doi:10.1016/j.
commissioning and load prediction, Energy Build. 92 (2015) 204–215, enbuild.2020.110082..
https://doi.org/10.1016/j.enbuild.2014.10.078. [101] C.-K. Wang, S. Tindemans, C. Miller, G. Agugiaro, J. Stoter, Bayesian calibration
[76] J. Cipriano, G. Mor, D. Chemisana, D. Pérez, G. Gamboa, X. Cipriano, Evaluation at the urban scale: a case study on a large residential heating demand
of a multi-stage guided search approach for the calibration of building energy application in amsterdam, J. Build. Performance Simul. 13 (3) (2020) 347–
simulation models, Energy Build. 87 (2015) 370–385, https://doi.org/ 361, https://doi.org/10.1080/19401493.2020.1729862.
10.1016/j.enbuild.2014.08.052. [102] Y. Chen, Z. Deng, T. Hong, Automatic and rapid calibration of urban building
[77] N. Sakiyama, L. Mazzaferro, J. Carlo, T. Bejat, H. Garrecht, Natural ventilation energy models by learning from energy performance database, Appl. Energy
potential from weather analyses and building simulation, Energy Build. 277. doi:10.1016/j.apenergy.2020.115584..
doi:10.1016/j.enbuild.2020.110596.. [103] C.F. Reinhart, C.C. Davila, Urban building energy modeling–a review of a
[78] M. Giuliani, G. Henze, A. Florita, Modelling and calibration of a high-mass nascent field, Build. Environ. 97 (2016) 196–202, https://doi.org/10.1016/j.
historic building for reducing the prebound effect in energy assessment, buildenv.2015.12.001.
Energy Build. 116 (2016) 434–448, https://doi.org/10.1016/j. [104] I. Ballarini, S.P. Corgnati, V. Corrado, Use of reference buildings to assess the
enbuild.2016.01.034. energy saving potentials of the residential building stock: The experience of
[79] S. Martńnez, A. Erkoreka, P. Eguńa, E. Granada, L. Febrero, Energy tabula project, Energy policy 68 (2014) 273–284, https://doi.org/10.1016/j.
characterization of a paslink test cell with a gravel covered roof using a enpol.2014.01.027.
novel methodology: Sensitivity analysis and bayesian calibration, J. Build. [105] T. Loga, B. Stein, N. Diefenbach, Tabula building typologies in 20 european
Eng. 22 (2019) 1–11, https://doi.org/10.1016/j.jobe.2018.11.010. countries-making energy-related features of residential building stocks
[80] K. Menberg, Y. Heo, R. Choudhary, Influence of error terms in bayesian comparable, Energy Build. 132 (2016) 4–12, https://doi.org/10.1016/j.
calibration of energy system models, J. Build. Performance Simul. 12 (1) enbuild.2016.06.094.
(2019) 82–96, https://doi.org/10.1080/19401493.2018.1475506. [106] Z. Taylor, Y. Xie, C. Burleyson, N. Voisin, I. Kraucunas, A multi-scale
[81] A. Chong, K. Lam, M. Pozzi, J. Yang, Bayesian calibration of building energy calibration approach for process-oriented aggregated building energy
models with large datasets, Energy Build. 154 (2017) 343–355, https://doi. demand models, Energy Build. 191 (2019) 82–94, https://doi.org/10.1016/j.
org/10.1016/j.enbuild.2017.08.069. enbuild.2019.02.018.
[82] J. Yuan, V. Nian, B. Su, Q. Meng, A simultaneous calibration and parameter [107] A. Krayem, A. Al Bitar, A. Ahmad, G. Faour, J.-P. Gastellu-Etchegorry, I. Lakkis,
ranking method for building energy models, Appl. Energy 206 (2017) 657– J. Gerard, H. Zaraket, A. Yeretzian, S. Najem, Urban energy modeling and
666, https://doi.org/10.1016/j.apenergy.2017.08.220. calibration of a coastal mediterranean city: The case of beirut, Energy Build.
[83] Q. Li, G. Augenbroe, J. Brown, Assessment of linear emulators in lightweight 199 (2019) 223–234, https://doi.org/10.1016/j.enbuild.2019.06.050.
bayesian calibration of dynamic building energy models for parameter [108] K. Beven, A manifesto for the equifinality thesis, J. Hydrol. 320 (1–2) (2006)
estimation and performance prediction, Energy Build. 124 (2016) 194–202, 18–36, https://doi.org/10.1016/j.jhydrol.2005.07.007.
https://doi.org/10.1016/j.enbuild.2016.04.025. [109] G. Allesina, E. Mussatti, F. Ferrari, A. Muscio, A calibration methodology for
[84] Y. Heo, D. Graziano, L. Guzowski, R. Muehleisen, Evaluation of calibration building dynamic models based on data collected through survey and
efficacy under different levels of uncertainty, J. Build. Performance billings, Energy Build. 158 (2018) 406–416, https://doi.org/10.1016/j.
Simul. 8 (3) (2015) 135–144, https://doi.org/10.1080/19401493. enbuild.2017.09.089.
2014.896947. [110] Z. Mylona, M. Kolokotroni, S. Tassou, Frozen food retail: Measuring and
[85] A. Chong, W. Xu, S. Chao, N.-T. Ngo, Continuous-time bayesian calibration of modelling energy use and space environmental systems in an operational
energy models using bim and energy data, Energy Build. 194 (2019) 177–190, supermarket, Energy Build. 144 (2017) 129–143, https://doi.org/10.1016/j.
https://doi.org/10.1016/j.enbuild.2019.04.017. enbuild.2017.03.049.
[86] S. Chen, D. Friedrich, Z. Yu, J. Yu, District heating network demand prediction [111] J. Vesterberg, S. Andersson, T. Olofsson, Calibration of low-rise multifamily
using a physics-based energy model with a bayesian approach for parameter residential simulation models using regressed estimations of transmission
calibration, Energies 12(18). doi:10.3390/en12183408.. losses, J. Build. Performance Simul. 9 (3) (2016) 304–315, https://doi.org/
[87] G. Tardioli, A. Narayan, R. Kerrigan, M. Oates, J. O’Donnell, D. Finn, A 10.1080/19401493.2015.1067257.
methodology for calibration of building energy models at district scale using [112] D. Guyot, F. Giraud, F. Simon, D. Corgier, C. Marvillet, B. Tremeac, Building
clustering and surrogate techniques, Energy Build. 226. doi:10.1016/j. energy model calibration: A detailed case study using sub-hourly measured
enbuild.2020.110309.. data, Energy Build. 223 (2020), https://doi.org/10.1016/j.
[88] M. Kristensen, R. Hedegaard, S. Petersen, Hierarchical calibration of enbuild.2020.110189 110189.
archetypes for urban building energy modeling, Energy Build. 175 (2018) [113] R. Escandón, R. Suárez, J. Sendra, On the assessment of the energy
219–234, https://doi.org/10.1016/j.enbuild.2018.07.030. performance and environmental behaviour of social housing stock for the
[89] M. Kristensen, R. Hedegaard, S. Petersen, Long-term forecasting of hourly adjustment between simulated and measured data: The case of mild winters
district heating loads in urban areas using hierarchical archetype modeling, in the mediterranean climate of southern europe, Energy Build. 152 (2017)
Energy 201. doi:10.1016/j.energy.2020.117687.. 418–433, https://doi.org/10.1016/j.enbuild.2017.07.063.
[90] S. Rouchier, M. Jiménez, S. Castańo, Sequential monte carlo for on-line [114] F. Ascione, N. Bianco, R. De Masi, F. De’Rossi, G. Vanoli, Energy retrofit of an
parameter estimation of a lumped building energy model, Energy Build. 187 educational building in the ancient center of benevento. feasibility study of
(2019) 86–94, https://doi.org/10.1016/j.enbuild.2019.01.045. energy savings and respect of the historical value, Energy Build. 95 (2015)
[91] H. Lim, Z. Zhai, Comprehensive evaluation of the influence of meta-models on 172–183, https://doi.org/10.1016/j.enbuild.2014.10.072.
bayesian calibration, Energy Build. 155 (2017) 66–75, https://doi.org/ [115] V. Monetti, E. Fabrizio, M. Filippi, Impact of low investment strategies for
10.1016/j.enbuild.2017.09.009. space heating control: Application of thermostatic radiators valves to an old
[92] R. Hedegaard, M. Kristensen, T. Pedersen, A. Brun, S. Petersen, Bottom-up residential building, Energy Build. 95 (2015) 202–210, https://doi.org/
modelling methodology for urban-scale analysis of residential space heating 10.1016/j.enbuild.2015.01.001.
demand response, Appl. Energy 242 (2019) 181–204, https://doi.org/10.1016/ [116] G. Kazas, E. Fabrizio, M. Perino, Energy demand profile generation with
j.apenergy.2019.03.063. detailed time resolution at an urban district scale: A reference building
[93] Y.-J. Kim, C.-S. Park, Stepwise deterministic and stochastic calibration of an approach and case study, Appl. Energy 193 (2017) 243–262, https://doi.org/
energy simulation model for an existing building, Energy Build. 133 (2016) 10.1016/j.apenergy.2017.01.095.
455–468, https://doi.org/10.1016/j.enbuild.2016.10.009. [117] D. Jermyn, R. Richman, A process for developing deep energy retrofit
[94] H. Lim, Z. Zhai, Influences of energy data on bayesian calibration of building strategies for single-family housing typologies: Three toronto case studies,
energy model, Appl. Energy 231 (2018) 686–698, https://doi.org/10.1016/j. Energy Build. 116 (2016) 522–534, https://doi.org/10.1016/j.
apenergy.2018.09.156. enbuild.2016.01.022.
[95] J. Sokol, C. Cerezo Davila, C. Reinhart, Validation of a bayesian-based method [118] H. Samuelson, A. Ghorayshi, C. Reinhart, Analysis of a simplified calibration
for defining residential archetypes in urban building energy models, Energy procedure for 18 design-phase building energy models, J. Build.
Build. 134 (2017) 11–24, https://doi.org/10.1016/j.enbuild.2016.10.050. Performance Simul. 9 (1) (2016) 17–29, https://doi.org/10.1080/
[96] W. Tian, S. Yang, Z. Li, S. Wei, W. Pan, Y. Liu, Identifying informative energy 19401493.2014.988752.
data in bayesian calibration of building energy models, Energy Build. 119 [119] P. Beagon, F. Boland, M. Saffari, Closing the gap between simulation and
(2016) 363–376, https://doi.org/10.1016/j.enbuild.2016.03.042. measured energy use in home archetypes, Energy Build. 224. doi:10.1016/j.
[97] L. Lundström, J. Akander, Bayesian calibration with augmented stochastic enbuild.2020.110244..
state-space models of district-heated multifamily buildings, Energies 13(1). [120] M. Tokarik, R. Richman, Life cycle cost optimization of passive energy
doi:10.3390/en13010076.. efficiency improvements in a toronto house, Energy Build. 118 (2016) 160–
[98] C. Zhu, W. Tian, B. Yin, Z. Li, J. Shi, Uncertainty calibration of building energy 169, https://doi.org/10.1016/j.enbuild.2016.02.015.
models by combining approximate bayesian computation and machine [121] Y. Ji, P. Xu, A bottom-up and procedural calibration method for building
learning algorithms, Appl. Energy 268. doi:10.1016/j.apenergy.2020.115025.. energy simulation models based on hourly electricity submetering data,
Energy 93 (2015) 2337–2350, https://doi.org/10.1016/j.energy.2015.10.109.
19
A. Chong, Y. Gu and H. Jia Energy & Buildings 253 (2021) 111533
[122] M. Royapoor, T. Roskilly, Building model calibration using energy and [146] C.A. Aumann, A methodology for developing simulation models of complex
environmental data, Energy Build. 94 (2015) 109–120, https://doi.org/ systems, Ecol. Model. 202 (3–4) (2007) 385–396, https://doi.org/10.1016/j.
10.1016/j.enbuild.2015.02.050. ecolmodel.2006.11.005.
[123] N. Jain, E. Burman, S. Stamp, D. Mumovic, M. Davies, Cross-sectoral [147] H.B. Gunay, W. O’Brien, I. Beausoleil-Morrison, Implementation and
assessment of the performance gap using calibrated building energy comparison of existing occupant behaviour models in energyplus, J. Build.
performance simulation, Energy Build. 224. doi:10.1016/j. Performance Simul. 9 (6) (2016) 567–588, https://doi.org/10.1080/
enbuild.2020.110271.. 19401493.2015.1102969.
[124] A. O’ Donovan, P. O’ Sullivan, M. Murphy, Predicting air temperatures in a [148] S. Kanteh Sakiliba, N. Bolton, M. Sooriyabandara, The energy performance and
naturally ventilated nearly zero energy building: Calibration, validation, techno-economic analysis of zero energy bill homes, Energy Build. 228.
analysis and approaches, Appl. Energy 250 (2019) 991–1010. doi:10.1016/j. doi:10.1016/j.enbuild.2020.110426..
apenergy.2019.04.082.. [149] A. Figueiredo, J. Kämpf, R. Vicente, R. Oliveira, T. Silva, Comparison between
[125] F. Pianosi, K. Beven, J. Freer, J.W. Hall, J. Rougier, D.B. Stephenson, T. Wagener, monitored and simulated data using evolutionary algorithms: Reducing the
Sensitivity analysis of environmental models: A systematic review with performance gap in dynamic building simulation, J. Build. Eng. 17 (2018) 96–
practical workflow, Environ. Model. Software 79 (2016) 214–232, https://doi. 106, https://doi.org/10.1016/j.jobe.2018.02.003.
org/10.1016/j.envsoft.2016.02.008. [150] J. Lee, S. Yoo, J. Kim, D. Song, H. Jeong, Improvements to the customer baseline
[126] M.D. Morris, Factorial sampling plans for preliminary computational load (cbl) using standard energy consumption considering energy efficiency
experiments, Technometrics 33 (2) (1991) 161–174. and demand response, Energy 144 (2018) 1052–1063, https://doi.org/
[127] F. Campolongo, J. Cariboni, A. Saltelli, An effective screening design for 10.1016/j.energy.2017.12.044.
sensitivity analysis of large models, Environ. Model. Software 22 (10) (2007) [151] M. De Rosa, M. Brennenstuhl, C. Cabrera, U. Eicker, D. Finn, An iterative
1509–1518. methodology for model complexity reduction in residential building
[128] K. Kim, J. Haberl, Development of a home energy audit methodology for simulation, Energies 12 (12). doi:10.3390/en12122448..
determining energy and cost efficient measures using an easy-to-use [152] C. Cornaro, S. Rossi, S. Cordiner, V. Mulone, L. Ramazzotti, Z. Rinaldi, Energy
simulation: Test results from single-family houses in texas, usa, Build. performance analysis of stile house at the solar decathlon, Lessons learned, J.
Simul. 9 (6) (2016) 617–628, https://doi.org/10.1007/s12273-016-0299-y. Build. Eng. 13 (2017) (2015) 11–27, https://doi.org/10.1016/j.
[129] W. Tian, A review of sensitivity analysis methods in building energy analysis, jobe.2017.06.015.
Renew. Sustain. Energy Rev. 20 (2013) 411–419, https://doi.org/10.1016/j. [153] E. Carlon, V. Verma, M. Schwarz, L. Golicza, A. Prada, M. Baratieri, W.
rser.2012.12.014. Haslinger, C. Schmidl, Experimental validation of a thermodynamic boiler
[130] J. Robertson, B. Polly, J. Collis, Reduced-order modeling and simulated model under steady state and dynamic conditions, Appl. Energy 138 (2015)
annealing optimization for efficient residential building utility bill 505–516, https://doi.org/10.1016/j.apenergy.2014.10.031.
calibration, Appl. Energy 148 (2015) 169–177, https://doi.org/10.1016/j. [154] M. Bhandari, S. Shrestha, J. New, Evaluation of weather datasets for building
apenergy.2015.03.049. energy simulation, Energy Build. 49 (2012) 109–118, https://doi.org/10.1016/
[131] R. Enríquez, M. Jiménez, M. Heras, Towards non-intrusive thermal load j.enbuild.2012.01.033.
monitoring of buildings: Bes calibration, Appl. Energy 191 (2017) 44–54, [155] A. Persily, A. Musser, S.J. Emmerich, Modeled infiltration rate distributions for
https://doi.org/10.1016/j.apenergy.2017.01.050. us housing, Indoor air 20 (6) (2010) 473–485, https://doi.org/10.1111/j.1600-
[132] F. Tüysüz, H. Sözer, Calibrating the building energy model with the short 0668.2010.00669.x.
term monitored data: A case study of a large-scale residential building, [156] D. Kim, S. Cox, H. Cho, P. Im, Model calibration of a variable refrigerant flow
Energy Build. 224. doi:10.1016/j.enbuild.2020.110207.. system with a dedicated outdoor air system: A case study, Energy Build. 158
[133] K. Kim, J. Haberl, Development of methodology for calibrated simulation in (2018) 884–896, https://doi.org/10.1016/j.enbuild.2017.10.049.
single-family residential buildings using three-parameter change-point [157] S. Nagpal, C. Mueller, A. Aijazi, C. Reinhart, A methodology for auto-
regression model, Energy and Buildings 99 (2015) 140–152, cited By 22. calibrating urban building energy models using surrogate modeling
doi:10.1016/j.enbuild.2015.04.032.. techniques, J. Build. Performance Simul. 12 (1) (2019) 1–16, https://doi.org/
[134] A. Elharidi, P. Tuohy, M. Teamah, A. Hanafy, Energy and indoor environmental 10.1080/19401493.2018.1457722.
performance of typical egyptian offices: Survey, baseline model and [158] M. Manfren, B. Nastasi, Parametric performance analysis and energy model
uncertainties, Energy Build. 135 (2017) 367–384, https://doi.org/10.1016/j. calibration workflow integration – a scalable approach for buildings, Energies
enbuild.2016.11.011. 13(3). doi:10.3390/en13030621..
[135] B. Glasgo, C. Hendrickson, I.L. Azevedo, Assessing the value of information in [159] S. Asadi, E. Mostavi, D. Boussaa, M. Indaganti, Building energy model
residential building simulation: Comparing simulated and actual building calibration using automated optimization-based algorithm, Energy Build.
loads at the circuit level, Appl. Energy 203 (2017) 348–363, https://doi.org/ 198 (2019) 106–114, https://doi.org/10.1016/j.enbuild.2019.06.001.
10.1016/j.apenergy.2017.05.164. [160] D. Chakraborty, H. Elzarka, Performance testing of energy models: are we
[136] I. Allard, T. Olofsson, G. Nair, Energy evaluation of residential buildings: using the right statistical metrics?, J Build. Performance Simul. 11 (4) (2018)
Performance gap analysis incorporating uncertainties in the evaluation 433–448, https://doi.org/10.1080/19401493.2017.1387607.
methods, Build. Simul. 11 (4) (2018) 725–737, https://doi.org/10.1007/ [161] G.R. Ruiz, C.F. Bandera, Validation of calibrated energy models: Common
s12273-018-0439-7. errors, Energies 10 (10) (2017) 1587, https://doi.org/10.3390/en10101587.
[137] A. Mihai, R. Zmeureanu, Bottom-up evidence-based calibration of the hvac [162] T.A. Reddy, I. Maor, C. Panjapornpon, Calibrating detailed building energy
air-side loop of a building energy model, J. Build. Performance Simul. 10 (1) simulation programs with measured data-part i: General methodology (rp-
(2017) 105–123, https://doi.org/10.1080/19401493.2016.1152302. 1051), Hvac&R Res. 13 (2) (2007) 221–241, https://doi.org/10.1080/
[138] F. Ascione, N. Bianco, T. Iovane, G. Mauro, D. Napolitano, A. Ruggiano, L. 10789669.2007.10390952.
Viscido, A real industrial building: Modeling, calibration and pareto [163] J.K. Kissock, J.S. Haberl, D.E. Claridge, Inverse modeling toolkit: numerical
optimization of energy retrofit, J. Build. Eng. 29. doi:10.1016/j. algorithms, ASHRAE Trans. 109 (2003) 425.
jobe.2020.101186.. [164] J.S. Haberl, A. Sreshthaputra, D.E. Claridge, J.K. Kissock, Inverse model toolkit:
[139] R. Yin, S. Kiliccote, M. Piette, Linking measurements and models in application and testing, ASHRAE Trans. 109 (2003) 435.
commercial buildings: A case study for model calibration and demand [165] EVO, Uncertainty assessment for ipmvp, international performance
response strategy evaluation, Energy Build. 124 (2016) 222–235, https://doi. measurement and verification protocol, Efficiency Valuation Organization..
org/10.1016/j.enbuild.2015.10.042. [166] C. Cerezo, J. Sokol, S. AlKhaled, C. Reinhart, A. Al-Mumin, A. Hajiah,
[140] N. Li, Z. Yang, B. Becerik-Gerber, C. Tang, N. Chen, Why is the reliability of Comparison of four building archetype characterization methods in urban
building simulation limited as a tool for evaluating energy conservation building energy modeling (ubem): A residential case study in kuwait city,
measures?, Appl Energy 159 (2015) 196–205, https://doi.org/10.1016/j. Energy Build. 154 (2017) 321–334, https://doi.org/10.1016/j.
apenergy.2015.09.001. enbuild.2017.08.029.
[141] C. Aparicio-Fernéndez, J.-L. Vivancos, P. Cosar-Jorda, R. Buswell, Energy [167] T. Gneiting, A.E. Raftery, Strictly proper scoring rules, prediction, and
modelling and calibration of building simulations: A case study of a domestic estimation, J. Am. Stat. Assoc. 102 (477) (2007) 359–378, https://doi.org/
building with natural ventilation, Energies 12 (17). doi:10.3390/ 10.1198/016214506000001437.
en12173360.. [168] A.K. Persily, Airtightness of commercial and institutional buildings: blowing
[142] D. Yan, W. O’Brien, T. Hong, X. Feng, H.B. Gunay, F. Tahmasebi, A. Mahdavi, holes in the myth of tight buildings..
Occupant behavior modeling for building performance simulation: Current [169] F.G.N. Li, A. Smith, P. Biddulph, I.G. Hamilton, R. Lowe, A. Mavrogianni, E.
state and future challenges, Energy Build. 107 (2015) 264–278, https://doi. Oikonomou, R. Raslan, S. Stamp, A. Stone, A. Summerfield, D. Veitch, V. Gori,
org/10.1016/j.enbuild.2015.08.032. T. Oreszczyn, Solid-wall u-values: heat flux measurements compared with
[143] Y.-S. Kim, M. Heidarinejad, M. Dahlhausen, J. Srebric, Building energy model standard assumptions, Build. Res. Inf. 43 (2) (2015) 238–252, https://doi.org/
calibration with schedules derived from electricity use data, Appl. Energy 190 10.1080/09613218.2014.967977.
(2017) 997–1007, https://doi.org/10.1016/j.apenergy.2016.12.167. [170] V. Gori, V. Marincioni, P. Biddulph, C.A. Elwell, Inferring the thermal
[144] A. Chong, G. Augenbroe, D. Yan, Occupancy data at different spatial resistance and effective thermal mass distribution of a wall from in situ
resolutions: Building energy performance and model calibration, Appl. measurements to characterise heat transfer at both the interior and exterior
Energy 286 (2021), https://doi.org/10.1016/j.apenergy.2021.116492 116492. surfaces, Energy Build. 135 (2017) 398–409, https://doi.org/10.1016/j.
[145] G. Augenbroe, The role of simulation in performance-based building, in: J.L. enbuild.2016.10.043.
Hensen, R. Lamberts (Eds.), Building performance simulation for design and [171] D.H. Yi, D.W. Kim, C.S. Park, Parameter identifiability in bayesian inference for
operation, Routledge, 2019, Ch. 10, p. 343.. building energy models, Energy Build. 198 (2019) 318–328, https://doi.org/
10.1016/j.enbuild.2019.06.012.
20
A. Chong, Y. Gu and H. Jia Energy & Buildings 253 (2021) 111533
[172] T. Hong, Y. Chen, X. Luo, N. Luo, S.H. Lee, Ten questions on urban building [180] I. Gaetani, P.-J. Hoes, J.L. Hensen, Occupant behavior in building energy
energy modeling, Build. Environ. 168 (2020), https://doi.org/10.1016/j. simulation: Towards a fit-for-purpose modeling strategy, Energy Build. 121
buildenv.2019.106508 106508. (2016) 188–204, https://doi.org/10.1016/j.enbuild.2016.03.038.
[173] C. Miller, D. Thomas, J. Kämpf, A. Schlueter, Urban and building multiscale co- [181] I. Gaetani, P.-J. Hoes, J.L. Hensen, A stepwise approach for assessing the
simulation: case study implementations on two university campuses, J. Build. appropriate occupant behaviour modelling in building performance
Performance Simul. 11 (3) (2018) 309–321, https://doi.org/10.1080/ simulation, J. Build. Performance Simul. 13 (3) (2020) 362–377, https://doi.
19401493.2017.1354070. org/10.1080/19401493.2020.1734660.
[174] Y. Chen, T. Hong, X. Luo, B. Hooper, Development of city buildings dataset for [182] S. Zhan, A. Chong, Data requirements and performance evaluation of model
urban building energy modeling, Energy Build. 183 (2019) 252–265, https:// predictive control in buildings: A modeling perspective, Renew. Sustain.
doi.org/10.1016/j.enbuild.2018.11.008. Energy Rev. (2021), https://doi.org/10.1016/j.rser.2021.110835 110835.
[175] Y.Q. Ang, Z.M. Berzolla, C.F. Reinhart, From concept to application: A review [183] S. Pfenninger, J. DeCarolis, L. Hirth, S. Quoilin, I. Staffell, The importance of
of use cases in urban building energy modeling, Appl. Energy 279 (2020), open data and software: Is energy research lagging behind?, Energy Policy
https://doi.org/10.1016/j.apenergy.2020.115738 115738. 101 (2017) 211–215, https://doiorg/10.1016/j.enpol.2016.11.046.
[176] F. Noardo, L. Harrie, K.A. Ohori, F. Biljecki, C. Ellul, T. Krijnen, H. Eriksson, D. [184] M. McNutt, J. Unite Reproducibility (2014), https://doi.org/10.1126/science.
Guler, D. Hintz, M.A. Jadidi, M. Pla, S. Sanchez, V.-P. Soini, R. Stouffs, J. aaa1724.
Tekavec, J. Stoter, Tools for BIM-GIS Integration (IFC Georeferencing and [185] C. Boettiger, An introduction to docker for reproducible research, ACM
Conversions): Results from the GeoBIM Benchmark 2019, ISPRS Int. J. Geo-Inf. SIGOPS Oper. Syst. Rev. 49 (1) (2015) 71–79, https://doi.org/10.1145/
9 (9) (2020) 502, https://doi.org/10.3390/ijgi9090502. 2723872.2723882.
[177] F. Biljecki, J. Lim, J. Crawford, D. Moraru, H. Tauscher, A. Konde, K. Adouane, S. [186] S. Pfenninger, L. Hirth, I. Schlecht, E. Schmid, F. Wiese, T. Brown, C. Davis, M.
Lawrence, P. Janssen, R. Stouffs, Extending CityGML for IFC-sourced 3D city Gidden, H. Heinrichs, C. Heuberger, et al., Opening the black box of energy
models, Autom. Constr. 121 (2021), https://doi.org/10.1016/j. modelling: Strategies and lessons learned, Energy Strategy Rev. 19 (2018)
autcon.2020.103440 103440. 63–71, https://doi.org/10.1016/j.esr.2017.12.002.
[178] E.J. Rykiel Jr, Testing ecological models: the meaning of validation, Ecol. [187] B.L. Ball, N. Long, K. Fleming, C. Balbach, P. Lopez, An open source analysis
Model. 90 (3) (1996) 229–244, https://doi.org/10.1016/0304-3800(95) framework for large-scale building energy modeling, J. Build. Performance
00152-2. Simul. 13 (5) (2020) 487–500, https://doi.org/10.1080/
[179] M. Trčka, J.L. Hensen, Overview of hvac system simulation, Autom. Constr. 19 19401493.2020.1778788.
(2) (2010) 93–99, https://doi.org/10.1016/j.autcon.2009.11.019. [188] H. Jia, A. Chong, eplusr, A framework for integrating building energy
simulation and data-driven analytics, Energy Build. (2021), https://doi.org/
10.1016/j.enbuild.2021.110757 110757.
21