Berkly AHUs qt8h08x3xd
Berkly AHUs qt8h08x3xd
Berkly AHUs qt8h08x3xd
LBL Publications
Title
Development and Implementation of Fault-Correction Algorithms in Fault Detection and
Diagnostics Tools
Permalink
https://escholarship.org/uc/item/8h08x3xd
Journal
Energies, 13(10)
ISSN
1996-1073
Authors
Lin, Guanjing
Pritoni, Marco
Chen, Yimin
et al.
Publication Date
2020
DOI
10.3390/en13102598
Peer reviewed
Abstract: A fault detection and diagnostics (FDD) tool is a type of energy management and
information system that continuously identifies the presence of faults and efficiency improvement
opportunities through a one-way interface to the building automation system and the application
of automated analytics. Building operators on the leading edge of technology adoption use FDD
tools to enable median whole-building portfolio savings of 8%. Although FDD tools can inform
operators of operational faults, currently an action is always required to correct the faults to generate
energy savings. A subset of faults, however, such as biased sensors, can be addressed automatically,
eliminating the need for staff intervention. Automating this fault “correction” can significantly
increase the savings generated by FDD tools and reduce the reliance on human intervention.
Doing so is expected to advance the usability and technical and economic performance of FDD
technologies. This paper presents the development of nine innovative fault auto-correction algorithms
for Heating, Ventilation, and Air Conditioning pi(HVAC) systems. When the auto-correction routine is
triggered, it overwrites control setpoints or other variables to implement the intended changes. It also
discusses the implementation of the auto-correction algorithms in commercial FDD software products,
the integration of these strategies with building automation systems and their preliminary testing.
Keywords: fault correction; fault detection and diagnostics; building operation; energy efficiency;
field testing
1. Introduction
Commercial buildings constitute 18% of the U.S. primary energy consumption [1] and account for
$149 billion in annual energy expenditures [2]. Much of this consumption is due to operational waste,
representing a tremendous potential for savings. The literature indicates that median whole-building
savings of 16% are achieved by commissioning existing buildings [3] and that 5–30% of commercial
building energy use is wasted due to problems associated with controls [4–9].
Commercially available fault detection and diagnostics (FDD) tools provide a means of
monitoring-based commissioning, through which instances of operational inefficiency can be
continuously identified, isolated, and surfaced for resolution by operations and maintenance staff.
Today’s FDD technology has been documented to enable whole building savings of 8% on average,
across users [10]. These technologies integrate with building automation systems (BASs) or can be
implemented as retrofit add-ons to existing equipment, and continuously analyze operational data
streams across many system types and configurations. This is in contrast to the historically typical
variants of FDD that are delivered as original equipment manufacturer-embedded equipment features
or handheld FDD devices that rely upon temporary field measurements.
Figure 1 represents an idealized architecture of a BAS, adapted from American Society of
Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE) Guideline 13 [11]. Field devices
Figure 1. Schematic
Figure illustration
1. Schematic of the
illustration ofintegration of building
the integration automation
of building systemsystem
automation (BAS) data
(BAS)into fault
data into
detection and diagnostics
fault detection (FDD) products.
and diagnostics (BACnet MS/TP
(FDD) products. (BACnet - BACnet
MS/TP Master Slave
- BACnet Token
Master Passing
Slave Token
Passing protocol).
protocol).
Although
Although FDDFDD tools
tools areare being
being usedused
to to enable
enable cost-effective
cost-effective energy
energy savings,
savings, there
there remains
remains anan
opportunity
opportunity to to advance
advance thethe state
state of of
thethe technology.
technology. In In practice,
practice, thethe need
need forfor human
human intervention
intervention to to
fix faults once they are identified often results in delay or inaction, resulting in additional operations
fix faults once they are identified often results in delay or inaction, resulting in additional operations
and maintenance (O&M) costs or lost opportunities. Traditionally, FDD generates recommendations
Energies 2020, 13, 2598 3 of 20
and follow-up actions which are implemented by service technicians or other staff. An emerging
capability comprises the integration of FDD outputs with facility management “work order” or CMMSs
(computerized maintenance management systems). While this makes it possible to automatically
generate work orders from the FDD system, human intervention is still required to implement the
corrective action. Therefore, this work seeks to develop automated fault-correction approaches and
integrate them with commercial FDD technology offerings, thereby closing the loop between the
passive diagnostics and active control. This is possible by converting the one-way BAS interface into a
two-way interface, as is done with supervisory predictive control technologies that are emerging in
the market.
It is not possible to automate the correction of mechanical faults such as failed actuators; however,
there is nonetheless a compelling set of operational problems that are detectable in today’s FDD
offerings and are correctable through the software-based manipulation of the BAS parameters that
can be exposed to external applications via BACnet. For example, Fernandez et al. [6] assessed
control problems in commercial buildings, as well as their prevalence and whole-building energy
impact for key commercial building sectors. Among the most common faults that relate to biased
sensors, improper control parameter settings and inefficient schedules have significant impact and
high prevalence rates. Automating the correction of these types of faults can increase the savings
realized through the use of FDD tools and reduce the extent to which savings are dependent upon
human intervention.
The academic and technical literature has extensively covered the development of automated FDD
applied to HVAC and lighting systems [13,14]; however, very little has been published on the automatic
correction of the identified faults via the actual control system. One set of relevant papers stem from the
vast literature on rule-based FDD algorithms. Fernandez et al. [15,16] developed passive and proactive
fault auto-correction algorithms for various HVAC components and systems. The methods proposed
to correct some faults which include biased air-handler unit (AHU) mixed air (MA), outside air (OA),
and return air (RA) temperature and humidity sensors; damper control hunting; minimum outdoor air
damper too open/closed; and manual overrides in large HVAC systems. Using the same approach,
Brambley et al. [17] extended Fernandez et al. [15,16] by adding correction routines for the biased
AHU supply air (SA) temperature and flow rate sensors and the biased variable-air-volume (VAV)
box discharge air temperature and flow rate sensors. This project implemented and tested a subset of
these algorithms (sensor bias and minimum outdoor air damper position) in a laboratory experiment.
This research stopped short of validating the developed solutions in physical buildings or integrating
them with existing BAS and commercial FDD products.
Related to the concept of fault correction is a body of work in the building control literature
that focuses on fault tolerant control. The purpose of a fault tolerant controller is to the maintain
proper operation of a system despite the presence of faults [18,19]. These approaches have been widely
adopted in other industries for safety-critical systems such as nuclear power plants, spacecraft and
aircraft. In the context of buildings, Padilla et al. [20] developed a model-based strategy which aims to
replace defective sensors in AHUs [20] with “virtual sensors.” The signal generated from these “virtual
sensors” can be used in the AHU control system when the actual physical sensors behave abnormally.
Supply air temperature and pressure sensor faults are effectively corrected by using the proposed
algorithms. Wang et al. [21] developed a supervisory control scheme that adapts to the presence of a
measurement error in an outdoor air flow rate. The method uses neural network models to estimate
the correct behavior of the faulty sensor and to maintain indoor air quality while minimizing energy
use. Hao et al. [22] employed principal component analysis to develop fault-tolerant control and
data recovery in the HVAC monitoring system. Bengea et al. [23] developed a fault-tolerant optimal
control strategy for an HVAC system integrating FDD and model predictive control. The output of the
FDD algorithm is used to continuously update the model’s predictive control algorithm parameters.
The approaches described in these papers offer innovations to the state of the art, yet they are not
readily implemented in today’s buildings control systems. This is because they comprise strategies
Energies 2020, 13, 2598 4 of 20
that are not supported by traditional BAS capabilities. Similarly, while the literature focuses on the
development of these advanced controllers, it does not explore their integration with existing FDD
technologies. An additional practical challenge is that a large volume of non-faulty data under various
operational conditions is typically needed to train the models employed in these solutions.
This paper complemented and extended previous work in three ways. (1) It developed a
comprehensive set of fault auto-correction algorithms designed to be integrated with commercial
FDD tools. These algorithms target incorrectly programmed schedules, manual control “lock out,”
sensor bias, control hunting, rogue zone, and suboptimal setpoints/setpoints setback. Typically,
commercial FDD tools are developed as a software layer on top of the existing BAS. There exists a
natural separation of roles in this arrangement, in which the BAS actively controls the building and the
FDD tool observes its operation and provides insights and recommendations to the building manager.
The new auto-correction algorithms afford the FDD technology a certain degree of control capability.
(2) It conducted preliminary testing and performance validation during which two auto-correction
routines were deployed in a commercial FDD tool and tested on two AHUs in a real building.
The enhanced FDD tool was able to correct faults successfully. (3) It presented the challenges of
the integration of developed auto-correction algorithms into commercial FDD tools along with the
solutions through work with three industry partners. New insights were gained by implementing the
pseudo-code developed by the research team in real systems and real buildings. Sections 2–4 present
the auto-correction algorithms, preliminary testing and the implementation changes and solutions,
respectively. Section 5 concludes the paper and describes future work.
Table 1. Cont.
For each of the faults in Table 1, correction algorithms were developed (for faults 1, 3 and 5–9) or
adapted from the existing literature (for faults 2 and 4). The auto-correctable faults in Table 1 were
divided into two categories: faults 1–5 were in the “Fault” category, which indicated problems that
violated the intended operation of the equipment (e.g., sensor bias); faults 6–9 were in the “Opportunity”
category, which indicated problems that represented potential to improve the current operation of
the equipment (e.g., improve a setpoint reset). This distinction was made to differentiate between the
intent of the restoring operation to what it was originally intended to be and that of optimal control.
As the objective of this study was to develop automated fault-correction algorithms that could be
integrated with commercial FDD and BAS products, the auto-correction algorithms described in this
section were decoupled from the fault detection and diagnostics algorithms embedded in the FDD
tools. This permitted the applicability of the developed correction algorithms across a variety of FDD
technologies that employed different FDD rules and algorithms. Furthermore, it was assumed that
the FDD tools were able to detect the faults of focus, as they represented some of the more commonly
encountered faults in commercial buildings.
Figure 2 shows the flow chart of the general auto-correction process. In this process, after the
FDD algorithm generates a fault flag of a specific fault, the fault auto-correction algorithm is initiated
to correct this fault with the approval from the building operator. Control_variable_being_overwritten
is the key element in the auto-correction process. The algorithm overwrites this variable
(Control_variable_being_ overwritten_current) with a new value (Control_variable_being_overwritten
_new). The control_variable_being_overwritten_current is the one identified in the FDD algorithm
to be associated with the problematic value (fault) or potential to improve (opportunity).
The control_variable_ being_overwritten_new is the same variable that has the correct value (fault) or
optimized value (opportunity). All of the auto-correction algorithms developed in this work followed
this structure, with different control variables overwritten, and different ways to determine the correct
or improved value of the variable.
Each auto-correction algorithm is presented and discussed in the following.
Energies 2020, 13, x FOR PEER REVIEW 6 of 20
lockout2020,
Energies temperaturesetpoints, which are the outside air temperatures above (below) which7 ofthe
13, 2598 20
outside air damper will return to its minimum position.
(a) (b)
Figure 3. Flowchart
Flowchart of the supply
supply ai temperature
temperature (SAT) sensor bias fault auto-correction
auto-correction algorithm.
algorithm.
Approach1:1:overwrite
(a) Approach overwritethethe
SATSAT temperature
temperature value,
value, and
and (b) (b) Approach
Approach 2: overwrite
2: overwrite the SAT the SAT
setpoint.
setpoint.
2.4. Damper/Valve/Fan/Pump Control Hunting Due to Improper Proportion Gain
2.4. Damper/Valve/Fan/Pump
In contrast to the otherControl Hunting
algorithms, theDue to Improper Proportion
auto-correction of controlGain hunting due to improper
proportion gain employs a trial and error procedure [15]. The control_variable_being_overwritten
In contrast to the other algorithms, the auto-correction of control hunting due to improper is the
proportional–integral–derivative
proportion gain employs a trial and (PID) controller
error parameter
procedure proportion
[15]. The gain (Kp). In the auto-correction
control_variable_being_overwritten is
process, the Kp is continually adjusted to find out the appropriate value
the proportional–integral–derivative (PID) controller parameter proportion gain (Kp). In the auto- that eliminates the hunting
behavior.
correctionWhen the the
process, FDDKp algorithm flags anadjusted
is continually improper toKp causing
find out the theappropriate
hunting fault, the that
value auto-correction
eliminates
algorithm is initiated. First, a maximum auto-correction duration
the hunting behavior. When the FDD algorithm flags an improper Kp causing the hunting fault, threshold (T_AC_thresh) is setthe
to
avoid an endless auto-correction process. Note that a setting time
auto-correction algorithm is initiated. First, a maximum auto-correction duration threshold of an actuator during the control
response
(T_AC_thresh) may be is varied due toan
set to avoid different
endlessactuator control characteristics.
auto-correction process. Note that For aexample,
setting timefor aofVAVan
terminal unit damper, the settling time is typically in the order of one or
actuator during the control response may be varied due to different actuator control characteristics.two minutes, but the settling
time for a cooling
For example, for acoil
VAV valve may be
terminal unitseveral
damper, minutes. Then, time
the settling the current valueinofthe
is typically theorder
Kp isof compared
one or twoto
aminutes,
Kp_threshold,
but the in this case
settling time0.2.forThis test iscoil
a cooling meant
valve to may
avoidbean unacceptably
several minutes.long Then,settling time under
the current value
pure integral control. If the Kp value is above the Kp_threshold, the Kp is decreased
of the Kp is compared to a Kp_threshold, in this case 0.2. This test is meant to avoid an unacceptably by 10% [15]. Then,
the
longalgorithm starts
settling time a proactive
under test scenario
pure integral control.to If
seetheif the
Kp hunting
value is issue
abovestillthepersists, by changing
Kp_threshold, the Kpthe is
setpoint (T_set) of the damper/valve/fan/pump to trigger the component’s movement.
decreased by 10% [15]. Then, the algorithm starts a proactive test scenario to see if the hunting issue If the component
is stillpersists,
still hunting, bythe changing
procedure the is repeated;
setpointotherwise,
(T_set) of thethe
procedure is terminated. If the to
damper/valve/fan/pump Kp trigger
reaches the
the
Kp_threshold and there isIfstill
component’s movement. theacomponent
hunting fault, then
is still it is flagged
hunting, as an errorisand
the procedure the Kpotherwise,
repeated; is reset to the
the
original value (Figure 4).
procedure is terminated. If the Kp reaches the Kp_threshold and there is still a hunting fault, then it
is flagged as an error and the Kp is reset to the original value (Figure 4).
Energies 2020, 13, 2598 8 of 20
Energies 2020, 13, x FOR PEER REVIEW 8 of 20
Figure 4. Flowchart of Algorithm 2.4: control hunting due to improper proportion gain (T_AC_thresh -
aFigure 4. Flowchart
maximum of Algorithm
auto-correction 2.4: control
duration hunting
threshold, due to improper
Kp - controller proportion
parameter gaingain).
proportion (T_AC_thresh
- a maximum auto-correction duration threshold, Kp - controller parameter proportion gain).
2.5. Rogue Zone
2.5. Rogue ZoneGuideline 36 [24] defines high-performance control sequences for AHU–VAV systems.
ASHRAE
The “Trim
ASHRAEand Guideline
Respond” 36logic
[24](see Sections
defines 2.8 and 2.9) is adopted
high-performance to reset thefor
control sequences supply air temperature
AHU–VAV systems.
and static pressure
The “Trim setpoints
and Respond” at (see
logic an AHU. The2.8
sections adjustment
and 2.9) isofadopted
these setpoints depends
to reset the supplyonair
the number of
temperature
cooling “requests”
and static pressure generated byan
setpoints at downstream
AHU. The zones that are
adjustment of served by the same
these setpoints AHU.on
depends For
theeach time
number
step, the change
of cooling valuegenerated
“requests” of setpointby (SPchange)
downstreamis determined
zones thatby areEquations
served by(1)theand (2) below:
same AHU. For each
time step, the change value of setpoint (SPchange) is determined by Equations (1) and (2) below:
SPchange = SPres × (R − I ) (1)
SPchange =SPres × (𝑅 − 𝐼) (1)
X
R = IM Request (2)
𝑅 = ∑ 𝐼𝑀𝑖i𝑅𝑒𝑞𝑢𝑒𝑠𝑡i𝑖 (2)
where SPres is a unit respond amount (e.g., 0.06 inches for static pressure
where SPres is a unit respond amount (e.g., 0.06 inches for static pressure setpoint), R is the total setpoint), R is the total
number
number of of cooling
cooling requests
requests from
from the
the downstream
downstreamzones, zones,IIisisthe
thedefined
definednumber
numberofofignored requests,i
ignoredrequests,
is indicator of the downstream zone, IM
i is the indicator of the downstream zone, IM is the importance multiplier that is used in the control
the is the importance multiplier that is used in the control
sequence to decide if the cooling requests from the zone level should be used
sequence to decide if the cooling requests from the zone level should be used to control the upstream to control the upstream
AHU,
AHU, and and Request
Request is is the
the cooling
cooling request
request from
from the the zone. Therefore, ifif there
zone. Therefore, there isis aa rogue
rogue zonezone that
that
continuously sends cooling requests whenever its schedule is on, due
continuously sends cooling requests whenever its schedule is on, due to the zone-level equipment to the zone-level equipment
problems,
problems, the parameter R
the parameter R will
will always
always include
include thisthis request,
request, and
and itit keeps
keeps the
the setpoints
setpoints in in the
the control
control
loop to its high end. Excluding rogue zones from the corresponding reset
loop to its high end. Excluding rogue zones from the corresponding reset control strategies improves control strategies improves
operation
operation and andincreases
increasesenergyenergy savings.
savings.After
Afterthethe
zone-level equipment
zone-level equipment problems
problemsthat lead
that to the to
lead rogue
the
zone
rogueare zonefixed,
arethe rogue
fixed, thezone
rogueis no longer
zone is norogue,
longer androgue,
all theand
control variables
all the control that are overwritten
variables that are
during the auto-correction
overwritten process change
during the auto-correction back change
process to their back
original value.
to their original value.
Two
Two correction strategies were developed to eliminate the rogue zone
correction strategies were developed to eliminate the rogue zone impacts
impacts (i.e.,
(i.e., to ignore the
to ignore the
cooling request from the rogue zone). The first is to overwrite I in Equation
cooling request from the rogue zone). The first is to overwrite I in Equation (1). The auto-correction (1). The auto-correction
algorithm increases II by
algorithm increases by n
n for
for each
each currently
currently identified
identified rogue
rogue zone.
zone. TheThe value
value of
of nn is
is the
the same
same as as the
the
number
number of cooling requests determined in the control sequence of that rogue zone. The second is to
of cooling requests determined in the control sequence of that rogue zone. The second is to
overwrite the IM of the rogue zone in Equation (2). When the FDD tool flags the rogue zone fault, the
Energies 2020, 13, 2598 9 of 20
overwrite the IM of the rogue zone in Equation (2). When the FDD tool flags the rogue zone fault,
the IM of the rogue zone is overwritten to be zero. Therefore, the cooling requests from the rogue zone
can be removed.
actions sent by the FDD tool, and (5) commission and test the new system. Further details are
presented inbetween
functionality Lin et al.the
[26].
FDDThis section
and illustrates
the BAS, (2) buildthe
antest results of twointerface
auto-correction auto-correction algorithms:
to communicate with
“Rogue zone” and “Improve AHU supply air temperature setpoint reset” for one implementation
the building operator, (3) translate the algorithms into the FDD programming environment, (4) modify
partner.
the Section 4 summarizes
BAS programming the building
of the specific challenges that werethefaced
to integrate by threeactions
new control partners
sentduring the
by the FDD
implementation process, as well as the solutions that were used by one or more project
tool, and (5) commission and test the new system. Further details are presented in Lin et al. [26]. partners to
mitigate them.
This section illustrates the test results of two auto-correction algorithms: “Rogue zone” and “Improve
AHU In the preliminary
supply air temperaturetesting, the two
setpoint routines
reset” for onewere deployed inpartner.
implementation a commercial
SectionFDD product
4 summarizes
(SkySpark® by SkyFoundry) and tested on two AHUs in a building in Berkeley, California, US.
the challenges that were faced by three partners during the implementation process, as well as the
between March 3 and April 5, 2020. The goal of this preliminary test was to determine whether the
solutions that were used by one or more project partners to mitigate them.
enhanced FDD solutions were able to correct faults without adverse operational effects.
In the preliminary testing, the two routines were deployed in a commercial FDD product
(SkySpark® by SkyFoundry) and tested on two AHUs in a building in Berkeley, California, US. between
3.1. Description of the Testing Site and Equipment
March 3 and April 5, 2020. The goal of this preliminary test was to determine whether the enhanced
Table 2 summarizes
FDD solutions the test
were able to correct site without
faults and equipment information.
adverse operational AHU01 and AHU02 are
effects.
structurally identical. Figure 5 shows the BAS graphics (i.e., native dashboard) for one of the two
3.1. Description of the Testing Site and Equipment
AHUs.
Table 2 summarizes the test site and equipment information. AHU01 and AHU02 are structurally
Table 2. Test site information.
identical. Figure 5 shows the BAS graphics (i.e., native dashboard) for one of the two AHUs.
Size Building BAS Brand
Building Type HVAC Configuration
(m )
2 Schedule Table 2. Test site information. and Model
Labs: 24/7
Building Type Size (m2 ) Building Schedule HVAC Configuration BAS Brand and Model
operation, 3 chillers and 2 AHUs (AHU01 and Johnson
Mixed Labs: 24/7 3 chillers and 2 AHUs (AHU01
Offices:
operation, AHU02), covering
and AHU02), about
covering 90% of the
about Controls
laboratory and 8919
Mixed laboratory Johnson Controls (JCI)
and office space
8919 4 a.m.–9 p.m.,
Offices: floor
90%area, and
of the the
floor connected
area, and the zones
Metasys
(JCI)
office space 4 a.m.–9 p.m., = 83 and
Monday– (n =connected
83 and nzones
= 80,(nrespectively) Metasys
Monday–Sunday n = 80, respectively)
Sunday
Figure5.5. BAS
Figure BAS graphics
graphics for the AHU02 at the test site.
site. AHU01
AHU01has
hasaasimilar
similarstructure.
structure.
Both
BothAHU01
AHU01andand AHU02
AHU02 were
were controlled
controlled by
by aa control
control sequence
sequence implemented
implementedin inthe
thenative
nativeBAS
BAS
control
controllanguage
languageand
andhosted on on
hosted local controllers.
local EachEach
controllers. AHUAHUwas controlled independently.
was controlled The supply
independently. The
air temperature
supply cooling and
air temperature heating
cooling and setpoint was resetwas
heating setpoint based onbased
reset the algorithm highlighted
on the algorithm below in
highlighted
plain
below English:
in plain English:
If the AHU is enabled (based on schedules, normally 24/7):
Energies 2020, 13, 2598 11 of 20
◦ Calculate the average cooling demand output for all the zones served by the AHU (cooling
demand output is the output calculated by the PI[D] loop based on the proportional, integral
[and derivative] component of the difference between zone cooling setpoint and zone
temperature).;
◦ Constrain the results between min = 3% and max = 12%;
◦ Linearly map the average output to a calculated cooling setpoint between 18.3 ◦ C and 12.8 ◦ C.
The value of 3% average cooling output is mapped to 18.3 ◦ C, 12% is mapped to 12.8 ◦ C,
and all the values in between are calculated linearly;
◦ The heating setpoint is fixed to 12.8 ◦ C, except for when the cooling setpoint reaches 12.8 ◦ C.
In that case, the heating setpoint becomes 12.2 ◦ C.
• The economizer damper and the chilled water valve are controlled to maintain the cooling supply
air temperature setpoint. The heating hot water valve is controlled to maintain the heating supply
air temperature setpoint. As a result, when the outside air temperature is between the heating
and the cooling setpoints, the air handling unit typically does not cool or heat the air.
The existing SAT control strategy is relatively efficient, compared to common practice in the
industry (fixed setpoint or resets based on outdoor temperature or return temperature alone and no
deadband). However, the current strategy presents two limitations: (1) it responds to outlier zones or
rogue zones, although minimally, as the reset is based on an average cooling demand outputs from
all the zones; and (2) its calibration parameters (e.g., min and max average zone feedback of 3% and
12%, respectively) were established via trial and error and personal judgement. Given the limited
capabilities of the BAS zone controllers (i.e., field devices in Figure 1), the reset strategy was entirely
calculated within the AHU controllers.
The FDD tool connected to the BAS is a commercial product managed by a consultant and the
facility manager of the site. The tool allows for custom programming and bi-directional communication
to the BAS via the BACnet network. In contrast to the BAS, the FDD tool coding language is a modern
scripting language with the ability to use high-level functions that allow the portability of the code
between the buildings and equipment. The two auto-correction algorithms were coded using this
platform and tested on the two AHUs. In the FDD tool, a zone was identified as a rogue zone when
one or more disqualifying conditions were detected for that zone and the zone was sending a request
to the AHU. The zone requests are calculated based on zone PID loop output >95%. Disqualifying
conditions for cooling requests include:
• Leaky reheat valve (VAV box discharge air temperature (DAT) > AHU SAT + 2.8 ◦ C);
• Supply airflow setpoint not met (<90% or >110% of setpoint and delta > 1.4 cubic meter per minute);
• Zone cooling setpoint too low (lower than 22.2 ◦ C unless exempt).
where R0 is the number of net cooling requests from the downstream zones of an AHU; R is the number
of total cooling requests from the downstream zones; Ide f ault is the default number of ignored cooling
requests (set by the user); Irouge_zones is the number of ignored cooling requests from the all rogue zones;
Itotal is the sum of the previous two variables; and Irouge_zone_i is the number of cooling requests from
the ith identified rogue zone. R0 is calculated by subtracting the sum of all the rogue zones ignored
based on the conditions described above (Irogue_zones ) and a default minimum of the ignored zones
(Ide f ault ) from all the requests (R). If the equation leads to a negative result, R0 becomes zero. R0 is used
in the SAT reset calculation below.
3.2.2. Code for “Improve AHU Supply Air Temperature Setpoint Reset”
The supply air temperature cooling setpoint (SAT_spt) is continually reset using “Trim and
Respond” logic between a minimum and maximum setpoint (SATmin = 12.8 ◦ C and SATmax = 18.3 ◦ C).
When the supply air fan is turned on, the initial setpoint is set to SAT0 = 18.3 ◦ C and the reset logic
is active immediately. When active, for every time step t = 5 min, the net cooling request from the
downstream zones (R’) is calculated using Equations (3)–(5) above. If the R0 is above zero, SAT_spt is
decreased by a defined respond amount (SATres = 0.06 ◦ C for each request) until the SAT_spt reaches
SATmin; if R0 is equal to zero, the SAT_spt is increased by a fixed amount (SATtrim= 0.12 ◦ C) until the
SAT_spt reaches SATmax.
Figure 6. Zone
Figure 6. Zone requests
requestsper
perzone
zone(upper
(upperleft),
left),the
theignored requests
ignored perper
requests zone (upper
zone right),
(upper the the
right), sumsum
of the
of
requests, the total of ignored requests and the net requests for all the zones of AHU01 on 4 March 2020.
the requests, the total of ignored requests and the net requests for all the zones of AHU01 on 4 March
The vertical line marks 11 a.m. on all the plots.
2020. The vertical line marks 11 a.m. on all the plots.
3.3.2. Test Results of “Improve AHU SAT Setpoint Reset” Algorithm
3.3.2. Test Results of “Improve AHU SAT Setpoint Reset” Algorithm
The auto-correction algorithm “Improve AHU SAT setpoint reset” successfully changed the SAT
Theof
setpoint auto-correction
AHU01 and AHU02 algorithm “Improve
in the BAS. AsAHU
shown SATin setpoint
Figure 7,reset” successfully
the SAT changedfollowed
setpoint changes the SAT
setpoint of AHU01 and AHU02 in the BAS. As shown in Figure 7, the SAT setpoint changes
the routine described in Section 3.2. The supply fan was on for the whole time, since this AHU serves followed
the routineareas.
laboratory described
WheninR’Section 3.2. The
was larger thansupply fan wasaton
zero starting for a.m.,
10:05 the whole time, since
the algorithm this AHU
slowly reduced serves
the
laboratory areas. When◦ R’ was larger than zero starting at 10:05 a.m., the algorithm slowly
SAT setpoint by 0.06 C for each request every five minutes. Starting at 11:50 a.m., the R’ remained at reduced
the SAT
zero andsetpoint
the routineby 0.06 °C for
slowly each request
increased every
the SAT five minutes.
setpoint by 0.12 ◦Starting
C everyat 11:50
five a.m., the
minutes R’itremained
until reached
at zero and the
◦ routine slowly increased the SAT setpoint by 0.12 °C every five minutes until it
SAT max (18.3 C). The SAT setpoint remained at SATmax until R’ was larger than zero at 14:50 p.m.
reached
Then, theSAT
SATmax (18.3 °C). The SAT setpoint remained at SATmax until R’ was larger than zero at 14:50
setpoint again slowly decreased when R’ was larger than zero and slowly increased
p.m. Then, the
when R’ was zero. SAT setpoint again slowly decreased when R’ was larger than zero and slowly
increased when R’
Because both the wasoriginal
zero. and corrected logic used feedback loops, a direct comparison of the
two was not possible without modeling the dynamic behavior of the system or collecting enough
data to perform a system-level evaluation. However, since the original controller was still active
(for backup purposes) we could qualitatively compare the time at which each algorithm would start
reducing the SAT setpoint in the morning. Figure 8 shows this comparison for two consecutive days
in AHU01. The red line represents the corrected setpoint that was calculated by the algorithm and
the blue line was the actual temperature that tracked the setpoint. The green line depicts the original
logic. As highlighted by the text, the original logic would try to reduce the temperature much earlier
than the corrective algorithm. This behavior was consistent across the testing period for both AHUs.
For AHU01, during 14 days out of the 34 days, the old logic started earlier, while during the other
20 days the system did not require cooling. Conversely, for AHU02, during 13 days the old logic
Energies 2020, 13, 2598 14 of 20
started earlier, during one day it started later, and the other days it did not require cooling (the test
was conducted during a mild spring). Overall, the preliminary test was successful. It showed the
uninterrupted operation of the two algorithms in two AHUs for more than a month. The SAT tracked
the new setpoint throughout the whole testing period. The new control sequence did not cause any
occupant complaints, and it worked more efficiently than the previous one, although a precise savings
Energies 2020, 13, x FOR PEER REVIEW 14 of 20
estimate was beyond the scope of the test.
Figure 7. The SAT setpoint of AHU01 after the execution of the auto-correction algorithm (4 March 2020).
Figure 7. The SAT setpoint of AHU01 after the execution of the auto-correction algorithm (4 March
2020).
Because both the original and corrected logic used feedback loops, a direct comparison of the
two was not possible without modeling the dynamic behavior of the system or collecting enough
data to perform a system-level evaluation. However, since the original controller was still active (for
backup purposes) we could qualitatively compare the time at which each algorithm would start
reducing the SAT setpoint in the morning. Figure 8 shows this comparison for two consecutive days
in AHU01. The red line represents the corrected setpoint that was calculated by the algorithm and
the blue line was the actual temperature that tracked the setpoint. The green line depicts the original
logic. As highlighted by the text, the original logic would try to reduce the temperature much earlier
than the corrective algorithm. This behavior was consistent across the testing period for both AHUs.
For AHU01, during 14 days out of the 34 days, the old logic started earlier, while during the other 20
days the system did not require cooling. Conversely, for AHU02, during 13 days the old logic started
earlier, during one day it started later, and the other days it did not require cooling (the test was
conducted during a mild spring). Overall, the preliminary test was successful. It showed the
uninterrupted operation of the two algorithms in two AHUs for more than a month. The SAT tracked
the new setpoint throughout the whole testing period. The new control sequence did not cause any
occupant complaints, and it worked more efficiently than the previous one, although a precise
savings estimate was beyond the scope of the test.
Figure 8. Comparison of the new and the old setpoint control strategies in AHU01 during
Figure
3–4 8. Comparison
March 2020. of the new and the old setpoint control strategies in AHU01 during March 3
and 4 2020.
4.1. Develop a Secure Two-Way Communication Between the FDD and the BAS
Opening a two-way communication between the FDD system and the BAS was a challenge seen
across all project partners implementing fault auto-correction into their FDD product environment.
FDD tools typically read operational data from the BAS, run analytics and flag faults on the software
interface. Often, they do not have capabilities to write commands directly onto the BAS. As indicated
in Figure 1, the FDD tool commonly collects operational data using three pathways: (1) from the BAS
server database, (2) from a central BAS server via API and (3) directly via the BACnet IP network.
The first pathway prevents the FDD tool from writing back to the control system, therefore it cannot be
used to implement auto-correction procedures. The second one requires BAS-specific interfaces; thus,
implementers tend to avoid it. For this reason, when expanding the one-way interface to two-way
communication, all the partners selected the third pathway to write back directly via the BACnet
IP network.
The project partners mitigated the two-way communication challenge by upgrading their FDD
system infrastructure. Figure 9 illustrates the solutions for the two cloud-based FDD systems. The solid
line shows the original infrastructure, and the red dashed line shows the upgrade. In the first
cloud-based FDD case (Figure 9a), the BACnet stack (a software library allows users to add a native
BACnet interface to talk to the devices or applications in the BACnet network) of the FDD data
acquisition device was updated to include a “write” function. The local data acquisition device was
also updated to make API requests to the cloud FDD platform to retrieve the auto-correction command
information. This enabled the FDD system to send the auto-corrective command to the local device
and then to the writable properties used to control the BAS. In a cloud-based FDD system of another
partner (Figure 9b), the current BACnet library already had writing capabilities. To enable secure
communication with the cloud, the system architecture was changed. The standard data acquisition
device was paired with a new field device (an auto-correction execution device) specifically designed
to execute the new routines and log the interaction with the BAS. The cloud FDD engine initiated the
auto-corrective command onto the auto-correction execution device. The device then executed the
commands onto the BAS BACnet network and reported back the results to the cloud FDD engine.
The BAS data were still acquired by the existing FDD data acquisition field device and delivered to the
cloud FDD engine. The third on-premise FDD system was already capable of writing commands via
BACnet. It only needed to change a setting in the BAS to authorize the changes.
FDD engine initiated the auto-corrective command onto the auto-correction execution device. The
device then executed the commands onto the BAS BACnet network and reported back the results to
the cloud FDD engine. The BAS data were still acquired by the existing FDD data acquisition field
device and delivered to the cloud FDD engine. The third on-premise FDD system was already capable
of writing
Energies commands
2020, 13, 2598 via BACnet. It only needed to change a setting in the BAS to authorize
16 of the
20
changes.
(a) (b)
Figure9.9.Two-way
Figure Two-waycommunication
communicationinfrastructure
infrastructurefor
for(a)
(a)the
thecloud-based
cloud-basedFDD
FDDsystem
system1,1,and
and(b)
(b)the
the
cloud-basedFDD
cloud-based FDDsystem
system2.2.
4.2.
4.2.Incorporate
IncorporateOperator
OperatorApproval
Approval
The
Thesecond
second challenge faced by
challenge faced bythe
thepartners
partnerswaswas incorporating
incorporating operator
operator approval.
approval. The The
new new
auto-
auto-correction
correction feature affords the FDD technology a certain degree of control capability. Thebuilding
feature affords the FDD technology a certain degree of control capability. The building
operators
operatorsmay maybe behesitant
hesitanttototrust
trustthis
thisnew
newcapability
capabilityand andfeel
feela alack
lackofofcontrol.
control.ToTomitigate
mitigatethis
this
challenge,
challenge, one project partner updated the existing interface to make sure the users wereallowed
one project partner updated the existing interface to make sure the users were allowedtoto
actively
activelystart,
start,interrupt
interruptandandtrack
trackthe
theauto-correction
auto-correctionactivities.
activities.Auto-correction
Auto-correctionenableenableandanddisable
disable
functionality was added to the user interface (UI), and the name of the control variables,
functionality was added to the user interface (UI), and the name of the control variables, their current their current
value,
value,andandthethenewnewproposed
proposedvalues
valueswere
wereprovided
providedtotoincrease
increasethetheoperators’
operators’awareness.
awareness.All Allthe
theuser
user
and system activities of auto-correction are stored in a history log that is available to the
and system activities of auto-correction are stored in a history log that is available to the user. Figure user. Figure 10
shows
10 showsa simplified mockupmockup
a simplified of the newof UIthedisplaying
new UI the auto-correction
displaying enable/disable enable/disable
the auto-correction functionality,
action history and other details. Another partner also developed new interfaces
functionality, action history and other details. Another partner also developed new interfaces for for auto-correction
authentication
Energies 2020, 13, xand
auto-correction FORacknowledgement.
PEER REVIEW and acknowledgement.
authentication 17 of 20
Figure 10.
Figure 10. Mockup
Mockupofofthethe
new
newuser interface
user (UI)(UI)
interface developed by one
developed bypartner, displaying
one partner, the auto-
displaying the
correction enable/disable
auto-correction functionality,
enable/disable the action
functionality, historyhistory
the action and other
anddetails
other (ASO – Automatic
details System
(ASO – Automatic
Optimization).
System Optimization).
4.3.
4.3.Manage
Manage BAS
BAS and
and Site-Specific
Site-Specific Customizations
Customizations
The
The traditional
traditional separation
separation ofof the
theroles
rolesbetween
betweenthetheFDD
FDD and
and thethe BAS
BAS allowed
allowed thethe
FDDFDD
toolstools
to
to develop general algorithms that were independent from some of the details about
develop general algorithms that were independent from some of the details about BAS and the BAS and the
implementation
implementation of of specific controlprograms.
specific control programs.For
Forinstance,
instance,anan algorithm
algorithm that
that detects
detects opportunities
opportunities to
save energy by shortening the AHU schedules did not need to know how these schedules were
implemented in the BAS, it just needs to analyze the data produced by them. However, auto-
correcting the same schedule meant overriding the operation of the BAS, therefore the developers
must know many more details about the specific implementation of the control logic to avoid
unintended consequences. The third challenge confronted was the lack of standardization in the BAS
Energies 2020, 13, 2598 17 of 20
to save energy by shortening the AHU schedules did not need to know how these schedules were
implemented in the BAS, it just needs to analyze the data produced by them. However, auto-correcting
the same schedule meant overriding the operation of the BAS, therefore the developers must know many
more details about the specific implementation of the control logic to avoid unintended consequences.
The third challenge confronted was the lack of standardization in the BAS control logic, variables
and interfaces. The implementation partners reported issues in (1) deciphering the BAS control
sequences and identifying the exact control variables to override, (2) gaining access to these variables
and (3) gathering data with frequency and timeliness appropriate to the application. An example
of the first issue is the implementation of the “override manual control” algorithm described in
Section 2.2. Depending upon the BAS, the override can be accomplished via an “override” variable
Manual_override (whose value is 1—equipment is in manual control, 0—equipment is in automatic
control) or by the setting of the priority level of the BACnet points (e.g., 8—manual operator override,
16—default automatically operation) [27].
Accessing the proper control variable was another part of the challenge. The auto-correction
algorithms may require the FDD tools to be able to access the control variables that are not commonly
exposed to the outside by the BAS. An example is the PID-tuning parameters required by the “control
hunting” algorithm. One implementation partner reported being unable to retrieve these points via
BACnet for a site, since the BAS vendor used a proprietary solution. - The third issue emerged when
a partner implemented the algorithms described in Section 3. These routines need real-time data
updated every few minutes, since the algorithms are reevaluated continuously, while the existing BAS
was storing it at 15-minute intervals and transmitting it to the FDD tool once a day (to save memory
and bandwidth).
To address the challenges, all the partners had to spend significant time to understand and modify
the BAS programming and setup, in addition to its interface with the FDD tool. The parameters of
the BAS controller, gateway or server were changed to expose the necessary variables, making sure
they could be modified when needed. Sampling frequency and data transfer rate were increased to
implement some of the algorithms. A partner created a virtual point in the actual codes to accommodate
the settings of override in various BASs. The virtual point is a string semicolon delimited list of point
IDs that are mapped to any points that need to be changed from override in the BAS. A partner reported
that particular care had to be put into matching appropriate data types (e.g., binary, analog with
different precisions, arrays) used by the BACnet protocol, to avoid communication errors. All these
customizations varied by BAS vendor, hardware vintage and site configuration.
4.4. Manage Control Conflicts between the BAS and the FDD Tool
The last challenge reported pertained to the conflicts between the BAS and the FDD control actions.
Algorithms that make one-time changes to the BAS operation (e.g., the incorrectly programmed
schedule in Section 2.1) may be overridden by operators or the BAS logic at a later date. It is unclear
whether or not the auto-correction procedure should periodically update these variables. Moreover,
algorithms that continuously change variables may also conflict with the existing BAS sequence of
operation. An example is improving the AHU static pressure setpoint reset (Section 2.8) on a BAS
that already has a reset strategy. There is a need to understand which one takes precedence and if the
existing control sequences should be turned off.
To address the first issue, one implementation partner used an existing feature of the FDD platform
to separately track the active schedule and the most efficient schedule and let the operator decide
which one to activate. In addition, it logged all the changes to that schedule to offer more information
to the user. For the second issue, another partner set up a fallback mechanism in the BAS, for use with
the new FDD auto-correction algorithm that continuously modified the control setpoint. A watchdog
was added in the BAS programming to make sure the FDD tool was online. If the FDD tool went
offline, the BAS reverted back to the setpoint generated by the original control logic in the case of loss
of communication with the FDD tool.
Energies 2020, 13, 2598 18 of 20
Author Contributions: Conceptualization, J.G., Formal analysis, G.L., M.P. and Y.C.; Methodology, G.L., M.P., Y.C.
and J.G.; Writing—original draft, G.L., M.P., Y.C. and J.G.; Writing Review & Editing, G.L. All authors have read
and agreed to the published version of the manuscript.
Funding: This research was funded by the Assistant Secretary for Energy Efficiency and Renewable Energy,
Building Technologies Office, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.
Acknowledgments: This work was supported by the Assistant Secretary for Energy Efficiency and Renewable
Energy, Building Technologies Office, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.
Energies 2020, 13, 2598 19 of 20
The authors wish to acknowledge Harry Bergmann for his guidance and support of the research. We also thank
the fault detection and diagnostics technology and service providers who participated in this study.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Energy Information Administration. Monthly Energy Review April 2019; EIA: Washington, DC, USA, 2019.
2. Energy Information Administration. Commercial Buildings Energy Consumption Survey (CBECS);
EIA: Washington, DC, USA, 2016.
3. Mills, E. Building commissioning: A golden opportunity for reducing energy costs and greenhouse gas
emissions in the United States. Energy Effic. 2011, 4, 145–173. [CrossRef]
4. Katipamula, S.; Brambley, M.R. Methods for fault detection, diagnostics, and prognostics for building
systems—A review, part I. Hvac R Res. 2005, 11, 3–25. [CrossRef]
5. Roth, K.W.; Westphalen, D.; Feng, M.Y.; Llana, P.; Quartararo, L. Energy Impact of Commercial Building
Controls and Performance Diagnostics: Market Characterization, Energy Impact of Building Faults and Energy
Savings Potential; US Department of Energy: Washington, DC, USA, 2005.
6. Fernandez, N.E.; Katipamula, S.; Wang, W.; Xie, Y.; Zhao, M.; Corbin, C.D. Impacts of Commercial Building
Controls on Energy Savings and Peak Load Reduction; Pacific Northwest National Lab: Richland, WA, USA, 2017.
7. Deshmukh, S.; Glicksman, L.; Norford, L. Case study results: Fault detection in air-handling units in
buildings. Adv. Build. Energy Res. 2018, 1–17. [CrossRef]
8. Fernandes, S.; Granderson, J.; Singla, R.; Touzani, S. Corporate delivery of a global smart buildings program.
Energy Eng. 2018, 115, 7–25. [CrossRef]
9. Wall, J.; Ying, G. Evaluation of Next-Generation Automated Fault Detection & Diagnostics (FDD) Tools for
Commercial Building Energy Efficiency—Final Report Part. I: FDD Case Studies in Australia, RP1026; Low Carbon
Living CRC: Boca Raton, FL, USA, 2018.
10. Lin, G.; Kramer, H.; Granderson, J. Building fault detection and diagnostics: Achieved savings, and methods
to evaluate algorithm performance. Build. Environ. 2020, 168, 106505. [CrossRef]
11. ASHRAE. Guideline 13–2015—Specifying Building Automation Systems; ASHRAE: Akron, OH, USA, 2015.
12. Granderson, J.; Singla, R.; Mayhorn, E.; Ehrlich, P.; Vrabie, D.; Frank, S. Characterization and Survey of
Automated Fault Detection and Diagnostics Tools; Report Number LBNL-2001075; Lawrence Berkeley National
Laboratory: Washington, DC, USA, 2017.
13. Kim, K.; Rao, P.; Burnworth, J.A. Self-tuning of the PID controller for a digital excitation control system.
IEEE Trans. Ind. Appl. 2010, 46, 1518–1524.
14. Shi, Z.; O’Brien, W. Development and implementation of automated fault detection and diagnostics for
building systems: A review. Autom. Constr. 2019, 104, 215–229. [CrossRef]
15. Fernandez, N.; Brambley, M.; Katipamula, S. Self-Correcting HVAC Controls: Algorithms for Sensors and Dampers
in Air-Handling Units, PNNL-19104; Pacific Northwest National Laboratory: Richland, WA, USA, 2009.
16. Fernandez, N.; Brambley, M.; Katipamula, S.; Cho, H.; Goddard, J.; Dinh, L. Self Correcting HVAC Controls
Project Final Report PNNL-19074; Pacific Northwest National Laboratory: Richland, WA, USA, 2009.
17. Brambley, M.; Fernandez, N.; Wang, W.; Cort, K.A.; Cho, H.; Ngo, H.; Goddard, J.K. Fial Project Report:
Self-Correcting Controls for VAV System Faults Filter/Fan/Coil and VAV Box Sections. No. PNNL-20452;
Pacific Northwest National Laboratory (PNNL): Richland, WA, USA, 2011; Volume 20.
18. Isermann, R. Fault-Diagnosis Systems: An Introduction from Fault Detection to Fault Tolerance; Springer Science
& Business Media: New York, NY, USA, 2006.
19. Zhang, Y.; Jiang, J. Bibliographical review on reconfigurable fault-tolerant control systems. Annu. Rev. Control
2008, 32, 229–252. [CrossRef]
20. Padilla, M.; Choinière, D.; Candanedo, J.A. A model-based strategy for self-correction of sensor faults in
variable air volume air handling units. Sci. Technol. Built Environ. 2015, 21, 1018–1032. [CrossRef]
21. Wang, S.; Chen, Y. Fault-tolerant control for outdoor ventilation air flow rate in buildings based on neural
network. Build. Environ. 2002, 37, 691–704. [CrossRef]
22. Hao, X.; Zhang, G.; Chen, Y. Fault-tolerant control and data recovery in HVAC monitoring system.
Energy Build. 2005, 37, 175–180. [CrossRef]
Energies 2020, 13, 2598 20 of 20
23. Bengea, S.C.; Li, P.; Sarkar, S.; Vichik, S.; Adetola, V.; Kang, K.; Lovett, T.; Leonardi, F.; Kelman, A.D.
Fault-tolerant optimal control of a building HVAC system. Sci. Technol. Built Environ. 2015, 21, 734–751.
[CrossRef]
24. ASHRAE. Guideline 36–2018. High. Performance Sequences of Operation for HVAC Systems; ASHRAE: Akron,
OH, USA, 2018.
25. ASHRAE. ASHRAE/IES Standard 90.1–2016. Energy Standard for Buildings Except Low-Rise Residential Buildings;
ASHRAE: Akron, OH, USA, 2016.
26. Lin, G.; Pritoni, M.; Chen, Y.; Granderson, J. Can We Fix It Automatically? Development of Fault
Auto-Correction Algorithms for HVAC and Lighting Systems. ACEEE 2020, in press.
27. BACnet®Primer. What is BACnet? Phoenix Controls: Acton, MA, USA. 2009. Available
online: https://www.phoenixcontrols.com/CatalogDocuments/Products/Network%20Integration/BACnet%
20Primer%20(MKT-0233).pdf (accessed on 19 May 2020).
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).