research-article

Open access

MB2C: Model-Based Deep Reinforcement Learning for Multi-zone Building Control

Authors:

Xianzhong Ding,

Wan Du, and

Alberto E. CerpaAuthors Info & Claims

BuildSys '20: Proceedings of the 7th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation

November 2020

Pages 50 - 59

https://doi.org/10.1145/3408308.3427986

Published: 18 November 2020 Publication History

PDF eReader

Abstract

Reinforcement learning has been widely studied for controlling Heating, Ventilation, and Air conditioning (HVAC) systems. Most of the existing works are focused on Model-Free Reinforcement Learning (MFRL), which learns an agent by extensively trial-and-error interaction with a real building. However, one of the fundamental problems with MFRL is the very large amount of training data required to converge to acceptable performance. Although simulation models have been used to generate sufficient training data to accelerate the training process, MFRL needs a high-fidelity building model for simulation, which is also hard to calibrate. As a result, Model-Based Reinforcement Learning (MBRL) has been used for HVAC control. While MBRL schemes can achieve excellent sample efficiency (i.e. less training data), they often lag behind model-free approaches in terms of asymptotic control performance (i.e. high energy savings while meeting occupants' thermal comfort).

In this paper, we conduct a set of experiments to analyze the limitations of current MBRL-based HVAC control methods, in terms of model uncertainty and controller effectiveness. Using the lessons learned, we develop MB2C, a novel MBRL-based HVAC control system that can achieve high control performance with excellent sample efficiency. MB2C learns the building dynamics by employing an ensemble of environment-conditioned neural networks. It then applies a new control method, Model Predictive Path Integral (MPPI), for HVAC control. It produces candidate action sequences by using an importance sampling weighted algorithm that scales better to high state and action dimensions of multi-zone buildings. We evaluate MB2C using EnergyPlus simulations in a five-zone office building. The results show that MB2C can achieve 8.23% more energy savings compared to the state-of-the-art MBRL solution while maintaining similar thermal comfort. MB2C can reduce the training data set by an order of magnitude (10.52×) while achieving comparable performance to MFRL approaches.

References

[1]

Ltd. DR International.2012. 2011 building energy data book. https://openei.org/doe-opendata/dataset/buildings-energy-data-book.

Abstract

References

Cited By

Index Terms

Recommendations

Building HVAC Scheduling Using Reinforcement Learning via Neural Network Based Model Approximation

Gnu-RL: A Precocial Reinforcement Learning Solution for Building HVAC Control Using a Differentiable MPC Policy

OCTOPUS: Deep Reinforcement Learning for Holistic Smart Building Control

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Badges

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations