Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3408308.3427986acmotherconferencesArticle/Chapter ViewAbstractPublication PagessensysConference Proceedingsconference-collections
research-article
Open access

MB2C: Model-Based Deep Reinforcement Learning for Multi-zone Building Control

Published: 18 November 2020 Publication History
  • Get Citation Alerts
  • Abstract

    Reinforcement learning has been widely studied for controlling Heating, Ventilation, and Air conditioning (HVAC) systems. Most of the existing works are focused on Model-Free Reinforcement Learning (MFRL), which learns an agent by extensively trial-and-error interaction with a real building. However, one of the fundamental problems with MFRL is the very large amount of training data required to converge to acceptable performance. Although simulation models have been used to generate sufficient training data to accelerate the training process, MFRL needs a high-fidelity building model for simulation, which is also hard to calibrate. As a result, Model-Based Reinforcement Learning (MBRL) has been used for HVAC control. While MBRL schemes can achieve excellent sample efficiency (i.e. less training data), they often lag behind model-free approaches in terms of asymptotic control performance (i.e. high energy savings while meeting occupants' thermal comfort).
    In this paper, we conduct a set of experiments to analyze the limitations of current MBRL-based HVAC control methods, in terms of model uncertainty and controller effectiveness. Using the lessons learned, we develop MB2C, a novel MBRL-based HVAC control system that can achieve high control performance with excellent sample efficiency. MB2C learns the building dynamics by employing an ensemble of environment-conditioned neural networks. It then applies a new control method, Model Predictive Path Integral (MPPI), for HVAC control. It produces candidate action sequences by using an importance sampling weighted algorithm that scales better to high state and action dimensions of multi-zone buildings. We evaluate MB2C using EnergyPlus simulations in a five-zone office building. The results show that MB2C can achieve 8.23% more energy savings compared to the state-of-the-art MBRL solution while maintaining similar thermal comfort. MB2C can reduce the training data set by an order of magnitude (10.52×) while achieving comparable performance to MFRL approaches.

    References

    [1]
    Ltd. DR International.2012. 2011 building energy data book. https://openei.org/doe-opendata/dataset/buildings-energy-data-book.
    [2]
    Jyri Salpakari and Peter Lund. Optimal and rule-based control strategies for energy flexibility in buildings with pv. Applied Energy, 161:425--436, 2016.
    [3]
    Alex Beltran and Alberto E Cerpa. Optimal hvac building control with occupancy prediction. In ACM BuildSys, 2014.
    [4]
    Daniel A Winkler, Ashish Yadav, Claudia Chitu, and Alberto E Cerpa. Office: Optimization framework for improved comfort & efficiency. In ACM/IEEE IPSN, 2020.
    [5]
    Narendra N Kota, John M House, Jasbir S Arora, and Theodore F Smith. Optimal control of hvac systems using ddp and nlp techniques. Optimal Control Applications and Methods, 17(1):71--78, 1996.
    [6]
    Xianzhong Ding, Wan Du, and Alberto Cerpa. Octopus: Deep reinforcement learning for holistic smart building control. In ACM BuildSys, 2019.
    [7]
    Zhiang Zhang and Khee Poh Lam. Practical implementation and evaluation of deep reinforcement learning control for a radiant heating system. In ACM BuildSys, 2018.
    [8]
    Zoltan Nagy, June Y Park, and J Vazquez-Canteli. Reinforcement learning for intelligent environments: A tutorial. Handbook of Sustainable and Resilient Infrastructure, 2018.
    [9]
    June Young Park and Zoltan Nagy. Hvaclearn: A reinforcement learning based occupant-centric control for thermostat set-points. In ACMe-Energy, 2020.
    [10]
    Chi Zhang, Sanmukh R Kuppannagari, Rajgopal Kannan, and Viktor K Prasanna. Building hvac scheduling using reinforcement learning via neural network based model approximation. In ACM BuildSys, 2019.
    [11]
    Siddharth Goyal and Prabir Barooah. A method for model-reduction of non-linear thermal dynamics of multi-zone buildings. Energy and Buildings, 2012.
    [12]
    Anusha Nagabandi, Gregory Kahn, Ronald S Fearing, and Sergey Levine. Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. In IEEE ICRA, 2018.
    [13]
    Grady Williams, Nolan Wagener, Brian Goldfain, Paul Drews, James M Rehg, Byron Boots, and Evangelos A Theodorou. Information theoretic mpc for model-based reinforcement learning. In IEEE ICRA, 2017.
    [14]
    Tong Wu and Jorge Ortiz. Towards adaptive anomaly detection in buildings with deep reinforcement learning. In ACM BuildSys, 2019.
    [15]
    Bharathan Balaji, Sunil Mallya, et al. Deepracer: Autonomous racing platform for experimentation with sim2real reinforcement learning. In IEEE ICRA, 2020.
    [16]
    Francesco Fraternali, Bharathan Balaji, Yuvraj Agarwal, and Rajesh K Gupta. Aces: Automatic configuration of energy harvesting sensors with reinforcement learning. ACM TOSN, 2020.
    [17]
    Zhihao Shen, Wan Du, Xi Zhao, and Jianhua Zou. Dmm: fast map matching for cellular data. In ACM MobiCom, 2020.
    [18]
    Zhihao Shen, Kang Yang, Wan Du, Xi Zhao, and Jianhua Zou. Deepapp: A deep reinforcement learning framework for mobile application usage prediction. In ACM SenSys, 2019.
    [19]
    Zhi Cao, Honggang Zhang, Yu Cao, and Benyuan Liu. A deep reinforcement learning approach to multi-component job scheduling in edge computing. In IEEE MSN, 2019.
    [20]
    Miaomiao Liu, Xianzhong Ding, and Wan Du. Continuous, real-time object detection on mobiledevices without offloading. In IEEE ICDCS, 2020.
    [21]
    Bingqing Chen, Zicheng Cai, and Mario Bergés. Gnu-rl: A precocial reinforcement learning solution for building hvac control using a differentiable mpc policy. In ACM BuildSys, 2019.
    [22]
    John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
    [23]
    Poul O Fanger et al. Thermal comfort. analysis and applications in environmental engineering. Thermal comfort. Analysis and applications in environmental engineering., 1970.
    [24]
    Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty estimation using deep ensembles. In NeurIPS, 2017.
    [25]
    Kurtland Chua, Roberto Calandra, Rowan McAllister, and Sergey Levine. Deep reinforcement learning in a handful of trials using probabilistic dynamics models. In NeurIPS, 2018.
    [26]
    2019 Sergey Levine. Model-based reinforcement learning. http://rail.eecs.berkeley.edu/deeprlcourse/.
    [27]
    Anusha Nagabandi, Kurt Konolige, Sergey Levine, and Vikash Kumar. Deep dynamics models for learning dexterous manipulation. In CoRL, 2020.
    [28]
    A. Standard. Standard 55--2004-thermal environmental conditions for human occupancy. ASHRAE Inc, 2004.
    [29]
    Herbert Robbins and Sutton Monro. A stochastic approximation method. The annals of mathematical statistics, pages 400--407, 1951.
    [30]
    Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
    [31]
    Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep feedforward neural networks. In AISTATS, 2010.
    [32]
    Mesut Avci, Murat Erkoc, Amir Rahmani, and Shihab Asfour. Model predictive hvac load control in buildings using real-time electricity pricing. Energy and Buildings, 2013.
    [33]
    Michael Wetter. Co-simulation of building energy and control systems with the building controls virtual test bed. Journal of Building Performance Simulation, 2011.
    [34]
    Zdravko I Botev, Dirk P Kroese, Reuven Y Rubinstein, and Pierre L'Ecuyer. The cross-entropy method for optimization. In Handbook of statistics. Elsevier, 2013.
    [35]
    Claudia Chiţou, Grigore Stamatescu, and Alberto Cerpa. Building occupancy estimation using supervised learning techniques. In IEEE ICSTCC, 2019.
    [36]
    Grigore Stamatescu, Alex Beltran, and Alberto Cerpa. Data-driven comfort models for user-centric predictive control in smart buildings. In ACM BuildSys, 2016.

    Cited By

    View all
    • (2024)A Low-Density Parity-Check Coding Scheme for LoRa NetworkingACM Transactions on Sensor Networks10.1145/366592820:4(1-29)Online publication date: 8-Jul-2024
    • (2024)Optimizing Irrigation Efficiency using Deep Reinforcement Learning in the FieldACM Transactions on Sensor Networks10.1145/366218220:4(1-34)Online publication date: 8-Jul-2024
    • (2024)Exploring Deep Reinforcement Learning for Holistic Smart Building ControlACM Transactions on Sensor Networks10.1145/365604320:3(1-28)Online publication date: 2-Apr-2024
    • Show More Cited By

    Index Terms

    1. MB2C: Model-Based Deep Reinforcement Learning for Multi-zone Building Control

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      BuildSys '20: Proceedings of the 7th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation
      November 2020
      361 pages
      ISBN:9781450380614
      DOI:10.1145/3408308
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 18 November 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Badges

      • Honorable Mention

      Author Tags

      1. HVAC Control
      2. Model Predictive Control
      3. Model-based Deep Reinforcement Learning

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      BuildSys '20
      Sponsor:

      Acceptance Rates

      BuildSys '20 Paper Acceptance Rate 38 of 139 submissions, 27%;
      Overall Acceptance Rate 148 of 500 submissions, 30%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)362
      • Downloads (Last 6 weeks)22

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)A Low-Density Parity-Check Coding Scheme for LoRa NetworkingACM Transactions on Sensor Networks10.1145/366592820:4(1-29)Online publication date: 8-Jul-2024
      • (2024)Optimizing Irrigation Efficiency using Deep Reinforcement Learning in the FieldACM Transactions on Sensor Networks10.1145/366218220:4(1-34)Online publication date: 8-Jul-2024
      • (2024)Exploring Deep Reinforcement Learning for Holistic Smart Building ControlACM Transactions on Sensor Networks10.1145/365604320:3(1-28)Online publication date: 2-Apr-2024
      • (2024)Orientation Estimation Piloted by Deep Reinforcement Learning2024 IEEE/ACM Ninth International Conference on Internet-of-Things Design and Implementation (IoTDI)10.1109/IoTDI61053.2024.00016(134-145)Online publication date: 13-May-2024
      • (2024)MaOC: Model-Assisted Optimal Control for Ductless-Split Cooling Systems in Building Environments2024 16th International Conference on COMmunication Systems & NETworkS (COMSNETS)10.1109/COMSNETS59351.2024.10427219(790-797)Online publication date: 3-Jan-2024
      • (2024)Reinforcement learning-trained optimisers and Bayesian optimisation for online particle accelerator tuningScientific Reports10.1038/s41598-024-66263-y14:1Online publication date: 8-Jul-2024
      • (2024)Expert-guided imitation learning for energy management: Evaluating GAIL’s performance in building control applicationsApplied Energy10.1016/j.apenergy.2024.123753372(123753)Online publication date: Oct-2024
      • (2024)An experimental evaluation of deep reinforcement learning algorithms for HVAC controlArtificial Intelligence Review10.1007/s10462-024-10819-x57:7Online publication date: 13-Jun-2024
      • (2023)HPC-GPT: Integrating Large Language Model for High-Performance ComputingProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624172(951-960)Online publication date: 12-Nov-2023
      • (2023)Data Race Detection Using Large Language ModelsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624088(215-223)Online publication date: 12-Nov-2023
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media