Energies 16 07508 v2

energies
Article
A Regression Framework for Energy Consumption in Smart
Cities with Encoder-Decoder Recurrent Neural Networks
Berny Carrera and Kwanho Kim *
Department of Industrial and Management Engineering, Incheon National University,

Incheon 22012, Republic of Korea; berny@inu.ac.kr
* Correspondence: khokim@inu.ac.kr; Tel.: +82-32-835-8481
Abstract: Currently, a smart city should ideally be environmentally friendly and sustainable, and
energy management is one method to monitor sustainable use. This research project investigates the
potential for a “smart city” to improve energy management by enabling the adoption of various types
of intelligent technology to improve the energy sustainability of a city’s infrastructure and operational
efficiency. In addition, the South Korean smart city region of Songdo serves as the inspiration for
this case study. In the first module of the proposed framework, we place a strong emphasis on
the data capabilities necessary to generate energy statistics for each of the numerous structures. In
the second phase of the procedure, we employ the collected data to conduct a data analysis of the
energy behavior within the microcities, from which we derive characteristics. In the third module, we
construct baseline regressors to assess the proposed model’s varying degrees of efficacy. Finally, we
present a method for building an energy prediction model using a deep learning regression model to
solve the problem of 48-hour-ahead energy consumption forecasting. The recommended model is
preferable to other models in terms of R2 , MAE, and RMSE, according to the study’s findings.
Keywords: smart buildings; smart city; energy consumption; energy management; deep learning;
machine learning; data mining
Citation: Carrera, B.; Kim, K. A 1. Introduction

Regression Framework for Energy Modern smart buildings employ a vast sensor network to gather data that might be
Consumption in Smart Cities with
helpful for energy management. These sensors gather energy information that may be
Encoder-Decoder Recurrent Neural
used to forecast the building’s energy use, adjust thermostat settings, and even improve
Networks. Energies 2023, 16, 7508.
the building’s resilience and safety. Controlling and evaluating the energy usage of many
https://doi.org/10.3390/
buildings, nevertheless, can be difficult, especially if the power consumption is very vari-
en16227508
able. Only ordinary businesses and homes consume such a large quantity of energy. The
Academic Editor: Piotr Kosowski U.S. Department of Energy estimated that by the year 2020, Americans would consume
more than 20 million megawatt hours (MWh) in the residential sector and more than
Received: 27 September 2023
Revised: 31 October 2023
16 million MWh in the commercial sector, accounting for a combined 29% of all energy
Accepted: 2 November 2023
used in the country [1]. Around 60% of household energy is utilized for space cooling,
Published: 9 November 2023
space heating, and related electrical equipment, which makes up the largest component of
total energy usage [2]. Cities can use energy-saving measures that make use of the built
environment to solve this problem. These involve monitoring, managing, and improving
how much energy is used in buildings as well as how other resources are used.
Copyright: © 2023 by the authors. An essential part of managing the energy use of buildings is predicting their energy
Licensee MDPI, Basel, Switzerland. usage. The prediction of building energy demand is crucial to the operation and mainte-
This article is an open access article nance of buildings, as well as the deployment of energy-efficient building technologies.
distributed under the terms and
From the perspective of a building owner, estimating energy consumption can enable them
conditions of the Creative Commons
to decide on the right course of action regarding operation and maintenance. Addition-
Attribution (CC BY) license (https://
ally, accurate estimation of energy demand from buildings is also important to building
creativecommons.org/licenses/by/
owners for deciding on suitable financial investment building such as refurbishment and
4.0/).
Energies 2023, 16, 7508. https://doi.org/10.3390/en16227508 https://www.mdpi.com/journal/energies

Energies 2023, 16, 7508 2 of 24
renovation, for example, for the replacement of major building components such as heat-
ing, ventilation, and air conditioning (HVAC), lighting, power generation, hot water, or
refrigeration systems. Building owners can also benefit from accurate energy demand
prediction of their buildings for energy forecasting, energy demand reduction, energy au-
dit, efficiency improvement, and demand response. Other applications of accurate energy
demand estimation include building certification/testing/inspection, building life-cycle
cost management, and energy policy and planning.
The ‘2020 Energy Consumption Statistics’ published by the Korea Energy Management
Corporation and the Korea Energy Agency have reported that energy consumption in
academia is growing steadily as technology advances. In Korea, in the year 2020, the
amount of energy used in educational buildings is estimated at 10.2% of the total annual
energy consumption, and the percentage is expected to increase because of further advances
in technology. They reported that the annual energy consumption by educational buildings
increased by about 16.3% yearly from the year 2000 in the total energy consumption [3].
In particular, the electricity consumption in laboratories is likely to increase in the future;
moreover, it has to be taken into account that the energy consumption may vary depending
on the nature of research activities, the type of experiments, and the research facility, but it
is mostly driven by laboratory activity.
Universities and research facilities must reduce energy waste, manage, and analyze
energy usage in order to maximize energy utilization and assess the success of energy-
saving initiatives. In this work, a framework for the energy forecast of a smart city is
proposed. The proposed framework, based on the integration of a cloud platform for
storage, processing, and communication, integrates and unifies the data collected from the
different elements and buildings inside the campus. Moreover, in addition to the storage
of historical consumption of energy, the platform also includes the monitoring, control,
and evaluation of energy consumption. Beyond the analysis within the university, we take
into account temporal and environmental variables such as temperature or humidity. This
research demonstrates how, using deep learning, we can turn raw data into meaningful
information that might assist university town decision-makers in becoming more conscious
of energy use in a smart city context. In addition, this study is based on the energy
consumption of a university city due to the characteristic features that include the student
residence, laboratories, gym, cafeterias, restaurants, and offices, representing a smart city
on a smaller scale.
The symbiosis between smart industry and smart cities, like the energy sector, is
a dynamic interdependence that employs cutting-edge technologies to optimize urban
development and industrial processes such as electric vehicles or batteries [4–6]. From a
smart city perspective, one of the main outcomes is its ability to create information and
produce big data using the Internet of Things and embedded systems, creating a lot of
opportunities to improve the actual energy systems for electricity grid operators or energy
companies [7]. Furthermore, we look to provide an effective tool to support the decision-
making processes and the improvement of the academic, operational, and commercial
performance of the university facilities regarding efficient energy management. Using this
statement, this investigation seeks to address the following questions: (1) What can we
expect with the energy consumption data set of smart buildings? (2) To what extent can we
analyze a smart building through its energy data? (3) How can we enrich our data to get a
better perspective on what is happening in the environment? (4) What regression model
can we use? (5) How can we benefit from deep learning models? The study explores a
symbiotic relationship between advancements in smart industry and the development of
smart cities, specifically focusing on predicting energy consumption through the analysis
of individual buildings.
In this study, our first goal was to expose the possibilities offered by the energy data
on a campus-wide level for the smart energy management of a college town, transforming
the data into applicable information using deep learning for energy consumption and
prediction. We used a structure that is typical for most small cities around the world. We
Energies 2023, 16, 7508 3 of 24
wanted to show that energy data may be useful for energy management on a small scale.
By combining energy data collected by an energy management system between the end
of 2019 and the beginning of 2021, individual building energy consumption, with related
weather data, we pursued to show that these three different but correlated data sets are
related to each other.
Most previous literature on energy prediction is based on several regression models
with a few parameters [8–12], although some of these studies analyze the relationship
between energy consumption and climate change [13–17]. Other studies analyze the en-
ergy consumption in Korean buildings [18–21]. Previous research in the energy area of
smart cities studied the energy consumption of individual buildings but not as a whole.
Basically, they can be classified into two categories, studies that analyze residential and
non-residential buildings. Among the studies that focus on residential buildings, it should
be noted that they focus on a greater number of buildings [22–26]. The electricity consump-
tion in an average home in the United States alone ranges from 800 kWh to 1000 kWh
per month [2]. This number can vary according to different factors, such as the geographical
area, whose climate, according to the EIA (US Energy Information Administration), causes
the highest peaks of electricity consumption in a home to occur both in summer and winter
due to the use of electricity by heating and air conditioning. Additionally, after the start of
the pandemic by COVID-19, this consumption increased due to remote work and the lack
of need for users to leave their homes to consume in restaurants, etc. In such a way, we
can note that the climate and the seasons of the year are important because they indicate
the increase or decrease in energy consumption. For this reason, in this study, these vari-
ables were considered. In the second category, studies based on non-residential buildings
are usually based on offices, hospitals, or universities in the academic area [27–31]. Both
categories of study help us understand how energy consumption behaves.
We have built our research on the significant foundation provided by both types of
studies. For example, residential building studies have supported the idea that energy con-
sumption is affected by the number of occupants and weather, both of which considerably
impact energy consumption prediction. In addition, some researchers have conducted tests
and verification of their models by employing synthetic data created through the utilization
of Ecotect simulation software [23,24]. In their study, Dagdougui et al. [32] introduced
an innovative approach to energy load forecasting. This approach involved the analysis
of five buildings belonging to three distinct types, namely residential, commercial, and
educational/office buildings. Numerous scholarly investigations have focused on different
types of buildings; nevertheless, there is a lack of research that comprehensively explores a
complete urban area. The objective of our research is to examine the efficacy of forecasting
energy consumption requirements in smart urban areas through the analysis of a diverse
set of buildings situated across an educational campus.
Urban facilities take into account the different parts of a city, such as its office buildings,
residential buildings, restaurants or shops, schools, etc., but it is the office buildings
that make up most of the urban areas. In this study, we try to approach a smart city
as a space that houses this type of part, which in turn has many dynamic patterns. It
is widely recognized that the primary aim of the smart city system is to enhance the
efficiency of energy use and waste management. In this study, our objective is to analyze the
environment of a smart city and understand the different patterns of energy consumption.
But something very complicated is happening because there are many external variables
that can affect their behavior. So, we come to the question of how can the analysis of the
individual energy consumption of each building help in the energy prediction consumption
of the entire city?
In addition, the analysis of historical data is important since different patterns can be
observed within energy consumption. There are other external factors, such as weather
conditions, the seasons of the year, or the working hours of the city’s inhabitants. Although
it is very difficult to predict all the dynamic patterns, better energy management for the
city can be obtained in the same way as better public policy.
Energies 2023, 16, 7508 4 of 24
Finally, it is important to provide an analysis of the evolution of energy consumption,

which can help predict it at an estimated time. For this reason, this study analyzes dif-
ferent algorithms and approaches and develops an encoder-decoder framework for the
integration of the different data and a comparison of them. Therefore, forecasting energy,
including external variables, and selecting important variables such as individual building
consumption are important in generating meaningful information that helps electricity
producers and the government accurately supply the correct amount of energy.
This study proposes an encode-decode framework for 48-hour-ahead energy predic-
tion by applying different types of neural networks. Therefore, the framework is based on
energy consumption by understanding the individual patterns of each building, creating a
data structure to understand them, and combining them. It is possible to obtain complete
energy consumption for the entire city. Technically, predicting the energy consumption in a
whole city can create a better policy to supply the right amount of energy and not waste it.
Furthermore, this study proposes an encoding-decoding model for energy consumption
that incorporates external factors, such as meteorological observations. The decoding
component of the model utilizes a multi-LSTM network, which is constructed using each
variable employed in this investigation. From a city perspective, this research tries to
understand the individual dynamics of each building, making it sustainable for the entire
city to reduce energy consumption. By analyzing each building’s energy consumption
and identifying the factors that affect it, we can develop effective energy-saving strategies
for the entire city. To validate our model, we used it on our campus and compared the
predicted energy consumption with the actual energy consumption. This comparison helps
us determine the accuracy of our model and identify areas where improvements can be
made. Additionally, we used statistical techniques to evaluate the performance of our
model and ensure that it meets the required standards for energy consumption prediction.
Following the introduction of our study, this document is structured as follows:
Section 2 provides an introduction to our framework, including a comprehensive dis-
cussion of the machine learning approaches that are compared in this study. Section 3
provides a comprehensive account of the data collection, preparation, and analytic proce-
dures employed in this study. In Section 4, the principal findings of our proposed model
are presented and compared with those of other regression models. Section 5, present a
further discussion of findings from our study. Lastly, Section 6 presents the conclusions
and outlines potential avenues for further research.
2. Methodology
Sustainable and resilient smart cities must efficiently manage electricity usage in the
age of urbanization and rising energy demand [33,34]. Power scheduling, which optimizes
electrical device power consumption, is key to this effort. Power scheduling reduces en-
ergy demand fluctuations, promotes cost-effective electricity use, and improves renewable
energy integration. Advanced smart grid technologies enable time-of-use planning, load
balancing, and demand response in this method. Power scheduling is crucial to synchro-
nizing energy supply and demand as the globe strives to decrease carbon footprints and
strengthen energy infrastructures [35,36]. Basically, here we attack one of the key aspects of
power scheduling, which is the prediction of power consumption. Therefore, this section
outlines the suggested methodology for predicting energy usage on a smart campus. The
framework for predicting energy consumption has four primary modules. The first module
starts with data collection; the second module is data preprocessing; the third module is
training and validation; and the last module is the test module, as shown in Figure 1. The
Data Collection Module shown in Figure 1 showcases three separate subgraphs that serve
as examples of data collected over the course of a day. The first subgraph illustrates the
energy consumption pattern of a specific building, while the second subgraph depicts the
accompanying fluctuations in temperature throughout the course of the day. The third
subgraph provides a complete representation of the total energy consumption across all
buildings during the day.
35,801 m2. INU proposes an energy consumption collecting system that offers hourly
data on energy consumption for all academic buildings, as well as comprehensive over-
all energy consumption for the whole campus, as seen in Figure 2. The system collects
energy consumption at hourly intervals from most buildings. In Figure 2a, there are four
Energies 2023, 16, 7508 buildings from which energy consumption is not collected: the “International Exchange
5 of 24
Center”, “College of Urban Science”, “College of Business/School of Northeast Asian
Studies”, and “College of Social Sciences/College of Global Law, Politics and Economics”.
Figure 1. Proposed regression framework for energy consumption in a smart campus.

Figure 1. Proposed regression framework for energy consumption in a smart campus.
The first module data collection uses input from four sources: The first two datasets
come from Incheon National University, which is a present-day national university operated
in Incheon, Republic of Korea. INU campuses are in distinct locations around Incheon and
Seoul, but in our study, we directly focus on Songdo Global Campus, founded in 2009. The
INU Songdo campus is presently under construction as a newly established institution,
resulting in the recent completion of several building appliances. Accordingly, the INU
started to record information in November 2019. INU Songdo Campus is located in the
coordinates 37.3751◦ N, 126.6328◦ E and possesses a plottage of 456,806 m2 , a total area of
major buildings of 216,732 m2 , and an underground parking lot of 35,801 m2 . INU proposes
an energy consumption collecting system that offers hourly data on energy consumption
for all academic buildings, as well as comprehensive overall energy consumption for the
whole campus, as seen in Figure 2. The system collects energy consumption at hourly
intervals from most buildings. In Figure 2a, there are four buildings from which energy
consumption is not collected: the “International Exchange Center”, “College of Urban
Science”, “College of Business/School of Northeast Asian Studies”, and “College of Social
Sciences/College of Global Law, Politics and Economics”.
The weather data was obtained from the Korea Meteorological Administration (KMA),
getting specific data for the Songdo area that includes atmospheric pressure, temperature,
dew point temperature, precipitation, wind speed, and sky condition. The temporal data
was obtained from the moment in which the data was recorded in such a way as to continue
to maintain the time series of the model.
Energies 2023, 16, 7508 6 of 24
Energies 2023, 16, x FOR PEER REVIEW 6 of 26
(a) (b)
Figure2.2.Incheon
Figure IncheonNational
National University
University energy
energy collection
collection system
system (a) Campus
(a) Campus mapmap of INU
of INU and
and (b) (b)
INU’s
INU’s College of Engineering (Building 8th) energy consumption.
College of Engineering (Building 8th) energy consumption.
The weather
Lastly, combiningdatathewasdataobtained
sourcesfrom givestheus aKorea Meteorological
total energy consumption Administration
of one total
energy consumption, nine weather variables, four temporal variables, and thepressure,
(KMA), getting specific data for the Songdo area that includes atmospheric energy of
17temperature, dew point
academic buildings, temperature,
including precipitation,
two dormitories, onewind speed,
sports andand
center, skyone
condition. The
gymnasium,
temporal
from data was 2019
30 November obtained
to 19from the moment
January 2021. The in which
duration theof data
thewas
datarecorded
collectionin spans
such
a way as
around 14 to continue
months, to maintain
resulting the time
in a dataset series of32
including the model. and 9976 entries.
columns
Lastly,
The combining
second module,the thedata
datasources gives usmodule,
preprocessing a total energy consumption
is an explorative of onemanip-
analysis, total
energy consumption, nine weather variables, four temporal variables,
ulation, and transformation of data to enhance the performance of the tested algorithms, and the energy
of 17 is
which academic
explained buildings, including
more in detail two dormitories,
in Section 3. To check the oneperformance
sports center, andand one gym-
hyperparam-
nasium,
eter from 30 November
optimization of the models,2019theto 19 January
input data2021. The into
are split duration
threeof the data collection
segments: a training
spans
set witharound
70%, a14 months, resulting
validation set with 20%, in a dataset including
and a test set with 3210%.
columns and
First, the9976
dataentries.
coming
from theThedata
second module, themodule
preprocessing data preprocessing
to be used inmodule,
the deep is neural
an explorative
networks analysis, ma-
has to have
nipulation,
numerical and transformation
values in the boundedof data[37].
range to enhance the nominal
Therefore, performance of the variables
categorical tested algo-are
rithms, which
converted is explained
into multiple binarymore in detail
variables, in ordinal
and Sectioncategorical
3. To checkvariables
the performance and
use numerical
hyperparameter
label encoding. Later,optimization of the
all variables aremodels,
rescaledthe input
into datanormalization
a mean are split into three of [−1, 1],
rangesegments:
asa observed
training setin with
Equation
70%, (1) [38].
a validation set with 20%, and a test set with 10%. First, the data
coming from the data preprocessing module to be used in the deep neural networks has
0 x − mean( x )
to have numerical values in the bounded xi = i range [37].i Therefore,
, nominal categorical var- (1)
iables are converted into multiple binaryxvariables, max − xminand ordinal categorical variables use
numerical
A temporallabelsegmentation
encoding. Later, all variables
is performed arethe
since rescaled
first layerintoofathe
mean normalization
proposed Encoder-
range of [−1, 1], as observed in Equation (1) [38].
Decoder Recurrent Neural Network uses an LSTM network where it takes the input data
as a time-series vector over the previous𝑥 24 h𝑚𝑒𝑎𝑛(𝑥 xtv−24 , x)tv−23 , . . . , xtv−1 , xtv , where v represents
𝑥′ =
all the variables, and t the current time of𝑥 the prediction. , Therefore, an important design (1)of
𝑥
an LSTM network is how much data should be used as delayed inputs for the network. In
A temporalwe
our experiments, segmentation is performed
tried different days, with 24 since
h ago thebeing
first thelayer of the
most proposed En-
optimal.
coder-Decoder Recurrent
The third module Neural
focuses Network
on the process of uses an LSTM
choosing network
optimal where it takes
hyperparameters forthe
the
input data as a time-series vector over the previous 24 h 𝑥 , 𝑥
proposed network. First, the train and validation modules use a grid search method across , …, 𝑥 , 𝑥 , where
v represents
the training setalland
thethe
variables, andset.
validation t the
Tocurrent
further time
validate of the
theprediction.
performance Therefore, an im-
of the proposed
Encoder-Decoder Recurrent Neural Network regression model, comparisons areinputs
portant design of an LSTM network is how much data should be used as delayed made
for the
with network.
other regression In our experiments,
models. This studywe tried different
considers days, withmethods
single regression 24 h ago suchbeing the
as Auto
most optimal.
Regression (AR), auto ARIMA, SARIMA, auto SARIMA, PROPHET from Facebook, Linear
The third
Regression, module
Decision focuses
Tree, on the process
and ensemble of choosing
regression methodsoptimal such ashyperparameters
Extra Trees, Random for
the proposed network. First, the train
Forest, Gradient Boosting, CatBoost, and LightGBM.and validation modules use a grid search method
across the training set and the validation set. To further validate the performance of the
proposed Encoder-Decoder Recurrent Neural Network regression model, comparisons
Energies 2023, 16, 7508 7 of 24
In the fourth module, there is a detailed comparison of the performances of selected

models in the test set. Furthermore, the outcomes of each model represent the mean
prediction for energy consumption 48 h in advance, based on the test data segment. No-
tably, the Encoder-Decoder Recurrent Neural Network regression model demonstrates
the highest level of performance. Table 1 presents the evaluated hyper-parameters for the
prediction algorithms.
Table 1. Prediction algorithms with evaluated hyper-parameters.
Prediction Algorithms Evaluated Hyper-Parameters

AR -
Auto ARIMA -
SARIMA -
Single Regression
Auto SARIMA -
Models
PROPHET -
Linear regression -
Decision tree max_depth = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
Random forest num_estimators = {5, 10, 15, 20, 40, 80}, max_depth = {2, 5, 7, 9}
Extra trees num_estimators = {5, 10, 15, 20, 40, 80}, max_depth = {2, 5, 7, 9}
Ensemble Gradient boosting num_estimators = {5, 10, 15, 20, 40, 80}, max_depth = {2, 5, 7, 9}
Models depth = {3, 6, 8, 10}, learning_rate = {0.0001, 0.001, 0.01, 0.1},
CatBoost
iterations = {30, 50, 100}
LightGBM num_estimators = {5, 10, 15, 20, 40, 80}, max_depth = {2, 5, 7, 9}
Encoder-Decoder Recurrent activation = {‘relu’, ‘tanh’}, recurrent_activation = {‘relu’, ‘tanh’},
Deep Learning Model
Neural Network neurons = {15, . . ., 23, . . ., 50}, optimizer = {adam, rmsprop, nadam}
An Encoder-Decoder Recurrent Neural Network for Energy Consumption Prediction

The proposed Encoder-Decoder Recurrent Neural Network consists of five layers:
input layer, hidden layer, encode layer, decoder layer, and output layer, as shown in
Figure 3. In the hidden layer, to build the proposed Encoder-Decoder Recurrent Neural
Network [39], the most popular algorithms in deep learning were combined, artificial
neural networks (ANNs) or dense networks when we connect more hidden layers and
recurrent neural networks (RNNs) [40].
z = wx + b, (2)
z l = a l −1 w l + bl , (3)
d = a l = Φ ( z l ), (4)
1 − e−2z

tanh(z) = , (5)
(1 + e−2z )

z z>0
ReLU (z) = . (6)
0 z≤0
A single neuron represented as a linear regression is represented in Equation (2).
A dense network consisting of various neurons is represented in Equations (3) and (4).
Here, a = ( a1 , . . . , am ) is the input feature vector, where m represents the total input
variables, w = (w1 , . . . , wm ) represents the matrix of weights, b denotes the bias vector, and
z = (z1 , . . . , zm ) represents the hidden vector calculation over them. l represents the hidden
layers with l ∈ {1, 2, . . . , L}, while al is the neural nonlinear activation function applied
in the function Φ(zl ). Here, the initial input is all the available x data, which will help us
predict the total energy. In general, it is worth mentioning two types of activation functions,
denoted by the function Φ(z). In Equation (5), the rescaled logistic sigmoid function tanh is
Energies 2023, 16, 7508 8 of 24

defined, and the piecewise-linear function ReLU, which is the rectified linear activation
function, is defined in Equation (6).
Figure 3. Proposed deep learning encoder-decoder regression model for energy consumption in
Figure 3. Proposed deep learning encoder-decoder regression model for energy consumption in a
a smart city.
smart city.
To capture the temporal dependence𝑧 = 𝑤𝑥variability

+ 𝑏, of the input data, we use RNNs,
(2)
where the long short-term memory (LSTM) networks are well-known to provide better
performance than the vanilla RNN [41,42]. Basically, the proposed model uses RNN to infer
𝑧 =𝑎
an encoding of the input sequence of each 𝑤 +by
variable 𝑏 ,successively updating its hidden(3)
state.
Thus, the LSTM architecture adopted in this work was proposed by Gers, Schraudolph,
and Schmidhuber [41] and is defined𝑑in=Equations ),
𝑎 = Φ(𝑧 (7)–(11). (4)
it = Φ(Wid dt + Wia a(1 t −1 + 𝑒 Wic)ct−1 + bi ), (7)

tanh(𝑧) = , (5)
(1 + 𝑒 )
f t = Φ W f d d t + W f a a t −1 + W f c c t −1 + b f , (8)
𝑧 𝑧 0
𝑅𝑒𝐿𝑈(𝑧) = . (6)
0 𝑧 0
ct = f t ct−1 + it Φ (W cd dt + Wca at−1 + bc ), (9)
A single neuron represented as a linear regression is represented in Equation (2).
A dense network consisting ot =of Φ(various
Wod dt +neurons
Woa at−1is+represented
Woc ct + bo ),in Equations (3) and (4). (10)
Here, 𝑎 = (𝑎 , . . . , 𝑎 ) is the input feature vector, where m represents the total input
variables, 𝑤 = (𝑤 , . . . , 𝑤 ) represents at the
= omatrix of weights, b denotes the bias vector,
t Φ ( c t ), (11)
and 𝑧 = (𝑧 , . . . , 𝑧 ) represents the hidden vector calculation over them. l represents the
The layers
hidden LSTM with 𝑙 ∈ includes
network 1, 2, … , 𝐿an, input al is ithe
while gate t , a memory cell ct , a forget
neural nonlinear gate function
activation ft, and an
output gate
applied in theo at time t. The input gate accepts the output from the
t function Φ(𝑧 ). Here, the initial input is all the available x data, which dense layer. The weight
will
matrices W represent the connections between the network components.
help us predict the total energy. In general, it is worth mentioning two types•dof activa- W describes
the weights
tion fromdenoted
functions, the input bygate, W• a are the
the function weights
Φ(z). associated
In Equation withrescaled
(5), the the LSTM network’s
logistic sig-
hidden layers, and W • c are the weights associated with the
moid function tanh is defined, and the piecewise-linear function ReLU, which is the memory cell activations. Here,
the outputlinear
rectified gate produces
activationthe input encoding
function, is definedvector after the
in Equation (6).last LSTM in the network is
processed. The input encoding vector encapsulates
To capture the temporal dependence variability of the input the outputs obtained from
data, we the
use LSTM
RNNs,
network to serve as an input for the decoder section of the
where the long short-term memory (LSTM) networks are well-known to provide better model. In the decoding section,
performance than the vanilla RNN [41,42]. Basically, the proposed model uses RNN to
infer an encoding of the input sequence of each variable by successively updating its
Energies 2023, 16, 7508 9 of 24
we use two neural networks to improve the accuracy of our energy regression. The first
uses a Deep Feed Forward (DFF) network, which uses stacked dense networks that we
reduce until we only have a hidden layer that will be the result of the total energy. For
the second neural network, the decoder section uses an auto-encoder, which helps us
perform a feature compression of all the results of the energy obtained from the vector
encode and, in the end, generate the best neuron by looking for common patterns in the
energy consumed. Finally, both results are combined in the last neuron obtained in our
experiments to determine the best prediction of energy consumption.
3. Data Preparation and Analysis

This section provides an overview of the data pretreatment and analysis techniques
that were employed on the energy dataset obtained from Incheon National University in
Songdo, which is situated in the Incheon Metropolitan Region of Republic of Korea. The
dataset has a time span from 30 November 2019, at 10:00 am to 17 January 2021, resulting
in a total of 9975 records. Furthermore, it should be noted that all variables and models
employ an input data frame with a temporal resolution of one hour. Section 3.1 describes
the dependent variable, which is energy power consumption, as well as the time variables.
The factors pertaining to the energy consumption of each building and climatic information
are discussed in Section 3.2 and 3.3, respectively.
3.1. Energy and Time Variables

The campus in Songdo collects hourly energy consumption data from energy meters
installed throughout the campus. In other words, if it were possible to capture the total
energy consumption of an entire smart city from a macro perspective, the data would be
more granular and therefore more specific. Table 2 shows the dependent variable of energy
consumption per hour as well as the time variables with their respective preprocessing.
The variable time is expressed in terms of years, days, and hours. This unit type is not
suitable for our model; for instance, the schedule must begin at 00:00 and terminate at 23:00
and be near together. To perform this feature extraction, we will convert these variables
to their sine and cosine forms. We use the function timeseconds that returns the time in the
exact seconds of the measurement.
Figure 4 presents the time series, the decomposition trend, and resid plots for the en-
ergy consumption in INU. The trend component shows the variations of the low frequency,
presenting us with a better observation of the behavior of energy. Moreover, the trend
component shows an increase in energy consumption in December 2019. Additionally,
we can observe that starting year 2020, since the COVID-19 pandemic began, there was
a general decrease in energy consumption. We can observe a decrease in energy at the
end of the year 2019, which is when the academic cycle ends. During the year 2020, the
energy consumption pattern increases from the end of January until mid-February where
heating is used a lot and decreases in spring between the end of February and May. Fur-
thermore, a slight increase in energy consumption is evident between June and September,
corresponding to the summer season, due to the utilization of air conditioning. Conversely,
energy consumption declines during the fall season. Hence, the period of greatest energy
consumption is observed during the winter season due to the heightened utilization of
heating systems.
Figure 5 shows a scatter plot illustrating the regression relationship between current
energy consumption and the 48-hours-ahead target. The density of both variables is
displayed at the top and right sides of the graph, respectively. The maximum density of
kilowatts per hour is often recorded within the range of around 250–600 kWh. Additionally,
the red line illustrates a significant association between energy use 48 h in advance. The
correlation coefficient (ρ) between the variables is 46%, and its associated p-value is 0.00,
as seen in the upper right corner. Figure 6 shows the hourly aggregate energy use. As
anticipated, there is a noticeable increase in energy use during the hours of 9 a.m. to 7 p.m.,
which corresponds to typical working hours. However, it is noteworthy that there are many
Energies 2023, 16, 7508 10 of 24
instances of outlier values seen between the hours of 7 p.m. and 9 a.m. Notably, there is a
distinct solitary occurrence at 11 p.m., which may be attributed to an experiment conducted
in building number four of the computer and information department.
Table 2. Energy and time variables.
Category Variable Name Classification Description (Unit)

Total energy consumed in an hour by all buildings in
Total Energy
Energy_kW-hour Continuous Kilowatts/hour. The target variable is
Consumption
Electricity(kWh)_48-h-ahead.
Date Categorical Date of energy consumption in format YYYYMMDD.
Year Categorical Measurement year YYYY
Month Categorical Measurement month MM
Day Categorical Measurement day DD
Hour Categorical Measurement hour (00:00 to 23:00)
Week_Day Categorical Day from (1–7)
Num_Week Categorical Number of the week respect a year (1–52)
Time
Year sin Continuous Year_sin = sin timeseconds × 365 × 242 ×π 60 × 60

Year cos Continuous Year_cos = cos timeseconds × 365 × 242 ×π 60 × 60

Day sin Continuous Day_sin = sin timeseconds × 24 × 260π × 60

Day cos Continuous Day_cos = cos timeseconds × 24 × 260π × 60

Hour sin Continuous Hour_sin = sin timeseconds × 602×π60

Hour cos Continuous Hour_cos = cos timeseconds × 602×π60
Figure 4. Decomposition plot of time series energy consumption.

Figure 4. Decomposition plot of time series energy consumption.
Figure 5 shows a scatter plot illustrating the regression relationship between cur-
rent energy consumption and the 48-hours-ahead target. The density of both variables
is displayed at the top and right sides of the graph, respectively. The maximum density
of kilowatts per hour is often recorded within the range of around 250–600 kWh. Addi-
Energies 2023,
Energies 16,16,
2023, 7508
x FOR PEER REVIEW 12 of 26 11 of 24
Figure 5.5.Comparative scatter plotplot

analysis of actual and 48-hour-ahead total energy consumption
Figure 5.Comparative
Figure Comparative scatter
scatter plotanalysis
analysis ofofactual
actualand
and48-hour-ahead
48-hour-ahead total energy
total energyconsumption
consumptionin
in Songdo
Songdo campusfrom
campus from3030 November
November 20192019 toto
1919
January 2021.
January 2021.
in Songdo campus from 30 November 2019 to 19 January 2021.
Figure 6. Boxplot of the total energy consumption per hour in Songdo Campus from 30 November
2019 to 19 January 2021.
Figure6.6.Boxplot
Figure Boxplotof ofthe
thetotal
total energy
energy consumption
consumption per per hour
hour in
in Songdo
SongdoCampus
Campusfrom from30 30November
November
2019
3.2. to 19
Building January
Energy
2019 to 19 January 2021. 2021.
Consumption Variables
Data on energy use at an hourly interval is gathered from sensors installed in the
3.2.
3.2.Building
Building
designated Energy
EnergyConsumption
buildings. Consumption
This section does Variables
Variables
exploratory analysis on the aforementioned
data. Data
Table
Dataon 3 presents
onenergy
energyusethe building
use atat an energy
an hourly consumption
hourly interval variables
gathered
is gathered withsensors
from
from their descrip-
sensors installedin
installed inthe
the
tions, making
designated a
designatedbuildings.total of seventeen
buildings.This Thissectioncontinuous
sectiondoes
does variables.
exploratory
exploratoryFigure A1
analysis shows
analysis onthe
on the total time
aforementioned
the aforementioned data.
series
data.3plot
Table for 3each
presents
Table thebuilding,thewhere
building
presents energy weconsumption
building can note the
energy behavior
variables
consumption of with
the energy outflow
their descriptions,
variables with of descrip-
their making
each
ations,of
total ofthe variables.
seventeen As mentioned
continuous above,
variables. we can notice
Figurevariables.the outlier
A1 shows Figure from
the totalA1 variable
time series 04.
making a total of seventeen continuous shows theplot fortime
total each
Information_Computing (kWh). Moreover, 07. Information_Technology (kWh), 08. Col-
building, where we can note the behavior of the energy outflow
series plot for each building, where we can note the behavior of the energy outflow of of each of the variables. As
lege_Engineering (kWh), and 10. GuestHouse (kWh) variables present several outliers in
mentioned
eachbehavior above,
of the variables.we can notice
As mentioned the outlier from
above, we variable 04. Information_Computing (kWh).
their between September and December of can
2020.notice the outlier
A decrease fromcon-
in energy variable 04.
sumption can be seen for the Haksan library building during mid-February to June 2020, 10.
Moreover, 07. Information_Technology
Information_Computing (kWh). (kWh),
Moreover, 08.
07. College_Engineering
Information_Technology (kWh), and
(kWh), 08. Guest-
Col-
House
which (kWh) variables
lege_Engineering
remained (kWh),
closed duepresent
and several
10.
to measures to outliers
GuestHouse in their
reduce (kWh)
COVID behavior
variables
infections. between
present September
several
Additionally, con-outliersandin
December
theirenergy
stant of
behavior 2020. A decrease
between
consumption isSeptemberin energy
observed consumption
anduniversity
at the December of can
2020.
headquarters,be Aseen foroffice,
thein
decrease
faculty Haksan
energylibrary
cen- con-
building
tral
sumption during
laboratory bemid-February
candepartment,
seen forthe to June
theCollege
Haksan 2020,
of library
Arts which
andbuilding
Physicalremained
Education,
during closed
and the
mid-Februarydue totomeasures
student June 2020,to
reduce
which COVID
remainedinfections.
closed dueAdditionally,
to measuresconstant
to reduceenergy
COVID consumption is observed at
infections. Additionally, the
con-
university headquarters, faculty office, central laboratory department, the College of
stant energy consumption is observed at the university headquarters, faculty office, cen- Arts
and
tralPhysical Education,
laboratory andthe
department, theCollege
studentofcenter. Furthermore,
Arts and a decrease
Physical Education, inthe
and energy con-
student
sumption can be seen in the College of Natural Science and the student dormitory, this was
because more classes began to be held online.
Energies 2023, 16, 7508 12 of 24
Table 3. Building Energy Consumption variables.
Variable Name Classification Description (Unit)

Energy consumed in kilowatt-hours by the University
01. University_Headquarters (kWh) Continuous
headquarters, building number one.
Energy consumed in kilowatt-hours by the faculty office,
02. Faculty_Hall (kWh) Continuous
building number two.
Energy consumed in kilowatt-hour by the computer and
04. Information_Computing (kWh) Continuous
information department, building number four.
Energy consumed in kilowatt-hour by the colleges of
05. Natural_Science (kWh) Continuous natural sciences, life science and bioengineering.
Building number five.
Energy consumed in kilowatt-hours by Haksan library,
06. Library (kWh) Continuous
building number six.
Energy consumed in kilowatt-hours by the College of
07. Information_Technology (kWh) Continuous
Information Technology, building number seven.
08. College_Engineering (kWh) Continuous
Engineering, building number eight.
Energy consumed in kilowatt-hours by the central
09. Joint_Experiment (kWh) Continuous
laboratory department, building number nine.
Energy consumed in kilowatt-hours by the guest house,
10. GuestHouse (kWh) Continuous
building number ten.
Energy consumed in kilowatt-hours by the welfare and
11. Welfare_Hall (kWh) Continuous service center, including the cafeteria, building
number eleven.
Energy consumed in kilowatt-hours by the convention
12. Convention (kWh) Continuous
center, building number twelve.
15. College_Humanities (kWh) Continuous
Humanities, building number fifteen.
16. Art_Sports (kWh) Continuous
Arts and Physical Education, building number sixteen.
Energy consumed in kilowatt-hours by the student
17. Student_Hall (kWh) Continuous
center, building number seventeen.
Energy consumed in kilowatt-hours by the student
18-1. Dormitory (kWh) Continuous
dormitory #1, building number eighteen dash one.
Energy consumed in kilowatt-hours by the sport center
20. Sport_Center (kWh) Continuous
and golf practice center, building number twenty.
Energy consumed in kilowatt-hours by the gymnasium,
21. Gym (kWh) Continuous
building number twenty-one.
3.3. Weather Variables

This study uses weather observation data for forecasting, as we focus on the possibility
of exploiting the information content of weather observations that weather forecasts may
not contain. This study collects eight weather-related variables from the city of Songdo, as
shown in Table 4. The selection of weather observation versus weather forecast is based
on some characteristics between them. For example, weather agencies like KMA report
weather observation at one-hour intervals versus weather forecasting at three-hour intervals.
Figures A2 and A3 present the relationship between the eight weather variables and energy
consumption. Figure A2 shows two sections. The upper section shows the correlation
between each variable where each red dot represents a data point with the trend shown
with the black lines. The section below shows the density of points in that area where the
reddish color shows the highest density and the lighter color where there is little density of
Energies 2023, 16, 7508 13 of 24
values. Although all variables exhibit similar patterns, Pressure has the highest correlation,
where half of the same variables show a positive trend and the other half a negative trend.
Feature extraction was performed for the Wind_Speed and Wind_Direction variables, which
were transformed into their cosine and sine forms, Wx and Wy, respectively, as shown in
Table 4. Figure 7 shows all normalized variables, where 04. Information_Computing (kWh),
07. Information_Technology (kWh), 08. College_Engineering (kWh), and 10. GuestHouse (kWh)
variables present several outliers as observed previously.
Table 4. Weather variables related to Songdo, Incheon.
Category Variable Name Classification Description (Unit)

Dew_Point Continuous (◦ C) Dew point temperature
Humidity Continuous (%) Humidity
Precipitation Continuous (%)
Pressure Continuous (hPa)
Clear, Dangerously Windy, Dangerously Windy and Partly Cloudy,
Weather Sky Condition Categorical Foggy, Heavy Rain, Humid, Light Rain, Mostly Cloudy, Overcast,
variables Possible Drizzle, Possible Light (Rain, Snow), Rain, Snow, Windy
Temperature Continuous (◦ C) Temperature
Wind_Speed Continuous (mph) Wind speed
Wind_Direction(deg) Continuous (◦ ) Wind direction in degrees (0◦ –360◦ )

Wx Continuous Wind_Direction(deg) × π
Wx = Wind_Speed × cos 180

Wind_Direction(deg) × π
Energies 2023, 16, x FOR Wy
PEER REVIEW Continuous Wy = Wind_Speed × sin 180 15 of 26
Figure 7. All variables used in this study already normalized.

Figure 7. All variables used in this study already normalized.
4. Results
In this section, we compare the suggested model’s prediction performance against
the results of evaluated forecasting methods. Initially, an outline is provided of the per-
formance metrics employed for the comparison and evaluation of the resultant models.
Next, we will give the hyperparameters that were assessed for each method, highlight-
Energies 2023, 16, 7508 14 of 24
4. Results
In this section, we compare the suggested model’s prediction performance against the
results of evaluated forecasting methods. Initially, an outline is provided of the performance
metrics employed for the comparison and evaluation of the resultant models. Next, we will
give the hyperparameters that were assessed for each method, highlighting the best result
obtained for each of them. Next, the performance of each model is assessed by utilizing the
test set. Finally, a comprehensive study is conducted to compare our suggested forecast
model with the other models that were tested.
In order to evaluate the efficacy of the experiment, we employed three metrics denoted
as Equations (12)–(14). These metrics encompass the root mean square error (RMSE), mean
absolute error (MAE), and R2 . The RMSE is a performance indicator that imposes penalties
on significant and large errors, thereby measuring the extent to which predictions deviate
from measured energy consumption. The MAE quantifies the average discrepancy between
the predicted and observed values of energy consumption. The R2 statistic quantifies the de-
gree of relationship between predicted and actual energy usage by calculating the squared
correlation coefficient. The performance metrics are delineated in the following manner:
q
N
RMSE = ∑i=1 (yi − ŷi )2 /N, (12)
N
MAE = ∑ i =1 yi − ŷi /N, (13)
N N
R2 = ∑i=1 (ŷi − y)2 /∑i=1 (yi − y)2 , (14)
where yi is the measured energy consumption, ŷi is a predicted energy, y is the mean of the
measured energy consumption, and N is the number of samples.
The data collecting period spans over a duration of more than one year, namely from
30 November 2019 to 17 January 2021, resulting in a total data gathering time of 14 months
with a dataset size of 2 MB. The objective of this project is to generate forecasts for energy
usage with a lead time of 48 h. The training dataset was utilized to train the model for
each method. The train set encompasses the time period spanning from 30 November
2019 at 10:00 to 14 September 2020 at 22:00. The validation set is utilized to fine-tune the
hyperparameters. It encompasses data collected 14 September 2020, 23:00 and 6 December
2020, 15:00. The test set is utilized to evaluate and compare the performance of different
models. It encompasses the data collected from 6 December 2020, at 16:00, to 17 January
2021, at 00:00, as shown in Figure 8. The top plot of Figure 8 displays the complete time
series data set from 30 November 2019 to 17 January 2021. The purple highlighted section
spans from January 2020 to the end of February 2020 shown in detail in the bottom plot for
Figure 8.
The three-fold cross-validation method was employed as a resampling procedure
during the training phase. Additionally, a grid search procedure was used to determine
the optimal values for the hyperparameters of each model. In order to evaluate the
efficacy of our encoder-decoder regression model, a series of experiments were undertaken.
The experiments were conducted using an Intel Core i9-9900KF (Intel, Santa Clara, CA,
USA) central processing unit (CPU) paired with 16 gigabytes of DDR4 random access
memory (RAM). The tests were conducted using the scikit-learn library in Python 3.7, which
provides implementations of several machine-learning methods. The hyper-parameter
tuning procedure was standardized across all data sources.
to fine-tune the hyperparameters. It encompasses data collected 14 September 2020,
23:00 and 6 December 2020, 15:00. The test set is utilized to evaluate and compare the
performance of different models. It encompasses the data collected from 6 December
2020, at 16:00, to 17 January 2021, at 00:00, as shown in Figure 8. The top plot of Figure
8 displays the complete time series data set from 30 November 2019 to 17 January 2021.
Energies 2023, 16, 7508
The purple highlighted section spans from January 2020 to the end of February152020 of 24
shown in detail in the bottom plot for Figure 8.
Figure
Figure 8.8.Time
Time series
series fixed
fixed partition
partition forfor train,
train, validation,
validation, and
and test
test sets.
sets.
Thehyper-parameter
The three-fold cross-validation
candidatesmethod evaluatedwasinemployed
this studyasare a resampling
shown in Table procedure
5. The
during encompassed
analysis the training phase. Additionally,examination
a comprehensive a grid searchofprocedure
seven single wasregression
used to determine
methods,
thebagging
two optimal ensemble
values formethods,
the hyperparameters
three boosting of each model.
ensemble In order toand
algorithms, evaluate the effi-
one proposed
cacy of our encoder-decoder regression model, a series of experiments
deep learning algorithm. It should be noted that time series and linear regression algo- were undertaken.
The experiments
rithms cannot evaluatewereany
conducted using an Intel
hyperparameters Core
as they i9-9900KF (Intel,
automatically Santa
calculate theClara, CA,
best value
USA)
for central
the time processing
series. unit indicate
The results (CPU) paired with
that the 16 gigabytes of significantly
hyper-parameters DDR4 random access
affect the
memory (RAM).
performance of theThe tests were
models. For theconducted
Decisionusing the scikit-learn
Tree model, increasinglibrary in Python 3.7,
the max_depth im-
which its
proved provides implementations
performance, but only up of to
several machine-learning
a certain methods.
level. After a certain The hyper-pa-
depth, the model
started
rameter to tuning
overfit.procedure
Among the wasseveral single regression
standardized models
across all data evaluated in the valida-
sources.
tion set,
Theit hyper-parameter
was observed that the Decision
candidates Tree model
evaluated achieved
in this study are the
shown lowest RMSE
in Table and
5. The
the
analysis R2 value when
highestencompassed the maximum depth
a comprehensive of the tree
examination was set
of seven at 8.regression
single For the Randommeth-
Forest, Extra
ods, two Trees, ensemble
bagging Gradient Boosting,
methods, and threeLightGBM models increasing
boosting ensemble algorithms, bothandmax_depth
one pro-
and num_estimators
posed deep learningimproved
algorithm.their performance.
It should However,
be noted that the best
time series andperformance
linear regression was
achieved with a moderate number of trees and depth. For the CatBoost
algorithms cannot evaluate any hyperparameters as they automatically calculate the model, the depth
and
bestiterations
value forhad theatime
significant
series. impact
The resultson the model’s
indicate performance,
that with a deeper
the hyper-parameters tree
signifi-
and
cantly affect the performance of the models. For the Decision Tree model, increasing thea
more iterations improving the model’s accuracy. Moreover, the learning rate had
considerable
max_depth effect on performance,
improved its performance, withbut
a small
only learning rate leading
up to a certain level. to better
After results.
a certain
Furthermore, for ensemble models, Extra Trees model got the best RMSE, MAE, and R2
with max_depth = 9 and num_estimators = 80. For the Encoder-Decoder Recurrent Neural
Network, the number of neurons, activation functions, and optimizer significantly affected
the model’s performance. A small number of neurons resulted in better performance. The
choice of activation functions and optimizer also affected the model’s accuracy.
Considering the acquisition of the most optimal models using a three-fold cross-
validation process, we proceeded to assess the efficacy of each model. Typically, in order
to assess the efficacy of the suggested model, we conduct evaluations on both the valida-
tion and test datasets, employing several performance measures like RMSE, MAE, and
R2 . Table 6 presents the results of the evaluated prediction models for the INU energy
consumption validation dataset.
Energies 2023, 16, 7508 16 of 24
Table 5. Hyper-parameter tuning using grid search for the algorithms.
Prediction Algorithms Evaluated Hyper-Parameters

AR -
Auto ARIMA -
SARIMA -
Single
Auto SARIMA -
Regression Models
PROPHET -
Linear regression -
Decision tree max_depth = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
Random forest num_estimators = {5, 10, 15, 20, 40, 80}, max_depth = {2, 5, 7, 9}
Extra trees num_estimators = {5, 10, 15, 20, 40, 80}, max_depth = {2, 5, 7, 9},
Ensemble Gradient boosting num_estimators = {5, 10, 15, 20, 40, 80}, max_depth = {2, 5, 7, 9}
Models learning_rate = {0.0001, 0.001, 0.01, 0.1},
CatBoost
depth = {3, 6, 8, 10}, iterations = {30, 50, 100}
LightGBM num_estimators = {5, 10, 15, 20, 40, 80}, max_depth = {2, 5, 7, 9}
Encoder-Decoder Recurrent activation = {‘relu’, ‘tanh’}, recurrent_activation = {‘relu’, ‘tanh’},
Deep Learning Model
Neural Network neurons = {15, . . ., 23, . . ., 50}, optimizer = {adam, rmsprop, nadam}
The optimal values are emphasized by the use of bold and underline formatting.
Table 6. Results of Energy Power Consumption Prediction over validation dataset.
Prediction Models Validation RMSE Validation MAE Validation R2

AR 157.64 130.05 −0.03
Auto ARIMA 167.04 141.80 −0.15
Single
SARIMA 188.85 159.29 −0.43
Regression Models
Auto SARIMA 167.38 139.27 −0.16
PROPHET 91.70 69.62 0.70
Linear regression 144.98 108.44 0.15
Decision tree 102.86 67.36 0.57
Random forest 98.84 65.87 0.61
Ensemble
Extra trees 85.60 58.24 0.71
Models
Gradient boosting 91.68 64.35 0.66
CatBoost 94.69 70.56 0.64
LightGBM 101.24 82.26 0.59
Encoder-Decoder
Deep Learning Model 83.66 59.78 0.71
Recurrent Neural Network
The optimal values are emphasized by the use of bold and underline formatting.
Based on the outcomes derived from the validation data, it is evident that among
the single regression models, SARIMA exhibits the highest RMSE of 188.85. This finding
suggests that SARIMA has relatively inferior prediction accuracy in comparison to the other
models. The best results from single regression models were obtained from the algorithm
PHOPHET, with the lowest RMSE of 91.70. Among the ensemble models examined, it is
observed that Extra Trees and Gradient Boosting show the lowest RMSE values, specifically
85.60 and 91.68, respectively. However, the Encoder-Decoder Recurrent Neural Network,
which is suggested in this study, achieves the lowest RMSE value of 83.66. This outcome
suggests that the Encoder-Decoder Recurrent Neural Network possesses greater accuracy
in predicting energy consumption. In the context of MAE for the validation set, the single
regression models reveal that Decision Tree exhibits the lowest MAE with 67.36, indicating
superior performance in accurately capturing the absolute disparities between predicted
and observed values. Within the ensemble models, it is observed that Extra Trees and
Gradient Boosting provide the lowest MAE values, specifically 58.24 and 64.35, respectively.
This outcome suggests that both models excel at effectively reducing absolute prediction
errors, but it should be mentioned that Extra Trees also obtained the minimum error
compared to the other models evaluated in the validation set. Among single regression
Energies 2023, 16, 7508 17 of 24
models, PROPHET demonstrates the greatest validation R2 with 69.67%, suggesting a

robust association between predicted and observed values. In the context of ensemble
models, the Extra Trees and Encoder-Decoder Recurrent Neural Network exhibit notable
R2 values of 70.05% and 71.08%, respectively. The results indicate their strong prediction
skills and suitability for the given dataset.
Figure 9 presents the performance results of RMSE and MAE over test data. Based
on the single regression model category, it can be concluded that the PROPHET model
outperformed all other models in terms of both RMSE and MAE. The next best-performing
model was the Decision Tree model with relatively low RMSE and MAE values. On the
other hand, the SARIMA model showed the highest RMSE and MAE values among all the
models evaluated, indicating that it did not perform well on the test set. The AR, Auto
ARIMA, and Auto SARIMA models also showed relatively high RMSE and MAE values,
suggesting that their performance was not satisfactory compared to the other models.
Among the ensemble models, the LightGBM and CatBoost models showed slightly higher
RMSE and MAE values compared to the Random Forest and Gradient Boosting but still
performed better than most of the single regression models. Finally, as a machine learning
model, the linear regression and decision tree models had relatively higher RMSE and MAE
values, indicating that their performance was not as good as some of the other models.
Finally,
Energies 2023, thePEER
16, x FOR Encoder-Decoder
REVIEW Recurrent Neural Network achieved the best performance
19 of 26in
terms of RMSE and MAE value scores among all models in the test set.
Figure 9. Performance results RMSE and MAE over test data. Predicted energy consumption ver-
Figure 9. Performance results RMSE and MAE over test data. Predicted energy consumption versus
sus measured energy consumption. The optimal values are emphasized using blue and underline
measured energy consumption.
formatting. The optimal values are emphasized using blue and underline formatting.
Figure 10 presents the R 2 performance result onresult

the test data set. set.
It can bebeconcluded
Figure 10 presents the R2 performance on the test data It can concluded
that the AR, Autothat the AR, Auto
ARIMA, SARIMA,ARIMA,andSARIMA,
Autoand Auto SARIMA
SARIMA models models
diddidnotnotperform
perform well
well
on thehave
on the test set as they test set as they have
negative R2 negative
values;Rfor
2 values; for this reason, they are not presented in
this reason, they are not presented in
Figure 10. This indicates that the models did not fit the data well and performed worse
Figure 10. This indicates that the models did not fit the data well and performed worse
than a horizontal line. The PROPHET model performed relatively well, with an R2 value
than a horizontal line. TheThis
of 0.6967. PROPHET model
indicates that performed
the model was ablerelatively
to explain awell, an R2 ofvalue
withportion
significant the
variance in the data. The Linear Regression model had an R2 value of 17.81%, which
indicates that the model was not able to explain much of the variance from the dataset.
The tree-based models, including Decision Tree, Extra Trees, Random Forest, Gradient
Boosting, and CatBoost, performed better than the other models, with R2 values ranging
from 60.77% to 75.74%. These models were able to capture the nonlinear relationships
Energies 2023, 16, 7508 18 of 24
of 0.6967. This indicates that the model was able to explain a significant portion of the
variance in the data. The Linear Regression model had an R2 value of 17.81%, which
indicates that the model was not able to explain much of the variance from the dataset. The
tree-based models, including Decision Tree, Extra Trees, Random Forest, Gradient Boosting,
and CatBoost, performed better than the other models, with R2 values ranging from 60.77%
to 75.74%. These models were able to capture the nonlinear relationships in the data, which
improved their performance. Moreover, the proposed Encoder-Decoder Recurrent Neural
Network performed the best among all the models, with an R2 value of 75.93%. This
Energies 2023,indicates thatREVIEW
16, x FOR PEER the model was able to capture the complex temporal dependencies in the 20 of 26
data and predict the values accurately.
Figure 10. Performance results for R2 over test data. Predicted energy consumption versus meas-
Figure 10. Performance results for R2 over test data. Predicted energy consumption versus measured
ured energy consumption. The optimal values are emphasized using blue and underline format-
energy consumption. The optimal values are emphasized using blue and underline formatting.
ting.
5. Discussion 5. Discussion
The analysis of the Theenergy
analysisconsumption dataset fromdataset
of the energy consumption Incheon fromNational University
Incheon National University
reveals that both ensemble
reveals thatmodels and the models
both ensemble Encoder-Decoder Recurrent Neural
and the Encoder-Decoder Network
Recurrent Neural Net-
produce promisingwork results for precise
produce promisingpredictions,
results forwith the predictions,
precise Deep Learning withmodel
the Deep exhibiting
Learning model
the highest predictive accuracy.
exhibiting The evaluated
the highest predictivehyperparameters
accuracy. The evaluated have ahyperparameters
considerable effect have a con-
siderable
on model performance, effect on model
highlighting theperformance, highlighting
need for meticulous the need
tuning for meticulous
to achieve optimal tuning to
results. Notably, achieve optimal
the optimal results. Notably,
combination the optimal combination
of hyperparameters varies of hyperparameters
between models, varies
between models,
necessitating a systematic search and necessitating a systematic
experimental approach searchforand experimental
their determination.approach for their
determination.
In terms of ensemble models, both validation and test set-evaluated performance
In terms of ensemble models, both validation and test set-evaluated performance
metrics demonstrate favorable performance, with RMSE and MAE scores lying within
metrics demonstrate favorable performance, with RMSE and MAE scores lying within
reasonable ranges. The Encoder-Decoder Recurrent Neural Network and Extra Trees
reasonable ranges. The Encoder-Decoder Recurrent Neural Network and Extra Trees
models have the lowest RMSE values, indicating superior performance in predicting the
models have the lowest RMSE values, indicating superior performance in predicting the
target variable on the validation
target variable on set.the
Invalidation
addition,set.
theInRandom
addition, Forest, Gradient
the Random Boosting,
Forest, Gradientand Boosting,
Encoder-Decoder and Recurrent Neural Network models have the highest R 2 scores, indicating
Encoder-Decoder Recurrent Neural Network models have the highest R2 scores,
their suitability forindicating
predicting thesuitability
their target variable based the
for predicting on unobserved
target variabledata basedonon theunobserved
test set. data
While Extra Trees performs
on the test set. exceptionally well in the validation set in terms of MAE, its
performance decreases on the test set, indicating possible overfitting during training. Also,
While Extra Trees performs exceptionally well in the validation set in terms of
in the validation set, even though LightGBM has the highest RMSE and MAE scores among during
MAE, its performance decreases on the test set, indicating possible overfitting
ensemble models,training. Also, in the
its performance on validation
the test setset, even
has though LightGBM
improved. In conclusion,has the highest
over RMSE and
the test
MAE scores among ensemble models, its performance
set Random Forest, Gradient Boosting, and Encoder-Decoder Recurrent Neural Networks on the test set has improved. In
conclusion, over the test set Random Forest, Gradient
emerge as the top-performing models, boasting comparatively high R scores and low Boosting,
2 and Encoder-Decoder
Recurrent Neural Networks emerge as the top-performing models, boasting compara-
RMSE scores over the validation set, indicative of their robust predictive capabilities.
tively high R2 scores and low RMSE scores over the validation set, indicative of their
In terms of training time, Linear Regression emerged as the fastest in our experi-
robust predictive capabilities.
ments. This result is attributed
In terms of totraining
the model exclusively
time, trainingemerged
Linear Regression with energyas thedata without
fastest in our experi-
evaluating other hyperparameters. However,toit’s
ments. This result is attributed the crucial to note that
model exclusively this approach
training with energy also
data with-
introduces a limitation. On the flip
out evaluating otherside, the Encoder-Decoder
hyperparameters. However,Recurrent
it’s crucial Neural Network,
to note that this approach
also introduces a limitation. On the flip side, the Encoder-Decoder Recurrent Neural
Network, while robust in performance, incurred a training time of approximately 4 to 8
min for each model. Despite this, post-training, both algorithms demonstrated efficient
speed in making predictions.
Therefore, the adoption of the Encoder-Decoder Recurrent Neural Network is mo-
Energies 2023, 16, 7508 19 of 24
while robust in performance, incurred a training time of approximately 4 to 8 min for

each model. Despite this, post-training, both algorithms demonstrated efficient speed in
making predictions.
Therefore, the adoption of the Encoder-Decoder Recurrent Neural Network is mo-
tivated by its ability to model sequential data effectively and capture complex temporal
dependencies in energy consumption patterns. The recurrent architecture enables the
model to remember previous observations, allowing it to make more accurate predictions.
The comparison with other current models demonstrates the accuracy and predictive per-
formance superiority of the Encoder-Decoder Recurrent Neural Network, which justifies
its selection as the preferred model for energy consumption prediction in this study.
6. Conclusions
The study presents a process for predicting energy consumption in a smart city by
analyzing individual buildings using a proposed Encoder-Decoder Recurrent Neural Net-
work. The aim is to improve energy management in smart cities, which should ideally
be sustainable and environmentally friendly. The proposed framework includes three
modules: gathering data to generate energy statistics for each building; conducting a
data analysis of energy behavior inside micro-cities to extract characteristics; and building
baseline regressors to evaluate the proposed model’s effectiveness.
Moreover, the study proposes a framework process for energy consumption predic-
tion for a smart campus, consisting of four modules: data collection, data preprocessing,
training and validation modules, and testing modules. Data are collected from Incheon
National University, including energy consumption, weather data, and temporal data. The
data are then preprocessed, including the conversion of categorical variables and normal-
ization. In order to prepare the data, we partitioned the energy consumption log into three
distinct subsets: the training set, the validation set, and the test set. With the training set,
we trained each algorithm and tested it over a validation set to optimize the evaluated
hyper-parameters over 13 regression methods. Optimal hyperparameters for the proposed
regression models are chosen to use grid search for predicting energy consumption 48 h
ahead. The recommended model based on a neural network with Encoder-Decoder Recur-
rent connections improves performance prediction by 75.93%, as measured by the R2 value
over unseen data in the test set. The findings suggest that this model is superior to other
models in terms of R2 , MAE, and RMSE. Therefore, the proposed process has the potential
to improve energy sustainability and efficiency in smart cities.
While our research has demonstrated that the proposed Encoder-Decoder Recurrent
Neural Network has the potential to improve energy consumption predictions for smart
cities, there are a number of limitations that must be considered. First, the current availabil-
ity of data for smart cities is limited, limiting the scope of our analysis. Although our study
utilizes a variety of data sources, including educational, residential, and recreational build-
ings, it predominantly reflects the perspective of a microcity. Future research endeavors
should seek to acquire more extensive and diverse datasets representing complete smart
cities, enabling a broader and more comprehensive analysis.
In addition, the Encoder-Decoder Recurrent Neural Network, a specific form of deep
learning model, has been the focus of our research. In future research, it would be beneficial
to investigate alternative deep learning architectures, such as transformers or retentive
networks, to evaluate their efficacy in predicting energy consumption in smart cities.
Diversifying the models under consideration can result in a more nuanced comprehension
of their relative strengths and weaknesses.
The temporal scope of our dataset, which spans from late 2019 to early 2021, is another
limitation. Future research should aim to integrate longer temporal datasets in order to
strengthen the reliability of our findings and account for potential temporal variations. This
extension in the temporal domain would contribute to the development of more accurate
and adaptable prediction models by facilitating a more comprehensive understanding of
energy consumption patterns and trends.
Energies 2023, 16, 7508 20 of 24
Finally, while our study lays the groundwork for energy consumption prediction in
smart cities, addressing these limitations and conducting future research will contribute
to refining and expanding the applicability of deep learning predictive models, thereby
helping more sustainable and efficient smart city infrastructures.
Author
Energies Contributions:
2023, 16, Conceptualization,
x FOR PEER REVIEW B.C.; Writing—original draft, B.C.; Supervision, K.K.22All
of 26
authors have read and agreed to the published version of the manuscript.
Funding: This work was supported
Author by Incheon
Contributions: NationalB.C.;
Conceptualization, University (International
Writing—original draft, B.C.;Cooperative)
Supervision, K.K.
Research in Grant in 2018
Alland by the
authors haveKorea Institute
read and agreed toof Energy
the Technology
published Evaluation
version of the manuscript.and Planning
(KETEP) granted financial resource
Funding: Thisfrom
work the
was Ministry
supported of
by Trade,
IncheonIndustry & Energy,
National University Republic of
(International Korea
Cooperative)
(No. 20212020900090). Research in Grant in 2018 and by the Korea Institute of Energy Technology Evaluation and Plan-
ning (KETEP) granted financial resource from the Ministry of Trade, Industry & Energy, Republic
of KoreaThe
Data Availability Statement: (No.dataset analyzed during the current study is not publicly available
20212020900090).
due to privacy and confidentiality concerns.
Data Availability Statement: The dataset analyzed during the current study is not publicly avail-
able due to privacy and confidentiality concerns.
Conflicts of Interest: The authors declare no conflict of interest.
Conflicts of Interest: The authors declare no conflict of interest.
Appendix A Appendix A
Figure A1. Energy consumption by building from years 2019 to 2021 in Incheon National University,
Songdo, Incheon.
Energies 2023, 16, 7508 21 of 24

Figure A1. Energy consumption by building from years 2019 to 2021 in Incheon National Univer-
sity, Songdo, Incheon.
Figure A2. Data visualization and analysis of weather variables versus energy consumption.
Figure A2. Data visualization and analysis of weather variables versus energy consumption.
Energies2023,
Energies 2023,16,
16, x FOR PEER REVIEW
7508 2422ofof26
24
FigureA3.
Figure A3.Heatmap
Heatmapanalysis
analysisofof weather
weather variables
variables versus
versus energy
energy consumption.
consumption.
References
References
1.1. U.S.
U.S.Energy
EnergyInformation
InformationAdministration. Energy Consumption
Administration. Energy Consumption By By Sector
Sector in in U.S.;
U.S.; U.S.
U.S. Energy
EnergyInformation
InformationAdministration:
Administration:
Washington, DC, USA,
Washington, DC, USA, 2021. 2021.
2.2. U.S.
U.S.Energy
EnergyInformation
InformationAdministration. Electricity Consumption
Administration. Electricity Consumption in in U.S.
U.S. Homes;
Homes; U.S.
U.S. Energy
EnergyInformation
InformationAdministration:
Administration:
Washington, DC, USA,
Washington, DC, USA, 2015. 2015.
3.3. Energy
EnergyStatistics
StatisticsRelated
RelatedData;
Data;Korea
KoreaEnergy
EnergyAgency:
Agency:Ulsan,
Ulsan,Republic
RepublicofofKorea,
Korea,2021.
2021.
4.4. Obregon,
Obregon,J.;J.;Han,Han,Y.-R.;
Y.-R.;Ho,Ho,C.W.;
C.W.;Mouraliraman,
Mouraliraman,D.; D.;Lee,
Lee,C.W.;
C.W.;Jung,
Jung,J.-Y.
J.-Y.Convolutional
Convolutionalautoencoder-based
autoencoder-basedSOH SOHestimation
estimationof
lithium-ion
of lithium-ion batteries using
batteries electrochemical
using electrochemicalimpedance
impedance spectroscopy.
spectroscopy.J. Energy Storage
J. Energy 60, 106680.
2023,2023,
Storage [CrossRef]
60, 106680.
5.5. Athanasopoulou,
Athanasopoulou,L.; L.;Bikas,
Bikas,H.;
H.; Papacharalampopoulos,
Papacharalampopoulos, A.; Stavropoulos, P.;
A.; Stavropoulos, P.; Chryssolouris,
Chryssolouris, G. G. An
An industry
industry4.0 4.0approach
approachtoto
electric vehicles. Int. J. Comput. Integr. Manuf. 2023, 36, 334–348.
electric vehicles. Int. J. Comput. Integr. Manuf. 2023, 36, 334–348. [CrossRef]
6.6. Stavropoulos,
Stavropoulos,P.; P.; Giannoulis,
Giannoulis,C.; Papacharalampopoulos,
C.; Papacharalampopoulos, A.; Foteinopoulos, P.; Chryssolouris,
A.; Foteinopoulos, G. Life cycle
P.; Chryssolouris, G. analysis:
Life cycle Comparison
analysis:
between different methods and optimization challenges. Procedia CIRP 2016, 41,
Comparison between different methods and optimization challenges. Procedia CIRP 2016, 41, 626–631. 626–631. [CrossRef]
7.7. Carrera,
Carrera,B.; B.;Peyrard,
Peyrard,S.;S.;Kim,
Kim,K.K.Meta-regression
Meta-regressionframework
frameworkfor forenergy
energyconsumption
consumptionprediction
predictioninina asmart
smartcity:
city:AAcase
casestudy
studyof
Songdo in South Korea. Sustain. Cities Soc. 2021, 72, 103025.
of Songdo in South Korea. Sustain. Cities Soc. 2021, 72, 103025. [CrossRef]
8.8. Liu,
Liu,Y.;
Y.;Li,
Li,J.J.Annual
AnnualElectricity
Electricityand
andEnergy
Energy Consumption
Consumption Forecasting
Forecasting for for the
the UKUK Based
BasedononBack
BackPropagation
PropagationNeural
NeuralNetwork,
Network,
Multiple
MultipleLinear
LinearRegression,
Regression,and andLeast
LeastSquare
SquareSupport
SupportVector
Vector Machine.
Machine. Processes
Processes 2022, 11,11,
2022, 44.44.
[CrossRef]
Energies 2023, 16, 7508 23 of 24
9. Carrera, B.; Kim, K. Comparison analysis of machine learning techniques for photovoltaic prediction using weather sensor data.
Sensors 2020, 20, 3129. [CrossRef]
10. Guo, J.; Han, M.; Zhan, G.; Liu, S. A Spatio-Temporal Deep Learning Network for the Short-Term Energy Consumption Prediction
of Multiple Nodes in Manufacturing Systems. Processes 2022, 10, 476. [CrossRef]
11. Al-Saudi, K.; Degeler, V.; Medema, M. Energy Consumption Patterns and Load Forecasting with Profiled CNN-LSTM Networks.
Processes 2021, 9, 1870. [CrossRef]
12. Qian, K.; Wang, X.; Yuan, Y. Research on regional short-term power load forecasting model and case analysis. Processes 2021,
9, 1617. [CrossRef]
13. Andrić, I.; Koc, M.; Al-Ghamdi, S.G. A review of climate change implications for built environment: Impacts, mitigation measures
and associated challenges in developed and developing countries. J. Clean. Prod. 2019, 211, 83–102. [CrossRef]
14. Farah, S.; Whaley, D.; Saman, W.; Boland, J. Integrating climate change into meteorological weather data for building energy
simulation. Energy Build. 2019, 183, 749–760. [CrossRef]
15. Kim, M.K.; Choi, J.-H. Can increased outdoor CO2 concentrations impact on the ventilation and energy in buildings? A case
study in Shanghai, China. Atmos. Environ. 2019, 210, 220–230. [CrossRef]
16. Lupato, G.; Manzan, M. Italian TRYs: New weather data impact on building energy simulations. Energy Build. 2019, 185, 287–303.
[CrossRef]
17. Al-Hajj, R.; Assi, A.; Fouad, M.; Mabrouk, E. A hybrid LSTM-based genetic programming approach for short-term prediction of
global solar radiation using weather data. Processes 2021, 9, 1187. [CrossRef]
18. Hong, W.-H.; Kim, J.-Y.; Lee, C.-M.; Jeon, G.-Y. Energy consumption and the power saving potential of a University in Korea:
Using a field survey. J. Asian Archit. Build. Eng. 2011, 10, 445–452. [CrossRef]
19. Chung, M.H.; Rhee, E.K. Potential opportunities for energy conservation in existing buildings on university campus: A field
survey in Korea. Energy Build. 2014, 78, 176–182. [CrossRef]
20. Lee, S.; Jung, S.; Lee, J. Prediction model based on an artificial neural network for user-based building energy consumption in
South Korea. Energies 2019, 12, 608. [CrossRef]
21. Park, K.-H.; Kim, S.-M. Analysis of energy consumption of buildings in the university. Korean J. Air Cond. Refrig. Eng. 2011, 23,
633–638. [CrossRef]
22. Kim, T.-Y.; Cho, S.-B. Predicting residential energy consumption using CNN-LSTM neural networks. Energy 2019, 182, 72–81.
[CrossRef]
23. Bui, D.-K.; Nguyen, T.N.; Ngo, T.D.; Nguyen-Xuan, H. An artificial neural network (ANN) expert system enhanced with the
electromagnetism-based firefly algorithm (EFA) for predicting the energy consumption in buildings. Energy 2020, 190, 116370.
[CrossRef]
24. Tran, D.-H.; Luong, D.-L.; Chou, J.-S. Nature-inspired metaheuristic ensemble model for forecasting energy consumption in
residential buildings. Energy 2020, 191, 116552. [CrossRef]
25. Wen, L.; Zhou, K.; Yang, S. Load demand forecasting of residential buildings using a deep learning model. Electr. Power Syst. Res.
2020, 179, 106073. [CrossRef]
26. Olu-Ajayi, R.; Alaka, H.; Sulaimon, I.; Sunmola, F.; Ajayi, S. Building energy consumption prediction for residential buildings
using deep learning and other machine learning techniques. J. Build. Eng. 2022, 45, 103406. [CrossRef]
27. Kim, M.K.; Kim, Y.-S.; Srebric, J. Predictions of Electricity Consumption in a Campus Building Using Occupant Rates and Weather
Elements with Sensitivity Analysis: Artificial Neural Network vs. Linear Regression. Sustain. Cities Soc. 2020, 62, 102385.
[CrossRef]
28. Goudarzi, S.; Anisi, M.H.; Kama, N.; Doctor, F.; Soleymani, S.A.; Sangaiah, A.K. Predictive modelling of building energy
consumption based on a hybrid nature-inspired optimization algorithm. Energy Build. 2019, 196, 83–93. [CrossRef]
29. Kim, Y.; Son, H.-g.; Kim, S. Short term electricity load forecasting for institutional buildings. Energy Rep. 2019, 5, 1270–1280.
[CrossRef]
30. Ji, C.; Hong, T.; Kim, H.; Yeom, S. Effect of building energy efficiency certificate on reducing energy consumption of non-residential
buildings in South Korea. Energy Build. 2022, 255, 111701. [CrossRef]
31. Dong, Z.; Liu, J.; Liu, B.; Li, K.; Li, X. Hourly energy consumption prediction of an office building based on ensemble learning
and energy consumption pattern classification. Energy Build. 2021, 241, 110929. [CrossRef]
32. Dagdougui, H.; Bagheri, F.; Le, H.; Dessaint, L. Neural network model for short-term and very-short-term load forecasting in
district buildings. Energy Build. 2019, 203, 109408. [CrossRef]
33. De Jong, M.; Joss, S.; Schraven, D.; Zhan, C.; Weijnen, M. Sustainable–smart–resilient–low carbon–eco–knowledge cities; making
sense of a multitude of concepts promoting sustainable urbanization. J. Clean. Prod. 2015, 109, 25–38. [CrossRef]
34. Cortese, T.T.P.; Almeida, J.F.S.d.; Batista, G.Q.; Storopoli, J.E.; Liu, A.; Yigitcanlar, T. Understanding Sustainable Energy in the
Context of Smart Cities: A PRISMA Review. Energies 2022, 15, 2382. [CrossRef]
35. Ejaz, W.; Naeem, M.; Shahid, A.; Anpalagan, A.; Jo, M. Efficient energy management for the internet of things in smart cities.
IEEE Commun. Mag. 2017, 55, 84–91. [CrossRef]
36. Makhadmeh, S.N.; Khader, A.T.; Al-Betar, M.A.; Naim, S.; Abasi, A.K.; Alyasseri, Z.A.A. Optimization methods for power
scheduling problems in smart home: Survey. Renew. Sustain. Energy Rev. 2019, 115, 109362. [CrossRef]
Energies 2023, 16, 7508 24 of 24
37. Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M. Tensorflow: A system
for large-scale machine learning. In Proceedings of the International Conference OSDI, Savannah, GA, USA, 2–4 November 2016;
pp. 265–283.
38. James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning, 2nd ed.; Springer: Berlin/Heidelberg,
Germany, 2021; Volume 112.
39. Cho, K.; Van Merriënboer, B.; Bahdanau, D.; Bengio, Y. On the properties of neural machine translation: Encoder-decoder
approaches. arXiv 2014, arXiv:1409.1259.
40. Goodfellow, I.; Bengio, Y.; Courville, A.; Bengio, Y. Deep Learning; MIT Press: Cambridge, UK, 2016; Volume 1.
41. Gers, F.A.; Schraudolph, N.N.; Schmidhuber, J. Learning precise timing with LSTM recurrent networks. J. Mach. Learn. Res. 2002,
3, 115–143.
42. Carrera, B.; Sim, M.K.; Jung, J.-Y. PVHybNet: A Hybrid Framework for Predicting Photovoltaic Power Generation Using Both
Weather Forecast and Observation Data. IET Renew. Power Gener. 2020, 14, 2192–2201. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

Energies 16 07508 v2

Uploaded by

Document Informationclick to expand document information

Document Informationclick to expand document information

Copyright:

Available Formats

Energies 16 07508 v2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Energies 16 07508 v2

Uploaded by

Copyright:

Available Formats

energies

Department of Industrial and Management Engineering, Incheon National University,

Citation: Carrera, B.; Kim, K. A 1. Introduction

Energies 2023, 16, 7508. https://doi.org/10.3390/en16227508 https://www.mdpi.com/journal/energies

Finally, it is important to provide an analysis of the evolution of energy consumption,

Figure 1. Proposed regression framework for energy consumption in a smart campus.

In the fourth module, there is a detailed comparison of the performances of selected

Table 1. Prediction algorithms with evaluated hyper-parameters.

Prediction Algorithms Evaluated Hyper-Parameters

An Encoder-Decoder Recurrent Neural Network for Energy Consumption Prediction

Energies 2023, 16, x FOR PEER REVIEW 8 of 26

To capture the temporal dependence𝑧 = 𝑤𝑥variability

it = Φ(Wid dt + Wia a(1 t −1 + 𝑒 Wic)ct−1 + bi ), (7)

3. Data Preparation and Analysis

3.1. Energy and Time Variables

Table 2. Energy and time variables.

Category Variable Name Classification Description (Unit)

Figure 4. Decomposition plot of time series energy consumption.

Figure 5.5.Comparative scatter plotplot

Table 3. Building Energy Consumption variables.

Variable Name Classification Description (Unit)

3.3. Weather Variables

Table 4. Weather variables related to Songdo, Incheon.

Category Variable Name Classification Description (Unit)

Figure 7. All variables used in this study already normalized.

shown in detail in the bottom plot for Figure 8.

Table 5. Hyper-parameter tuning using grid search for the algorithms.

Prediction Algorithms Evaluated Hyper-Parameters

Table 6. Results of Energy Power Consumption Prediction over validation dataset.

Prediction Models Validation RMSE Validation MAE Validation R2

models, PROPHET demonstrates the greatest validation R2 with 69.67%, suggesting a

Figure 10 presents the R 2 performance result onresult

while robust in performance, incurred a training time of approximately 4 to 8 min for

Energies 2023, 16, 7508 21 of 24

You might also like