Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Next Article in Journal
Identification of Efficient Sampling Techniques for Probabilistic Voltage Stability Analysis of Renewable-Rich Power Systems
Next Article in Special Issue
Thermodynamic Analysis of Negative CO2 Emission Power Plant Using Aspen Plus, Aspen Hysys, and Ebsilon Software
Previous Article in Journal
Methodology for Assessing the Impact of Aperiodic Phenomena on the Energy Balance of Propulsion Engines in Vehicle Electromobility Systems for Given Areas
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Forecasting of Heat Production in Combined Heat and Power Plants Using Generalized Additive Models

by
Maciej Bujalski
1,2 and
Paweł Madejski
2,*
1
PGE Energia Ciepła S.A., ul. Złota 59, 00-120 Warszawa, Poland
2
Department of Power Systems and Environmental Protection Facilities, Faculty of Mechanical Engineering and Robotics, AGH University of Science and Technology, Mickiewicz 30 Av., 30-059 Krakow, Poland
*
Author to whom correspondence should be addressed.
Energies 2021, 14(8), 2331; https://doi.org/10.3390/en14082331
Submission received: 10 March 2021 / Revised: 8 April 2021 / Accepted: 18 April 2021 / Published: 20 April 2021

Abstract

:
The paper presents a developed methodology of short-term forecasting for heat production in combined heat and power (CHP) plants using a big data-driven model. An accurate prediction of an hourly heat load in the day-ahead horizon allows a better planning and optimization of energy and heat production by cogeneration units. The method of training and testing the predictive model with the use of generalized additive model (GAM) was developed and presented. The weather data as an input variables of the model were discussed to show the impact of weather conditions on the quality of predicted heat load. The new approach focuses on an application of the moving window with the learning data set from the last several days in order to adaptively train the model. The influence of the training window size on the accuracy of forecasts was evaluated. Different versions of the model, depending on the set of input variables and GAM parameters were compared. The results presented in the paper were obtained using a data coming from the real district heating system of a European city. The accuracy of the methods during the different periods of heating season was performed by comparing heat demand forecasts with actual values, coming from a measuring system located in the case study CHP plant. As a result, a model with an averaged percentage error for the analyzed period (November–March) of less than 7% was obtained.

Graphical Abstract

1. Introduction

District heating systems (DHS) are common forms of heat distribution in large urban areas. The largest systems consist of a district heating network (DHN) supplied by a combined heat and power (CHP) plant. In recent years, combined-cycle gas turbine plants, as well as gas turbines with heat recovery units, have become popular due to their high flexibility, short start-up time, and lower environmental impact compared to coal plants [1]. In CHP plants, the basic product is useful heat, whose demand in a DHS must be covered at any time. This means that actual electricity production depends on the current heat demand. On the other hand, dynamic changes in electricity prices are observed. On the European energy markets, the price changes every hour, with maximum daily variations of 50% and more. An accurate prediction of an hourly heat load in the day-ahead horizon allows better planning and optimization of heat and electricity production by cogeneration units.
New solutions to optimize electricity generation and keep the produced heat load on the required level taking into account economic aspects are under constant development. Fang et al. [2] proposed an optimization model, where the heat demand and electricity price forecasts are used as an input to obtain a heat storage operation plan. Wand et al. [3] studied the flexibility of two different CHP units considering the day-ahead market and real-time wind power balancing. Nowadays, the progressive development of digitization and the use of advanced data analysis methods is a trend in the so-called 4th generation district heating [4,5]. The main element is short-term (up to several days) planning of energy production, based on the expected heat load profile [6]. An important issue is that estimated production on an hourly basis must be contracted in a day-ahead electricity market. Similarly fuel such as natural gas can also be purchased on the spot market. Moreover, inaccurate estimates of expected volumes may result in the need to switch on the peak units (e.g., heat only boilers), whose energy and environmental efficiency is lower.
Prediction of an hourly heat demand on a large urban scale is a complex problem. The heat for heating of buildings depends mainly on the weather data, while domestic hot water consumption is strongly related to consumers’ behavior over the day and week. An important aspect are transient DHN effects such as heat losses, thermal inertia of buildings, etc. [7]. Forecasting methods are based on a data-driven approach. It means that heat demand and its relation to predictor variables is found in historical data. Spoladore et al. [8] analyzed data of heat demand for town-level aggregation and developed a model of hourly gas consumption for heating purposes. Nigitz et al. [9] proposed a model, where changes in consumer behavior are covered by continuous adaptation by using historical data for the ambient temperature and the heat load. Mosavi et al. [10] and Bourdeau et al. [11] gave an overview of data-driven methods that can be applied to heat load forecasting. There are models on a scale of individual buildings as well as the entire district heating network. Generally, predictive models are based on the supervised learning technique and supplied with weather data such as temperature, wind speed, cloud cover, among others. Dotzdauer [12] developed a simple model based on stepwise regression of ambient temperature. Baltputnis et al. [13] used a multiple linear regression of meteorological parameters. Autoregressive integrated moving average ARIMA method of heat load and ambient temperature time series were used by Grosswindhager et al. [14] and Fang et al. [15]. Artificial Neural Networks ANN have been shown by Wojdyga [16] to be effective approach to analyze data from previous heating seasons. In recent years, there has been a clear increase in interest in the use of machine learning methods, such as support vector machines (SVM) [17], random forest [18], deep learning [19] and gradient boosting [20]. This results from the intensive development of IT tools used in large industrial installations, database capacity, and the availability of data acquisition systems. An important issue in the data-driven models is the selection of input variables and the way of training the model. Machine learning techniques enable analyzing many variables besides weather data, such as operating data from DHN or calendar data [21,22].
This paper focuses on the use of the generalized additive model (GAM) method to develop the heat demand model in a medium-sized heating system supplied from a CHP plant. In the GAM method, the forecast variable is estimated by smoothing the input variables with functional relationships [23]. It is useful extension of the generalized linear model (GLM), able to effectively map the seasonality and non-linearity which is normally presented in the heat load data. In the literature, there are many applications of the GAM method to forecast electricity load [24]. Kim et al. [25] decomposed the load into the components on different temporal scales, related to the annual, weekly and daily cycles. The non-linear impact of ambient air temperature on the load level was incorporated with the cubic spline by Sigauke [26]. Pathak et al. [27] used GAM to forecast gas usage for buildings taking into account weather data supplemented by features such as hour of the day, day of the week, month, etc. Khamma et al. [28] interpreted the relationships between predictors using GAM and evaluate their impacts on energy consumption in office buildings.
In this paper, the GAM method was applied to build an hourly heat demand model based on the weather data as ambient temperature, solar irradiation and wind speed. The presented method was extended by the use of variable representing the hour of the day. Application of day hour variable can improve accuracy of predicted heat load, taking into account DHN behavior changes during normal and weekend days. Additionally, the optimal size of the sliding window with data selected to the calibration layer was analyzed as a function days number. Particular attention was paid to parametrization and calibration of the model in order to obtain high accuracy in the day-ahead time horizon.

2. Case Study District Heating System

The case study DHS consists of a heating network with a total length of about 260 km. The network is supplied from a CHP plant with two identical gas turbines (total 106 MWe) equipped with heat recovery units (total 238 MWt). The flexibility of heat production in the summer season is provided by a 600 MWh heat accumulator. Four heat-only-boilers are used as the peak source. The share of individual units in total heat capacity of the plant is presented in Figure 1. The presented state refers to the period when the maximum value of produced heat in the CHP is delivered to the DHN. During the whole year, the diverse scenarios of heat production are realized and values presented in Figure 1 can rich different values. The operation strategy is strongly dependent on current heat demand as well as the overhaul time of the selected unit.
The dependence between the instantaneous heat load and the ambient air temperature during the entire heating season is depicted in Figure 2. The maximum heat load of the analyzed DHN does not exceed 250 MWt, when the minimal value of heat delivered to the final consumers during summertime is always above 20 MWt.
Obviously, the heat load increases as the temperature drops. A strong correlation can be seen, however there is a relatively large spread for the same temperature level. Produced heat can vary in the range of around 50 MW. It results from the influence of other factors such as a month, day of the week, hour of the day and other meteorological parameters. Therefore, it is required to include additional input data in order to increase accuracy of the heat forecasts.

2.1. Physical Model of the District Heating System

The heat is supplied to the district heating network, where it is distributed to the end-users through heat distribution centers. Additional installations in the system are heating substations that connect individual sections of the network and ensure heat distribution to specific areas of the city. The instantaneous heat output transferred to the heating network by the CHP plant depends on the mass flow rate of network water and the temperature difference between supply and return side, according to Equation (1):
  Q ˙ s ( t ) = m ˙ ( t )   c p     T s t     T R ( t )  
where Q ˙ s —thermal power, kW; c p —water specific heat, kJ/(kg·K); m ˙ —mass flow rate, kg/s; T s —supply temperature, K; T R —return temperature, K and t —time, s
The regulation of thermal power consists of changing both the supply temperature and the mass flow rate at the output [29]. The water temperature at the outlet from the CHP plant is in accordance with the regulatory table according to Equation (2). Supply temperature depends on the ambient temperature and a coefficient depending on the wind speed and solar radiation. For example, if the weather is windy and cloudy, the coefficient c takes values above 1, increasing the supply temperature. The maximum supply temperature in the analyzed system is 115 °C, at −20 °C ambient air temperature. During summertime, the supply side is maintained at 70 °C:
T s = c W s , I r ·   f T a  
where: T s —supply water temperature, K; c —coefficient; W s —wind speed, m/s; I r —solar radiation, W/m2; T a —ambient air temperature, K and f —individual system function based on the DHN regulatory table, K.

2.2. Operation of the District Heating System

During operation of the system, heat production in the source is continuously adapted to the real demand in the DHN. Due to the thermal behavior of buildings as well as transient operation of DHN there is a time delay between the current parameters at the supply and return side. The time course of parameters for selected periods during the heating season and the summer period is presented in Figure 3 and Figure 4, respectively. There is a strong correlation of the ambient air temperature with the generated thermal power, as well as the mass flow rate of water can be observed. In the summertime, dynamic fluctuations in the water mass flow rate and heat load are visible. Outside the heating season, heat demand results from the domestic hot water consumption. Peaks associated with increased hot water consumption in the evening and morning can be observed.

3. Heat Demand Model with the Use of GAM

Generalized Additive Model (GAM) is a class of statistical models in which the usual linear relationship between the response and predictors are replaced by several non-linear smooth functions [30]. The equation becomes as follow (Equation (3)):
y i = α 0   + f 1 x 1 ,   i + f 2 x 2 ,   i + + f p x p ,   i + ε i  
where: y i —dependent variable;   α 0 —intercept; x 1 , …, x p —independent variables; f 1 , …, f p —smoothing functions and ε i —random error
The GAM model is able to capture the non-linear effect of individual variables. The response variable is obtained as a summation of individual effects, represented with one or more terms. The smoothing function f consists of the base functions b and the corresponding regression coefficients β (see Equation (4)). The base function b can take the form of a linear or cubic spline, P-spline, and other [31]. The smoothing function can also include two input variables, according to Equation (5), where δ is a vector of regression coefficients:
f   x = i = 1 I β i   b i   x  
f   x 1 ,   x 2 = i = 1 I j = 1 J δ i j   b 1 i x 1   b 2 j x 2
where: I, J—the dimension of the spline basis; b(x)—the corresponding spline function; β —the corresponding regression coefficient and δ —the corresponding vector of regression coefficient.
One of the advantages of GAM models is their flexibility. The method summarizes the contribution of each predictor using smoothing terms. In addition, a GAM algorithm captures nonlinearity and interactions in a learning dataset. Predictive methods with more complex mathematical approach, such as artificial neural networks (ANN) are typical black-box models. In an ANN model, interactions with a forecasted variable are created implicitly when propagating through the hidden layers as each hidden unit is a non-linear combination of the input. Besides, in order to build an accurate model with black box models, many variables, especially over a long period of time, must be taken into account [16]. In the study presented in this paper, the investigated gas-fired power plant is relatively new and there are no data from the heating seasons of previous years. For this reason, the training window is relatively short (the last several days), due to the needs of permanent adaptation to the current operation of the system, and this can be achieved by the use of GAM methods.
The heat demand model for the case study DHS was built using the mcgv package [32] containing the implementation of libraries with the GAM method in R programming language. The forecasts generated by the model were compared with real values using the root mean square error RMSE and the mean absolute percentage error MAPE (Equations (6) and (7), respectively):
RMSE = 1 N t Q pred ,   t     Q real ,   t 2  
MAPE = 1 N t Q pred ,   t     Q real ,   t Q real ,   t
where: N—number of hours in the analyzed period; t—hour; Q pred   —predicted heat load, kW and Q real —real heat load, kW
RSS = i = 1 N y i α 0   j = 1 I β j   b j   x ij 2 + j = 1 I λ j f j t j 2 d t j
where: λ —penalty parameter.
The model is fitted by minimalizing a penalized residual sum of squares RSS presented in Equation (8) for one dimensional basis functions. The fitting involves finding all coefficients to minimize residual sum of square with the use of the general cross validation (GCV) criteria proposed by Craven et al. [33]. The degree of smoothness in a spline can be controlled by a penalty parameter λ in order to avoid overfitting. The iterative process of minimalization is stopped when the change of value of GCV between successive iterations are less than 0.01. The function minimizing RSS provides a compromise between a regression spline fit and a linear fit. When λ is near to zero the fit will be close to the data.

3.1. Input Variables

The weather data listed in Table 1 were taken into consideration as predictors. The table contains the calculated Pearson’s correlation coefficient, which is a simple statistic that measures the linear correlation between two variables. Coefficients were calculated using the data of the entire heating season. It has a value between +1 and −1, where 1 is a total positive linear correlation, and −1 is a total negative linear correlation. Correlations for forecasted and actual weather data are presented. There is less correlation with the forecasted values due to additional forecast errors. It should be noted that in a real application, the predictive model determines output based on the weather forecasts in a forthcoming day horizon.

3.2. Flow Diagram of the Model

In Figure 5 a flow chart that includes input and output of the predictive model is presented. The model generates an hourly heat load forecast in a day-ahead time horizon, starting from 00:00. This time horizon results from the conditions related to participation in the electricity market. The model operation on the timeline diagram is illustrated in Figure 6.
The model works as follows:
  • Before issuing the results, the model is calibrated with the real weather data and corresponding heat output from CHP within the moving time window.
  • When the weather forecast file comes at 7:00, the previously calibrated model generates load forecasts for the assumed time horizon.

3.3. Model Parametrization and Validation

Three variants of the model depending on the set of input data were considered (Equations (9)–(11)). For the first case, the model is supplied with ambient temperature, radiation and wind speed (Equation (9)). In addition to climatic parameters, variable representing the hour of the day was used, which takes values from 0 to 23. In the model M2, ambient temperate and hour of the day were implemented (Equation (10)). The model M3 only considers ambient air temperature (Equation (11)):
Q M 1 = α M 1 + f M 1 ( Temp ) + f M 1 ( Rad ) + f M 1 ( Hour ) + f M 1 ( Wind ,   Temp )  
Q M 2 =   α M 2   + f M 2 ( Temp ) + f M 2 ( Hour )
Q M 3 = α M 3 + f M 3 ( Temp )
where: Q M 1 , Q M 2 , Q M 3 —forecasted heat load from model M1, M2 and M3, respectively, MW; α M 1 ,   α M 2 ,   α M 3 —intercept in model M1, M2 and M3, respectively, MW; f M 1 ( Temp ) , f M 1 ( Rad ) , f M 1 ( Hour ) —one-dimensional additive functions of model M1, dependent on ambient air temperature, radiation, and hour of the day respectively, MW (see Figure 7a,b and Figure 8a); f M 1 ( Wind ,   Temp ) —two-dimensional additive function of model M1, dependent on wind speed and ambient air temperature, MW (see Figure 8b); f M 2 ( Temp ) , f M 2 ( Hour ) —one-dimensional additive functions of model M2, dependent on ambient air temperature and hour of the day respectively, MW; f M 3 ( Temp ) —one-dimensional additive function of model M3, dependent on ambient air temperature, MW; Temp—ambient air temperature (Temp = Ta), °C; Rad—solar radiation (Rad =   I r ), W/m2; Hour—hour of the day (Hour = h), h and Wind—wind speed (Wind = W s ), m/s
The detailed description of calculation Q M 3   using M3 model (Equation (11)) is presented in Equation (12). The calculated coefficients are can be found in Table 2, where value of heat demand obtained from the model was also presented for the selected ambient air temperature:
Q M 3 = α M 3 + f M 3 ( Temp ) = α M 3 + i = 1 I = 9 β T , M 3 , i · b T , M 3 , i Temp
where: I—dimension of the spline basis (number of knots, I = 9); β T , M 3 , i —the corresponding regression coefficient for M3 model (see Table 2) and b T , M 3 , i Temp —the corresponding cubic-spline regression function for M3 model
When calibrating the model (see Figure 5), input variables are fitted using smoothing terms. The signal from the heat meter on the output from the CHP to the DHN was used as the instantaneous heat load. Base functions are in the form of cubic spline functions. Other smooth functions such as thin plate regression spline or P-splines were examined. It was found that with a sufficiently high number of knots for splines, the type of function is of marginal importance. The additive term in M3 model, dependent on the ambient air temperature is shown in Equation (12). Other functions used in the remaining models (M2 and M1) are analogous. After the calibration stage, Equations (9)–(11) are used to find the forecasted heat load based on the new weather data.
In each model variant, a combination of input data and additive functions were selected to obtain the most accurate result in a day-ahead horizon. For example, it was observed that taking into account wind speed by using the two-dimensional additive function f(Wind, Temp) gives a better fit. Table 3 presents the most relevant parameters used (as an argument for calculations within the mgcv package) [32].
Figure 7 shows the impact of individual variables on the heat load for a sample training dataset in moving window (12 days) for the M1 model which contains four input variables. It can be clearly seen that as the ambient temperature together with solar irradiation increase, the values of the additive functions take lower values. For the variable representing hour, the daily variability of heat production is visible as a reduced power at night and increased in the evening (Figure 8a). The two-dimensional smoothing function, including the combined effect of wind speed and air temperature, can be seen in Figure 8b, where contour plot of additive term on the heat load is presented. The estimated value of the two dimensional additive function is marked on individual contour lines. One can notice that increasing the wind speed in the low temperature range leads to an increase in the heat load.
Table 4 contain some results from the fitting procedure of the M1 model, obtained for a selected training period. The table was generated as an output from mgcv package. It can be seen that the default maximum degrees of freedom for the smoothers used in the model are sufficient for all species, as the effective degrees of freedom (EDF) for all estimated smoothers are below their maximum possible value (k′). The p-value for the observed k-index is not significant. The k-index is a measure of the remaining pattern in the residuals, and the p-value is calculated based on the distribution of the k-index after randomizing the order of the residuals [34]. The data was fitted with GCV = 36.5 and R2 = 0.95.

3.4. Effect of the Moving Window Size with Learning Data

In order to determine the optimal size of the sliding window, the effect of the number of days in the window was investigated. The analysis was based on the data from the entire heating season, simulating the operation of the investigated models day by day for different window sizes. The RMSE error was calculated for the test dataset (in the next day horizon), for forecasted as well as actual weather data. The results are illustrated in Figure 9. The graph shows that the optimal window size with the smallest corresponding RMSE error is 12 days. This window size allows the model to be adapted to the current DHN demand and end-users’ behavior. The use of a time-shifting window potentially can lead to inaccurate forecasts in the next day’s horizon (e.g., when the ambient air temperature is expected to rapidly decrease outside the range in the learning window). In that case, the forecasts will be extrapolated from smoothing functions.

4. Results of the Heat Demand Model during the Heating Season

The obtained heat load models have been tested using the data coming from the DHS system during the heating season from November until March. The accuracy of the method was determined by means of a comparison of the forecasts with real values at the corresponding time. Figure 10 presents the time course of forecasts obtained by considered models along with the actual heat load for a selected period. It can be clearly seen that the M3 model is inaccurate compared with the M1 and M2 model. Including a variable representing the hour of the day significantly improves the results. This approach allows taking into account the daily profiles of heat production (e.g., morning and evening peaks).
From the point of view of optimizing the operation of the CHP plant, a high accuracy forecast for the forthcoming days is needed. It should be borne in mind that a model that fits well into the learning dataset may not give a good accuracy on the new data set. The Figure 11 shows the aggregated MAPE error for each model. The MAPE was calculated for the entire analyzed period (November–March). The metric was determined for the training dataset in a learning window (12 days) and test dataset (day-ahead horizon). In addition to weather data forecasts, simulation of model operation on real weather values was also included. This approach enables to assess the accuracy of the predictive model without weather forecast error. The M1 model gives a better fit to the training set because of including more input data (solar radiation and wind speed). A significant improvement can be observed while taking into account the hour of the day feature. This enables the daily seasonality to be taken into account.
The box plot of absolute percentage errors for test dataset is presented in Figure 12. In the M1 and M2 model, the vast majority of hourly percentage errors are between 2 and 12%. The maximum observed errors exceed 20%. It can be seen that the M1 model gives slightly better results, particularly outliers are on a lower level. The Table 5 presents the results of the MAPE and RMSE metrics aggregated into individual months of the heating season (including M1 and M2 model). In the period from December to February, the models have similar accuracy. During transition periods such as November and March, where the heat demand is lower, the M1 model gives more accurate results. In these months, due to the relatively large daily amplitude of changes in ambient temperature as well as dynamic changes of mass flow rate in DHN, there are additional difficulties in forecasting. Standard deviation of the heat demand in these periods was greater than in typically winter months. It also should be noted that in these periods the weather forecast error increases. For example, the RMSE error of forecasts in ambient air temperature in January and February was approximately 1 °C. In March, RMSE increased to 1.5 °C.
Including solar radiation in the learning data set allows to obtain better accuracy in March, when the influence of radiation in the thermal gains of buildings is greater. This can be observed in Figure 13. During periods of increased radiation over a day, the M1 model gives a better fit compared with the M2 model. The accuracy of two models is similar at night. The plot shows the hourly errors during several days from the end of March. During this period, the greatest instantaneous errors were observed. The time corresponds to the transient period between heating season and summer. Poor repeatability of heat production profile for similar climatic data was observed as a reason of inaccuracy. Therefore, the instantaneous errors exceed 20%, however the daily mean values are below 10%.

5. Conclusions

The paper presents the results of heat demand forecasting in the complex system supplied from a gas fired combined heat and power (CHP) plant with seven independent heat sources. The idea and results presented in the paper deal with the actual challenge of increasing energy efficiency during heat and electricity production. To maintain high production efficiency with reduced pollutants emission, high quality production forecasting is needed. For the purposes of optimizing heat production in cogeneration, hourly forecasts of the expected load demand in the forthcoming day are necessary. Accurate prediction enables efficient planning of heat production, taking into account the cooperation of the cogeneration heat sources with the district heating network. The main benefit comes from effective planning of electricity sales on the spot market, keeping the heat production at the required level. This is because the electricity production depends on the current heat load in the investigated system, which is a gas-fired combined heat and power plant. Electricity generation in gas-fired CHP plants allows reducing the emission of gaseous pollutants as CO2, SOx, and NOx and stopping the emission of dust. The CO2 emission level is around 50% lower in gas-fired power plants than coal-fired power plants. The main benefit of effective electricity generation planning in the gas-fired CHP plants is improving energy generation efficiency and reducing CO2 emission level.
The generalized additive model (GAM) method was successfully used to build a predictive model based on weather data and a variable representing an hour of the day. The heat load profile can change over time because of thermo-modernization of buildings, changing the regulation of heating nodes or new connections to the heating network. For this reason, it is important to calibrate the model properly and continuously. In presented paper the application of the adaptive training of the model using moving window was investigated. The results confirmed that the last 12 days give the opportunity to take into account the current conditions of DHN operation and heat consumption behavior by end users. The moving window was adopted in all three variants of the heat demand forecasting models. The model M1 is supplied with ambient temperature, radiation and wind speed. Model M2 was additionally supplied with variable representing the hour of the day, and model M3 only consider ambient air temperature. The results of model M1, including additional weather data such as irradiation and wind speed gives the best results, particularly between the winter and summer period when high fluctuations of radiation during a whole day exist. It should be noted that the model is also burdened with independent factors such as weather forecast error and uncertainty in measuring the thermal power at the DHN supply side (about 1.5–2%).

Author Contributions

Conceptualization, M.B. and P.M.; methodology, M.B. and P.M.; software, M.B.; validation, M.B., and P.M.; investigation, M.B. and P.M.; resources, M.B.; writing—original draft preparation, M.B. and P.M.; writing—review and editing, P.M.; visualization, M.B.; supervision, P.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by MEiN subsidy no. 16.16.130.942 (Faculty of Mechanical Engineering and Robotics, AGH UST).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Restrictions apply to the availability of these data. Data was obtained from PGE Energia Ciepła S.A. and are available from the authors with the permission of PGE Energia Ciepła S.A.

Acknowledgments

The research was carried out thanks to the possibility of participating in the Implementation Doctorate Programme at the AGH Doctoral School.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Żymełka, P.; Szega, M.; Madejski, P. Techno-Economic Optimization of Electricity and Heat Production in Gas-Fired CHP Plant with Heat Accumulator. J. Energy Resour. Technol. 2020, 142, 022101. [Google Scholar] [CrossRef]
  2. Fang, T.; Lahdelma, R. Optimization of combined heat and power production with heat storage based on sliding time window method. Appl. Energy 2016, 162, 723–732. [Google Scholar] [CrossRef]
  3. Wang, J.; You, S.; Zong, Y.; Cai, H.; Træholt, C.; Dong, Z.Y. Investigation of real-time flexibility of combined heat and power plants in district heating applications. Appl. Energy 2019, 237, 196–209. [Google Scholar] [CrossRef]
  4. Sameti, M.; Haghighat, F. Optimization of 4th generation distributed district heating system: Design and planning of combined heat and power. Renew. Energy 2019, 130, 371–387. [Google Scholar] [CrossRef]
  5. Schweiger, G.; Kuttin, F.; Posch, A. District Heating Systems: An Analysis of Strengths, Weaknesses, Opportunities, and Threats of the 4GDH. Energies 2019, 12, 4748. [Google Scholar] [CrossRef] [Green Version]
  6. Madejski, P.; Żymełka, P.; Węzik, R.; Kubiczek, H. Gas fired plant modeling for monitoring and optimization of electricity and heat production. J. Power Technol. 2017, 97, 455–462. [Google Scholar]
  7. Idowu, S.; Saguna, S.; Åhlund, C.; Schelén, O. Applied machine learning: Forecasting heat load in district heating system. Energy Build. 2016, 133, 478–488. [Google Scholar] [CrossRef]
  8. Spoladore, A.; Borelli, D.; Devia, F.; Mora, F.; Schenone, C. Model for forecasting residential heat demand based on natural gas consumption and energy performance indicators. Appl. Energy 2016, 182, 488–499. [Google Scholar] [CrossRef]
  9. Nigitza, T.; Gölles, M. A generally applicable, simple and adaptive forecasting method for the short-term heat load of consumers. Appl. Energy 2019, 241, 73–81. [Google Scholar] [CrossRef]
  10. Mosavi, A.; Bahmani, A. Energy Consumption Prediction Using Machine Learning. Electr. Eng. 2019, 2019030131. [Google Scholar]
  11. Bourdeau, M.; Zhai, X.Q.; Nefzaoui, E.; Guo, X.; Chatellier, P. Modeling and forecasting building energy consumption: A review of data-driven techniques. Sustain. Cities Soc. 2019, 48, 101533. [Google Scholar] [CrossRef]
  12. Dotzauer, E. Simple model for prediction of loads in district-heating systems. Appl. Energy 2002, 73, 277–284. [Google Scholar] [CrossRef]
  13. Baltputnis, K.; Petrichenko, R.; Sobolevsky, D. Heating Demand Forecasting with Multiple Regression: Model Setup and Case Study. In Proceedings of the IEEE 6th Workshop on Advances in Information, Electronic and Electrical Engineering (AIEEE), Vilnius, Lithuania, 8–10 November 2018; pp. 1–5. [Google Scholar] [CrossRef]
  14. Grosswindhager, S.; Voigt, A.; Kozek, M. Online Short-Term Forecast of System Heat Load in District Heating Networks. In Proceedings of the 31st International Symposium on Forecasting, Prague, Czech Republic, 26–29 June 2011. [Google Scholar]
  15. Fang, T.; Lahdelma, R. Evaluation of a multiple linear regression model and SARIMA model in forecasting heat demand for district heating system. Appl. Energy 2016, 179, 544–552. [Google Scholar] [CrossRef]
  16. Wojdyga, K. Predicting Heat Demand for a District Heating Systems. Int. J. Energy Power Eng. 2014, 3, 237. [Google Scholar] [CrossRef] [Green Version]
  17. Saloux, E.; Candanedo, J.A. Forecasting District Heating Demand using Machine Learning Algorithms. Energy Procedia 2018, 149, 59–68. [Google Scholar] [CrossRef]
  18. Bandyopadhyay, S.; Hazra, J.; Kalyanaraman, S. A machine learning based heating and cooling load forecasting approach for DHC networks. In Proceedings of the 2018 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT), Washington, DC, USA, 19–22 February 2018; pp. 1–5. [Google Scholar]
  19. Suryanarayana, G.; Lago, J.; Geysen, D.; Aleksiejuk, P.; Johansson, C. Thermal load forecasting in district heating networks using deep learning and advanced feature selection methods. Energy 2018, 157, 141–149. [Google Scholar] [CrossRef]
  20. Le, L.T.; Nguyen, H.; Zhou, J.; Dou, J.; Moayedi, H. Estimating the Heating Load of Buildings for Smart City Planning Using a Novel Artificial Intelligence Technique PSO-XGBoost. Appl. Sci. 2019, 9, 2714. [Google Scholar] [CrossRef] [Green Version]
  21. Dahl, M.; Brun, A.; Kirsebom, O.S.; Andresen, G.B. Improving Short-Term Heat Load Forecasts with Calendar and Holiday Data. Energies 2018, 11, 1678. [Google Scholar] [CrossRef] [Green Version]
  22. Geysen, D.; De Somer, O.; Johansson, C.; Brage, J.; Vanhoudt, D. Operational thermal load forecasting in district heating networks using machine learning and expert advice. Energy Build. 2018, 162, 144–153. [Google Scholar] [CrossRef]
  23. Wood, S. Generalized Additive Models: An Introduction with R; Chapman & Hall: London, UK, 2006. [Google Scholar]
  24. Krstonijević, S. Generalized Additive Model for Electricity Load Prediction in R. Zenodo 2020. [Google Scholar] [CrossRef]
  25. Kim, H.; Kim, J. Short term electricity demand forecasting in south korea with generalized additive models. J. Korean Data Inf. Sci. Soc. 2018, 29, 1299–1307. [Google Scholar] [CrossRef]
  26. Sigauke, C. Forecasting medium-term electricity demand in a South African electric power supply system. J. Energy South. Afr. 2017, 28, 54. [Google Scholar] [CrossRef]
  27. Pathak, N.; Ba, A.; Ploennigs, J.; Roy, N. Forecasting Gas Usage for Big Buildings Using Generalized Additive Models and Deep Learning. In Proceedings of the 2018 IEEE International Conference on Smart Computing (SMARTCOMP), Taormina, Italy, 18–20 June 2018; pp. 203–210. [Google Scholar]
  28. Khamma, T.R.; Zhang, Y.; Guerrier, S.; Boubekri, M. Generalized additive models: An efficient method for short-term energy prediction in office buildings. Energy 2020, 213, 118834. [Google Scholar] [CrossRef]
  29. Sandou, G.; Font, S.; Tebbani, S.; Hiret, A.; Mondon, C. Optimization and control of a district heating network. IFAC Proc. Vol. 2005, 38, 397–402. [Google Scholar] [CrossRef]
  30. Hastie, T.; Tibshirani, R. Generalized Additive Models; Chapman & Hall/CRC: Boca Raton, FL, USA, 1990. [Google Scholar]
  31. Perperoglou, A.; Sauerbrei, W.; Abrahamowicz, M.; Schmid, M. A review of spline function procedures in R. BMC Med Res. Methodol. 2019, 19, 1–16. [Google Scholar] [CrossRef] [Green Version]
  32. Wood, S. mgcv: GAMs and Generalized Ridge Regression for R. R News 2001, 1, 20–25. [Google Scholar]
  33. Craven, P.; Wahba, G. Smoothing noisy data with spline functions. Numer. Math. 1978, 31, 377–403. [Google Scholar] [CrossRef]
  34. Pedersen, E.J.; Miller, D.L.; Simpson, G.L.; Ross, N. Hierarchical generalized additive models in ecology: An introduction with mgcv. PeerJ 2019, 7, e6876. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Figure 1. The structure of thermal power of the analyzed combined heat and power (CHP).
Figure 1. The structure of thermal power of the analyzed combined heat and power (CHP).
Energies 14 02331 g001
Figure 2. Heat demand variation with ambient air temperature.
Figure 2. Heat demand variation with ambient air temperature.
Energies 14 02331 g002
Figure 3. District Heating Network (DHN) parameters for the heating period.
Figure 3. District Heating Network (DHN) parameters for the heating period.
Energies 14 02331 g003
Figure 4. District heating network (DHN) parameters for summer period.
Figure 4. District heating network (DHN) parameters for summer period.
Energies 14 02331 g004
Figure 5. Flow chart of the model with calibration and prediction layer.
Figure 5. Flow chart of the model with calibration and prediction layer.
Energies 14 02331 g005
Figure 6. Timeline diagram of the heat demand model.
Figure 6. Timeline diagram of the heat demand model.
Energies 14 02331 g006
Figure 7. The additive effect on the heat load in the M1 model. The dashed line marks the 95% confidence interval (a) the air temperature effect f(Temp) (b) solar radiation effect f(Rad).
Figure 7. The additive effect on the heat load in the M1 model. The dashed line marks the 95% confidence interval (a) the air temperature effect f(Temp) (b) solar radiation effect f(Rad).
Energies 14 02331 g007
Figure 8. The additive effect on the heat load in the M1 model. The dashed line marks the 95% confidence interval (a) the hour effect f(Hour) (b) two-dimensional air temperature and wind speed effect f(Wind, Temp).
Figure 8. The additive effect on the heat load in the M1 model. The dashed line marks the 95% confidence interval (a) the hour effect f(Hour) (b) two-dimensional air temperature and wind speed effect f(Wind, Temp).
Energies 14 02331 g008
Figure 9. RMSE(root mean square error) depending on the size of the training window for (a) M1 model (b) M2 model (c) M3 model.
Figure 9. RMSE(root mean square error) depending on the size of the training window for (a) M1 model (b) M2 model (c) M3 model.
Energies 14 02331 g009
Figure 10. Actual and forecasted heat load over a several days.
Figure 10. Actual and forecasted heat load over a several days.
Energies 14 02331 g010
Figure 11. MAPE (mean absolute percentage error) for training and test dataset.
Figure 11. MAPE (mean absolute percentage error) for training and test dataset.
Energies 14 02331 g011
Figure 12. The boxplot of error for the test dataset (with forecasted weather).
Figure 12. The boxplot of error for the test dataset (with forecasted weather).
Energies 14 02331 g012
Figure 13. Solar radiation and absolute percentage error of the M1 and M2 model (end of March).
Figure 13. Solar radiation and absolute percentage error of the M1 and M2 model (end of March).
Energies 14 02331 g013
Table 1. Weather variables with the correlation coefficient with heat demand.
Table 1. Weather variables with the correlation coefficient with heat demand.
VariableSymbolUnitPearson’s Coefficient
–Real Data
Pearson’s Coefficient
–Forecasted Data
Ambient temperatureTemp°C−0.81−0.76
Solar irradiationRadW/m2−0.42−0.41
Wind speedWindm/s−0.20−0.15
Table 2. The corresponding regression coefficients together with intercept value in model M3 (for selected training dataset).
Table 2. The corresponding regression coefficients together with intercept value in model M3 (for selected training dataset).
β T , M 3 , i   α M 3 , MW Q M 3 , MW
(Temp = −0.7 °C)
i = 1i = 2i = 3i = 4i = 5i = 6i = 7i = 8i = 9
8.55−14.102.71−6.031.464.732.0327.86−11.7997.23130.80
Table 3. Input parameters for mgcv package used for building the predictive model.
Table 3. Input parameters for mgcv package used for building the predictive model.
ParameterDescriptionValue
Family The family object specifying the distribution and link to use in fitting.Gaussian
MethodThe smoothing parameter estimation method.GCV (generalized cross validation)
OptimizerThe numerical method to optimize the smoothing parameter estimation criterion.Newton
Smoothing functionsIndicating the smoothing basis to use.Cubic regression splines
Table 4. Results of the fitting the M1 model.
Table 4. Results of the fitting the M1 model.
Model Termk’EDFk-Indexp-Value
f(Temp)9.05.6211.871.02 × 10−10
f(Rad)9.02.878.6764.16 × 10−6
f(Wind, Temp)28.012.794.4084.28 × 10−8
f(Hour)9.08.2769.992<2 × 10−16
EDF—effective degrees of freedom.
Table 5. Statistical metrics in individual months of the M1 and M2 model.
Table 5. Statistical metrics in individual months of the M1 and M2 model.
MonthMean Heat Load, MWMAPE, % RMSE, MW
Model M1Model M2Model M1Model M2
November110.1 (SD 18.7)6.36.99.19.8
December118.7 (SD 16.9)5.25.47.67.9
January118.3 (SD 16.1)6.16.49.39.8
February108.3 (SD 17.3)5.96.07.97.9
March96.5 (SD 22.4)7.58.88.49.5
SD—Standard Deviation; MAPE—mean absolute percentage error; RMSE—root mean square error.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Bujalski, M.; Madejski, P. Forecasting of Heat Production in Combined Heat and Power Plants Using Generalized Additive Models. Energies 2021, 14, 2331. https://doi.org/10.3390/en14082331

AMA Style

Bujalski M, Madejski P. Forecasting of Heat Production in Combined Heat and Power Plants Using Generalized Additive Models. Energies. 2021; 14(8):2331. https://doi.org/10.3390/en14082331

Chicago/Turabian Style

Bujalski, Maciej, and Paweł Madejski. 2021. "Forecasting of Heat Production in Combined Heat and Power Plants Using Generalized Additive Models" Energies 14, no. 8: 2331. https://doi.org/10.3390/en14082331

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop