A Hybrid Convolutional–Long Short-Term Memory–Attention Framework for Short-Term Photovoltaic Power Forecasting, Incorporating Data from Neighboring Stations

Hu, Feng; Zhang, Linghua; Wang, Jiaqi

doi:10.3390/app14125189

Open AccessArticle

A Hybrid Convolutional–Long Short-Term Memory–Attention Framework for Short-Term Photovoltaic Power Forecasting, Incorporating Data from Neighboring Stations

by

Feng Hu

,

Linghua Zhang

^* and

Jiaqi Wang

School of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210023, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(12), 5189; https://doi.org/10.3390/app14125189

Submission received: 1 May 2024 / Revised: 7 June 2024 / Accepted: 11 June 2024 / Published: 14 June 2024

(This article belongs to the Special Issue Photovoltaic Power System: Modeling and Performance Analysis, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

To enhance the safety of grid operations, this paper proposes a high-precision short-term photovoltaic (PV) power forecasting method that integrates information from surrounding PV stations and deep learning prediction models. The proposed method utilizes numerical weather prediction (NWP) data of the target PV station and highly correlated features from nearby stations as inputs. This study first analyzes the correlation between irradiance and power sequences and calculates a comprehensive similarity index based on distance factors. Stations with high-similarity indices are selected as data sources. Subsequently, Bayesian optimization is employed to determine the optimal data fusion ratio. The selected data are then used to model power predictions through the convolutional long short-term memory with attention (Conv-LSTM-ATT) deep neural network. Experimental results show that the proposed model significantly outperforms three classical models in terms of forecasting accuracy. The data fusion strategy determined by Bayesian optimization reduces the root mean square error (RMSE) of the test set by 20.04%, 28.24%, and 30.94% under sunny, cloudy, and rainy conditions, respectively.

Keywords:

photovoltaic power forecasting; deep learning prediction models; surrounding PV stations; Conv-LSTM-ATT neural network

1. Introduction

Solar energy, as a renewable energy source that is inexhaustible and sustainable, holds a significant position in long-term energy strategies due to its ample cleanliness, relative abundance, low maintenance requirements, resource abundance, and potential economic benefits [1,2]. Among various solar energy generation methods, photovoltaic power has garnered widespread attention in recent years [3]. With the increasing installed capacity and share, the stochastic and fluctuating characteristics of distributed photovoltaic systems have become impossible to overlook in terms of their impact on grid security dispatch and field operation management [4]. An accurate prediction of photovoltaic power generation can provide conventional power plants with sufficient time to start up and maintain proper reserves [5], thereby ensuring the safe and stable operation of the power grid and reducing operational costs.

Related Work on PV Generation Forecasting

Photovoltaic power prediction has been a hot topic in recent years. Many researchers have conducted extensive work in this area. Forecast periods in photovoltaic power prediction are categorized into four distinct time windows: ultra-short, short, medium, and long [6]. In these windows, ultra-short-term forecasts encompass predictions within an hour, crucial for immediate operational adjustments. Short-term forecasts, ranging from one hour to a full day, cater to daily management and responsiveness. The medium-term window, extending from a day up to several weeks or months, is vital for scheduling maintenance and preparatory operations. Long-term forecasts, projecting several months to years ahead, are pivotal for strategic planning and participating in energy markets [7].

As to prediction techniques, the forecast methods can be broadly categorized into three types: physics-based methods, statistical methods, and hybrid methods. Physics-based methods [8] rely on meteorological data, geographical information, and detailed photovoltaic cell physical model information to simulate the power generation process. By studying the model’s power generation process, they predict solar radiation intensity, and the power generation can be obtained from the predicted solar radiation intensity. This type of method typically does not require historical data. However, due to the challenge of obtaining accurate photovoltaic cell physical model data and the limited resolution of geographical information data, the accuracy of physics-based prediction methods may not be ideal. Statistical methods [9,10,11] work by analyzing a large amount of historical data and establishing inherent mapping relationships to directly predict photovoltaic power. However, due to the stochastic and fluctuating nature of photovoltaic power, the generalization ability of statistical methods may be reduced. Hybrid methods [12,13] combine both physics-based and statistical approaches to leverage the strengths of each and address their weaknesses. These methods aim to improve prediction accuracy and robustness by integrating physical understanding with data-driven insights.

Recently, deep learning [14,15,16] has attracted a great deal of attention. A study [17] proposed an RNN (recurrent neural network) model for solving complex nonlinear mapping problem. However, RNN often struggles with long-term data dependencies due to vanishing gradients. Another research study [18] employed a long short-term memory network (LSTM), successfully addressing these gradient issues inherent in traditional RNN. Despite its effectiveness, LSTM has its own set of limitations [9,19]. A combination of a bi-directional long short-term memory (BiLSTM) network and a copula sampling method has been utilized to create representative scenarios for photovoltaic (PV) power production, as noted in references [20,21]. Earlier, in [22], a generative adversarial network (GAN) was initially used for the generation of these PV power scenarios. To enhance the understanding of the temporal correlation in renewable energy, the GAN’s generator incorporated LSTM units [23]. Numerous articles [24,25,26,27,28,29] have focused on predicting renewable energy, highlighting the benefits of Big Data analysis and sophisticated feature extraction. Methods based on deep learning techniques are particularly effective in exploring the attributes of higher dimensional data, bypassing the need for complex pre-existing knowledge. However, the above methods predict PV power scenarios based only on the historical PV power data of the target PV station, and the coupling relationship between target site and neighboring sites is ignored, which may miss valid representative scenarios of PV power.

To address the aforementioned issue, this paper proposes a data-driven framework that considers spatial and temporal information from a large number of neighboring sites to develop a short-term photovoltaic power prediction model for the target site. By introducing the convolutional long short-term memory with attention (Conv-LSTM-ATT) model algorithm, which combines the convolutional long short-term memory (Conv-LSTM) module with the attention mechanism, the model adaptively allocates different levels of attention to the photovoltaic power time series at different time points, allowing it to focus on crucial time series and improve prediction accuracy. The main contributions of this paper are listed below.

(1): High-Precision Short-Term Photovoltaic Power Prediction Method: This method integrates numerical weather prediction (NWP) data from the target photovoltaic station and highly correlated features from surrounding photovoltaic stations using a deep learning model, significantly enhancing the safety of grid operations.
(2): Calculation and Application of Composite Similarity Index: This study first analyzes the correlation between irradiance and power sequences and their relationship with distance factors, calculates a composite similarity index between the target site and other regional photovoltaic stations, and selects data sources based on similarity, providing more precise data input for model training.
(3): Application of Bayesian Optimization Techniques: Optimal data fusion ratios are determined through Bayesian optimization techniques, effectively balancing exploration and exploitation, enhancing the model’s predictive accuracy and stability.
(4): Development and Application of the Conv-LSTM-ATT Model: A hybrid deep learning model combining convolutional long short-term memory (Conv-LSTM) with the attention mechanism (ATT) is developed, which better handles the spatiotemporal features in time series data, improving the accuracy of crucial time series predictions.
(5): Experimental Validation: Tests on real-world datasets validate the superiority of the proposed model over three other classical models in short-term photovoltaic power prediction.

The structure of this paper is outlined as follows. In Section 1, we provide an overview of the related work in the field. In Section 2, we present the problem formulations. Our novel deep learning approach for PV power prediction is introduced in Section 3. A real-world dataset is used for conducting experiments in Section 4, where we compare the prediction performance with several existing methods. Finally, in Section 5, we conclude the paper.

2. Problem Formulation

Actual observational data indicate that photovoltaic (PV) power outputs from geographically close locations exhibit high similarity due to similar random factors, such as solar radiation intensity and weather variations. Therefore, the spatial correlation of PV power generation can be described using output spatial correlation. Specifically, output spatial correlation refers to the degree of similarity between PV power output sequences in different geographical regions. These similarities decrease as the distance between two locations increases. In the latitude direction, as latitude increases, solar radiation intensity gradually decreases, leading to higher output spatial correlation between neighboring regions. In the longitude direction, the phase difference between PV power output sequences in two locations increases with the time difference, thereby affecting the output spatial correlation.

In this paper, we propose a data-driven framework aimed at leveraging spatiotemporal correlations and periodic characteristics for short-term photovoltaic (PV) power prediction. This framework, considering the information from neighboring sites, is depicted in Figure 1 and primarily consists of the following steps:

(1): Distributed PV power data are collected to form a spatiotemporal dataset.
(2): PV power time series are detrended to exclude the impacts of the diurnal cycle.
(3): Detrended solar data from multiple sites are fused to form the input for data-driven forecasting models.
(4): Data-driven forecasting models are developed based on the fused data.

This framework illustrates how spatiotemporal datasets are fused using historical data from neighboring power stations. The historical PV power of station

p

at time

t

can be represented as

f_{t}^{p}

. The historical data of station

p

from time

t - n

to

t

are described as

X_{t}^{p} = {[f_{t - n}^{p}, f_{t - (n - 1)}^{p}, \dots, f_{t}^{p}]}^{T}

. Then, we combine the historical PV power from its neighboring stations to construct a spatiotemporal PV power matrix, as follows:

X_{t}^{s} = {[\begin{array}{l} X_{t - n}^{s} \\ X_{t - (n - 1)}^{s} \\ ⋮ \\ X_{t}^{s} \end{array}]}^{T} = [\begin{matrix} f_{t - n}^{1} & f_{t - (n - 1)}^{1} & \dots & f_{t}^{1} \\ f_{t - n}^{2} & f_{t - (n - 1)}^{2} & \dots & f_{t}^{2} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ f_{t - n}^{m} & f_{t - (n - 1)}^{m} & \dots & f_{t}^{m} \end{matrix}]

(1)

The overall timeline is defined as the union of historical

T_{h}

and future timesteps

T_{f}

, and

T_{h} \cup T_{h} = \{t_{1}, t_{2}, \dots, t_{h}\} \cup \{t_{h + 1}, t_{h + f}\}

.

The objective is to forecast the PV generation

\hat{Y_{T_{f}}^{P}}

based on the provided numerical weather prediction (NWP)

X_{N W P}

and fused data

X_{t}^{s}

. This task can be framed as an optimization problem, where the aim is to determine the sequence conditional on the future

f

timesteps.

F (X_{N W P}; X_{t}^{s}) \to \hat{Y_{T_{f}}^{P}}

where

\to

represents the complex nonlinear mapping.

3. Materials and Methods

3.1. Overview of the Proposed Model

This section proposes a novel hybrid deep architecture for short-term photovoltaic (PV) power forecasting. The proposed model consists of a Conv-LSTM module and two Bi-LSTM modules. Figure 2 illustrates the overall architecture of the proposed model. The Conv-LSTM module comprises a convolutional neural network (CNN) and an LSTM network, where the CNN is utilized to extract spatial features of PV power, which are then connected to the LSTM network to capture short-term temporal features of PV power. Simultaneously, the Bi-LSTM modules are employed to extract auxiliary information features, such as global irradiation, direct irradiation, temperature, and humidity. The spatiotemporal features and auxiliary information features are fused into a feature vector through the Feature Fusion (FF) layer. Finally, two fully connected layers (FC layers) are applied as regression layers for prediction. Additionally, an attention mechanism is incorporated into the Conv-LSTM module to automatically explore varying levels of time series importance at different time points. In the subsequent subsections, a detailed description of each module will be provided.

3.2. Selection of Similar Neighboring PV Plants Based on Composite Similarity Index

To enhance the accuracy of photovoltaic (PV) power output forecasts, it is vital to integrate information from adjacent PV plants as input features. This integration is predicated on the premise that these neighboring plants must exhibit a significant similarity with the target PV plant. A sophisticated approach involves constructing a composite correlation index that encapsulates both the irradiance and power sequence correlations across various PV plants, reflecting the degree of similarity in PV plant data over different time scales. The mathematical formulation used to calculate this composite correlation is given by:

ϕ_{i} = ϕ_{R g_{i}}^{2} + ϕ_{R d_{i}}^{2} + ϕ_{P_{i}}^{2}

(2)

where

d_{R g i}

,

d_{R d i}

, and

d_{P i}

represent the correlations of the historical global irradiance

R g

component, the diffuse irradiance

R d

component, and the power sequence between the

i

th neighboring PV plant and the target plant, respectively.

ϕ_{i}

denotes the overall composite correlation for the

i

th neighboring PV plant relative to the target plant.

To ensure comprehensive similarity, it is also critical to evaluate the amplitude of irradiance and power output. Thus, a composite distance metric is employed:

d_{i} = d_{R g_{i}}^{2} + d_{R d_{i}}^{2} + d_{P_{i}}^{2}

(3)

In this formula,

d_{R g i}

,

d_{R d i}

, and

d_{P i}

quantify the distances pertaining to the global irradiance component

R g

, diffuse irradiance component

R d

, and power sequence between the

i

th neighboring PV plant and the target plant, respectively.

d_{i}

represents the composite distance for the

i

th plant, with smaller values indicating a closer match in irradiance and power profiles.

From these metrics, a composite similarity index

Ψ_{i}

is calculated as follows:

Ψ_{i} = \frac{ϕ_{i}}{d_{i}}

(4)

This index measures the relative similarity between each neighboring PV plant and the target plant. After determining

Ψ_{i}

for all neighbors, they are ranked in descending order of similarity. The top k plants are then selected to create a set of neighboring PV plants with the highest degrees of similarity to the target plant.

This structured methodology not only systematizes the selection of relevant input features from similar PV plants but also substantiates the inclusion of such features in enhancing the precision of PV power forecasts.

3.3. K-Means++ Approach

The K-means++ algorithm [30] is an improvement over the K-means algorithm, specifically addressing the issue of the dependency on the initial centroids. The process of the K-means++ algorithm is as follows:

(1): From the given dataset samples $S = {s_{1}, s_{2}, . . ., s_{p}}$ , randomly select one sample as the initial cluster center $c_{1}$ .
(2): Calculate the Euclidean distance between each sample $s ᵢ (i = 1,2, \dots, p)$ in the dataset and the initialized cluster centers. Select the shortest distance and denote it as $D (s ᵢ)$ .
(3): To calculate the probability $P (s_{i})$ of each sample $s_{i} (i = 1,2, \dots, p)$ being selected as the next cluster center and choose the sample with the highest probability as the new cluster center, use the expression given by Equation (5). Repeat the process until K clusters are determined, and their corresponding cluster centers are denoted as $C = \{C_{1}, C_{2}, \dots, C_{k}\}$ .

$P (s_{i}) = \frac{D^{2} (s_{i})}{\sum_{s_{i} \in S} D^{2} (s_{i})}$

(5)

where $D^{2} (s_{i})$ represents the Euclidean distance between the sample $s_{i}$ and the current cluster center $C$ .
(4): Compute the Euclidean distance between each sample $s_{i}$ in the dataset and the K cluster centers, and then assign each sample $s_{i}$ to the cluster corresponding to the closest cluster center.
(5): For each cluster $C_{i}$ , recompute the cluster center $c_{i}$ (i.e., the centroid of all samples belonging to that cluster) using the following formula (Equation (3)):

$c_{i} = \frac{1}{|C_{i}|} \sum_{s \in C_{i}} s$

(6)

where $|C_{i}|$ represents the total number of samples in cluster $C_{i}$ , $s$ represents the samples in cluster $C_{i}$ .
(6): Repeat steps (4) and (5) until the positions of the cluster centers no longer change.

3.4. CNN

CNN [31], as a widely used neural network in the field of deep learning, can be applied to learn local trends in time series data. The CNN network consists of an input layer, convolutional layers, pooling layers, fully connected layers, and an output layer. The input layer reads the data, and the convolutional layers perform convolutional operations on the multi-dimensional feature grid data using local connections and parameter sharing techniques, mapping local features to global features. The pooling layer’s role is to perform dimensionality reduction and sampling by calculating the maximum and average values of the window matrix through a sliding window, progressively compressing data and parameters, while enhancing the robustness of the extracted features. The fully connected layers connect all the neurons and produce the output through the hidden layer. The structure of the CNN network is shown in Figure 3.

3.5. LSTM

Photovoltaic (PV) power data are a set of time series data with the characteristic that later data points are related to previous ones. In comparison to traditional recurrent neural networks (RNNs), LSTM neural networks [32] offer specific advantages. The gated mechanism in LSTM allows for controlled information flow, enabling the selective transmission and forgetting of information across different time steps. This helps to overcome the vanishing gradient problem and enables the network to better learn and retain long-term memories. Additionally, LSTM’s memory cell state allows for long-term information retention, reducing the issue of information loss and facilitating the capture of important features within sequences. The structure of an LSTM network is depicted in Figure 4.

f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(7)

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(8)

o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})

(9)

{\tilde{c}}_{t} = \tanh (W_{c} \cdot [h_{t - 1}, x_{t}] + b_{c})

(10)

c_{t} = f_{t} * c_{t - 1} + i_{t} * {\tilde{c}}_{t}

(11)

h_{t} = o_{t} * \tanh (c_{t})

(12)

In the above equations,

f_{t}

represents the forget gate,

i_{t}

represents the input gate,

o_{t}

represents the output gate,

c_{t}

represents the cell state,

{\tilde{c}}_{t}

represents the cell state candidate value, and

h_{t}

represents the hidden state value.

W

and

b

are the weight and bias parameters, respectively.

σ

denotes the sigmoid activation function. After obtaining the outputs of the three gates using Equations (7)–(9), the cell state

c_{t}

and the final output

h_{t}

of the cell can be further computed using Equations (10) and (11).

3.6. Conv-LSTM

In this paper, the Conv-LSTM module [33] serves as the main component of our proposed model, aiming to extract spatiotemporal features from PV power generation data. This module is a fusion of a convolutional neural network (CNN) and an LSTM network, as depicted in Figure 5. The CNN part comprises two convolutional layers, while the LSTM part consists of two LSTM layers.

The Conv-LSTM model combines the respective strengths of CNN and LSTM, allowing it to effectively handle spatiotemporal sequence data, extract multi-layer features, model long-term dependencies, and capture spatial relationships. As a result of these advantages, the Conv-LSTM model demonstrates superior performance in various tasks, including image prediction, video analysis, PV power forecasting, and traffic flow prediction.

The Conv-LSTM module receives as input a spatiotemporal matrix, denoted

X_{t}^{s}

, as elucidated in Equation (1). This matrix embodies the historical PV power at the forecast target location and its proximate areas. The extraction of spatial features is facilitated through the execution of a one-dimensional convolution operation across the flow data

X_{t}^{s}

at each time step

t

. A one-dimensional convolution kernel filter maneuvers across the data, capturing the local perceptual domain. The operational mechanism of the convolution kernel filter can be expressed mathematically as:

Y_{t}^{s} = σ (W_{s} * X_{t}^{s} + b_{s})

(13)

In this expression,

W_{s}

represents the filter’s weights,

b_{s}

signifies the bias,

X_{t}^{s}

indicates the PV power input at temporal position t, symbol

*

denotes the convolution operation,

σ

is the activation function, and

Y_{t}^{s}

is the output of the convolutional layer. This methodology adeptly facilitates the extraction of spatial features from adjacent observation points.

The pooling layer is not applied after the convolutional layer in our model since the dimension of the spatial feature is not large.

G_{t}^{s}

is denoted as the output of convolutional layer 2. After the spatial information is processed by the two convolutional layers, the output is connected to an LSTM network.

3.7. Attention Mechanism

The attention mechanism [34,35] simulates how the human brain processes information, thereby enhancing the ability of neural networks to handle information. It has been widely applied in machine translation, speech recognition, image processing, and other related fields. Applying the attention mechanism to deep neural networks allows the network to adaptively focus on input features that are more relevant to the current output while reducing interference from other features. Using the LSTM hidden layer output vectors

H = \{h_{1}, h_{2}, \dots, h_{t}\}

as the input to the attention mechanism, the attention mechanism seeks attention weights

α_{i}

for each

h_{i}

, which can be obtained using Equations (14) and (15).

e_{i} = \tanh (W_{h} h_{i} + b_{h}), e_{i} \in [- 1,1]

(14)

α_{i} = \frac{\exp (e_{i})}{\sum_{i = 1}^{t} \exp (e_{i})}, \sum_{i = 1}^{t} α_{i} = 1

(15)

where

W_{h}

is the weight matrix for

h_{i}

, and

b_{h}

is the bias term. The values of

W_{h}

and

b_{h}

will change during the model training process. The attention vector

H^{'} = \{h_{1}^{'}, h_{2}^{'}, \dots h_{t}^{'}\}

can be obtained using Equation (16).

h_{i}^{'} = α_{i} \cdot h_{i}

(16)

Figure 6 illustrates how the attention mechanism is applied to the Conv-LSTM module. As shown in Figure 6, the output of the Conv-LSTM at each time step

t

is computed as the weighted sum of the LSTM network output

H_{t}^{s}

. The specific calculation is as follows:

H_{t}^{a} = \sum_{k = 1}^{n + 1} β_{k} H_{t - (k - 1)}^{s}

(17)

where

n + 1

is the sequence length, and

β_{k}

represents the attention value at time

t - (k - 1)

. The attention

β_{k}

can be calculated as follows:

β_{k} = \frac{e x p (s_{k})}{\sum_{k = 1}^{n + 1} s_{k}}

(18)

The vector

s = {(s_{1}, s_{2}, \dots, s_{1},)}^{T}

represents the importance of each component in the power time series and can be obtained as follows:

s_{t} = V_{s}^{T} t a n h (W_{h s} G_{t}^{s} + W_{l s} H_{t}^{s})

(19)

where

V_{s}^{T}

,

W_{x s}

, and

W_{h s}

are learnable parameters, and

H_{t}^{s}

represents the hidden output from the Conv-LSTM network.

From Equations (18) and (19), it can be observed that the attention value

β

at time

t

depends on the current time step

t

and its previous

n

time steps of inputs

G_{t}^{s}

and hidden variables

H_{t}^{s}

. The attention value

β

can also be seen as the activation of the power selection gate, where a set of gates controls the amount of information flowing into the LSTM network at each time step. A higher activation value indicates that the power’s contribution to the final prediction result is more significant.

3.8. Applying Bayesian Optimization to Optimize Data Fusion Ratios in Photovoltaic Power Forecasting

This study aims to determine the optimal data fusion ratio by minimizing the root mean square error (RMSE) of photovoltaic power prediction. To achieve this goal, Bayesian optimization, an advanced optimization method suitable for handling high-cost evaluation problems, was employed. In the photovoltaic power prediction model, this study specifically focuses on optimizing the data fusion ratio. The optimization process involves constructing a Gaussian process (GP) model of the objective function, which not only predicts the values of the objective function but also provides a measure of the uncertainty of these predictions, thereby helping to effectively balance the exploration and exploitation of the parameter space under the guidance of uncertainty.

To accommodate the data fusion needs under different weather conditions, the definition of the objective function f(θ) has been adjusted to a more general form. The specific expression of the function is:

f (θ) = \frac{1}{n} \sum_{i = 1}^{n} (y_{i} - \sum_{j \in J} r_{S j} \cdot f_{S j} (X_{i, S j}))^{2}

(20)

where

θ = \{r_{S j} ∣ j \in J\}

represents the proportion of data from each station in the selected set of stations

J

under specific weather conditions. This form of the objective function allows the model to adjust the number and proportion of integrated stations according to specific environmental conditions, optimizing prediction performance. Each component of the objective function is explained in detail as follows:

θ = \{r_{S j} ∣ j \in J\}

are the model parameters, representing the fusion ratio of data from each station in the selected station set

J

under given weather conditions. These fusion ratio coefficients

r_{S j}

need to be optimized to minimize the overall prediction error.

n

is the total number of data points, used to calculate the overall prediction error.

y_{i}

is the actual observed value at the

i

th data point.

f_{S j} (X_{i, S j})

is the predictive function output based on the input data

X_{i, S j}

from station

J

.

The summation part

\sum_{j \in J} r_{S j} \cdot f_{S j} (X_{i, S j})

calculates the weighted sum of the predictive outputs from all selected stations, where the weights are their respective fusion ratios

r_{S j}

, reflecting each station’s contribution to the final prediction. The goal is to adjust these fusion ratios to find the parameter configuration that minimizes the overall prediction error.

Through the Bayesian optimization framework, this study effectively explores the optimal settings of these fusion ratio parameters. The Gaussian process (GP) model provides a method to quantify the uncertainty of the objective function predictions, while the acquisition function, such as expected improvement (EI), guides the search of the parameter space, prioritizing the exploration of parameter combinations that are likely to significantly enhance model performance. This approach not only improves the accuracy and reliability of the model under various environmental conditions but also ensures that the optimal data fusion ratio is achieved under different weather conditions, effectively enhancing the performance of the photovoltaic power prediction model.

4. Results

4.1. Data Source

The photovoltaic dataset used in this study was provided by the Desert Knowledge Australia Solar Centre (DKASC). This region hosts numerous photovoltaic power stations, each with its unique set of data records. The selected dataset collects measured data on power and various meteorological factors from January 2020 to December 2020, including global radiation, rainfall, humidity, ambient temperature, and wind direction. These data are crucial for deeply understanding the relationship between photovoltaic power output and meteorological conditions. Specifically, the global radiation data reflect solar radiation, which is one of the primary factors influencing photovoltaic power generation. During the experiment, the dataset was divided into training, validation, and test sets at a ratio of 8:1:1. As photovoltaic components significantly reduce power output during early morning and late afternoon, with most of the time having zero or near-zero power, the original data had a resolution of 5 min. Therefore, the prediction time was chosen to be from 7:00 a.m. to 6:00 p.m. each day, with a total of 133 sampling points as experimental samples. The annual distribution of output power data for selected PV station is shown in Figure 7.

In this study, the K-means++ algorithm was employed to partition the historical photovoltaic dataset. To enhance the effectiveness of data partitioning, each day was treated as a sample, and for each sample, the standard deviation

σ

of the global radiation, relative humidity, and temperature, which are three meteorological features, were calculated. Additionally, the skewness coefficient

k_{u r}

and the mean value

\bar{x}

were computed. These computed values were then used to form a feature vector for clustering purposes. The historical photovoltaic data were categorized into three classes: sunny, cloudy, and rainy/snowy weather conditions. The results of the data partitioning using the K-means++ algorithm are presented in Figure 8. This chart clearly displays the data distribution under different weather conditions, providing intuitive visual support for our analysis. Among these, there are 136 days of sunny weather, 133 days of cloudy weather, and 93 days of rainy/snowy weather.

4.2. Photovoltaic Power Influencing Factors and Correlation Analysis

The Pearson correlation coefficient (PCC) analysis method is employed to calculate the correlation coefficients between each factor and PV power output. The results indicate that global radiation has the highest correlation coefficient, while direct radiation, humidity, and temperature have relatively lower correlation coefficients, and wind speed has the smallest correlation coefficient.

r_{x, y} = \frac{\sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y})}{\sqrt{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}} \sqrt{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}}

(21)

where

\bar{x}

and

\bar{y}

are the average values of the elements in x and y, respectively. After computing all the data samples through Equation (21), we can get the variable correlation table, where Rg, Rd, H, T, W, and Wd represent global radiation, direct radiation, humidity, temperature, wind speed, and wind direction.

Table 1 depicts the PCC values of variables. The larger the absolute value of PCC, the stronger the association. In this paper, meteorological variables with PCC values greater than 0.4 with load are screened as input variables for CNN to reduce the redundancy of inputs and to lay the foundation for improving prediction accuracy.

From this table, it can be seen that there are four elements (global irradiation, direct irradiation, humidity, and temperature) that have a strong correlation with power. Therefore, the dimension of the input sequence is 4.

Because the input sequence contains the information of multiple moments before the prediction point, the computing time and memory consumption will increase dramatically if the length of input sequence is too long. Therefore, the length of the input sequence was of great significance for this experiment. In this paper, we used the autocorrelation coefficient to determine the length of the input sequence. The formula for the autocorrelation coefficient with delay h is as follows:

r_{h} = \sum_{i = 1}^{n - h} \frac{(x_{i} - \bar{x}) (x_{i + h} - \bar{x})}{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}}

(22)

In the formula,

x_{i}

represents the historical power sequence, and

x_{i + h}

represents the power sequence with a time lag of h * 5 min.

According to Table 2, we can see that the correlation gradually decreases with the increase in time delay h. Based on the previous analysis, the input sequence length of 12 is suitable. Each input data point consists of 12 groups of four-dimensional data before the power point to be predicted.

4.3. Model Evaluation Metrics

Four metrics were introduced to evaluate the model performance: root mean square error (RMSE), mean absolute percentage error (MAPE), mean absolute error (MAE), and coefficient of determination (R²), as expressed in Equations (23)–(26). For RMSE, MAPE, and MAE, a smaller value indicates better prediction results. On the other hand, for the coefficient of determination R², a higher value indicates better prediction results.

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}

(23)

M A P E = \frac{100 %}{n} \sum_{i = 1}^{n} |\frac{y_{i} - \hat{y_{i}}}{\hat{y_{i}}}|

(24)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - \tilde{y_{i}})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}

(25)

M A E = \frac{100 %}{n} \sum_{i = 1}^{n} |y_{i} - \hat{y_{i}}|

(26)

In the above formulas,

y_{i}

and

\hat{y_{i}}

are the true and predicted values of PV power at time

i

, respectively.

\tilde{y_{i}}

represents the average of true PV power values, and

n

is the number of test samples.

4.4. Training Configuration

During the training phase, a loss function was established to update the parameters within the mode. This loss function encompassed mean squared error (

M S E

) loss, L1 weight regularization, and L2 weight regularization. By minimizing the loss function, the model’s parameters were updated using the backpropagation algorithm and an optimizer, gradually refining the model and enhancing its predictive performance. The loss function is defined as follows:

L o s s = M S E + \frac{λ_{1}}{n} {‖w‖}_{1} + \frac{λ_{2}}{n} {‖w‖}_{2}

(27)

The

M S E

loss measures the average squared difference between the model’s predicted values and the true values, serving as an indicator of the model’s fitting ability and predictive accuracy. The L1 and L2 weight regularization terms are employed to control the complexity of the model and prevent overfitting.

λ_{1}

and

λ_{2}

are regularization parameters, while

w

represents the weight coefficients, helping to balance the importance of different components within the loss function.

M S E = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}

(28)

where

y_{i}

represents the actual photovoltaic power data,

\hat{y_{i}}

denotes the predicted photovoltaic power data, and

n

represents the size of the dataset.

The objective of L1 regularization included in the loss function is to achieve a sparse model and prevent overfitting through the utilization of the deep model. Moreover, L2 regularization in the loss function serves to prevent the occurrence of excessively large parameter values within the model, thereby averting the dominance of a single feature over the predictive performance of the model. L1 regularization and L2 regularization can be defined as follows:

{‖w‖}_{1} = \sum_{i = 1}^{n} ‖w_{i}‖, {‖w‖}_{2} = \sum_{i = 1}^{n} \sqrt{W_{i}^{2}}

(29)

Then, the loss function can be rewritten as follows:

L o s s = \frac{1}{n} (\sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2} + λ_{1} \sum_{i = 1}^{n} |W_{i}| + λ_{2} \sum_{i = 1}^{n} \sqrt{W_{i}^{2}})

(30)

In the proposed model, the Adam optimization algorithm [36], which adaptively adjusts the learning rate, was utilized to optimize the model’s parameters.

4.5. Model Setting

In the experiments, the convolutional layer had 10 filters with the size of each filter being 3. The stride of the sliding window for the input data was set to 1. The learning rate was set to 0.01, the batch size was 128, and the number of training iterations was set to 100. The rectified linear activation unit (ReLU) was adopted as the activation function.

All experimental platforms were built on a high-performance server equipped with an Intel Core i7-8700 CPU (Intel Corporation, Santa Clara, California, USA) and one Nvidia GeForce RTX 2080Ti Graphics card (Nvidia Corporation, Santa Clara, California, USA). The programming language was Python 3.7.0 with PyTorch 1.7.1.

4.6. Selection Results of Neighboring Photovoltaic Stations

In this study, the composite similarity index (

Ψ_{i}

) was developed to effectively evaluate and select neighboring PV plants. This index reflects the similarity between neighboring and target PV stations in terms of irradiance and power sequence correlations. Table 3, Table 4 and Table 5 below display the normalized composite similarity index for neighboring PV stations numbered 2 to 7 under sunny, cloudy, and rainy weather.

To accurately determine the optimal integration parameter k value, this study employed a Conv-LSTM model, specifically focusing on the predictive performance for Station No. 1. This paper systematically observed the 12 h average prediction errors under three different meteorological conditions: sunny, cloudy, and snowy/rainy, detailed in Table 6, Table 7 and Table 8, respectively. The analysis showed that the errors decreased and then increased as the number of neighboring PV stations integrated increased in order of decreasing composite similarity. Notably, when no neighboring PV stations were integrated (k = 0), the prediction errors were higher since the model relied solely on data from a single station. Specifically, under sunny conditions, the lowest prediction error occurred when k was 4 (see Table 6); under cloudy conditions, the minimum error was achieved at k = 5 (see Table 7); and under snowy/rainy conditions, the optimal performance was observed when k was 3 (see Table 8). In this study, the data from neighboring PV stations were integrated in equal proportions based on the number of stations, ensuring uniform contributions from each station to the model. In summary, this study establishes that the best numbers of neighboring PV stations to integrate under varying weather conditions are 4, 5, and 3, respectively.

4.7. Comparison and Analysis of the Results

To validate the effectiveness of the Conv-LSTM-ATT model, this study selected one day each of sunny, cloudy, and rainy weather from the three types of weather clustered as the test set for prediction. At the same time, we introduced three deep learning models (LSTM, Bi-LSTM, and Conv-LSTM) as benchmarks for comparison. The model proposed in this study and several baseline models were compared in experiments under the same dataset, that is, only the historical data of the target site were used, and then their prediction results were analyzed and compared.

Figure 9, Figure 10 and Figure 11 show the prediction results of four different models under different weather conditions, and Table 9, Table 10 and Table 11 show the prediction errors of the centralized models under different weather conditions. In comparing the data from the three tables, it is evident that the model proposed in this article achieved the lowest RMSE values for sunny, cloudy, and rainy weather conditions, which were 0.1636, 0.2358, and 0.2421, respectively, when compared to other models. Table 9 shows that in sunny conditions, the photovoltaic output power fluctuated slightly, and the power curve changed relatively smoothly. Several models could predict the trend of photovoltaic output power. The evaluation indicators R² of the LSTM, Bi-LSTM, and Conv-LSTM prediction models were 0.933, 0.942, and 0.951, respectively, and the evaluation indicator R² of the model proposed in this article was 0.973, which was higher than the other models, and the effect was the best. Table 10 shows that in cloudy conditions, the continuous movement of clouds caused the solar radiation intensity received by the photovoltaic components to change continuously, leading to large fluctuations in the fitting curve of the predicted and actual values of photovoltaic output power. Table 11 shows that in rainy and snowy weather, the RMSEs of the LSTM, Bi-LSTM, and Conv-LSTM prediction models were 0.3226, 0.3218, and 0.2886, respectively, and the RMSE of the model proposed in this article was 0.2421, which was lower than the other three prediction models. The above analysis indicates that the model proposed in this article has more outstanding prediction effects under three types of weather conditions.

Compared to the LSTM model, the model proposed in this paper reduced RMSE by 18.28%, 27.99%, and 24.95% in sunny, cloudy, and rainy weather, respectively; MAPE was reduced by 36.76%, 45.26%, and 41.73%, respectively, and MAE was reduced by 24.97%, 27.10%, and 16.53%, respectively. Compared to the Bi-LSTM model, the proposed model reduced RMSE by 13.02%, 22.86%, and 24.76% in sunny, cloudy, and rainy weather, respectively; MAPE was reduced by 20.86%, 33.62%, and 34.28%, respectively, and MAE was reduced by 16.03%, 19.37%, and 12.14%, respectively. Compared to the Conv-LSTM model, the proposed model reduced RMSE by 10.84%, 14.28%, and 16.11% in sunny, cloudy, and rainy weather, respectively; MAPE was reduced by 8.76%, 7.23%, and 15.98%, respectively, and MAE was reduced by 13.07%, 15.26%, and 6.06%, respectively. The comparison results indicate that the model proposed in this paper effectively combined the advantages of both CNN and LSTM methods, and used the attention mechanism to compensate for the deficiency of the LSTM model in retaining key information when the input sequence was long, thereby effectively improving prediction accuracy.

The processing time is crucial for real-time applications, where faster predictions are often desirable. In our experiments, the Conv-LSTM-ATT model showed a slightly higher processing time compared to the other models. This increment in time can be attributed to the complexity of the model, especially due to the integration of the attention mechanism. While it does add to the prediction time, the improvement in prediction accuracy (as shown by the lower MAPE, MAE, and RMSE values) could justify this trade-off in contexts where prediction accuracy is more critical than the speed of computation.

In this study, Bayesian optimization was applied to adjust the data fusion ratios in a photovoltaic power prediction model. By setting 100 iterations, using the expected improvement (EI) acquisition function to balance exploration and exploitation, and setting the data fusion ratio parameter space from 0% to 100%, the research team comprehensively covered all configurations from no fusion to full fusion. The optimization results revealed the optimal data fusion ratios under different weather conditions as follows: under sunny conditions, 38.72%, 2.36%, 26.83%, and 14.50%; under cloudy conditions, 49.11%, 6.77%, 23.46%, 9.88%, and 17.68%; under snowy/rainy conditions, 30.18%, 12.05%, and 19.45%. These optimized fusion ratios were then applied to the training set data under corresponding weather conditions, followed by evaluation using the Conv-LSTM-ATT prediction model.

The experimental design involved comparing the impact of five different data fusion strategies on prediction performance, including the following: “No Fusion”, using only historical data from the target site; “Uniform Fusion”, evenly fusing data from all surrounding stations; “Similarity-Filtered Fusion”, evenly fusing data from nearby stations selected based on similarity; “Bayesian-Optimized Similarity Fusion”, determining the optimal fusion ratios for nearby stations based on similarity through Bayesian optimization; and “Actual Values” as a reference for model prediction accuracy. The experimental results showed that, compared to no fusion and uniform fusion strategies, the similarity-filtered fusion and Bayesian-optimized similarity fusion strategies significantly improved prediction accuracy, particularly the Bayesian-optimized similarity fusion, which performed better than other strategies under all test conditions.

These findings indicate that appropriate data fusion strategies can significantly enhance the performance of photovoltaic power prediction models, and Bayesian optimization serves as a powerful tool to effectively implement these strategies, especially in environments requiring high data diversity and complexity. Figure 12, Figure 13 and Figure 14 show the prediction results of the proposed model at different integration ratios, and Table 12, Table 13 and Table 14 show the prediction errors of the proposed model at different integration ratios. Through the experiment, we can draw the following conclusions.

(1): Significant Reduction in Error Metrics: The introduction of more data from neighboring stations significantly reduced error metrics such as RMSE and MAE. By applying Bayesian optimization to determine the optimal fusion ratios of data from nearby stations based on similarity, RMSE decreased by 20.04%, 28.24%, and 30.94% under sunny, cloudy, and rainy conditions, respectively, and MAPE decreased by 30.30%, 18.83%, and 29.27%. Similarly, MAE also decreased by 23.07%, 17.58%, and 31.36% under these weather conditions. These reductions emphasize that the model’s ability to predict PV power output is enhanced when supported with more extensive spatial data.
(2): Variability in Prediction Accuracy Across Weather Conditions: The improvement in prediction accuracy varied across different weather conditions. Particularly during rainy conditions, because more data from surrounding areas were integrated, compensating for the lack of historical data at the target site, the reduction in prediction error was the greatest, reaching 31.36%. This shows that the model especially benefits from additional data where there is a deficiency, enhancing its accuracy.
(3): Improvement in R² Value and the Trade-off with Time: As more data were integrated, the model’s R² value improved, indicating a stronger correlation between predicted and actual values. However, this accuracy came at the cost of increased computational time, especially as the degree of data integration increased, leading to longer prediction times.

In predicting photovoltaic power, the Conv-LSTM-ATT model that integrates spatial data from surrounding stations exhibits excellent performance. This strategy effectively utilizes diverse data sources, enhancing the model’s predictive accuracy across various weather conditions, and proving its practical application potential in real-world PV power forecasting scenarios.

5. Conclusions

This study employed the correlation analysis to identify and refine input variables, aiming to reduce their dimensionality and simplify the computational process. A data-driven framework was introduced, which integrated spatial and temporal information. This framework effectively leveraged the advantages of both CNN and LSTM networks by developing the Conv-LSTM module, thereby enhancing the model’s ability to learn the long-term mapping relationship between photovoltaic power and meteorological data. By integrating attention mechanisms into the Conv-LSTM model, distinct weights were assigned to LSTM’s hidden layers, reducing the loss of historical information and intensifying the impact of crucial data. Under the same dataset conditions, the experimental results of the study indicated that compared to three classical models, the method proposed in this paper exhibited superior performance in terms of prediction. Moreover, the utilization of a fused dataset further amplified the model’s performance, showcasing its exceptional predictive capabilities.

Nevertheless, the model’s scalability and generalization capability across different geographical locations and varying environmental conditions need further investigation. It is important to assess how well the model performs when applied to data from different PV systems that were not part of the initial training dataset. To address this, future research could focus on employing advanced techniques like transfer learning and domain adaptation. This approach would enable the model to effectively adapt to diverse environmental variables and different geographic locations, ensuring robust performance across a variety of PV systems. Additionally, enriching the training dataset with a broader spectrum of climatic and geographical data could further enhance the model’s predictive accuracy and reliability in new settings.

Author Contributions

Conceptualization, F.H. and L.Z.; methodology, F.H.; software, F.H. and J.W.; validation, F.H., L.Z. and J.W.; formal analysis, F.H.; investigation, J.W.; resources, L.Z.; data curation, F.H.; writing—original draft preparation, F.H.; writing—review and editing, J.W.; visualization, F.H.; supervision, L.Z.; project administration, L.Z.; funding acquisition, L.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 62371253.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Nomenclature of Variables and Abbreviations

Abbreviation	Full Term
PV	Photovoltaic
NWP	Numerical Weather Prediction
RMSE	Root Mean Square Error
CNN	Convolutional Neural Network
LSTM	Long Short-Term Memory
Conv-LSTM	Convolutional Long Short-Term Memory
MAPE	Mean Absolute Percentage Error
MAE	Mean Absolute Error
PCC	Pearson Correlation Coefficient
FF	Feature Fusion Layer
FC	Fully Connected Layer
ReLU	Rectified Linear Activation Unit
X	Historical Power Data
ATT	Attention

References

Moreira, M.O.; Balestrassi, P.P.; Paiva, A.P.; Ribeiro, P.F.; Bonatto, B.D. Design of experiments using artificial neural network ensemble for pho-tovoltaic generation forecasting. Renew. Sustain. Energy Rev. 2021, 135, 110450. [Google Scholar] [CrossRef]
Agrawal, S.; Soni, R. Renewable energy: Sources, importance and prospects for sustainable future. In Energy: Crises, Challenges and Solutions; John Wiley & Sons: Hoboken, NJ, USA, 2021; pp. 131–150. [Google Scholar] [CrossRef]
Rajesh, R.; Mabel, M.C. A comprehensive review of photovoltaic systems. Renew. Sustain. Energy Rev. 2015, 51, 231–248. [Google Scholar] [CrossRef]
Hernández-Callejo, L.; Gallardo-Saavedra, S.; Alonso-Gómez, V. A review of photovoltaic systems: Design, operation and maintenance. Sol. Energy 2019, 188, 426–440. [Google Scholar] [CrossRef]
Yang, D.; Kleissl, J.; Gueymard, C.A.; Pedro, H.T.; Coimbra, C.F. History and trends in solar irradiance and PV power forecasting: A preliminary assessment and review using text mining. Sol. Energy 2018, 168, 60–101. [Google Scholar] [CrossRef]
Li, P.; Zhou, K.; Lu, X.; Yang, S. A hybrid deep learning model for short-term PV power forecasting. Appl. Energy 2020, 259, 114216. [Google Scholar] [CrossRef]
Han, S.; Qiao, Y.H.; Yan, J.; Liu, Y.Q.; Li, L.; Wang, Z. Mid-to-long term wind and photovoltaic power generation prediction based on copula function and long short term memory network. Appl. Energy 2019, 239, 181–191. [Google Scholar] [CrossRef]
Kumler, A.; Xie, Y.; Zhang, Y. A Physics-based Smart Persistence model for Intra-hour forecasting of solar radiation (PSPI) using GHI measurements and a cloud retrieval technique. Sol. Energy 2019, 177, 494–500. [Google Scholar] [CrossRef]
Ahmed, R.; Sreeram, V.; Mishra, Y.; Arif, M.D. A review and evaluation of the state-of-the-art in PV solar power forecasting: Techniques and optimization. Renew. Sustain. Energy Rev. 2020, 124, 109792. [Google Scholar] [CrossRef]
De Giorgi, M.G.; Congedo, P.M.; Malvoni, M. Photovoltaic power forecasting using statistical methods: Impact of weather data. IET Sci. Meas. Technol. 2014, 8, 90–97. [Google Scholar] [CrossRef]
Han, Y.; Wang, N.; Ma, M.; Zhou, H.; Dai, S.; Zhu, H. A PV power interval forecasting based on seasonal model and nonparametric estimation algorithm. Sol. Energy 2019, 184, 515–526. [Google Scholar] [CrossRef]
Wang, H.; Lei, Z.; Zhang, X.; Zhou, B.; Peng, J. A review of deep learning for renewable energy forecasting. Energy Convers. Manag. 2019, 198, 111799. [Google Scholar] [CrossRef]
Li, G.; Xie, S.; Wang, B.; Xin, J.; Li, Y.; Du, S. Photovoltaic Power Forecasting With a Hybrid Deep Learning Approach. IEEE Access 2020, 8, 175871–175880. [Google Scholar] [CrossRef]
Zhen, Z.; Liu, J.; Zhang, Z.; Wang, F.; Chai, H.; Yu, Y.; Lu, X.; Wang, T.; Lin, Y. Deep Learning Based Surface Irradiance Mapping Model for Solar PV Power Forecasting Using Sky Image. IEEE Trans. Ind. Appl. 2020, 56, 3385–3396. [Google Scholar] [CrossRef]
Mishra, M.; Dash, P.B.; Nayak, J.; Naik, B.; Swain, S.K. Deep learning and wavelet transform integrated approach for short-term solar PV power prediction. Measurement 2020, 166, 108250. [Google Scholar] [CrossRef]
Munawar, U.; Wang, Z. A Framework of Using Machine Learning Approaches for Short-Term Solar Power Forecasting. J. Electr. Eng. Technol. 2020, 15, 561–569. [Google Scholar] [CrossRef]
Park, M.K.; Lee, J.M.; Kang, W.H.; Choi, J.M.; Lee, K.H. Predictive model for PV power generation using RNN (LSTM). J. Mech. Sci. Technol. 2021, 35, 795–803. [Google Scholar] [CrossRef]
Abdel-Nasser, M.; Mahmoud, K. Accurate photovoltaic power forecasting models using deep LSTM-RNN. Neural Comput. Appl. 2019, 31, 2727–2740. [Google Scholar] [CrossRef]
Hong, T.; Pinson, P.; Wang, Y.; Weron, R.; Yang, D.; Zareipour, H. Energy forecasting: A review and outlook. IEEE Open Access J. Power Energy 2020, 7, 376–388. [Google Scholar] [CrossRef]
Lin, W.; Zhang, B.; Li, H.; Lu, R. Multi-step prediction of photovoltaic power based on two-stage decomposition and BILSTM. Neurocomputing 2022, 504, 56–67. [Google Scholar] [CrossRef]
Zhen, H.; Niu, D.; Wang, K.; Shi, Y.; Ji, Z.; Xu, X. Photovoltaic power forecasting based on GA improved Bi-LSTM in microgrid without mete-orological information. Energy 2021, 231, 120908. [Google Scholar] [CrossRef]
Son, Y.; Zhang, X.; Yoon, Y.; Cho, J.; Choi, S. LSTM–GAN based cloud movement prediction in satellite images for PV forecast. J. Ambient Intell. Humaniz. Comput. 2023, 14, 12373–12386. [Google Scholar] [CrossRef]
Li, Q.; Zhang, D.; Yan, K. A Solar Irradiance Forecasting Framework Based on the CEE-WGAN-LSTM Model. Sensors 2023, 23, 2799. [Google Scholar] [CrossRef] [PubMed]
Kumari, P.; Toshniwal, D. Extreme gradient boosting and deep neural network based ensemble learning approach to forecast hourly solar irradiance. J. Clean. Prod. 2021, 279, 123285. [Google Scholar] [CrossRef]
Agga, A.; Abbou, A.; Labbadi, M.; El Houm, Y.; Ali, I.H.O. CNN-LSTM: An efficient hybrid deep learning architecture for predicting short-term photovoltaic power production. Electr. Power Syst. Res. 2022, 208, 107908. [Google Scholar] [CrossRef]
Tang, Y.; Yang, K.; Zhang, S.; Zhang, Z. Photovoltaic power forecasting: A hybrid deep learning model incorporating transfer learning strategy. Renew. Sustain. Energy Rev. 2022, 162, 112473. [Google Scholar] [CrossRef]
Sanchez-Sutil, F.; Cano-Ortega, A.; Hernandez, J.C.; Rus-Casas, C. Development and calibration of an open source, low-cost power smart meter prototype for PV house-hold-prosumers. Electronics 2019, 8, 878. [Google Scholar] [CrossRef]
Hernández, J.C.; Ruiz-Rodriguez, F.J.; Jurado, F. Modelling and assessment of the combined technical impact of electric vehicles and photovoltaic generation in radial distribution systems. Energy 2017, 141, 316–332. [Google Scholar] [CrossRef]
Sanabria-Villamizar, M.; Bueno-López, M.; Hernández, J.C.; Vera, D. Characterization of household consumption load profiles in the time and frequency domain. Int. J. Electr. Power Energy Syst. 2021, 37, 107756. [Google Scholar] [CrossRef]
Kapoor, A.; Singhal, A. A comparative study of K-Means, K-Means++ and Fuzzy C-Means clustering algorithms. In Proceedings of the 2017 3rd International Conference on Computational Intelligence & Communication Technology (CICT), Ghaziabad, India, 9–10 February 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1–6. [Google Scholar]
Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 53. [Google Scholar] [CrossRef]
Yu, Y.; Si, X.; Hu, C.; Zhang, J. A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Chen, Y.; Liu, H.; Ma, X.; Su, X.; Liu, Q. Day-ahead photovoltaic power forcasting using convolutional-LSTM networks. In Proceedings of the 2021 3rd Asia Energy and Electrical Engineering Symposium (AEEES), Chengdu, China, 26–29 March 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 917–921. [Google Scholar]
Niu, Z.; Zhong, G.; Yu, H. A review on the attention mechanism of deep learning. Neurocomputing 2021, 452, 48–62. [Google Scholar] [CrossRef]
Zhou, H.; Zhang, Y.; Yang, L.; Liu, Q.; Yan, K.; Du, Y. Short-Term Photovoltaic Power Forecasting Based on Long Short Term Memory Neural Network and Attention Mechanism. IEEE Access 2019, 7, 78063–78074. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]

Figure 1. The spatiotemporal data-driven framework for photovoltaic (PV) power forecasting.

Figure 2. Short-term photovoltaic (PV) power forecasting model based on deep learning.

Figure 3. Structure diagram of CNN.

Figure 4. The LSTM neuron structure.

Figure 5. The attention-based Conv-LSTM module.

Figure 6. Attention and the Conv-LSTM network.

Figure 7. PV power generation data in three dimensions.

Figure 8. Distribution diagram of weather types based on K-Means++ clustering.

Figure 9. Forecast result under sunny weather.

Figure 10. Forecast result under cloudy weather.

Figure 11. Forecast result under rainy weather.

Figure 12. Forecast result under sunny weather.

Figure 13. Forecast result under cloudy weather.

Figure 14. Forecast result under rainy weather.

Table 1. Correlation analysis.

Factors	Rg	Rd	H	T	W	Wd
Correlation	0.98	0.8	−0.46	0.42	0.32	0.08

Table 2. Autocorrelation analysis.

Lag Time	Correlation	Lag Time	Correlation
1	0.988	11	0.633
2	0.965	12	0.576
3	0.959	13	0.499
4	0.925	14	0.427
5	0.908	15	0.358
6	0.889	16	0.294
7	0.836	17	0.231
8	0.789	18	0.155
9	0.742	19	0.081
10	0.695	20	0.031

Table 3. Composite similarity of neighboring PV stations under sunny weather.

Neighboring PV Station Number	$Normalized Composite Similarity Ψ_{i}$
2	0.68
3	0.90
4	1.00
5	0.75
6	0.92
7	0.85

Table 4. Composite similarity of neighboring PV stations under cloudy weather.

Neighboring PV Station Number	$Normalized Composite Similarity Ψ_{i}$
2	0.88
3	0.95
4	0.99
5	1.00
6	0.92
7	0.65

Table 5. Composite similarity of neighboring PV stations under rainy weather.

Neighboring PV Station Number	$Normalized Composite Similarity Ψ_{i}$
2	0.77
3	0.90
4	0.93
5	0.73
6	1.00
7	0.82

Table 6. Impact of the number of neighboring PV stations on prediction accuracy for sunny weather.

Neighboring PV Stations (K)	0	1	2	3	4	5	6
RMSE/W	0.1636	0.1622	0.1611	0.1600	0.1583	0.1605	0.1629

Table 7. Impact of the number of neighboring PV stations on prediction accuracy for cloudy weather.

Neighboring PV Stations (K)	0	1	2	3	4	5	6
RMSE/W	0.2341	0.2229	0.2114	0.2083	0.2011	0.1996	0.2157

Table 8. Impact of the number of neighboring PV stations on prediction accuracy for rainy weather.

Neighboring PV Stations (K)	0	1	2	3	4	5	6
RMSE/W	0.2467	0.2400	0.2349	0.2269	0.2292	0.2307	0.2425

Table 9. Forecast errors of different forecasting models for sunny weather.

Indexes	LSTM	Bi-LSTM	Conv-LSTM	Conv-LSTM-ATT
R²	0.933	0.942	0.951	0.973
RMSE/W	0.2002	0.1881	0.1835	0.1636
MAE/W	0.1710	0.1528	0.1476	0.1283
MAPE/%	6.42	5.13	4.45	4.06
Time/s	542	623	691	755

Table 10. Forecast errors of different forecasting models for cloudy weather.

Indexes	LSTM	BiLSTM	Conv-LSTM	Conv-LSTM-ATT
R²	0.919	0.924	0.949	0.965
RMSE/W	0.3275	0.3057	0.2751	0.2358
MAE/W	0.2535	0.2292	0.2181	0.1848
MAPE/%	8.44	6.96	4.98	4.62
Time/s	559	635	689	739

Table 11. Forecast errors of different forecasting models for rainy weather.

Indexes	LSTM	Bi-LSTM	Conv-LSTM	Conv-LSTM-ATT
R²	0.892	0.901	0.918	0.934
RMSE/W	0.3226	0.3218	0.2886	0.2421
MAE/W	0.2487	0.2363	0.2210	0.2076
MAPE/%	11.55	10.24	8.01	6.73
Time/s	556	628	699	785

Table 12. Comparison of the prediction errors of sunny weather under different fusion ratios.

Indexes	No Fusion	Uniform Fusion	Similarity-Filtered Fusion	Bayesian-Optimized Similarity Fusion
R²	0.973	0.983	0.991	0.996
RMSE/W	0.1636	0.1418	0.1345	0.1308
MAE/W	0.1283	0.1122	0.1063	0.0987
MAPE/%	4.06	3.82	3.25	2.83
Time/s	755	794	824	876

Table 13. Comparison of the prediction errors of cloudy weather under different fusion ratios.

Indexes	No Fusion	Uniform Fusion	Similarity-Filtered Fusion	Bayesian-Optimized Similarity Fusion
R²	0.965	0.972	0.980	0.985
RMSE/W	0.2358	0.1939	0.1747	0.1692
MAE/W	0.1848	0.1702	0.1675	0.1523
MAPE/%	4.62	4.05	3.91	3.75
Time/s	739	783	836	853

Table 14. Comparison of the prediction errors of rainy weather under different fusion ratios.

Indexes	No Fusion	Uniform Fusion	Similarity-Filtered Fusion	Bayesian-Optimized Similarity Fusion
R²	0.934	0.953	0.966	0.976
RMSE/W	0.2421	0.2179	0.1828	0.1672
MAE/W	0.2076	0.1813	0.1652	0.1425
MAPE/%	6.73	5.16	4.99	4.76
Time/s	785	798	863	898

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hu, F.; Zhang, L.; Wang, J. A Hybrid Convolutional–Long Short-Term Memory–Attention Framework for Short-Term Photovoltaic Power Forecasting, Incorporating Data from Neighboring Stations. Appl. Sci. 2024, 14, 5189. https://doi.org/10.3390/app14125189

AMA Style

Hu F, Zhang L, Wang J. A Hybrid Convolutional–Long Short-Term Memory–Attention Framework for Short-Term Photovoltaic Power Forecasting, Incorporating Data from Neighboring Stations. Applied Sciences. 2024; 14(12):5189. https://doi.org/10.3390/app14125189

Chicago/Turabian Style

Hu, Feng, Linghua Zhang, and Jiaqi Wang. 2024. "A Hybrid Convolutional–Long Short-Term Memory–Attention Framework for Short-Term Photovoltaic Power Forecasting, Incorporating Data from Neighboring Stations" Applied Sciences 14, no. 12: 5189. https://doi.org/10.3390/app14125189

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Hybrid Convolutional–Long Short-Term Memory–Attention Framework for Short-Term Photovoltaic Power Forecasting, Incorporating Data from Neighboring Stations

Abstract

1. Introduction

Related Work on PV Generation Forecasting

2. Problem Formulation

3. Materials and Methods

3.1. Overview of the Proposed Model

3.2. Selection of Similar Neighboring PV Plants Based on Composite Similarity Index

3.3. K-Means++ Approach

3.4. CNN

3.5. LSTM

3.6. Conv-LSTM

3.7. Attention Mechanism

3.8. Applying Bayesian Optimization to Optimize Data Fusion Ratios in Photovoltaic Power Forecasting

4. Results

4.1. Data Source

4.2. Photovoltaic Power Influencing Factors and Correlation Analysis

4.3. Model Evaluation Metrics

4.4. Training Configuration

4.5. Model Setting

4.6. Selection Results of Neighboring Photovoltaic Stations

4.7. Comparison and Analysis of the Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Nomenclature of Variables and Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI