1. Introduction
Bearings are a fundamental and vulnerable part of industrial rotating equipment. Numerous factors (e.g., running load, operation temperature, lubrication, installation, corrosion, material defects, etc.) lead to severe bearing faults and influence the normal operation of machines [
1]. Thus, regular bearing maintenance is crucial for reducing machine downtime and improving productivity [
2]. In recent years, predictive maintenance (PdM) has become an increasingly significant field in modern manufacturing, as it can estimate the health status of machines to minimize risks of sudden breakdowns [
4]. One key technique of PdM is RUL prediction that refers to the remaining time left for machines to operate normally before a serious bearing failure occurs [
5]. Consequently, accurate RUL prediction of bearings contributes to minimizing machine downtime, reducing maintenance frequency, and maximizing the service life of bearings [
Generally, approaches for predicting the RUL of bearings can be roughly classified as two main groups: mechanism model methods [
12] and data-driven methods [
15]. Specifically, mechanism model methods typically rely on failure principles of bearings and try to establish accurate mathematical models to describe the degradation process of bearings [
16]. Hu et al. [
17] proposed an RUL prediction model for bearings based on the diffusion process; the model addresses the uncertainty in prediction results and enhances the accuracy of predictions. Gao et al. [
18] developed fatigue reliability models that consider the combined effects of fatigue damage accumulation and effective stress growth, resulting in an accurate performance degradation prediction of composite materials. In theory, mechanism model methods have the potential to adequately reflect the system nature by describing the mechanisms and characteristics of the bearing degradation process [
19]. Nevertheless, for complex mechanical systems, physical principles underlying bearing failures are not yet fully understood [
20], which leads to difficulties in developing precise and reliable mechanism models.
Recently, the development of artificial intelligence (AI) as well as big data technologies brings significant opportunities for data-driven methods to eliminate the need for complex control equations during the analysis process [
21]. By extracting features from enormous data and establishing specific relational models between feature patterns and the RUL of bearings, data-driven methods offer innovative solutions in the prediction of RUL for bearings [
22]. In the literature [
23], based on the open-accessed bearing dataset from the FEMTO-ST institute, principal component analysis (PCA) as well as least squares-support vector regression (LS-SVR) are applied to extract features and predicting RUL, respectively. This method provides a soft computing technique to identify patterns among features, which improves the accuracy in RUL prediction. Additionally, Dong et al. [
24] proposed a novel method that integrates kernel PCA with a support vector machine (SVM). In this method, vibration signals are decomposed to obtain fault information. Following that, characteristic features extracted by a kernel PCA are inputted into the SVM to establish a classification model for operational status. Ultimately, this approach achieves effective recognition of bearing operating status.
Currently, deep learning (DL) is considered as a major breakthrough in the data-driven methods. With deep neural networks, DL is able to capture deep representation of the dataset and achieve better performance than other data-driven methods in the fault diagnosis and the prediction of RUL [
27]. Gao et al. [
28] integrate fuzzy inference and neural networks to capture the nonlinear relationship between parameters and fatigue status. Additionally, introducing non-proportionality and phase differences enables the model to accurately predict the fatigue life of various materials. Zhu et al. [
29] combined time–frequency representations (TFRs) and multiscale convolutional neural network (MSCNN) for bearing RUL prediction, in which wavelet transform (WT) is utilized to obtain non-stationary property of TFRs and address the difficulty in applying CNN directly to raw time series. Xiao et al. [
30] proposed a fusion method that merges empirical mode decomposition with a gated recurrent unit (GRU) to effectively address the problem of accurately assessing bearing degradation. The key innovation of this method lies in decomposing the original signal and extracting the most sensitive trend features, which are then inputted into GRU to calculate the health index.
In practical industrial environments, the large data volume, high data dimension, strong interference noise, and coupling effects between parameters make it difficult to achieve high accuracy and strong robustness in the prediction of bearing RUL. Hence, a novel prediction framework called CNN-VAE-MBiLSTM is proposed for RUL prediction of bearings in this paper. The CNN-VAE part of the framework is obtained by fusing symmetric CNN and VAE; it can effectively capture accurate low-dimensional TFRs from time–frequency spectrum of signals using the advantage of efficient image processing provided by CNN and the continuous learning ability of data distribution offered by VAE. Then, the MBiLSTM is introduced to transform features from multi-axis signals into estimated RUL values. The MBiLSTM consists of two steps: Firstly, statistical variables and TFRs from each direction are input into a sub-model to encode temporal information. In the second step, outputs of sub-models are combined to extract the differences among multi-axis features.
At last, the effectiveness of CNN-VAE-MBiLSTM is verified using experimental datasets of bearings. The key contributions of this study are presented as follows:
- (1)
The CNN-VAE part is an unsupervised model that can adaptively extract TFRs without relying on hand-designed labels, which avoids laborious work of feature construction, eliminates the influence of personal participation, and successfully applies the high-dimensional time–frequency spectrum to RUL prediction.
- (2)
Bi-directional long short-term memory (BiLSTM) is employed as the sub-model in MBiLSTM, which is excellent for capturing sequential characteristics of features and has a significant improvement in accuracy of RUL prediction. In addition, the two-step approach designed in MBiLSTM imitates the architecture of ensemble learning to enhance the accuracy and robustness of RUL prediction. Experimental results indicate that the MBiLSTM has better performance than the single BiLSTM.
Subsequent sections of this paper are arranged as follows. The related works are reviewed in
Section 2. Subsequently, problem formulation and methodology are described in
Section 3. Then, the proposed approach is verified in
Section 4, which consists of dataset description, evaluation metrics, feature construction, as well as the RUL prediction and discussion. Lastly, the conclusion of the paper is offered in
Section 5.
2. Related Work
Severe bearing failures will result in equipment breakdowns, leading to substantial economic loss and threatening the health of operators [
31]. Therefore, timely analysis of operation conditions for bearings is of great research importance. Recent studies have demonstrated the effectiveness of using operating data to reflect the bearing degradation caused by material defects or other intricate factors [
32]. Therefore, data-driven approaches have become an essential strategy of the PdM for bearings.
Typically, the construction of RUL prediction models for bearings using data-driven approaches consists of two key phases: one is to utilize signal processing techniques (SPTs) to extract physical features, and the other is to employ machine learning models to learn the underlying correlation between these features and bearing degradation [
34]. For instance, Singleton et al. [
35] extracted the time–frequency domain features (TFDFs) from vibration signals and then tracked the TFDF to evaluate the RUL of bearings using curve fitting and extended Kalman filtering algorithms. Huang et al. [
36] integrated the attention mechanism into neural network and utilized time domain features (TDFs) and frequency domain features (FDFs) as inputs, achieving good RUL prediction results of bearings.
Although physical features extracted using SPTs have proven effective in qualitative classification of bearing health status, these features still face challenges in fully capturing subtle changes during the degradation process for quantitative prediction [
37]. Moreover, due to the diverse and complex nature of degradation processes, determining appropriate features also requires substantial expertise and human labor, so that some researchers introduce end-to-end deep frameworks into RUL estimation in bearings [
39]. In the literature [
40], the LSTM network is combined with the CNN network to form an end-to-end deep framework. Within this framework, the convolutional layer directly extracts degradation features from sensor data, while LSTM layers are utilized for accurate quantitative prediction of the degradation process. In addition, Ye et al. [
41] adopted multi-scale convolutional autoencoder (MSCAE) to automatically capture both global and local information from vibration signals. Health indicators (HI) were then constructed to replace time–frequency features as inputs of the prediction model (i.e., LSTM network). Ultimately, the effectiveness of this approach was validated using an open-source dataset. Compared to machine learning models, end-to-end deep framework not only significantly improves prediction performance but also simplifies the modeling process by skipping the feature engineering. However, the simplification of processes results in redundancy of hyper-parameters and poor model generalization [
To solve the aforementioned issues, there is an increasing interest to introduce deep learning methods into feature engineering and model construction, respectively. Specifically, traditional SPTs are combined with deep neural networks to extract deep features. Then, these deep features are used in a deep neural network to establish an accurate relationship with the target values [
43]. Li et al. [
44] proposed an intelligent method for RUL prediction based on deep CNNs, where short-time Fourier transform (STFT) is employed to obtain the time–frequency spectrum of vibration signals. The time–frequency spectrum is then processed by CNN to extract and analyze multi-scale features, resulting in high-precision RUL prediction results. Saucedo-Dorantes et al. [
45] introduced a novel data-driven diagnosis methodology for identifying bearing faults, in which stacked auto-encoders (SAE) are used to extract fault-related deep features and a deep neural network is employed to fuse the information from different domains. The experiment showed that this approach achieves advantageous results in the fault diagnosis of different bearings.
As mentioned above, deep neural networks have been explored to predict bearing RUL in some research, but further improvements are needed to achieve more accurate and robust predictive performance.
4. Experiment and Analysis
The proposed RUL prediction approach for bearings was validated using an industrial case from a textile company located in Jinhua, Zhejiang Province, China. In this case, based on the prior maintenance knowledge of machines, bearing datasets were simultaneously collected from the crucial bearing installed in six weaving machines (Picanol GTMax-I 3.0,Picanol Group, Ieper, Belgium), and these weaving machines kept intermittently operating until they had to be shut down due to severe bearing failures. In addition, only vibration signals were recorded in bearing datasets using sensors (CT1010L and PCIE-1803, Shenzhen Jilantin Intelligent Technology Co., Ltd., Shenzhen, China). The experimental platform is shown in
Figure 6.
4.1. Dataset Description
During the entire lifespan of the weaving machine, vibration signals of main bearing contained multiple axes, namely
Y-axis, and
Z-axis. In this case, the entire long maintenance cycle of bearings was about two years, indicating the degradation rate of bearings was slow. Thus, the intermittent sampling method was adopted. The sampling interval was set as 1 h, and sampling duration and sensor acquisition frequency were set as 1 s and 5000 Hz, respectively. The details of the experimental datasets are presented in
Table 3.
Following the practical experience in numerous bearing studies, six bearing datasets were separated into a training set and a testing set to verify the effectiveness of the proposed approach. The training set accounted for about 70% and the testing set accounted for about 30%, which is also shown in
Table 3.
In practical industrial environments, the degradation process of bearings has complex behaviors. As shown in
Figure 7, the Bearing_1 dataset was evenly divided into 10 equal segments, and the boxplot method was employed to estimate amplitudes of vibration within each segment. It revealed a notable variation in the amplitude distribution over time, as well as significant differences among multiple axes, which also indicate the crucial potential of sequential characteristics and the fusion of vibration signals from various directions in predicting the RUL.
4.2. Experiment Setup and Evaluation Metrics
Due to different degradation mechanisms, the maximum operation cycles among bearing datasets usually have significant differences. Meanwhile, ensuring consistent scales between inputs and outputs of a model is advantageous for enhancing its prediction performance. Therefore, the RUL values of samples in this case were normalized in pretreatment.
Furthermore, to calculate prediction errors of approaches, an improved score function from the IEEE PHM 2012 challenge [
23], mean absolute error (
MAE) and root mean squared error (
RMSE) were selected. The formulas for these metrics are defined as follows:
is the number of samples, and
are the
ith predictive RUL value and actual RUL value, respectively.
4.3. Feature Construction
To obtain more comprehensive bearing degradation information, it is essential to construct multi-domain features (TDFs, FDFs, and TFDFs). In this case, TDFs consist of 16 statistical parameters and FDFs involve 12 statistical parameters. However, some statistical parameters are not sensitive to the degradation state of the bearing, which need to be eliminated to prevent negative effects. Therefore, monotonicity (
Mon), correlation (
Corr), as well as robustness (
Rob) were considered for screening features [
68], and a linear combination of these criteria was utilized as a comprehensive evaluation metric (
Cem) to fully evaluate the applicability of degradation features. The formulas are outlined as follows:
is the number of samples;
are the ith actual value and the mean value of RUL, respectively;
are the
ith value and the mean value of feature, respectively; and
denotes the
ith smoothed value of feature.
Based on the training set, the results of comprehensive evaluation are shown in
Figure 8, and the impact of parameters number on the proposed RUL prediction model are shown in
Table 4. Finally, the threshold value of Cem is highlighted as the red line in
Figure 8, the top eight statistical parameters of TDFs and FDFs were selected, which included:
- (1)
Time domain parameters: RMS, mean, minimum, variance, clearance factor;
- (2)
Frequency domain parameters: spectral mean, spectral root mean square, gravity frequency.
Additionally, the proposed CNN-VAE model has the capability to dynamically compress effective degradation information from the time–frequency spectrum into TFDFs. The compression effect of bearing datasets in the
X-axis using the CNN-VAE model are presented in
Figure 9.
As illustrated in
Figure 9, the high-dimensional time–frequency spectrum of bearing datasets was efficiently compressed into nine TFDFs. The compression effectiveness of the CNN-VAE model is demonstrated in following aspects:
- (1)
TFDFs curves of the Bearing_1 dataset in
X-axis were closely related to the trend of the amplitude distribution over time in
Figure 7. This means that the obtained TFDFs contained the important information of degradation process.
- (2)
In addition, all TFDFs in the training set exhibited good monotonicity and robustness, which further proves the high performance of the CNN-VAE model in capturing important degradation information.
- (3)
Moreover, it can be observed that the CNN-VAE model also had a successful performance in the testing set, and the extracted TFDFs had excellent continuity, indicating that the compression achieved by the CNN-VAE model demonstrates superior generalization.
4.4. RUL Prediction and Discussion
To validate the superiority of the proposed approach, this case sets up comparative experiments between the CNN-VAE-MBiLSTM model and four prediction models. These prediction models include linear support vector regression (LSVR), kernelized support vector regression (KSVR), DCNN, and BiLSTM. The main parameters and structures of these models are outlined below.
LSVR and KSVR: LSVR and KSVR are directly implemented using scikit-learn [
69]. The penalty coefficients used for LSVR were set as 5, 50, 500, and 5000, respectively. The kernel trick for KSVR uses a radial basis function (RBF). Additionally, LSVR was utilized as the baseline model for comparison with other models.
DCNN: DCNN combines a CNN and a fully connected layer. The CNN part adopts the same structure as the CNN in the encoding part of the CNN-VAE model, as shown in
Table 5. In addition, the fully connected layer is responsible for converting high-dimensional outputs from the CNN into prediction values of RUL.
BiLSTM: BiLSTM utilizes the same feature engineering as the proposed approach. But, in the prediction process, extracted features obtained from multi-axis were directly combined and then input into a single BiLSTM. The single BiLSTM follows the same structure as the BiLSTM part in the CNN-VAE-MBiLSTM model, in which the number of hidden layers and hidden neurons were set as 2 and 10, respectively.
Moreover, to ensure a fair and unbiased comparison, all models were evaluated using the same training set and testing set. All neural networks were trained with the Adam optimizer. The training epochs and the learning rate were set as 500 and 0.0005, respectively. To prevent contingency, the experiments were repeated three times, and those results were then averaged to obtain the final result. The experimental results are presented in
Table 6 and
Figure 10.
The detailed comparisons among these models are described as follows:
- (1)
Based on TDFs and FDFs, both of the LSVR model and the KSVR model exhibited significant fluctuations in prediction results in the testing set. But, it is evident that the KSVR model outperformed the LSVR model in the training set, which indicates that better non-linear fitting ability is more conducive to establish an effective mapping relationship between features and bearing RUL.
- (2)
Additionally, compared to the KSVR model, the MAE value and the RMSE value of DCNN model in the testing set decreased to 0.055 and 0.0883, respectively. This reduction of prediction errors is due to the utilization of additional TFRs, which also further verifies the significant impact of TFDFs in feature engineering.
- (3)
Furthermore, the performance of the BiLSTM model was much better than the DCNN model, as shown in
Figure 10. The MAE value and the RMSE value of the BiLSTM model in the testing set were 0.0414 and 0.0784, respectively. Meanwhile, TFDFs were adopted in the BiLSTM model to avoid the influence of TFRs. Thus, the difference between the BiLSTM model and the DCNN model reflects important effects of sequential characteristics.
- (4)
Ultimately, the proposed CNN-VAE-MBiLSTM model integrates the extraction of TFRs and the obtainment of sequential characteristics. It is obvious that the proposed method achieves the best accuracy and robustness in RUL prediction. The values of MAE, RMSE, and Score in the testing set were 0.0281, 0.0401, and 0.7894, respectively, which means that the proposed approach can satisfy requirements of bearing maintenance in machines.
4.5. Robust Analysis
Generally, the noise in real industrial environments will result in a decrease in the predictive performance in various AI algorithms. Hence, the anti-noise capacity of algorithms plays a crucial role in determining its practicality. To evaluate anti-noise capacities of the proposed approach, different white Gaussian noises were added into the above comparative experiments. The intensity of white Gaussian noises was decided by the signal-to-noise ratio (
SNR), as defined by:
is the signal length;
denote the
ith amplitude value in raw data and white Gaussian noise, respectively; and the unit of
SNR is decibel (dB).
In this section, the anti-noise capacity of the above algorithms was analyzed under different values of SNR that ranged from 10 dB to 2 dB. The lower SNR represents the higher intensity of the white Gaussian noise utilized. The experimental results are shown in
Table 7,
Table 8,
Table 9 and
Figure 11.
According to
Figure 11, with the increase in the intensity of white noise, the accuracy of different models tends to decline. Specifically, when the SNR changed from 5 dB to 2 dB, there was a significant decreased in the accuracy of the LSVR model and KSVR model. Score values of both models in the testing set reduced from 0.4472 and 0.4127 to 0.2250 and 0.3202, respectively. In contrast, the accuracy of the DCNN model, BiLSTM model, and the CNN-VAE-MBiLSTM model declined more slowly; the scores of these models were 0.4636, 0.5437 and 0.6447 when the SNR was 2 dB. This suggests that the use of TFRs contributes to improving the robustness of models. Furthermore, when SNR was 2 dB, the proposed approach outperformed the other models, and achieved a further 18.6% improvement in Score compared to the BiLSTM model.
As discussed above, the proposed approach exhibited the best anti-noise capacity and the highest accuracy in noisy environment.
5. Conclusions and Future Research
In this paper, a novel approach for bearing RUL prediction called CNN-VAE-MBiLSTM is proposed. This approach can be divided into two parts: the CNN-VAE model and the MBiLSTM model. The CNN-VAE model is capable of automatically compressing the high-dimensional time–frequency spectrum of raw data into low-dimensional TFRs, which avoids laborious works of feature construction and eliminates the influence of personal participation. The MBiLSTM model adopts a two-step strategy that extracts features from each acquisition direction of a signal and will independently capture sequential characteristics at the first step. Following that, differences among multi-axis features are further obtained at second step. Ultimately, the proposed approach achieves accurate and robust RUL predictions.
In comparative experiments, the proposed CNN-VAE-MBiLSTM model was compared with four RUL prediction models (LSVR, KSVR, DCNN, and BiLSTM) to judge its prediction performance using three evaluation metrics. The comparison results confirmed the superiority of the proposed approach for RUL prediction. The MAE, RMSE, and Score of the proposed approach in the testing set were 0.0281, 0.0404, and 0.7894, respectively. In addition, the anti-noise capacity of the proposed approach was further analyzed by artificially adding different white Gaussian noises to the raw signals. As mentioned above, the proposed approach exhibited the best anti-noise capacity and the highest accuracy in a noisy environment.
In future research, the generalization of the proposed approach on different types of machines is planned to be discussed, and the network architecture of the proposed approach will be further optimized, aiming to achieve a better prediction performance and lower computational complexity.