4.1. Laplacian Score for Feature Selection
Generally, the obtained high dimension RCMMFE values are not all really related with fault information and parts of the entropies are redundant. It is necessary to reduce the fault feature dimensionality to improve the efficiency of fault diagnosis. Laplacian Score (LS) is a both supervised and unsupervised feature selection method, which is mainly founded on Laplacian eigenmaps and locality preserving projection and estimate the features according their locality preserving power. By using LS the fault features are resorted according to their Laplacian score from low to high, which corresponds to their importance, namely, those features with the lowest scores as the most ones. In this paper, LS is employed to select the most important features from the initial fault features to reflect the fault information. After that SVM suitable for dealing with classification problems of small samples is employed for fault mode classification. Besides, the particle swarm optimization algorithm is used to optimize the parameters of SVM, i.e., the PSO-SVM method is used to automatically identify fault category and severity.
The extracted MFE features in 20 scales, theoretically, are able to identify the fault type. However, a feature vector with high dimension will be time-consuming and even information inefficient for fault diagnosis. Therefore, it is necessary to select the most interrelated feature with fault information from the 20 features, which will avoid the dimension disaster and improve the performance of classification and the efficiency of rolling bearing automatically fault diagnosis.
Laplacian Score mainly is inspired by Laplacian eigenmaps and locality preserving projection and its key thought is to estimate the features according their locality preserving power and a detailed description of LS can be found in [
17,
18]. Laplacian score algorithm chooses those features with the lowest scores. Since, LS has not been widely used in rolling bearing fault diagnosis for fault feature selection, in this paper it is employed to select the features with most close relationship with fault information from the initial features.
4.2. The Proposed Method
The proposed fault diagnosis method for rolling bearing is described as follows:
- (1)
RCMMFE is employed to extract the complexity information related with fault from vibration signals of rolling bearing for construction of initial fault features. The initial fault features are divided into training and testing data sets.
- (2)
LS is utilized to sort the initial feature values of training data sets to construct sensitive fault features.
- (3)
The sensitive fault features of training data are used to train the PSO-SVM based multi-classifier.
- (4)
The sensitive fault features of testing data can be obtained by using the feature order get by LS in step (2).
- (5)
The sensitive fault features of testing data sets are input to the trained PSO-SVM multi-classifier and the outputs are used to diagnose fault location and severity.
The flowchart of the proposed method is given in
Figure 2.
Next, the experiment data of rolling bearing offered by Case Western Reserve University Bearing data center are used to verify the effectiveness of proposed method. As shown in
Figure 3 above, the test stand consists of a motor, a torque transducer/encoder (center), a dynamometer, and control electronics. Single point faults were introduced into SKF bearings using electro-discharge machining with local fault diameters of 0.1778 mm and 0.3556 mm and fault depths of 0.2794 mm. Vibration data were collected using accelerometers attached to the housing with magnetic bases. Accelerometers were placed at the 12 o’clock position at both the drive end and fan end of the motor housing. Digital data was collected with sample frequency 12,000 Hz. Experiments were conducted for both fan and drive end bearings with outer raceway faults located at 6 o’clock (orthogonal to the load zone). In this paper, the data we used include seven classes, i.e., normal bearing (noted as Norm), faults located in inner raceway (IR), outer raceway (OR) and ball element (BE) with fault diameters of 0.1778 mm and 0.3556 mm (label as IR1 IR2, OR1, OR2, BE1 and BE2, respectively). The states of rolling bearing are divided into seven classes and the label description is shown in first and second columns of
Table 1. Generally, only the single channel data of rolling bearing in drive end or fan end were used to verify their fault diagnosis methods by many scholars. However, the method that synthesizes all three channel data collected from different location and accelerometers, i.e., the drive end (DE), fan end (FE) and base accelerometers for the same fault category may get much more fault information and higher fault identifying rate than that of the single channel data based fault diagnosis methods.
In the following the proposed method and related ones are employed to the above experiment data analysis of rolling bearing. Firstly, MSE-, MFE-, MMSE-, MMFE- and RCMMFE-based methods for fault extraction are used to extract the complexity information related with faults from vibration signals of rolling bearings for comparison purpose. For each class 30 samples with data length 2048 are selected, i.e., in total 210 samples belonging to seven classes are used in the experimental analysis. Besides, for the single channel based MSE and MFE methods, only the DE channel data are used for comparison. The entropy results of the above five methods are given in
Figure 4a–e, respectively. From
Figure 4 it can be found that the MSE and MFE curves of vibration signals of rolling bearings in a single channel are mixed together at most scales, while the MMSE, MMFE and RCMMFE curves of rolling bearings of different classes are obviously divided at most scales.
In order to fulfill the fault diagnosis of rolling bearing automatically, 10 samples of each class are randomly selected as training samples while the remaining 20 ones are taken as testing samples. Correspondly, the initial training data set can be obtained with dimension 70 × 20, the initial testing data set with dimension 140 × 20 as well. We firstly study the proposed method on the experiment data. The RCMMFEs of vibration signals of rolling bearing of the seven classes with totally 210 samples are computed. Ten samples of each class are randomly selected as training samples and the remaining 20 are as testing ones. Correspondingly, the initial training data set with dimension 70 × 20 can be obtained as well as the initial testing data set with dimension 140 × 20.
Secondly, the LS for feature selection is used to the training samples to sort the 20 entropy values according to their importance, where the cosine value of two vectors is used to evaluate the ‘closeness’ between them and assign weights for each edge in the graph and k nearest neighbors (KNN) method is used to construct the graph with parameter k = 5. Then the first five ones are selected to construct sensitive fault training data set with dimension 70 × 5. Correspondingly, the features of initial testing data set also can be reordered according to the order of sensitive fault training data set with obtaining sensitive testing ones with dimension 140 × 5.
Next, the sensitive training data sets are used to train the PSO-SVM based multi-classifier (for seven classes). After that the sensitive testing data sets are used to test the trained multi-classifier and the outputs of proposed methods are given in
Figure 5, from which it can be found that all testing samples are correctly classified and the corresponding fault identifying rate of the proposed method is 100%. This indicates the effectiveness of the proposed method.
For comparison purposes, MSE, MMSE, MFE and MMFE are used to replace RCMMFE in the proposed method to extract the fault information from vibration signals of rolling bearings. Similar to the above proposed method, LS is used for fault feature selection and PSO-SVM used for fault mode classification when the construction of training and testing data set are the same as the proposed fault diagnosis method. The outputs of testing samples by using these four methods are shown in
Figure 6a–d, respectively, and the corresponding identifying rates are shown in
Table 2. From
Figure 6 and
Table 2 it can be found that all the testing samples are correctly classified by the MFE-, MMFE- and RCMMFE-based fault diagnosis methods with 100% identification rates, while there 40 and 10 testing samples are respectively classified by the MSE- and MMSE-based methods. To sum up, the above analysis result shows that the proposed methods can get an accurate identifying rate when analyzing the classification problem including fault category and severity of rolling bearing. Most of all, it can be found the fuzzy entropy-based nonlinear dynamic methods—MFE, MMFE and RCMMFE—get much higher fault identifying rates than that of sample entropy-based MSE and MMSE methods.
In the above case, the proposed and its related methods are applied to distinguish the fault category and severity of rolling bearings. However, in some case, we only want to know the fault locations without considering the fault severity. Under this consideration, the above seven classes can be redivided into four class problems described in
Table 3, when IR1 and IR2, OR1 and OR2, BE1 and BE2 are amalgamated into single IR, OR and BE. Next, 70 samples (20 of each fault classes and 10 of Norm) are randomly selected as training samples that used for input of LS to select the most important features and then the first five ones are used to construct sensitive fault training data sets. Also the selected feature orders are used to reorder the feature orders of testing data sets. Finally, the sensitive fault features of training data are used to train the PSO-SVM based multi-classifier (for four classes) and those of testing data are used to test the four class multi-classifier. Correspondingly, the outputs of all testing data obtained by the four methods are given in
Table 4, respectively, from which it can be concluded that all testing samples are correctly classified by the MFE and RCMMFE combining LS and PSO-SVM-based fault diagnosis methods and the fault identifation rate is 100%, while there are respectively one, six and one testing samples misclassified by the MSE-, MMSE- and MMFE-based fault diagnosis methods with corresponding identification rates of 99.29%, 95.71% and 99.29%. This indicates that the fault diagnosis effects of fuzzy entropy based methods—MFE, MMFE and RCMMFE—have better distinguishing capability than the sample entropy-based ones, both in dealing with four or seven classes problems of rolling bearings. Also, the MMFE- and RCMMFE-based fault diagnosis methods using three channel data have higher fault identification rates and stronger robustness than the single channel data-based methods.
Finally, to verify the indispensability of LS for feature selection, all obtained 20 fuzzy entropy values are taken as the fault features and input to the PSO-SVM-based multi-classifier for training and testing. The features are respectively divided into seven and four classification problems and the corresponding identifying rates of the MSE, MFE, MMSE, MMFE and RCMMFE methods for these two classification problems are shown in
Table 5. From
Table 2,
Table 4 and
Table 5 it can be found that the identification rates of most methods without using LS feature selection are lower than those of methods using LS for both seven and four classification. Also it can be seen that all the methods get higher identification rates for four class problems than that for seven class, which means that it is more difficult to distinguish the fault severity by comparing with fault class. Lastly, the comparison result indicate that the MFE-, MMFE- and RCMMFE-based methods can reflect the fault information better and get higher identifying rates than the MSE- and MMSE-based ones no matter whether using LS is used for feature selection.