Hindawi
Hindawi
Hindawi
Research Article
A Stacked Autoencoder-Based Deep Neural Network for
Achieving Gearbox Fault Diagnosis
Copyright © 2018 Guifang Liu et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Machinery fault diagnosis is pretty vital in modern manufacturing industry since an early detection can avoid some dangerous
situations. Among various diagnosis methods, data-driven approaches are gaining popularity with the widespread development of
data analysis techniques. In this research, an effective deep learning method known as stacked autoencoders (SAEs) is proposed
to solve gearbox fault diagnosis. The proposed method can directly extract salient features from frequency-domain signals and
eliminate the exhausted use of handcrafted features. Furthermore, to reduce the overfitting problem in training process and improve
the performance for small training set, dropout technique and ReLU activation function are introduced into SAEs. Two gearbox
datasets are employed to conform the effectiveness of the proposed method; the result indicates that the proposed method can not
only achieve significant improvement but also is superior to the raw SAEs and some other traditional methods.
machine (RBM) [13, 14], and convolutional neural networks where g is the decoding function, W2 is the weight matrix of
(CNN) [15, 16]. Compared with traditional methods, deep the decoder, and b2 is the bias vector.
learning methods do not need human labor and expert The parameter sets of the autoencoder are optimized to
knowledge for feature extraction. All the hyperparameters in minimize the reconstruction error:
model training and pattern classification modules are able to
1 n
be trained jointly. Therefore, deep learning can be employed 𝜙 (Θ) = arg min ∑L (xi , x̂i ) (3)
to address machinery fault diagnosis in a very general way. 𝜃,𝜃 n i=1
As one of the widely used deep learning techniques,
stacked autoencoders (SAEs) have attracted considerable where L represents a loss function L(x, x̂) = ‖x − x̂‖2 .
attention in fault diagnosis. It has been investigated as a As shown in Figure 2, the structure of SAEs is stacking n
common component of DNN by Bengio et al. [17]. Jia et al. autoencoders into n hidden layers by an unsupervised layer-
[18] proposed a SAEs based DNNs for roller bearing and wise learning algorithm and then fine-tuned by a supervised
planetary gearbox fault diagnosis with input as frequency method. So the SAEs based method can be divided into three
spectra after Fourier transform. Guo et al. [19] employed steps:
multidomain statistical features of the raw vibration signals (1) Train the first autoencoder by input data and obtain
as the input of SAEs, which can be viewed as a kind of feature the learned feature vector;
fusion. Liu et al. [20] fed the normalized spectrograms created (2) The feature vector of the former layer is used as the
by STFT into SAEs for rolling bearing fault diagnosis. In the input for the next layer, and this procedure is repeated until
work presented in [21], the nonlinear soft threshold approach the training completes.
and digital wavelet frame were used to process the measured (3) After all the hidden layers are trained, backpropaga-
signal and then fed into SAEs for rotating machinery diag- tion algorithm (BP) is used to minimize the cost function and
nosis. Jia et al. [22] constructed a local connection network update the weights with labeled training set to achieve fine-
based on normalized sparse autoencoder, and L1 norm was tuning.
employed to find sparse features. 2.2. Dropout. Dropout is an effective strategy that has been
Inspired by the prior researches, a new framework based proved to reduce overfitting in the training process of neural
on SAEs is proposed to resolve the gearbox fault diagnosis. networks. The overfitting problem always happens when the
Furthermore, to overcome the deficiency of overfitting prob- training set is small, which would result in a low accuracy on
lem in the training process and improve the performance the test set. Dropout can randomly affect the neurons of the
for small training set, dropout technique [23] and ReLU hidden layer to lose power in the training process as shown
activation function are introduced into SAEs. Rest of the in Figure 3, but the weights of those neurons are preserved.
paper is organized as follows. Section 2 briefly introduces the Furthermore, the neurons can recover to work when the next
algorithms of SAEs, dropout, and ReLU activation function. sample is input. Technically, dropout is able to be achieved by
Section 3 is dedicated to detailing the content of the proposed setting the output date of some hidden neurons to 0 and then
method. In Section 4, the multifault gearbox dataset is these neurons cannot be related to the forward propagation
adopted to validate the effectiveness of the proposed method. process. Many researches have tested the effect of dropout
Furthermore, the superiority of the proposed method is on reducing the overfitting problem for the small training
exhibited by comparing with the other traditional methods. set [28], and this paper will also employ it to enhance the
Finally, some conclusions are drawn in Section 5. feature extraction ability and classification accuracy of SAEs
for multifault gearbox fault diagnosis.
2. Theoretical Background
2.3. ReLU. For traditional activation functions (sigmoid
2.1. Stacked Autoencoders. Autoencoder is a kind of unsuper-
and hyperbolic tangent functions), the gradients decrease
vised learning structure that owns three layers: input layer,
quickly with training error propagating to forward layers. The
hidden layer, and output layer as shown in Figure 1. The
rectified linear units (ReLU) activation function has received
process of an autoencoder training consists of two parts:
extensive attention in recent years, since its gradient will not
encoder and decoder. Encoder is used for mapping the input
decrease with the independent variables increasing. So the
data into hidden representation, and decoder is referred to
network with ReLU does not suffer from gradient diffusion
reconstructing input data from the hidden representation.
or vanishing. The ReLU function is shown in (4) and the
Given the unlabeled input dataset {xn }N
n=1 , where xn ∈ R
m×1
, structure is displayed in Figure 4.
hn represents the hidden encoder vector calculated from xn ,
and x̂n is the decoder vector of the output layer. Hence the fr (x) = max (0, x) (4)
encoding process is as follows:
hn = f (W1 xn + b1 ) (1) 3. Proposed Framework
where f is the encoding function, W1 is the weight matrix of This section details the proposed intelligent fault diagnosis
the encoder, and b1 is the bias vector. method. In the method, SAEs are combined with dropout
The decoder process is defined as follows: to achieve multifault gearbox fault diagnosis. The framework
and illustration of the proposed method are displayed in Fig-
x̂n = g (W2 hn + b2 ) (2) ure 5. SAEs combined with dropout model are applied to train
Mathematical Problems in Engineering 3
...
Output layer
Decoder
...
Hidden layer
Encoder
... Input layer
AE3
Classifier
Fine-fun
Hidden layer 3
Stack
Hidden layer 2
AE2
Hidden layer 1
Input layer
AE1
Figure 3: Dropout neural net model. Left: a standard neural net. Right: an example of a thinned net produced by applying dropout.
the weight matrix from frequency spectra of vibration signals. (2) Build the DNNs by SAEs, and then employ the
Specifically, the procedure can be described as follows: unlabeled training set {Xi }M
i=1 to pretrain the DNNs layer-by-
(1) The spectra of vibration signals are composed the layer.
training set {Xi , li }M
i=1 , where M is the number of samples, (3) Utilize BP algorithm to update the weights and fine-
j N×1
X ∈R is the ith sample containing N Fourier coefficients, turn the parameters of the SAEs with labeled training set
and li is the health label of Xi . {Xi , li }M
i=1 .
4 Mathematical Problems in Engineering
0
−3 −2 −1 0 1 2 3
Classifier ···
BP
3
AE2 Dropout
2
AE1 1
FFT ···
Data acquisition
(4) The testing set is adopted to validate the effectiveness gears (pinion and wheel gear) and the gear parameters are
of the proposed method. displayed in Table 1. There are six health conditions under
three loads: normal, a single worn pinion, a single pit of
4. Experiments wheel, a single broken tooth of wheel, coupled fault of broken
wheel and worn pinion, and coupled fault of wheel pit and
4.1. Case 1: Fault Diagnosis of a Multifault Gearbox worn pinion. For brevity, the six fault types of gear are
named as Type-1, Type-2, Type-3, Type-4, Type-5, and Type-
4.1.1. Data Description. Gear faults including distributed fault 6, respectively. 100 data samples are collected from each fault
(worn) and localized faults (broken, pit), as well as coupled type under one load by an overlapped manner, so a total of
fault in power train, perhaps cause catastrophic accidents. 1800 samples are obtained from the designed bench and each
Therefore, an early recognition of the gear faults is critical sample contains 1000 data points. Considering the rotation
for normal operation of a gearbox. Our paper focuses on frequency of shaft is 880 rpm, so each period of rotation
investigating the multifault gearbox. In this section, a muli- contains 350 data points. For avoiding the influence of speed
fault gearbox experimental dataset is employed to validate fluctuation, each sample collects almost three periods of
the effectiveness of the proposed method [29]. The vibration rotation data (1000 data points). The frequency spectra are
signals were collected on a specially designed bench which also adopted as input data, and each sample contains 500
consisted of a one phase input and three-phase output Fourier coefficients. The major reason of using frequency
motor (the nominal power is 0.75 kW and nominal rotation spectra is that the frequency spectra can show the distribution
frequency is 880 rpm), a gearbox, the shaft supporting seats, of constitutive components with discrete frequencies and
a flexible coupling, and a magnetic powder brake as show more clarity information about the state of rotating machines
in Figure 6. The sensor is a piezoelectric accelerometer [18]. Here we randomly select 4 samples from the normal type
(DH131E) mounted on the flat surface of gearbox and the of gear, and obtain their Fourier coefficients by FFT as shown
sampling frequency is 5120 Hz. The gearbox includes two in Figure 7. It is easy to find that the time-domain features of
Mathematical Problems in Engineering 5
Three-phase motor
Tooth-shaped belt
20 2
Sample 1 0 1
−20 0
200 400 600 800 1000 100 200 300 400 500
20 2
Sample 2 0 1
−20 0
200 400 600 800 1000 100 200 300 400 500
20 2
Sample 3 0 1
−20 0
200 400 600 800 1000 100 200 300 400 500
20 2
Sample 4 0 1
−20 0
200 400 600 800 1000 100 200 300 400 500
Time points Frequency coefficients
each sample are different, but the frequency spectra features deviation is the lowest, so 0.3 is chosen as the dropout rate in
are becoming regularity with each other. The structure of the this experiment.
designed DNNs is 500, 200, 100, and 6, respectively. To classify the six health conditions of the gears, 10%
samples are employed to train the proposed model and the
4.1.2. Diagnosis Results. The parameter of dropout rate 𝛼 is rest are used for testing. The learning rate is 0.01 and the
changed from 0 to 0.7 with a step size of 0.1, and 15 trials are iteration number is 100. The training and testing accuracies of
carried out for the experiment in order to reduce the effective 15 trials are displayed in Figure 9 and the average training and
of randomness. 10% of samples are randomly selected to train testing accuracies are 100% and 99.34% ± 0.25% respectively,
the model, and the rest are used for testing. The diagnosis which indicates that the proposed model can also distinguish
accuracies are shown in Figure 8. It is clearly seen that when 𝛼 the six health conditions of gear with a high accuracy. To
is 0.3, the diagnosis accuracy is the highest and the standard illustrate the process concretely, the classification results of
6 Mathematical Problems in Engineering
100 6
5
Health label
98
Average accuracy (%)
4
96 3
2
94
1
0 50 100 150 200 250 300 350 400 450 500
92
Sample number
100 50
Dimension 3
99.8 0
Accuracy (%)
99.6
−50
−50
99.4
−50 0
0 50
99.2 50 Dimension 1
Dimension 2
99
2 4 6 8 10 12 14
Type-1 Type-4
Trial number Type-2 Type-5
Type-3 Type-6
Training accuracy
Testing accuracy Figure 11: Feature visualization map.
Figure 9: Diagnosis result using the proposed method.
Load Gearbox
Motor
Tachometer
Shock
Data Absorber
Acquisition
System
(a)
(b) (c)
24 teeth Output
X
Z 29 teeth
Y Input
Tested gear
(d) (e)
Figure 14: (a) Experimental setup; (b) worn teeth; (c) broken teeth; (d) accelerometer location; (e) schematic of the gearbox.
× 104
200 6
Amplitude
4
0
2
−200
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0 1000 2000 3000 4000 5000 6000 7000 8000
Time (s) Frequency (Hz)
(a)
× 104
200 8
Amplitude
6
0 4
2
−200 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0 1000 2000 3000 4000 5000 6000 7000 8000
Time (s) Frequency (Hz)
(b)
× 104
200 8
Amplitude
6
0 4
2
−200 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0 1000 2000 3000 4000 5000 6000 7000 8000
Time (s) Frequency (Hz)
(c)
× 104
500 8
Amplitude
6
0 4
2
−500 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0 1000 2000 3000 4000 5000 6000 7000 8000
Time (s) Frequency (Hz)
(d)
Figure 15: Vibration data and corresponding frequency spectra of the four different types of gear conditions: (a) NC; (b) SW; (c) MW; and
(d) BT.
Mathematical Problems in Engineering 9
4 References
[1] X. Jiang, S. Li, and Q. Wang, “A study on defect identification of
Health label
3
planetary gearbox under large speed oscillation,” Mathematical
Problems in Engineering, vol. 2016, Article ID 5289698, 18 pages,
2
2016.
1 [2] S. Yin, X. Li, H. Gao, and O. Kaynak, “Data-based techniques
0 50 100 150 200 250 300 focused on modern industry: an overview,” IEEE Transactions
Sample number on Industrial Electronics, vol. 62, no. 1, pp. 657–667, 2015.
[3] J. Gao, L. Wu, H. Wang, and Y. Guan, “Development of a method
Figure 17: Diagnosis result of the 4th trial.
for selection of effective singular values in bearing fault signal
de-noising,” Applied Sciences, vol. 6, no. 5, p. 154, 2016.
30 [4] R. Zhao, R. Yan, J. Wang, and K. Mao, “Learning to monitor
machine health with convolutional Bi-directional LSTM net-
works,” Sensors, vol. 17, no. 2, article no. 273, 2017.
20
[5] V. Muralidharan and V. Sugumaran, “A comparative study of
Naı̈ve Bayes classifier and Bayes net classifier for fault diagnosis
10 of monoblock centrifugal pump using wavelet analysis,” Applied
Dimension 2
[13] P. Tamilselvan and P. Wang, “Failure diagnosis using deep belief [30] J. Wang, S. Li, X. Jiang, and C. Cheng, “An automatic feature
learning based health state classification,” Reliability Engineering extraction method and its application in fault diagnosis,” Jour-
& System Safety, vol. 115, no. 7, pp. 124–135, 2013. nal of Vibroengineering, vol. 19, no. 4, pp. 2521–2533, 2017.
[14] M. Gan, C. Wang, and C. Zhu, “Construction of hierarchical [31] J. Wang, X. Jiang, S. Li, and Y. Xin, “A novel feature represen-
diagnosis network based on deep learning and its application in tation method based on deep neural networks for gear fault
the fault pattern recognition of rolling element bearings,” Mech- diagnosis,” in Proceedings of the 2017 Prognostics and System
anical Systems and Signal Processing, vol. 72, no. 2, pp. 92–104, Health Management Conference (PHM-Harbin), pp. 1–6, IEEE,
2016. Harbin, China, July 2017.
[15] O. Janssens, V. Slavkovikj, B. Vervisch et al., “Convolutional [32] X. Jiang, S. Li, and Y. Wang, “Study on nature of crossover phe-
neural network based fault detection for rotating machinery,” nomena with application to gearbox fault diagnosis,” Mechani-
Journal of Sound and Vibration, vol. 377, pp. 331–345, 2016. cal Systems and Signal Processing, vol. 83, pp. 272–295, 2017.
[16] X. Guo, L. Chen, and C. Shen, “Hierarchical adaptive deep con-
volution neural network and its application to bearing fault
diagnosis,” Measurement, vol. 93, pp. 490–502, 2016.
[17] Y. Bengio and Y. Lecun, “Scaling learning algorithms towards
AI,” Large-Scale Kernel Machines, pp. 321–359, 2007.
[18] F. Jia, Y. Lei, J. Lin, X. Zhou, and N. Lu, “Deep neural networks:
a promising tool for fault characteristic mining and intelligent
diagnosis of rotating machinery with massive data,” Mechanical
Systems and Signal Processing, vol. 72-73, pp. 303–315, 2016.
[19] L. Guo, H. Gao, H. Huang, X. He, and S. Li, “Multifeatures
fusion and nonlinear dimension reduction for intelligent bear-
ing condition monitoring,” Shock and Vibration, vol. 2016,
Article ID 4632562, 10 pages, 2016.
[20] H. Liu, L. Li, and J. Ma, “Rolling bearing fault diagnosis based
on STFT-deep learning and sound signals,” Shock and Vibration,
vol. 2016, Article ID 6127479, 12 pages, 2016.
[21] J. Tan, W. Lu, J. An, and X. Wan, “Fault diagnosis method study
in roller bearing based on wavelet transform and stacked auto-
encoder,” in Proceedings of the 27th Chinese Control and Decision
Conference, CCDC 2015, pp. 4608–4613, IEEE, May 2015.
[22] F. Jia, Y. Lei, L. Guo, J. Lin, and S. Xing, “A neural network
constructed by deep learning technique and its application to
intelligent fault diagnosis of machines,” Neurocomputing, vol.
272, pp. 619–628, 2018.
[23] G. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R.
Salakhutdinov, “Improving neural networks by preventing co-
adaptation of feature detectors,” 2012, https://arxiv.org/abs/
1207.0580.
[24] W. Li, S. Zhang, and G. He, “Semisupervised distance-pre-
serving self-organizing map for machine-defect detection and
classification,” IEEE Transactions on Instrumentation and Mea-
surement, vol. 62, no. 5, pp. 869–879, 2013.
[25] W. Du, J. Tao, Y. Li, and C. Liu, “Wavelet leaders multifractal fea-
tures based fault diagnosis of rotating mechanism,” Mechanical
Systems and Signal Processing, vol. 43, no. 1-2, pp. 57–75, 2014.
[26] J. S. Lin and Q. Chen, “Fault diagnosis of rolling bearings based
on multifractal detrended fluctuation analysis and Mahalanobis
distance criterion,” Mechanical Systems and Signal Processing,
vol. 38, no. 2, pp. 515–533, 2013.
[27] X. Lou and K. Loparo, “Bearing fault diagnosis based on wavelet
transform and fuzzy inference,” Mechanical Systems and Signal
Processing, vol. 18, no. 5, pp. 1077–1095, 2004.
[28] S. Wang and C. Manning, “Fast dropout training,” in Pro-
ceedings of the International Conference on Machine Learning,
JMLR.org, p. 118, 2013.
[29] X. Jiang, S. Li, and Y. Wang, “A novel method for self-adaptive
feature extraction using scaling crossover characteristics of
signals and combining with LS-SVM for multi-fault diagnosis
of gearbox,” Journal of Vibroengineering, vol. 17, no. 4, pp. 1861–
1878, 2015.
Advances in Advances in Journal of The Scientific Journal of
Operations Research
Hindawi
Decision Sciences
Hindawi
Applied Mathematics
Hindawi
World Journal
Hindawi Publishing Corporation
Probability and Statistics
Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 http://www.hindawi.com
www.hindawi.com Volume 2018
2013 www.hindawi.com Volume 2018
International
Journal of
Mathematics and
Mathematical
Sciences
Journal of
Hindawi
Optimization
Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018
International Journal of
Engineering International Journal of
Mathematics
Hindawi
Analysis
Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018