Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Deep Learning Models For Predictive Maintenance - A Survey, Comparison, Challenges and Prospect

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Deep learning models for predictive maintenance: a survey, comparison,

challenges and prospect

OSCAR SERRADILLA, Mondragon Unibertsitatea


EKHI ZUGASTI, Mondragon Unibertsitatea
URKO ZURUTUZA, Mondragon Unibertsitatea
Given the growing amount of industrial data spaces worldwide, deep learning solutions have become popular for predictive maintenance,
which monitor assets to optimise maintenance tasks. Choosing the most suitable architecture for each use-case is complex given the
arXiv:2010.03207v1 [cs.LG] 7 Oct 2020

number of examples found in literature. This work aims at facilitating this task by reviewing state-of-the-art deep learning architectures,
and how they integrate with predictive maintenance stages to meet industrial companies’ requirements (i.e. anomaly detection, root
cause analysis, remaining useful life estimation). They are categorised and compared in industrial applications, explaining how to fill
their gaps. Finally, open challenges and future research paths are presented.

CCS Concepts: • Applied computing → Engineering; • Computing methodologies → Neural networks; Machine learning
algorithms.

Additional Key Words and Phrases: Deep learning, predictive maintenance, data-driven, survey, review, Industry 4.0

ACM Reference Format:


Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza. 2020. Deep learning models for predictive maintenance: a survey, comparison,
challenges and prospect. 1, 1 (October 2020), 34 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn

1 INTRODUCTION
In recent years, industry has risen attention on artificial intelligence and machine learning techniques due to their
capacity of creating automatic models that handle the big amount of data currently collected, which is growing
exponentially. The research trend of machine learning has switched to more complex models such as ensemble methods
and deep learning given their higher accuracy dealing with bigger datasets. These methods have evolved due to the
increase of computing power and the latter mainly due to the evolution of GPU-s, being deep learning one of the most
researched topics nowadays. These models achieve state-of-the-art results in many fields such as intrusion detection
system, computer vision or language processing.
Maintenance is defined by the norm EN 13306 [168] as the combination of all technical, administrative and managerial
actions during the life cycle of an item intended to retain it in, or restore it to, a state in which it can perform the required
function. Moreover, it defines three types of maintenance: improvement maintenance improves machine reliability,
maintainability and safety while keeping the original function; preventive maintenance is performed before failures
Authors’ addresses: Oscar Serradilla, Mondragon Unibertsitatea, Electronics and Computer Science, Loramendi 4, Mondragon, Spain, 20500, oserradilla@
mondragon.edu; Ekhi Zugasti, Mondragon Unibertsitatea, Electronics and Computer Science, Loramendi 4, Mondragon, Spain, 20500, ezugasti@mondragon.
edu; Urko Zurutuza, Mondragon Unibertsitatea, Electronics and Computer Science, Loramendi 4, Mondragon, Spain, 20500, uzurutuza@mondragon.edu.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not
made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components
of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to
redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org.
© 2020 Association for Computing Machinery.
Manuscript submitted to ACM

Manuscript submitted to ACM 1


2 Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza

occur either in periodical or predictive ways and corrective maintenance replaces the defective/broken parts when
machine stops working. Currently, most industrial companies rely on periodical and corrective maintenance strategies.
Nowadays, we are transitioning towards the fourth revolution denominated as Industry 4.0 (I4.0), which is based on
cyber physical systems and industrial internet of things. It combines software, sensors and intelligent control units
to improve industrial processes and fulfill their requirements [109]. These techniques enable automatised predictive
maintenance functions analysing massive amount of process and related data based on condition monitoring (CM).
Predictive maintenance (PdM) is the most cost-optimal maintenance type given its potential to achieve an overall
equipment effectiveness (OEE) [171] higher than 90% by anticipating maintenance requirements [37, 44] and promise
a return on investment up to 1000% [81]. Maintenance optimisation is a priority for industrial companies given that
effective maintenance can reduce their cost up to 60% by correcting failures of machines, systems and people [42].
Concretely, PdM maximises components’ working life by taking advantage of their unexploited lifetime potential while
reducing downtime and replacement costs by replacement before failures occur; thus preventing expensive breakdowns
and production time loss caused by unexpected stops.
The numerous research works on PdM can be classified in three approaches [98]: physical model-based, data-driven
and hybrid. Physical model methods use systems’ knowledge to build a mathematical description of their degradation
[18, 91, 94, 129, 169]. It is easy to understand their physical meaning but difficult to implement in complex systems.
Data-driven methods predict systems’ state by monitoring their condition with solutions that learned from historical
data [16, 192, 201]. These are composed of statistical methods, reliability functions and artificial intelligence methods.
They are suitable for complex systems since they do not need to understand how these work. However, it is more difficult
to relate their output to physical meaning. Finally, hybrid approach combines the aforementioned two approaches
[98, 204]. Data-driven and deep learning methods have gained popularity in industry in recent years due to the increase
of machine data collection, which enables the development of accurate PdM models in complex systems.
The review methodology of this survey on deep learning models application for predictive maintenance is
explained in this paragraph. First, context and applications of PdM are analysed. After that, different types of models
are researched. Then, data-driven models are analysed. Finally, deep learning models are thoroughly reviewed. This
methodology has enabled to acquire general insight of the scope and then focus on the specific research topics.
Furthermore, the state-of-the-art (SotA) analysis has enabled the comparison among methods and discussion on
challenges and prospect of DL models for PdM. The conducted analysis is performed by querying search engines about
aforementioned topics. Initially, Scopus and Engineering Village search engines were used, since these contain more
specific and relevant articles of the field. However, when the research advanced to more specific topics, another search
engine was included: Google Scholar. This extends the research space to unindexed journals and preprints, providing a
wider space including newer and unindexed published works. Many works belong but are not limited to the following
publishing editorials: ACM Digital Library, ScienceDirect, IEEE-Xplore and SpringerLink.
Despite existing several published reviews on machine learning and deep learning models for predictive maintenance,
this work provides these additional contributions to the state-of-the-art (SotA): (1) We review and explain the most
relevant data-driven techniques focused on SotA deep learning architectures with application to PdM, providing
extensive perspective on the available techniques in a simplified and structured way. (2) We discuss the suitability of
deep learning models for PdM and compare their benefits and drawbacks with statistical and classical machine learning
models. (3) We analyse current trends on PdM publications, define their gaps, present research challenges, identifying
opportunities and prospect.

Manuscript submitted to ACM


Deep learning models for predictive maintenance: a survey, comparison, challenges and prospect 3

This paragraph describes the remaining content of this work. Section 2 reviews predictive maintenance’s background
stages and provides an overview of traditional data-driven models used in the field, together with an overview about
deep learning techniques. Section 3 reviews and categorises the most relevant state-of-the-art deep learning works
for predictive maintenance by underlying technique, analysing them by PdM stages to enable comparison. Moreover,
related reviews are analysed. Section 4 reviews the publicly available reference datasets for PdM model application and
benchmarking. Section 5 discusses the suitability of deep learning models for predictive maintenance, evaluating their
benefits and drawbacks in comparison with other data-driven techniques. Finally, Section 6 concludes this work by
highlighting the most relevant aspects and gaps discovered during the review of referenced publications.

2 OVERVIEW OF PREDICTIVE MAINTENANCE AND DEEP LEARNING


2.1 Predictive maintenance background
Predictive maintenance solutions have to deal with many factors, peculiarities and challenges of industrial data. The
most relevant ones are discussed in the next paragraphs.
Venkatasubramanian et al. present in [169] the 10 desirable properties for a PdM system: quick detection and diagnosis,
isolability (distinguish among different failure types), robustness, novelty identifiability, classification error estimation,
adaptability, explanation facility, minimal modelling requirements, real-time computation and storage handling, multiple
fault identifiability.
Two main challenges of industrial use-cases are their behaviour and data variability. These occur even in assets
working under same characteristics given the mechanical tolerances, mount adjustments, variations in EOC and other
factors. These factors make PdM model reusability difficult among machines and assets. Other relevant challenges
are gathering quality data, performing correct preprocessing and feature engineering to get a representative dataset
for the problem. In addition, each observation is related to previous ones and therefore should be analysed together,
which increases data dimensionality and modelling complexity; and failure data gathering is difficult given machines
are designed and controlled to work correctly while preventing failures, therefore these are not frequent.
Some commonly monitored key components in PdM are but not limited to, bearings, blades, engines, valves, gears
and cutting tools [200]. Moreover, the most common failure types detected by CM are imbalance cracks, fatigue, abrasive
and corrosion wear, rubbing, defects and leak detection among others. The publication by Li et al. [90] classifies the types
of failures that may exist in the system as: component failure, environmental impact, human mistakes and procedure
handling.
The commonly used CM techniques are the following ones [166]: mechanical ultrasound [14], vibration analysis
[7, 8, 47, 51, 178], wear particle testing [49, 183], thermography, motor signal current analysis [45] and nondestructive
testing [46], but there are additional techniques as torque, voltage and envelopes [117], acoustic emission [8], pressure
[205] or temperature monitoring [14, 205]. The articles [54, 154] also dive into these techniques and cover the types of
failures they can detect, together with their applications. They highlight that EOC information could complement these
CM techniques to perform a more robust PdM analysis, collecting data from different sources: physical, machine and
operating.
Environmental and operational conditions (EOC) are conditions under which an industrial asset such a machine
or component is working [165]. Environmental conditions refer to external conditions that affect them like ambient
temperature or surrounding vibration perturbations. In contrast, operational conditions are working processes’ assigned
technical specifications, such as desired speed, force or positions. Additionally, sensor data comes from measurements
Manuscript submitted to ACM
4 Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza

taken by machine sensors. This data monitored over the time creates a dataset, in a form of time-series. Its analysis
using condition monitoring techniques enable determining component and machine states by comparing patterns and
trends with historical data. Many works present component degradation patterns in a plot denominated as P-F curve
[166], where health decreases from healthy working condition until failure as time or machine cycles go by.

2.2 Data-driven predictive maintenance stages


The majority of deep learning models for PdM are based on the same principles as other machine learning and statistical
techniques. Precisely, most data-driven methods follow the incremental steps presented in the roadmap of Figure 1,
based on the articles [138, 177] and OSA-CBM standard [83]: 1st anomaly detection, 2nd diagnosis, 3rd prognosis
and lastly mitigation.

IV
Mitigation
   Degradation 
III
Prognosis

II Failure Diagnosis

I Anomaly Detection

Fig. 1. Predictive maintenance roadmap represented by a stages of a pyramid chart.

Commonly two additional steps are performed before the aforementioned ones to prepare the data for PdM, as
general data analytic lifecycle, Khan et al. [75] and other PdM authors present. These steps are preprocessing and Feature
Engineering (FE), which, as stated above, are key to enhance model accuracy on PdM stages by creating a representative
dataset for the problem. All PdM stages have to be designed, adapted and implemented to fit use-cases’ requirements
and their data characteristics. In addition, the PdM systems development is incremental and therefore, techniques,
algorithms and decisions taken in each stage will influence the following ones. The next subsections overview the most
common data-driven methods to address each PdM stage.

2.2.1 Preprocessing. This step consists of preparing the collected data for further stages. Each PdM model has different
requirements and these must be taken into consideration when choosing adequate preprocessing techniques to boost
model performance. The most common techniques are briefly explained and referenced below: sensor data validation
[213] makes sure the collected data is correct; feature synchronisation [79] is used to gather signals sampled at different
timestamps to create a time-series/cycle-based data that is easier to handle; data cleaning removes or interpolates
not available and missing values [30, 39]; oversampling [30, 133] is applied for imbalance data handling to boost
accuracy on commonly scarce failure data class or to deal with small datasets; encoding [114] and discretisation [114]
change features’ type by projection to a new space where they are easier to handle by the model; segmentation splits
data in chunks to analyse big datasets and enable parallelisation [106]; feature scaling like normalisation [150] or
standardisation [139] scales all features to the same or similar space that enables comparisons; noise handling [79]
facilitates noisy data modelling.
Manuscript submitted to ACM
Deep learning models for predictive maintenance: a survey, comparison, challenges and prospect 5

2.2.2 Feature engineering. This step consists of extracting a relevant feature subset to be used as input for models
in further stages. It can boost statistical and machine learning model performance, despite not being compulsory
for deep learning models given these can extract new representative features that fit the problem automatically. The
most common techniques can be grouped into next groups: feature extraction as statistical features in time [200] and
frequency [55, 200, 210] domains that extract time/frequency relations of features; based on projection to new space like
principal component analysis [26, 45] which reduce dimensionality while keeping relevant information; concatenation
and fusion methods [87] create new features by combining available ones; feature selection [155] reduces dimensionality
discarding features of low variance, redundant and uncorrelated to target, given these increase complexity while not
supplying additional information.

2.2.3 Anomaly detection. It aims to detect whether the asset is working under normal condition or not. There are three
ways to address this step using data-driven models, classified by their underlying machine learning task: classification,
one-class classification and clustering. Respectively, these can be used when labeled data of different classes is available
in the training phase, when only one class data exist (commonly non-failure data) and when the data is unlabelled.
Failure modes and effects analysis (FMEA) [130] and its evolution by adding criticality analysis FMECA [22] are useful
to gain vision on the possible types of failures based on expert knowledge, which helps designing the data analysis
lifecycle, prioritising the failure types or anomalies to be detected.
The anomaly detection methods need preprocessed and some also depend on feature engineered data to work. Once
worked on features, the next step is to select, train and optimise the right model for the use-case. Following PdM stages
will be influenced and constrained by the selected AD method and use-case’s data. Table 1 classifies and summarises
the main data-driven anomaly detection techniques based on referenced SotA articles and the following review works
[28, 134, 172, 200]. Besides, two or more of these techniques can be combined to create an anomaly detection system
that compensates the disadvantages of a single model.

2.2.4 Diagnosis. Once an anomaly has been detected, the next stage consists of diagnosing whether this anomaly
belongs to a faulty working condition and can evolve into a future failure or, in contrary, there is no risk of failure.
The last case indicates that the anomaly detection model has not worked properly and therefore it may need to be
reevaluated or retrained. The diagnosis is usually based on root cause analysis (RCA) techniques, which aim to identify
the true cause of a problem.
The diagnosis algorithm has to be suitable for the problem being addressed. There are several approaches to tackle
this step, which depend on the implemented AD method and training data characteristics: multi-class classification,
binary classification, one-class classification and clustering. Concretely these are chosen if the dataset has multiple
failure types, failure and non failure observations, only observations of one class or unsupervised, respectively. There is
another technique that commonly complements RCA: anomaly deviation quantification by health index (HI). It aims to
measure assets’ damage by comparing current working data with historical data in a supervised or unsupervised way.
It can either indicate a percentage of deviation with regard to normal working data, or show degradation level in a
numerical scale, where the higher the value the more damaged the component is, where minimum value means no
damage, maximum is fully damaged or failure and intermediate values indicate different degrees of degradation [119].
The diagnosis step is easier when there is more information about the dataset and its labels. The main statistical and
machine learning techniques for diagnosis are described in the following list, ordered by increasing difficulty. They are
divided according to the anomaly detection technique used in the previous stage, which depends on data characteristics.
Manuscript submitted to ACM
6 Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza

Table 1. Summary of anomaly detection models classified by prevailing techniques. In the first column, Unsup refers to unsupervised,
All refers to supervised, semi-supervised and unsupervised and and Combination refers to a combination of models respectively.

Based What analy- Normal Anomalies Most common algorithms and categorised
on and ses data
Type
Density Density in In high In low den- K nearest neighbors (k-NN) [39, 116, 133, 163], local outlier factor (LOF)
Unsup features density sity [30, 43], local correlation integral (LOCI) a , relative density factor, density-
space dimen- based outlier score, reliability functions [128, 194]
sion
Distance Distance Near Far from Traditional threshold distance mahalanobis [150] or euclidean [47], rank
Unsup among from neighbors based detection algorithm (RBDA), randomization and pruning based, data
data-points neigh- streams based
bors
Statistics Relation to Near to Far from Parametric: gaussian mixture models (GMM) with expectation maximi-
All distribution distri- distribution sation (EM) [7], control charts as exponentially weighted moving average
models fit to bution models (EWMA) [29, 162]. Non-Parametric: kernel density estimation (KDE):
training data models gaussian or KL-divergence [162, 178, 208], histogram-based outlier detec-
tion (HBOS) [120], boxplot analysis [30], 3𝜎 [1]. Entropy-based permu-
tation entropy [51, 141], fuzzy entropy [27] and K-S test [19].
Clustering Relation Belong Belong to a Partitioning clustering: partitioning around medoids (PAM), K-means
Unsup to clusters to a small cluster [7, 43, 48]. Hierarchical clustering: DB-Scan, agglomerative [30], at-
created by large and far from tribute oriented induction (AOI) [52]. Grid-based: Dcluster. For high
unsuper- cluster large clusters dimensional: D-Stream, fuzzy-rules based [43]
vised ML or near
models one
Ensemble Combines Combina- Combination Bagging or boosting based as random forest (RF) [26, 39, 45, 116], extra
Combi- dissimilar tion of of models gradient boosting (XGBoost) [30], adaboost [116] and isolation forest (IF)
nation models. models [133], greedy ensemble, score normalization
Robust
Learning Relation Near the Far from Active learning. Transfer learning. Reinforcement learning.
All to models known the known Projection-based: Subspace and compression reconstruction error mea-
learned with classes classes of the suring like PCA [7] and AE [35], correlation [205, 214] and tensor-based.
training data of the model State-space based (hidden state of observed data and time evolution):
model kalman filter [170], hidden markov models (HMM) [133], bayesian net-
works (BN) [78] (dynamic BN, belief network), attention-based NN and
RNN (GRU, LSTM). Graph-based: capture interdependiencies. OCC:
OCSVM [210], BN. Prediction error-based regression: measure devi-
ation (autoregressive integrated moving average (ARIMA) [150], RNN
as LSTM [191]). Classification: normal and abnormal data in training
using interpretable models: linear regression [116], logistic regression
[39, 116], decision tree (DT) [39, 71]. ML classification techniques as SVM
[39, 71, 163] and feedforward NN [140]. Generative methods: GAN [88],
VAE [187].
a Methods that have been applied for AD in general but not specifically for PdM are mentioned but not referenced

• After multi-class classification for anomaly detection: diagnosis is performed based on previous failure data
knowledge of the estimated class, so the link of data to failure type is directly obtained from model [14, 21].
Once the possible failure type has been detected, semi-quantitative and qualitative approaches can be used
by harnessing expert knowledge to evaluate its potential consequences, using tools such as FMEA [38] or
Manuscript submitted to ACM
Deep learning models for predictive maintenance: a survey, comparison, challenges and prospect 7

Ishikawa diagram [137]. In addition, interpreting directly explainable models [2, 9] or using explainability on
less interpretable models such as SVM [40] can also help to perform this task.
• After binary classification for anomaly detection: clustering with extracted features can be performed to group
data by similarity and try to differentiate unlabeled failure types [193]. These diagnosis techniques can also
be based on statistical performance analysis [122], supported on trend analysis and definition of thresholds to
differentiate failure types by similarity or distance.
• After one-class classification or clustering for anomaly detection: these techniques use a threshold in distance
to the classified class or clusters density respectively to categorise anomalies. Diagnosis for these models
usually consists of precomputing metrics from data like health index and monitoring their evolution, instead of
monitoring input data evolution. The diagnosis can be performed using a clustering algorithm in these metrics
to analyse the intra-cluster and inter-cluster relations. Domain knowledge is essential to tie unsupervisedly
discovered relations to physical meaning of monitored assets. This novel knowledge is useful for interpreting
unsupervised models’ output to discover novel failure types, using models as K-means [3] or HMM with IF-ELSE
rules [184]. Log data can also be used for this clustering purpose and tag maintenance data [159] to perform
RCA.

2.2.5 Prognosis. Once an anomaly is detected and diagnosed, the degradation evolution can be monitored based on that
moment’s working conditions and machine state, focusing on the most influential features for AD and diagnosis stages
that can track failures. This step is usually carried out by remaining useful life models that estimate the remaining time
or cycles until a failure occurs when there is enough historical data of that failure type. Conversely, if there is not enough
degradation data, the only way to estimate degradation is by tracking the evolution of HI or the distance between novel
working states and the known good working states. Both aforementioned models can also provide a confidence bound.
The data-driven models for prognosis can be classified into 4 groups regarding their underlying method. The following
list summarises the most common techniques categorised by groups to prognosticate degradation:

• Similarity-based: compare current behavior with past run-to-failure behavior for prognosis [3, 142].
• Statistical: rely on historical statistics to estimate degradation, for example monitoring life usage in combination
with mean-time-to-failure [16] or survival models [203] to estimate the expected duration.
• Time series analysis: ARIMA [3, 16, 73] based on previous values, kalman filter to model hidden state of time-
related noisy data [170] and fourier and genetic programming to generate a polynomial function by optimising a
fitness function [55].
• Learning:
– Classification: diagnose the data to a known failure type or similar working data and then prognosticate a
degradation according to the historical data of this class. Despite any classifier can be used for this purpose,
the following ones are widely used in literature: feed-forward NN [140], SVM [140], BN [9, 85, 86], HMM
[201], fuzzy logic based [211] and RF [16, 62].
– Regression: directly estimate HI, anomaly deviation or RUL from the input data. Common SotA algorithms
are below: linear function is the simplest method [159]; nonlinear functions [96, 203] can model non-linear
relations; support vector regressor (SVR) [16, 17] works like SVM adapted for regression; relevance vector
regression (RVR) is based on bayesian regression [3]; CNN models features’ time-based relationships [13];
wiener processes model degradation by a real valued continuous-time stochastic processes [160]; recurrent
neural networks like LSTM and GRU [192] retain relevant past information for prognosis at each observation.
Manuscript submitted to ACM
8 Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza

2.2.6 Mitigation. Once an anomaly is detected, diagnosed its cause and prognosticated its remaining life, there is
enough information to perform maintenance actions to mitigate failures in early phases and thus prevent assets deriving
into failure. This stage consists of designing and performing the steps necessary to restore assets to correct working
condition before failures occur, which also reduces implementation and downtime costs.
Mitigation is performed by maintenance technicians who are in charge of creating and implementing a mitigation
plan as part of the maintenance management and manufacturing operation management processes. Data-driven PdM
models should generate assistance information, providing domain technicians with statistics [122] and prescriptions
[9]. Therefore, a more advanced mitigation is accomplished by the combination of domain knowledge and data-driven
information about assets’ health and expected degradation [105].

2.3 Deep learning techniques


This section presents deep learning background and introduces its most common architectures applied to the field of
PdM. Nowadays, deep learning models outperform statistical and traditional ML models in many fields including PdM,
when enough historical data exist. The deep learning (DL) term refers to artificial neural networks (ANN), a machine
learning technique inspired on brain functioning, that go beyond shallow 1- and 2-hidden layer networks [125].
ANNs are formed by neurons that compute linear regressions of inputs with weights and then compute non-linear
activation functions such as sigmoid, rectified linear unit (reLU) or tan-h to produce outputs. The network’s parameters
are commonly initialised randomly and they are then adjusted to map input data to output data given the training
dataset. This learning process takes place by running gradient descend algorithm combined with backpropagation
algorithm. These enable to calculate the adjustments of each neuron with respect to the error produced by the network
to reduce it, where the error is calculated based on the user defined cost function. Hornik in the article [103] justifies
that ANNs of at least two hidden layers with enough training data are capable of modelling any function or behaviour,
creating the universal approximator.
The book by Goodfellow et al. [58] provides exhaustive background on DL and it is considered as reference book
by many researchers in the field. Concretely, the book introduces machine learning and deep learning mathematical
background. Afterwards, it focuses on DL optimisation, regularisation, different type of architectures, their mathematical
definition and common applications. A simpler yet powerful overview of the field is done by Litjens et al. in the survey
of DL applied to medicine [102], which is further complemented with a visual scheme collecting the main architectures.
Another survey specifically focused on DL architectures, applications, frameworks, SotA and historical works, trends
and challenges is the one by Pouyanfar et al. [135]. Additionally, a reference book of practical DL applications is
presented by Geron [56], which is based on the following tools: Scikit-Learn 1 , Keras 2 , and TensorFlow 3 .
The most common DL techniques related to the field of PdM are summarised in the following paragraphs. Most of
them are based on the feed forward scheme but each one has its own characteristics:

• Feed-forward/MLP [182] is the first, most common and simplest architecture. It is formed by stacked neurons
creating layers, where all the neurons of a layer are connected to all the neurons of the next layer by feeding
their output to others’ input. However, there are no connections to neurons of previous layers or among neurons
of the same layer. The nomenclature for layers is the following: an input layer, hidden layers and an output layer.

1 https://scikit-learn.org
2 https://keras.io
3 https://www.tensorflow.org

Manuscript submitted to ACM


Deep learning models for predictive maintenance: a survey, comparison, challenges and prospect 9

The neural network is fed with observations pairing input features and target features, which are used to learn
their relation by minimising the error produced by the network by mapping input data to output.
• Convolutional neural network (CNN) [84] is a type of feedforward network that maintains neurons’ neighborhood
by applying convolutional filters. It is inspired by the animal visual cortex and has applications in image and
signal recognition, recommendation systems and NLP among others. The convolutional layer is usually linear
and is followed by the application of an activation function to produce non-linear output. After that, a max or
average pooling layer can be used to reduce the dimension. Finally, most architectures have a flatten step to
obtain representative features of input data that can be used with other ML or DL networks to perform typical
ML tasks. The convolutions’ weights are shared, making them easier to train.
• Recurrent neural network (RNN) [146] models temporal data by saving the state derived from previous inputs of
the network. The back-propagation through time algorithm [181] is an adaptation of traditional backpropagation
for temporal data used to propagate network’s error to previous time instances. However, this propagation can
result into vanishing or exploding gradient problem [65], making this networks forget long-term relations. To
solve this problem, specific RNN architectures were created based on forget gates, like long-short term memory
(LSTM) [66] and gated recurrent unit (GRU) [36].
• Deep belief network (DBN) [64] and restricted boltzmann machine (RBM) [152]. RBM is a bipartite, fully-
connected, undirected graph consisting of a visible layer and a hidden layer. It is a type of stochastic ANN that
can learn probability distribution over the data. It can be trained in supervised or unsupervised ways and its
main applications are on dimensionality reduction and classification. Accordingly, DBN is an ANN where every
two consecutive layers are treated as RBMs. It is trained in unsupervised way to reduce dimensionality. Then, it
can be retrained with classified data to perform classification.
• Autoencoder (AE) [15] is based on singular value decomposition concept [57] to extract the non-linear features
that best represent the input data in a smaller space. It consists of two parts: an encoder that maps input data to
the encoded, latent space, and the decoder, which projects latent space data to the reconstructed space that has
the same dimension as input data. The network is trained to minimise the reconstruction error, which is the
loss between input and output. Autoencoders can be classified according to their latent space dimensionality
in undercomplete and overcomplete, which respectively correspond to a latent space smaller, and bigger or
equal to the input dimension. These simple architectures are extended and adapted to fit different tasks and
problems. Vanilla autoencoders are the simplest autoencoders, which belong to the undercomplete type. The
following variations are obtained by applying regularisation and modifying AE types. One of these adaptations
is the denoising autoencoder (DAE) [82], used for corrupt data reconstruction. It is a type of overcomplete AE
where learning is controlled to avoid "identity function". It is fed with data pairs of noisy input and its denoised
output and trained to reduce the loss between them. Another modification is the sparse autoencoder (SAE) [112],
an AE restricted in the learning phase based on a sparse penalty constraint, which is based on the concept of
KL-Divergence. This algorithm aims to make each neuron sparse, discovering the structure information from the
data easier than vanilla AE and being more useful for practical applications [161].
• Generative models: variational autoenconder (VAE) [76] and generative adversarial network (GAN) [59]. Both
models were designed to work in unsupervised way. VAE is a generative and therefore non-deterministic
modification of the vanilla AE where the latent space is continuous. Usually, its latent space distribution is
gaussian, from where the decoder reconstructs the original signal based on random sampling and interpolation.
It has applications on estimating the data distribution, learning a representation of data samples and generating
Manuscript submitted to ACM
10 Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza

synthetic samples among others. GAN is another type of generative neural network that consists of two parts:
generator and discriminator. The generator is trained to generate an output that belongs to a specific data
distribution using as input a representation vector. The discriminator is trained to classify its input data whether
it belongs to a specific data distribution or not. The generator’s output is connected to discriminator’s input and
they are trained together, adversely. Generator’s objective is to bias discriminator by generating outputs from
random input and trying to make the discriminator classify it as it belongs to the specific trained distribution.
The role of the discriminator is to distinguish between synthetic, generated data, from non-synthetic, real data
from trained distribution. They are trained together so that each part learns from the other, competing to bias the
other part, similar to game theory. GANs can be extended to other ML tasks such as supervised or reinforcement
learning.
• Self organising map (SOM) [77] is an ANN-based unsupervised way to organise the internal representations of
data. It uses competitive learning, in contrast to typical ANNs that use backpropagation and gradient-descend, to
create a new space called map that is typically 2 dimensional. It is based on neighborhood functions to preserve
the topological properties of the input space into the new space, represented in cells. It has applications on
clustering, among others.

3 DEEP LEARNING FOR PREDICTIVE MAINTENANCE


This section collects, summarises, classifies and compares the reference DL techniques for PdM, analysing the most
relevant works and applications. It contains accurate DL models that achieve SotA results on reviewed articles, surveys
and reviews of the field. Even though most articles combine several techniques and perform more than one PdM stage
in the same architecture, this section classifies in its five first subsections the works by their principal DL technique to
perform each stage of Section 2.2; including the previous stage feature engineering and excluding preprocessing given
the latter’s explanation on previous section is also valid for DL. This classification enables the analysis and comparison of
DL techniques by stages. The sixth subsection presents works that successfully combine the aforementioned techniques
to create more complete architectures that fulfil one or more PdM stages, to give examples of ways to combine techniques
that can be infinite. Finally, last subsection gathers the most relevant information contained in similar works to this
survey, discussing related reviews and surveys.
The SotA works can be classified regarding their underlying ML task and algorithms used to address it, which are
directly related to the use-case and its data requirements. Binary classification is used when training data contains
labelled failure and non-failure observations. Multi-class classification is used in the same case as binary classification,
but there is more than one type of failure classified and therefore there are at least three classes: one represents
non-failure and then one for each type of failure. One-class classification (OCC) is used when the training dataset only
contains non-failure data, which usually consists of collecting machine data in early working states or when technicians
assure the asset is working correctly. Finally, unsupervised techniques are used when training datasets’ observations
are unlabelled and therefore there is no knowledge of which observations belong to failure and non-failure classes.
Unsupervised techniques can also be used as one class classifiers. Additionally, there are a few works on other machine
learning and deep learning topics such as active learning, reinforcement learning and transfer learning.

3.1 Feature engineering


The deep learning algorithms used in PdM are capable of performing feature engineering automatically, obtaining
a subset of derived features that fit specifically for the task. Therefore, these techniques remove the dependence on
Manuscript submitted to ACM
Deep learning models for predictive maintenance: a survey, comparison, challenges and prospect 11

manual and feature engineering process. Table 2 shows common deep learning techniques used for feature engineering.
These techniques are integrated with machine learning and deep learning models to create architectures that perform
PdM stages.

Table 2. Deep learning techniques for automatic feature engineering and projection. They are based on input signals relations and
temporal context.

Algo- How it works Strengths Limitations Applica- Ref


rithm tions
Feed- Add deep layers with less Reduce dimension to lower Does not model the features Engine health [5,
forward dimensions feature space. Simplest NN ar- by neighborhood neither tem- monitoring, 140,
chitecture poral relations. vibration 189]
monitoring
RBM Automatic feature extrac- Keep spatial representation in Not keeping data variance Bearing [69,
tion. Models data proba- new space. Not much training in new space. Difficulty on degradation, 97]
bility by minimising Con- time. modelling complex data since factory PLC
trastive Divergence. One- only one layer. sensors
way training, reconstruct-
ing input from output.
DBN Automatic feature extrac- Competitive SotA re- Very slow and inefficient Bearings [41,
tion using stacked RBMs sults. Can model time- training. Not modelling long- vibration, avi- 131,
with greedy training. Can dependencies using sliding term dependencies. ation engine, 158,
be used for HI construc- windows. wind turbine 176,
tion. 188]
SOM Data mapping to a speci- Non-linear mapping of com- Difficult to link latent vari- Turbofan, [33,
fied dimension plex data to a lower dimen- ables with physical mean- pneumatic 80, 97,
sion. Maintains feature distri- ing. More complex than other actuator, ther- 136]
bution in the new space. Can techniques. Fixed number of mal power
be combined with other tech- clusters plant, bearing
niques for RCA (i.e. 5-whys degradation
[33])
AEs Dimensionality reduction Automatic FE of raw sensor Extracted features not spe- Bearing [4, 35,
in latent space keeping data achieve similar results cific for the task. Needs more vibration, 67,
maximum input data vari- to traditional features. Tradi- resources: computational and satellite data, 133,
ance. Non-linear FE and tional features can also be in- training data. Loses temporal PHM2012 149]
HI calculation. put. No need of classification relations if input data are raw Predictor
or failure data. Allows online sensors data. Can lead to over- Challenge
CM. fitting
CNN Automatic feature extrac- Simple yet effective. Faster Slower training due to high Bearing, elec- [13,
tion. Univariate or mul- than traditional ML models number of weights. Data anal- tric motor, tur- 24, 61,
tivariate convolutions of in production. Takes advan- ysis in chuncks, not mod- bofan 92, 93,
input. Models sequential tage of neighborhoods. Less elling long-term dependen- 104,
data. Used with sliding training time and data by cies. 121,
windows. Combined with weight-sharing. Can outper- 175]
pooling methods to re- form LSTM. Dropout can pre-
duce dimension vent overfitting
RNNs Regression. Model time- Model temporal relationships RNNs suffer vanishing gra- Aero engine, [11,
series and sequential data of EOC data. Special architec- dient problem, even special hydropower 23,
by propagating state in- tures as LSTM and GRU can architectures cannot model plant 191,
formation through time. model medium-term depen- very long-term dependencies. 192]
dencies Need more resources.

Manuscript submitted to ACM


12 Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza

3.2 Anomaly detection


The deep learning-based AD algorithms can be classified in three groups, as stated in the introduction of this section,
regarding the characteristics of training data. The main architectures have been summarised in Figure 2.

Deep Learning
Traditional Features
Features

- Signal Processing - Feedforward


- Statistical features Preprocessed and - CNN
- Projected-space Feature Engineered - RBM & DBN
features Features - SOM
- AE
- RNN
Deep Learning
Anomaly Detection
for time-series

Training data classified as Training data classified Training data not classified
correct and failure as correct or classified as correct

One-Class Clustering in Compression Regression Generative


Classification
Classification projected reconstruction prediction discriminator
space residuals error as classifier

- OCNN - SOM - AE - RNN - GAN


Binary Multi-class - AE lattent - GAN - VAE
space

- Feedforward
with probability
output

Fig. 2. Diagram of the main deep learning techniques for anomaly detection applied to predictive maintenance and time-series,
classified by machine learning task4 .

Those algorithms are summarised, compared and their main applications are referenced in the following tables. On
one hand, the anomaly detection algorithms based on binary and multi-class classification approaches [5, 140] rely on
training data classified as correct and failure. These commonly used feature extraction techniques either traditional
or deep learning followed by a flatten process, and then have several fully-connected layers of decreasing dimension
until the output layer. The output layer usually uses the softmax activation function to output the probability of failure
and not failure. In the case of binary classification, there are one or two neurons indicating the probability of failure
and normal working condition. Similarly, in multi-class classification there are N+1 number of neurons, where there is
one neuron to indicate the probability of not failure and each of remaining N indicate the probability of each type of
failure. On the other hand, Table 3 contains algorithms that address AD problem based on one-class classification or
unsupervised approaches, using only training data classified as correct or not classified.

4 Inthis work, the term traditional features refers to handcrafted and automatic feature extraction techniques such as statistical or ML-based features,
excluding DL-based features.
Manuscript submitted to ACM
Deep learning models for predictive maintenance: a survey, comparison, challenges and prospect 13

Table 3. Anomaly detection methods that use training data classified as correct or not classified: one-class classification and
unsupervised.

Algo- How it works Strengths Limitations Applica- Ref


rithm tions
Autoencoders
Vani- Threshold in recon- Automatic feature engineering Extracted features not spe- Bearing [35,
lla struction error. Data of raw sensor data or tradi- cific for the task. Needs more vibration, 67,
AE is considered when it tional features. Minimise vari- resources: computational and satellite data, 133,
surpasses the thresh- ance loss in latent space. No training data. Loses temporal PHM2012 144,
old. need of classification or failure relations if input data are raw Predictor 149]
data. Allows online CM sensors data. Can lead to over- Challenge
fitting
Stack- Stack more than one Perform slightly better than Needs more resources than Bearing vibra- [53,
ed AE after another vanilla AE vanilla AE tion 147,
AE 164]
SAE AE constrained in Same as AE plus prevent over- More complex networks and Bearing vibra- [4, 34,
training with sparsity fitting by forcing all neurons to need more resources than tion, turbine 53,
to keeping neurons’ learn vanilla AE vibration 108]
activations low
DAE AE designed for noisy Outperform vanilla AE with More complex networks and Bearing vibra- [108,
data noisy data. Works slightly bet- need more resources than tion 185]
ter stacking several DAEs vanilla AE; stacked DAE needs
even more
Generative
VAE AE that maps input Learns posterior distri- Difficulty on implementation. Ball screw, [111,
data to posterior distri- bution from noisy distri- Loses temporal relations if in- electrostatic 178,
bution bution, generating data put data are raw sensors data. coalescer, web 187]
non-deterministically traffic
GAN Used for data augmen- Good data augmentation with Not working well with big Induction mo- [24,
tation and AD in 2 small imbalance ratio. AD out- imbalance ratio, complex and tor, bearing 88]
ways: using discrimi- perform unsupervised SotA need more resources. Outper- multisensor
nator and using resid- methods formed by simpler methods as
uals CNN [24]
One-Class Classifiers
OC- Train AE and freeze Automatic feature extraction Slower than traditional OCCs. General AD [32]
NN Encoder for OCC, sim- Extracted features are not fo-
ilar to OC-SVM loss cused on the problem
function

Manuscript submitted to ACM


14 Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza

Algo- How it works Strengths Limitations Applica- Ref


rithm tions
Recurrent Neural Networks
Vani- Regression, AD track- Model temporal relation- Suffers vanishing gradient Activity [10]
lla ing error between pre- ships of time-series data. problem; therefore cannot recognition
RNN dicted and real behav- Self-learning. model medium and long-term
ior or HI difference. dependencies. Need more
resources than feedforward AE
or CNN for training.
LSTM Same as vanilla RNN Same as vanilla RNN, however Even if handle better the van- Aircraft data, [10,
but changing neurons these can model longer time de- ishing gradient problem than activity recog- 60,
architecture to LSTMs pendencies than vanilla vanilla, have difficulty on mod- nition 123]
elling long-term dependencies.
Long training and computa-
tional requirements
GRU Same as vanilla RNN Same as LSTM plus easier to Same as LSTM but obtain a lit- Aircraft data, [10,
but changing neurons train tle worse results activity recog- 123]
architecture to GRUs nition

3.3 Diagnosis
The diagnosis steps depends on the information and type of AD model used for the previous stage, given PdM is an
incremental process where each stage is complemented by previous stages. In the case of multi-class classifier, the type
of failure related to the detected anomaly is already known, which enables a straightforward diagnosis and comparison
with historical data [5, 140]. Nonetheless, most PdM architectures implement binary classifier, one-class classifier or
unsupervised models, which lack of failure type information. Therefore, these can only perform diagnosis by grouping
the detected anomalies among them by similarity, which is done using clustering models [6, 12, 186, 212] and SOM
[63, 95, 148, 153]. The features used for this stage are similar to the ones for AD, which can be based either on traditional
or deep learning techniques.

3.4 Prognosis
The deep learning based models for PdM prognosis are focused on fitting a regression model to prognosticate either the
remaining useful life (RUL) of the diagnosed failure or the health degradation when there is no historical data of that
type. The RUL is commonly measured in time or number of cycles and the health degradation is tracked using anomaly
deviation quantification by health indexes. The most common algorithms are summarised and compared in Table 4.
Their input can be the information generated in previous stages and traditional or deep learning features. There are
many other algorithms that use DL features or traditional features combined with fully-connected network as last layer
to perform prognosis, but these are presented in the combination Section 3.6, while this section focuses on the most
common and simple SotA techniques that only use DL for prognosis.

Manuscript submitted to ACM


Deep learning models for predictive maintenance: a survey, comparison, challenges and prospect 15

Table 4. Summary of DL based prognosis works for PdM. Unsup and sup in algorithm column refer to unsupervised and supervised
respectively.

Algo- How it works Strengths Limitations Applica- Ref


rithm tions
Vanilla Regression, predicting features’ Model temporal relation- Suffers vanishing gradi- Aero engine [192]
RNN and HI’s evolution or predict- ships of time-series data. ent problem; therefore
Unsup ing remaining cycles or time. Possibility of self-learning cannot model medium
and sup and long-term depen-
dencies. High training
and computational
requirements
LSTM Same as vanilla RNN but chang- Same as vanilla RNN, Even if handle better the Aero engine, [126,
Unsup ing the neurons architecture to however these can model vanishing gradient prob- rolling bear- 192,
and sup LSTMs longer time dependencies lem than vanilla, have ing, lithium 195,
than vanilla. Outperform difficulty on modelling batteries 202]
vanilla RNN long-term dependencies.
High training and compu-
tational requirements
GRU Same as vanilla RNN but chang- Same as LSTM plus easier Same as LSTM but obtain Aero engine, [192]
Unsup ing the neurons architecture to to train a little worse results lithium batter-
and sup GRUs ies

3.5 Mitigation
The research methodology followed to create the current publication, showed no DL-based mitigation publications.
Several possible reasons for this fact are described bellow. The majority of DL works are focused on optimising a single
performance metric for the ML task to be solved, like maximising accuracy or F1 score on classification, and minimising
errors like MAE or RMSE on regressions. These works’ solutions are usually compared in simulated reference datasets,
looking for the architecture that outperforms the rest on the aforementioned metrics. Nonetheless, deep learning models
are the hardest ML type to understand given their higher complexity that makes them more accurate at modelling high
dimensionality complex data, and therefore they fail to meet the industrial explanation facility requirement.
In order to address this problem, they should provide mitigation advice or at least explanations about the reasons
for making predictions, which could be supported on the emerging field explainable artificial intelligence (XAI).
Furthermore, the final and most ambitious step in this PdM stage should be the automatising of recommendations
for domain technicians to integrate PdM in the maintenance plan, by optimising industrial maintenance process via
maintenance operation management. Finally, the reasons for existing few real application publications are presented
bellow. Industrial companies avoid publishing their data or implementation details to protect their intellectual property
and know-how from competence. Moreover, many data-driven research publications lack of domain technician feedback
so they tackle the problem only relying on data-driven techniques, without embracing domain knowledge.

3.6 Combination of models and remarkable works


The DL techniques already presented throughout current section are the basic elements and architectures used for PdM.
It is worth highlighting there are infinite possible architectures by combining these techniques among them, or used
together with other data-driven or expert-knowledge based techniques. The combination and adaptation of models for
Manuscript submitted to ACM
16 Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza

the problem being addressed results into more accurate models that fulfil its requirements. Table 5 summarises how
these models are commonly combined in SotA architectures, presenting their strengths and limitations.

Table 5. Possible combination of deep learning techniques for PdM architectures.

Algorithm How they work Strengths Limitations


Traditional and DL features combined with DL models
Traditional and Combine traditional and DL Outperform traditional ML and Understanding deep features is
DL-based FE FE methods with already pre- simple DL architectures. No need not straightforward. Slower and
with AE sented autoencoder architec- of handcrafted features. Auto- more complex than simple ANN
tures in the same model matic FE. Can model time-series models
dependencies using CNN, LSTM
and GRU by context extraction
Traditional and DL and traditional FE meth- Same as above Same as above
DL-based FE ods with DBN stacked to other
with DBN models
Hybrid: combination of features and models
DL FE tech- Combine CNN, LSTM, other DL Automatic dimension reduction. More complex and need more re-
niques combi- FE techniques and traditional Outperform other FE techniques. sources than traditional ML and
nation features to extract more com- Model temporal relations and simple DL models. Bidirectional
plex features neighbors. With bidirectional RNN cannot be done online
RNNs, future context is available

Moreover, Table 6 contains relevant works of the aforementioned types, which merge traditional FE or deep learning
FE with traditional data-driven or deep learning models. This collection of works shows that combination of techniques
can address all PdM stages using supervised or unsupervised approaches.

Table 6. Combination of deep learning techniques for PdM: relevant works summary.

Architecture How it works Strengths Limitations Applications and


refs
Autoencoders
AE with ex- Unsupervised AD track- Two steps training. Easy Unable to model non- Power plant [118],
treme learning ing error of ELM for to train. linear or complex rela- machine lifetime es-
machine (ELM). OCC, trained with nor- tions in ELM. timation [20].
mal data.
Stacked SAE Unsupervised FE adding No need of preprocessing. Difficult optimisation of Rolling bearing [34]
noise Robust to noise. Severity deep architecture
identification.
Stacked CNN- Unsupervised FE mod- Model temporality using Only short temporal rela- Gearbox vibration
based AE elling temporal relations neighbours. tions [25]
in sliding window

Manuscript submitted to ACM


Deep learning models for predictive maintenance: a survey, comparison, challenges and prospect 17

Architecture How it works Strengths Limitations Applications and


refs
AE with LSTM Unsupervised FE mod- Model temporality Higher computational re- Aviation [68], turbo-
elling temporal relations quirements fan and milling ma-
chine [113], solar en-
ergy, electrocardio-
gram [132] and man-
ufacturing [100]
VAE with RNN, Unsupervised generative Model temporality. Regu- High computational re- Motor vibration[68],
GRU or LSTM FE modelling temporal larised latent space quirements turbofan [190], sen-
relations and reducing to sors [196]
latent gaussian distribu-
tion
Restricted boltzmann machines and deep belief networks
DBN Unsupervised FE by hier- Fault classification from Need preprocessing. Ten- Induction motors
archical representations frequency distribution dency to overfitting. Not fault simulator [158]
modelling temporal rela-
tions
Regularised Probabilty modelling, RBM regularisation im- Single RBM, can be im- Rotating systems
RBM + SOM + health assesment and prove FE for RUL proved by multiple of [97]
RUL RUL prognosis using these layers.
distance
Image genera- Supervised or unsuper- Model temporality in an Difficulty on extracting Journal bearing
tion + DBN + vised FE modelling from image. Combine with im- clusters’ meaning, relying [127]
MLP/FDA/SOM vibration image data age processing methods on domain knowledge.
Hybrid: combination of features and models
Bidirectional Unsupervised FE mod- Health estimation and Need all signal to be Turbofan [50]
LSTM elling temporal relations then RUL mapping. More processed: no streaming.
robust. Future context is More complex than sim-
available. ple LSTM.
AE + Convo- Unsupervised probabil- Model temporality. More Each part trained indepen- Electric locomotive
lutional DBN ity modelling by auto- stable than traditional dently, not for problem. bearing fault [156]
+ exponential matic FE, modelling tem- ML and simple DL. Each EMA only model shorter
moving average poral relations. Training model complement others term temporal relations.
(EMA) in steps weaknesses
CNN and bidi- Unsupervised FE mod- Raw sensor data mod- Sliding window needs Milling machine
rectional LSTM elling temporal relations elling. Model long-term complete window. Higher [207]
based AE + fully temporal dependencies complexity combining
connected + lin- DL techniques
ear regression

Manuscript submitted to ACM


18 Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza

Architecture How it works Strengths Limitations Applications and


refs
Traditional FE Unsupervised FE mod- Same as above Same as above Aviation bearing
+ bidirectional elling temporal relations fault detection, gear
GRU combined fault diagnosis and
with ML models tool wear prediction
[174]

The rest of this subsection summarises the contributions and strengths of relevant analysed works. One interesting
article was published by Shao et al. [157], where a methodology of AE optimisation for rotating machinery fault
diagnosis is presented. Firstly, they create a new loss function based on maximum correntropy to enhance feature
learning. Secondly, they optimise model’s key parameters to adapt it to signal features. This model is applied to fault
diagnosis of gearbox and roller bearing. Another relevant publication is by Lu et al. who use growing SOM [107], a
extension of SOM algorithm that does not need specification of map dimension. It has been applied to simulated test
cases with application in PdM.
Guo et al. [60] propose a model based on LSTM and EWMA control chart for change point detection that is suitable
for online training. An additional interesting work is presented by Lejon et al. [89], who use ML techniques to detect
anomalies in hot stamping machine by non-ML experts. They aim to detect anomalous strokes, where the machine is
not working properly. They present the problem that most of the collected data corresponds to press strokes of products
without defects and that all the data is unlabelled. This data comes from sensors that measure pressures, positions and
temperature. The algorithms they benchmarked are AE, OCSVM and IF, where AE outperforms the rest achieving the
least number of false positive instances. As the authors conclude, the obtained results show the potential of ML in this field
in transient and non-stationary signals when fault characteristics are unknown, adding that AEs fulfill the requirements
of low implementation cost and close to real-time operation that will lead to more informed and effective decisions.
As previously mentioned in this article, the possibility of model combination is infinite. For instance, Li et al. in the
work [110] combine a GAN structure with LSTM neurons, two widely used DL techniques that achieve SotA results.
Additionally, DL techniques can be combined with other computing techniques as Unal et al. do in [167], combining a
feed forward network with Genetic Algorithms.
The last highlighted article that combines DL models is by Zhang et al. [199], one of the most complete unsupervised
PdM works. They build a model that uses correlation of sensor signals in the form of signature matrices as input that
is fed into an AE that uses CNN and LSTM with attention for AD, partial RCA and RUL. The strengths of this work
are the following: they show that correlation is a good descriptor for time-series signals, attention mechanism using
LSTMs gives temporal context and the use of anomaly score as HI is useful for RCA, mapping the detected failures to
the input sensors that originate them. Conversely, the RCA they do is not complete since they only correlate failures to
input sensors but are not able to link them to physical meaning. Moreover, the lack of pooling layers together with the
combination of DL techniques results in a complex model that is computationally expensive, needs more time and data
for training and its decisions are hard to explain.
The following publications use other ML tasks combined with DL models for PdM, and other DL techniques. Wen et
al. [179] use transfer learning with a SAE for motor vibration AD, outperforming DBNs. The article by Wen et al. [180]
proposes a transfer learning based framework inspired in U-Net that is pretrained with univariate time-series synthetic
Manuscript submitted to ACM
Deep learning models for predictive maintenance: a survey, comparison, challenges and prospect 19

data. The aim of this network is to be adaptable to other univariate or multivariate anomaly detection problems by
fine-tuning.
Martinez et al. [115] present a bayesian and CNN based DL classifier for AD. They first use a small labelled dataset
to train the model. Then, the model is used to classify the remaining data and then, it uses uncertainty modelling to
analyse the observations that cannot be correctly classified due to high entropy. Finally, it selects the top 100 with
highest entropy to query an domain knowledge technician, asking him/her to label them in order to retrain the model
with this new data. This procedure is followed until the model obtains a good accuracy. This work is an example of how
to use two interesting techniques in the field of PdM to address the problem of lacking labelled data by querying domain
technicians, showing them the instances from which the model can learn the most. Concretely, the aforementioned
techniques belong to semi-supervised classification type using active learning. Similarly, the review by Khan et al.
[75] mentions that expert knowledge can help troubleshooting the model and, if domain technicians are available, the
model could learn from them using a ML training technique called active learning where the model queries them in the
learning stage. Moreover, Kateris et al. present the work [74] where they use SOM as OCC model for AD together with
active learning, to progressively learn different stages of faults.
Another interesting technique with PdM applications is deep reinforcement learning. The publication by Zhang et al.
[197] uses it for HI learning, outperforming feed-forward networks but underperforming CNN and LSTM for AD and
RUL. This technique consists of transferring the knowledge adquired from one dataset to another one. The procedure
consists of reusing a part or the whole pretrained model adapting it to new’s requirements, which sometimes requires
retraining the model but this needs less data and time.

3.7 Related review works summary


This subsection summarises the most relevant information of the review works related to this survey, highlighting their
main contributions, detected challenges and gaps in the SotA works and their conclusions.
The survey by Chalapathy and Chawla [31] analyses the SotA DL approaches to address anomaly detection. The
work by Rieger et al. [145] makes a qualitative review on the SotA fast DL models applied for PdM in industrial internet
of things (IIoT) environments. They argue that real-time processing is essential for IoT applications, meaning that
a high latency system can lead to unintentional reactive maintenance due to insufficient time to plan maintenance.
Moreover, they highlight how DL models can be optimised. They state that weight-sharing on RNNs enables parallel
learning, which can help learning these type of nets that achieve SotA results in most PdM applications. Accordingly,
they also justify the use of max-pooling layers when dealing with CNNs to eliminate redundant processing and thus
optimise them. There are two DL reviews applied to fields that can be extrapolated to PdM: DL models for time series
classification by Fawaz et al. [70] and DL to model sensor data [173].
The review by Zhao et al. [206] explains there are algorithms that use traditional and hand-crafted features whereas
others use DL features for the problem, and presents the most common FE methods for DL based PdM systems. They
state that both aforementioned features work properly in DL models, supported on their SotA revision. Many of these
works use techniques to boost model performance as data-augmentation, model design and optimisation for the problem,
adopting architectures that already work in the SotA. They also adapt the learning function and apply regularisations
and tweak the number of neurons, connections, apply transfer learning or stack models in order to enhance model
generalisation and prevent overfitting. The advantage of traditional and hand-crafted features is they are not problem
specific, being applicable to other problems. Moreover, they are easy to understand by expert-knowledge technicians
given that they are based on mathematical equations. However, as they are not problem specific, in some cases DL-based
Manuscript submitted to ACM
20 Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza

FE techniques perform better since these are learned specifically for the problem and directly from the data. However,
they are not as intuitive as aforementioned features, meaning that technicians can have problems trying to understand
how they work.
The article by Zhao et al. [206] also summarises the information already stated throughout this survey: DL models
can achieve SotA results, pre-training in AEs can boost their performance, denoising models are beneficial for PdM
because of the nature of sensor data and that CNN and LSTM variants can achieve SotA results in the field of PdM
using model-optimisation, depending on the dataset’s scale. In addition, domain knowledge can help in FE and model
optimisation. Conversely, it is difficult to understand DL models even if there are some visualisation techniques because
they are black-box models. Transfer learning could be used when having little training data, and PdM belongs to a
imbalaced class problem because faulty data is scarce or missing.
The survey by Zhang et al. [200] compares the accuracy obtained by ANN, Deep ANN and AE in different datasets,
which allows comparisons, however these comparisons are done with models applied to different datasets and therefore
they are not fair. Nonetheless, they show high accuracy results, most of them between 95% and 100%, emphasising that
DL models can obtain promising results. They state that deeper models and higher dimensional feature vectors result in
higher accuracy models but sufficient data is needed. With the increase of computational power and data growth in the
field of PdM, research on this area tends to focus on data-driven techniques and specifically DL models. However, DL
models lack of the explainability and interpretability of taken decisions.
The review by Khan et al. [75] states that the developed DL architectures are application or equipment specific and
therefore there is no clear way to select, design or implement those architectures; the researches do not tend to justify
the decision of selecting one architecture over another that also works for the problem, for instance selecting CNN
versus LSTM for RUL. Its authors also argue that SotA algorithms as the ones presented throughout this section all
have shown to be working correctly and are not different from one another.
Even if this section has been focused on DL models for PdM, we have seen that they are often integrated with
traditional models and/or traditionally FE features, such as time and frequency domains, feature extraction based on
expert knowledge or mathematical equations.
As the authors Khan et al. state in [75], there is a lack of understanding of a problem when building DL models. They
also argue that VAE is ideal for modelling complex systems, achieving high prediction accuracy without health status
information. The algorithms that analyse the data maintaining its time-series relationship by analysing the variables
together, at the same time, are the most successful: no matter if using sliding window, CNN or LSTM techniques. Most
of SotA algorithms focus on AD, whereas they can also be adapted to perform RUL by a regression or RNN, where the
majority use LSTMs. Regressions commonly use features learned for the used AD models, or even use traditional and
hand-crafted features. Generative models like GAN do not work as good as expected. However, CNN works well while
needing less data and computing effort. This means that even DL models can achieve similar accuracy using traditional
features or deep features extracted from the data unsupervisedly.

4 COMPARISON OF STATE-OF-THE-ART RESULTS


4.1 Benchmark datasets
The review made by Khan et al. [75] states that one of the problems of PdM proposals is the lack of benchmarking
among them. There are some public PdM databases among the prognosis datasets released by the Nasa [124] belonging
to the scope of predictive maintenance, which are presented in the following paragraphs.
Manuscript submitted to ACM
Deep learning models for predictive maintenance: a survey, comparison, challenges and prospect 21

3. Milling dataset [124] gathers acoustic emission, vibration and current sensor data under different operating
conditions with the purpose of analysing the wear of the milling insert. Regarding PdM stages, it allows the application
of AD, RCA and RUL.
4. Bearing dataset [124] gathers vibration data from 4 accelerometers that monitor bearings under constant pressure
until failure, obtaining a run-to-failure dataset where all failures occur after exceeding their design life of 100 million
revolutions. Its possible PdM applications are AD and RUL estimation.
6. Turbofan engine degradation simulation dataset [124] contains run-to-failure data from engine sensors. Each
instance starts at a random point of engine life where it works correctly, and monitors its evolution until an anomaly
happens and afterwards reaches the failure state. The engines are working under different operational conditions and
develop different failure modes. Its possible PdM applications are AD, RCA and RUL.
10. Femto bearing dataset [124] is a bearing monitoring dataset inside the Pronostia competition that contains run-to-
failure and sudden failure data. The used sensors are thermocouples gathering temperature data and accelerometers
that monitor vibrations in the horizontal and vertical axis. Its possible PdM applications are AD, RCA and RUL.
Industrial companies are reluctant to publish their own datasets because they tend to trade secret their data and
knowledge in order to protect themselves from their competence. The dataset that approximates most to companies
data is the one published by Semeion research center named Steel plates faults dataset [99], where steel plate faults are
classified into 7 categories.

4.2 Data-driven technique’s results comparison


This subsection compares different relevant data-driven works for PdM application turbofan dataset introduced in
previous subsection, which is generated using the Commercial modular aero-propulsion system simulation (C-MAPPS).
The reasons for choosing this dataset are that it is one of the reference datasets of PdM, it enables the application of all
PdM steps and it is one of the most used dataset for model ranking.
The dataset lacks of the RUL label, which is the target column. Hence, many works assume it to be constant in
the initial period of time where the system works in correct conditions and degrades linearly after exceeding the
changepoint or initial anomalous point. The constant value in initial period is a parameter denominated as 𝑅𝑚𝑎𝑥 , which
is set to values near 130 for many state-of-the-art works, enabling a fair comparison of their results.
The most common metrics to evaluate models’ performance are the following ones [13]: root mean square error
(RMSE) in Equation 1, and score function that penalises late predictions in Equation 2, which was used in the PHM
2008 data challenge [151]. In previous equations, 𝑁 is the number of engines in test set, S is the computed 𝑠𝑐𝑜𝑟𝑒, and
ℎ = (𝐸𝑠𝑡𝑖𝑚𝑎𝑡𝑒𝑑𝑅𝑈 𝐿 − 𝑇𝑟𝑢𝑒𝑅𝑈 𝐿). Table 7 gathers state-of-the-art results for the last years on the four subsets of the
dataset.
v
u
t 𝑁
1 ∑︁ 2
𝑅𝑀𝑆𝐸 = ℎ (1)
𝑁 𝑖=1 𝑖
Í𝑁  − ℎ𝑖 
13 − 1 𝑓 𝑜𝑟 ℎ𝑖 < 0



 𝑖=1 𝑒
𝑆= Í𝑁  ℎ𝑖  (2)
 10 − 1 𝑓 𝑜𝑟 ℎ𝑖 ≥ 0
𝑖=1 𝑒


Results comparison of Table 7 does not show only model’s performance, but also the combination of preprocessing
and feature engineering techniques. Therefore, results show the performance of the whole data process applied to the
Manuscript submitted to ACM
22 Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza

Table 7. State-of-the-art results on four turbofan dataset subsets since 2014. The lower the metric, the better the model is considered
to perform on average. Best results are highlighted in bold.

Reference Year 𝑅𝑚𝑎𝑥 Architecture FD001 FD002 FD003 FD004 FD001 FD002 FD003 FD004
RMSE RMSE RMSE RMSE Score Score Score Score
Ramasso et al. [143] 2014 135 RULCLIPPER 13.3 22.9 16.0 24.3 216 2796 317 3132
MLP 37.6 80.0 37.4 77.4 17972 7802800 17409 5616600
SVR 21.0 42.0 21.0 45.3 1381 589900 1598 371140
Babu et al. [13] 2016 130
RVR 23.8 31.3 22.4 34.3 1504 17423 1431 26509
DCNN 18.4 30.3 19.8 29.2 1287 13570 1596 7886
Zhang et al. [198] 2017 130 MODBNE 15.0 25.1 12.5 28.7 334 5585 422 6558
Zheng et al. [209] 2017 130 LSTM + FFNN 16.1 24.5 16.2 28.2 338 4450 852 5550
Li et al. [93] 2018 125 CNN + FFNN 12.6 22.4 12.6 23.3 273 10412 284 12466
Ellefsen et al. [101] 2019 115- RBM + LSTM 12.6 22.7 12.1 22.7 231 3366 251 2840
135
Da Costa et al. [72] 2019 125 LSTM+attention 14.0 17.7 12.7 20.2 320 2102 223 3100

dataset until prediction. Nonetheless, the table shows that deep learning based architectures are the ones that achieve
state-of-the-art results in recent years. Concretely, these architectures are composed of combination of different DL
techniques.

5 DISCUSSION
This section analyses deep learning architectures’ suitability in the field of PdM. It is the result of comparing reviewed
articles’ trends, results and conclusions with PdM data characteristics and industrial requirements.

5.1 Comparison and suitability of deep learning in predictive maintenance


Physical and knowledge-based models for PdM were widely used 15 years ago but they are less common nowadays due
to the difficulty or impossibility of modelling complex systems. In fact, data-driven statistical and machine learning
publications started to gain popularity in this field since they learn system’s behaviour from the data directly and
therefore needed little domain knowledge. Conversely, in later years, due to the emergence of I4.0, the increment of
computational power and the automatising of machine and asset data collection, the data-driven publication trend has
moved towards deep learning based schemes.
There are several reasons for deep learning being a hot research topic in predictive maintenance field. They usually
achieve higher accuracy than traditional data-driven techniques. They can dispense with expert knowledge feature
engineering given their capacity of extracting automatic features for the problem being addressed. In addition, they can
model time-series data using attention or time context. The application of DL models is also widely researched in other
fields such as image processing and seq2seq. Nonetheless, their two major drawbacks are high training data requirements
and difficulty on model explainability. Conversely, these models must be modified and adapted for industrial and PdM
data characteristics and requirements.
Therefore, the model type choice for PdM application should be done carefully, after analysing each use case’s
needs. Maybe, its requirements are not satisfied by that moment’s machine learning research trend, which is currently
deep learning, and other type of models are more appropriate. For instance, statistical, machine learning and deep
learning models have their own peculiarities. They are all able to fulfill the following PdM desirable characteristics
Manuscript submitted to ACM
Deep learning models for predictive maintenance: a survey, comparison, challenges and prospect 23

from the list [169] by creating specific architectures: quick detection and diagnosis, isolability, novelty identifiability,
classification error estimation, adaptability, and real-time computation and multiple fault identifiability. However, the
main differences among these type of models are summarised in Table 8. Hence, the election of one group over the rest
and even deciding the final architecture requires a thorough analysis and comparison to determine the one that suits
both: use case and its data requirements.

Table 8. Differences of statistical, machine learning and deep learning architectures for predictive maintenance.

Characteristic Statistical Machine learning Deep learning


Amount of data for training Small Medium High
Training time Small Medium High
Complexity Small Medium High
Medium (grey models)
Explanation facility High Low
Low (blackbox models)
Accuracy Low Medium High

In the end, most deep learning architectures are either based on traditional data-driven concepts or are combined
with them in order to fill their gaps. Therefore, DL models could be a piece inside a PdM architecture that combines
other kind of models presented in Table 8. This could compensate the drawback of some models with others by a fusion
that meets PdM needs better.

5.2 Automatic development of deep learning models for predictive maintenance


Even though deep learning models can achieve SotA results in PdM datasets, their design, development and optimi-
sation relies on publications, data scientists previous knowledge and trial and error testing. These are some of their
biggest challenges: architecture type and structure choice, number of hidden layers and neurons, activation functions,
regularisation terms to prevent overfitting and learning parameters optimisation.
For the above-stated reasons, the whole process of DL model creation is not as automatic as believed. Moreover, in
order to obtain competitive results, many authors preprocess and feature engineer the raw EOC signals. This can boost
model performance but at the same time remove relevant information that could be learnt automatically using more
complex architectures. In addition, these steps are commonly performed by data scientists. Usually, domain knowledge
is not embedded, so models are expected to learn all the non-linear relations from the data. Conversely, this information
could help in architecture’s dimensionality reduction, resulting in simpler, more accurate and as a result explainable
models. Other byproduct benefits are less training data requirements, less training time and higher generalisation to
avoiding overfitting.

5.3 Application of deep learning research in industrial processes


There are many works that apply deep learning for predictive maintenance in the literature. Most SotA deep learning
techniques tackle PdM unsupervisedly given the difficulty to obtain failure data in industrial companies. This is the
reason for AEs, RBMs and generative models having so much repercussion in the field. The following paragraphs
summarise common techniques and how they meet industrial requirements.
Regarding SotA, there are many DL proposals for AD and RUL. Most of them tend to combine different algorithms
to create a more complex model that contains advantages of the techniques that compromise it. The most common
Manuscript submitted to ACM
24 Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza

combination for PdM sensor modelling in unsupervised way is CNNs with LSTMs in an AE or derived architecture.
Similarly, supervised approaches usually use CNNs and LSTMs in a ANN that outputs probability of failure types or
regressions. However, techniques fusion augments model complexity.
Regarding the diagnosis step, it is easy to perform RCA with supervised models given that, when the training
data contains the label, failure or not or even the type of failure, the model can directly map the new data with the
corresponding failure type automatically. However, in companies that lack this type of data, they can only model
normality by OCC models or even use unsupervised approach to model unlabelled data. There is a gap in these latter
models since they are unable to perform complete RCA given the impossibility to classify unspecified failure types.
One underlying reason could be the lack of collaboration between data scientists and expert-knowledge technicians.
Therefore, this gap could be filled by applying explainable artificial intelligence (XAI) techniques to facilitate the
communication, understanding and guidance of DL models. XAI is a promising emergent field with few publications in
the field of PdM.
Deep learning models also fail to propose mitigation actions since, as mentioned before, they should work together
with domain technicians knowledge and many works do not, tackling the problem in a purely data-scientific way and
forgetting about the underlying process working knowledge. For this reason, even if many models are accurate, they
can not meet industrial and real PdM requirements. They present complex schemes with many hidden layers even if
Venkatasubramanian et al. [169] state that understandability is one desired characteristic for PdM models. Without it,
industrial companies may not deploy a deep learning models to production as domain technicians would be unable to
understand their predictions and therefore trust them. Once again, the application of XAI techniques together with
expert knowledge could overcome the problem by enabling to: understand the predictions, map detected failures to real
physical root cause and even propose mitigation actions giving data-driven advice to help in maintenance management
(MM) and manufacturing operation management (MOM) decision making.
The majority of reviewed works were created and tested in research environments but not transferred or tested in
industrial companies. Even if there are some models trained with real industrial process data, the majority use reference
datasets that have been preprocessed and specifically prepared for the task, such as the ones presented in Section 4
that are generated in simulation or testing environments. However these are unable to adapt to industrial companies’
requirements presented by Venkatasubramanian et al. in article [169] that still prevail nowadays. Lejon et al. in the
work [89] consolidate the aforementioned needs by stating that industrial data is unlabelled and mostly correspond to
non-anomalous process conditions. With regard to PdM architectures, the work by Khan et al. [75] seems to be the one
that summarises and could better fit the requirements of the companies, even though it lacks of specification on how to
address PdM in real companies.
All in all, we have seen that industrial companies need PdM models to be accurate, easy to understand, process data
on streaming and adapted to process data characteristics. Their data is mostly collected in unsupervised way, or only
non-failure data is available. Moreover, it is collected under different EOC. Conversely, there is a gap in the published
data-driven models because available unsupervised and OCC proposals are unable to link novel detected failures to their
physical meaning. The main reason is that these models ignore expert knowledge. In addition, there are few research
publications on the application of XAI techniques in PdM, which could provide solutions for the main presented gaps.

6 CONCLUSIONS
The majority of industrial companies that rely on corrective and periodical maintenance strategies can optimise costs
by integrating automatic data-driven predictive maintenance models. These models monitor machine and component
Manuscript submitted to ACM
Deep learning models for predictive maintenance: a survey, comparison, challenges and prospect 25

states, whose research has evolved from statistical to more complex machine learning techniques. Nowadays, their
main research focuses on deep learning models.
The main objective of this survey is to analyse the state-of-the-art deep learning techniques implementation in the
field of predictive maintenance. For that purpose, several analysis and research are reviewed throughout the work,
which are summarised in this paragraph. In the beginning, the most relevant factors and characteristics of industrial
and PdM datasets are presented. Secondly, the steps necessary to perform PdM are presented in a methodological way.
Afterwards, statistical and traditional machine learning techniques for PdM are reviewed, in order to gain knowledge
on baseline models in which some deep learning implementations are based. Thenceforth, a thorough review on deep
learning state-of-the-art works is performed, classifying the works by their underlying technique, data typology and
compared among them; which enables methods’ comparison in a structured way. Related reviews on DL for PdM
are also analysed, highlighting their main conclusions. Thereafter, a summary on the main public PdM datasets is
presented and SotA results are compared on turbofan engine degradation simulation dataset. Moreover, the suitability
and impact of deep learning in the field of predictive maintenance is presented, together with the comparison with other
data-driven methods. In addition, the systematisation of deep learning models development for predictive maintenance
is discussed. Finally, the application of these models in real industrial use-cases is argued, analysing their applicability
beyond public benchmark datasets and research environments.
As stated before, industrial companies that want to optimise their maintenance operations should transition towards
predictive maintenance. However, this automatising should be embraced from simpler to more complex models, always
choosing the ones that could better fit their specific needs. Both domain experts and data scientists should collaborate in
the development and validation of a PdM structure. This hybrid model could benefit from the advantages of both domain
knowledge-based and data-driven approaches, resulting in an accurate yet interpretable model. Explainable machine
learning applied to deep learning could be an alternative to white-box and grey box models, which are more interpretable
and less accurate. These new models may achieve a trade-off between accuracy and explainability, integrating with
domain knowledge technicians, which can use them as a tool to perform PdM and gain knowledge from the data while
contrasting with theoretical background and domain expertise.
Industrial companies nowadays have collected much data by monitoring assets under normal working condition
and little to none failure data. Therefore, unsupervised and one-class classification algorithms research is relevant for
predictive maintenance field. Concretely, architectures like autoencoders or deep belief networks with LSTMs or CNNs
are one of the most researched type of architecture that enable unsupervised time-series data modelling. Nonetheless,
the design and optimisation of DL architectures is mainly guided by previous experience and trial and error.
To sum up, deep learning models have gained popularity in PdM due to their high accuracy, achieving state-of-the-art
results when trained with enough data. However, many works do not address other relevant aspects for PdM models
such as interpretability, real time execution, novelty detection or uncertainty modelling, given that mainly laboratory
datasets have been used. These aspects are fundamental to transfer any machine learning model to real, industrial use
cases, and run in production.

REFERENCES
[1] Charles M. Able, Alan H. Baydush, Callistus Nguyen, Jacob Gersh, Alois Ndlovu, Igor Rebo, Jeremy Booth, Mario Perez, Benjamin Sintay, and
Michael T. Munley. 2016. A model for preemptive maintenance of medical linear accelerators-predictive maintenance. Radiation Oncology 11, 1
(2016), 36. https://doi.org/10.1186/s13014-016-0602-1
[2] Toyosi Toriola Ademujimi, Michael P. Brundage, and Vittaldas V. Prabhu. 2017. A Review of Current Machine Learning Techniques Used in
Manufacturing Diagnosis. In IFIP Advances in Information and Communication Technology, Hermann Lödding, Ralph Riedel, Klaus-Dieter Thoben,
Manuscript submitted to ACM
26 Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza

Gregor von Cieminski, and Dimitris Kiritsis (Eds.), Vol. 513. Springer International Publishing, Cham, 407–415. https://doi.org/10.1007/978-3-319-
66923-6_48
[3] Partha Adhikari, Harsha Gururaja Rao, and Dipl.-Ing Matthias Buderath. 2018. Machine Learning based Data Driven Diagnostics & Prognostics
Framework for Aircraft Predictive Maintenance. 10th International Symposium on NDT in Aerospace, October 24-26, 2018, Dresden, Germany Ml
(2018), 1–15. https://www.ndt.net/article/aero2018/papers/We.5.B.3.pdf
[4] H. O.A. Ahmed, M. L.D. Wong, and A. K. Nandi. 2018. Intelligent condition monitoring method for bearing faults from highly compressed
measurements using sparse over-complete features. Mechanical Systems and Signal Processing 99 (2018), 459–477. https://doi.org/10.1016/j.ymssp.
2017.06.027
[5] Khalid F Al-Raheem and Waleed Abdul-Karem. 2011. Rolling bearing fault diagnostics using artificial neural networks based on Laplace wavelet
analysis. International Journal of Engineering, Science and Technology 2, 6 (2011). https://doi.org/10.4314/ijest.v2i6.63730
[6] Tsatsral Amarbayasgalan, Bilguun Jargalsaikhan, and Keun Ho Ryu. 2018. Unsupervised novelty detection using deep autoencoders with density
based clustering. Applied Sciences (Switzerland) 8, 9 (2018), 1468. https://doi.org/10.3390/app8091468
[7] Nagdev Amruthnath and Tarun Gupta. 2018. A research study on unsupervised machine learning algorithms for early fault detection in
predictive maintenance. In 2018 5th International Conference on Industrial Engineering and Applications, ICIEA 2018. IEEE, 355–361. https:
//doi.org/10.1109/IEA.2018.8387124
[8] Vimal and Saxena. 2013. Assessment of Gearbox Fault DetectionUsing Vibration Signal Analysis and Acoustic Emission Technique. IOSR Journal
of Mechanical and Civil Engineering 7, 4 (2013), 52–60. https://doi.org/10.9790/1684-0745260
[9] Fazel Ansari, Robert Glawar, and Wilfried Sihn. 2020. Prescriptive Maintenance of CPPS by Integrating Multimodal Data with Dynamic Bayesian
Networks. In Machine Learning for Cyber Physical Systems, Jürgen Beyerer, Alexander Maier, and Oliver Niggemann (Eds.). Springer Berlin
Heidelberg, Berlin, Heidelberg, 1–8. https://doi.org/10.1007/978-3-662-59084-3_1
[10] Damla Arifoglu and Abdelhamid Bouchachia. 2017. Activity Recognition and Abnormal Behaviour Detection with Recurrent Neural Networks.
Procedia Computer Science 110 (2017), 86–93. https://doi.org/10.1016/j.procs.2017.06.121
[11] Olgun Aydin and Seren Guldamlasioglu. 2017. Using LSTM networks to predict engine condition on large scale data processing framework. In 2017
4th International Conference on Electrical and Electronics Engineering, ICEEE 2017. IEEE, 281–285. https://doi.org/10.1109/ICEEE2.2017.7935834
[12] Caglar Aytekin, Xingyang Ni, Francesco Cricri, and Emre Aksu. 2018. Clustering and Unsupervised Anomaly Detection with l2 Normalized
Deep Auto-Encoder Representations. In Proceedings of the International Joint Conference on Neural Networks, Vol. 2018-July. IEEE, 1–6. https:
//doi.org/10.1109/IJCNN.2018.8489068 arXiv:1802.00187
[13] Giduthuri Sateesh Babu, Peilin Zhao, and Xiao Li Li. 2016. Deep convolutional neural network based regression approach for estimation of remaining
useful life. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),
Shamkant B Navathe, Weili Wu, Shashi Shekhar, Xiaoyong Du, X Sean Wang, and Hui Xiong (Eds.), Vol. 9642. Springer International Publishing,
Cham, 214–228. https://doi.org/10.1007/978-3-319-32025-0_14
[14] A. H.A. Bakar, H. A. Illias, M. K. Othman, and H. Mokhlis. 2013. Identification of failure root causes using condition based monitoring data on a 33
kV switchgear. International Journal of Electrical Power and Energy Systems 47, 1 (2013), 305–312. https://doi.org/10.1016/j.ijepes.2012.11.007
[15] Dana H Ballard. 1987. Modular Learning in Neural Networks. In Aaai. 279–284.
[16] Marcia Baptista, Shankar Sankararaman, Ivo P. de Medeiros, Cairo Nascimento, Helmut Prendinger, and Elsa M.P. Henriques. 2018. Forecasting
fault events for predictive maintenance using data-driven techniques and ARMA modeling. Computers and Industrial Engineering 115 (2018), 41–53.
https://doi.org/10.1016/j.cie.2017.10.033
[17] T. Benkedjouh, K. Medjaher, N. Zerhouni, and S. Rechak. 2013. Remaining useful life estimation based on nonlinear feature reduction and support
vector regression. Engineering Applications of Artificial Intelligence 26, 7 (2013), 1751–1760. https://doi.org/10.1016/j.engappai.2013.02.006
[18] Olivier Blancke, Dragan Komljenovic, Antoine Tahan, Amélie Combette, Normand Amyot, Mélanie Lévesque, Claude Hudon, and Noureddine
Zerhouni. 2018. A Predictive Maintenance Approach for Complex Equipment Based on Petri Net Failure Mechanism Propagation Model. In
Proceedings of the European Conference of the PHM Society, Vol. 4. https://phmpapers.org/index.php/phme/article/view/434
[19] Najmeh Bolbolamiri, Maryam Setayesh Sanai, and Ahmad Mirabadi. 2012. Time-Domain Stator Current Condition Monitoring : Analyzing Point
Failures Detection by Kolmogorov-Smirnov ( K-S ) Test. International Journal of Electrical, Computer, Energetic, Electronic and Communication
Engineering 6, 6 (2012), 587–592.
[20] Sumon Kumar Bose, Bapi Kar, Mohendra Roy, Pradeep Kumar Gopalakrishnan, and Arindam Basu. 2019. AdepoS: Anomaly detection based power
saving for predictive maintenance using edge computing. In Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC.
ACM, 597–602. https://doi.org/10.1145/3287624.3287716 arXiv:1811.00873
[21] Tony Boutros and Ming Liang. 2011. Detection and diagnosis of bearing and cutting tool faults using hidden Markov models. Mechanical Systems
and Signal Processing 25, 6 (2011), 2102–2124. https://doi.org/10.1016/j.ymssp.2011.01.013
[22] R. C. Bromley and E. Bottomley. 1994. Failure modes, effects and criticality analysis (FMECA). IEE Colloquium (Digest) 52 (1994). https:
//doi.org/10.1002/9781118312575.ch12
[23] Dario Bruneo and Fabrizio De Vita. 2019. On the use of LSTM networks for predictive maintenance in smart industries. In Proceedings - 2019 IEEE
International Conference on Smart Computing, SMARTCOMP 2019. IEEE, 241–248. https://doi.org/10.1109/SMARTCOMP.2019.00059
[24] Juan Pablo Cabezas Rodríguez. 2019. Generative Adversarial Network Based Model for Multi-Domain. Universidad de Chile (2019).

Manuscript submitted to ACM


Deep learning models for predictive maintenance: a survey, comparison, challenges and prospect 27

[25] Diego Cabrera, Fernando Sancho, Chuan Li, Mariela Cerrada, René Vinicio Sánchez, Fannia Pacheco, and José Valente de Oliveira. 2017. Automatic
feature extraction of time-series applied to fault severity assessment of helical gearbox in stationary and non-stationary speed operation. Applied
Soft Computing Journal 58 (2017), 53–64. https://doi.org/10.1016/j.asoc.2017.04.016
[26] Mikel Canizo, Enrique Onieva, Angel Conde, Santiago Charramendieta, and Salvador Trujillo. 2017. Real-time predictive maintenance for
wind turbines using Big Data frameworks. In 2017 IEEE International Conference on Prognostics and Health Management, ICPHM 2017. 70–77.
https://doi.org/10.1109/ICPHM.2017.7998308
[27] Christer Carlsson, Markku Heikkilä, and József Mezei. 2016. Fuzzy entropy used for predictive analytics. Vol. 341. Springer International Publishing,
Cham, 187–209. https://doi.org/10.1007/978-3-319-31093-0_9
[28] Thyago P. Carvalho, Fabrízzio A.A.M.N. Soares, Roberto Vita, Roberto da P. Francisco, João P. Basto, and Symone G.S. Alcalá. 2019. A systematic
literature review of machine learning methods applied to predictive maintenance. Computers and Industrial Engineering 137 (2019), 106024.
https://doi.org/10.1016/j.cie.2019.106024
[29] Philippe Castagliola, Giovanni Celano, and Stelios Psarakis. 2011. Monitoring the coefficient of variation using EWMA charts. Journal of Quality
Technology 43, 3 (2011), 249–265. https://doi.org/10.1080/00224065.2011.11917861
[30] Vítor Cerqueira, Fábio Pinto, Claudio Sá, and Carlos Soares. 2016. Combining boosted trees with metafeature engineering for predictive maintenance.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9897 LNCS (2016),
393–397. https://doi.org/10.1007/978-3-319-46349-0_35
[31] Raghavendra Chalapathy and Sanjay Chawla. 2019. Deep Learning for Anomaly Detection: A Survey. arXiv preprint (2019). arXiv:1901.03407
[32] Raghavendra Chalapathy, Aditya Krishna Menon, and Sanjay Chawla. 2018. Anomaly Detection using One-Class Neural Networks. arXiv preprint
(2018). arXiv:1802.06360
[33] Peter Chemweno, Ido Morag, Mohammad Sheikhalishahi, Liliane Pintelon, Peter Muchiri, and James Wakiru. 2016. Development of a novel
methodology for root cause analysis and selection of maintenance strategy for a thermal power plant: A data exploration approach. Engineering
Failure Analysis 66 (2016), 19–34. https://doi.org/10.1016/j.engfailanal.2016.04.001
[34] Renxiang Chen, Siyang Chen, Miao He, David He, and Baoping Tang. 2017. Rolling bearing fault severity identification using deep sparse
auto-encoder network with noise added sample expansion. Proceedings of the Institution of Mechanical Engineers, Part O: Journal of Risk and
Reliability 231, 6 (2017), 666–679. https://doi.org/10.1177/1748006X17726452
[35] Zhiqiang Chen, Shengcai Deng, Xudong Chen, Chuan Li, René Vinicio Sanchez, and Huafeng Qin. 2017. Deep neural networks-based rolling
bearing fault diagnosis. Microelectronics Reliability 75 (2017), 327–333. https://doi.org/10.1016/j.microrel.2017.03.006
[36] Kyunghyun Cho, Bart van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the Properties of Neural Machine Translation: Encoder–
Decoder Approaches. In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. Association for
Computational Linguistics, Doha, Qatar, 103–111. https://doi.org/10.3115/v1/W14-4012
[37] Chris Colemen, Satish Damodaran, Mahesh Chandramoulin, and Ed Deuel. 2017. Making maintenance smarter. Deloitte University Press (2017),
1–21.
[38] Hector Cortes, Joanna Daaboul, Julien Le Duigou, and Benoît Eynard. 2016. Strategic Lean Management: Integration of operational Performance
Indicators for strategic Lean management. IFAC-PapersOnLine 49, 12 (2016), 65–70. https://doi.org/10.1016/j.ifacol.2016.07.551
[39] Camila Ferreira Costa and Mario A. Nascimento. 2016. IDA 2016 industrial challenge: Using machine learning for predicting failures. In Lecture Notes
in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Henrik Boström, Arno Knobbe,
Carlos Soares, and Panagiotis Papapetrou (Eds.), Vol. 9897 LNCS. Springer International Publishing, Cham, 381–386. https://doi.org/10.1007/978-3-
319-46349-0_33
[40] M. Demetgul. 2013. Fault diagnosis on production systems with support vector machine and decision trees algorithms. International Journal of
Advanced Manufacturing Technology 67, 9-12 (2013), 2183–2194. https://doi.org/10.1007/s00170-012-4639-5
[41] Jason Deutsch and David He. 2017. Using Deep Learning-Based Approach to Predict Remaining Useful Life of Rotating Components. IEEE
Transactions on Systems, Man, and Cybernetics: Systems 48, 1 (2017), 11–20. https://doi.org/10.1109/TSMC.2017.2697842
[42] Balbir S Dhillon. 2002. Engineering maintenance: a modern approach. cRc press. 1–224 pages.
[43] Alberto Diez-Olivan, Jose A. Pagan, Ricardo Sanz, and Basilio Sierra. 2017. Data-driven prognostics using a combination of constrained K-means
clustering, fuzzy modeling and LOF-based score. Neurocomputing 241 (2017), 97–107. https://doi.org/10.1016/j.neucom.2017.02.024
[44] Don Sanger. 2017. Reactive, Preventive & Predictive Maintenance | IVC Technologies. https://ivctechnologies.com/2017/08/29/reactive-preventive-
predictive-maintenance/
[45] Tiago Dos Santos, Fernando J.T.E. Ferreira, Joao Moura Pires, and Carlos Damasio. 2017. Stator winding short-circuit fault diagnosis in induction
motors using random forest. In 2017 IEEE International Electric Machines and Drives Conference, IEMDC 2017. 1–8. https://doi.org/10.1109/IEMDC.
2017.8002350
[46] Bruce W. Drinkwater, Jie Zhang, Katherine J. Kirk, Jocelyn Elgoyhen, and Rob S. Dwyer-Joyce. 2009. Ultrasonic measurement of rolling bearing
lubrication using piezoelectric thin films. Journal of Tribology 131, 1 (2009), 1–8. https://doi.org/10.1115/1.3002324
[47] Gopi Krishna Durbhaka and Barani Selvaraj. 2016. Predictive maintenance for wind turbine diagnostics using vibration signal analysis based on
collaborative recommendation approach. In 2016 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2016.
1839–1842. https://doi.org/10.1109/ICACCI.2016.7732316

Manuscript submitted to ACM


28 Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza

[48] Samuel Eke, Thomas Aka-Ngnui, Guy Clerc, and Issouf Fofana. 2017. Characterization of the operating periods of a power transformer by clustering
the dissolved gas data. In Proceedings of the 2017 IEEE 11th International Symposium on Diagnostics for Electrical Machines, Power Electronics and
Drives, SDEMPED 2017, Vol. 2017-January. 298–303. https://doi.org/10.1109/DEMPED.2017.8062371
[49] I. Y. Elnasharty, A. K. Kassem, M. Sabsabi, and M. A. Harith. 2011. Diagnosis of lubricating oil by evaluating cyanide and carbon molecular emission
lines in laser induced breakdown spectra. Spectrochimica Acta - Part B Atomic Spectroscopy 66, 8 (2011), 588–593. https://doi.org/10.1016/j.sab.2011.
06.001
[50] Ahmed Elsheikh, Soumaya Yacout, and Mohamed Salah Ouali. 2019. Bidirectional handshaking LSTM for remaining useful life prediction.
Neurocomputing 323 (2019), 148–156. https://doi.org/10.1016/j.neucom.2018.09.076
[51] Fuzhou Feng, Guoqiang Rao, Pengcheng Jiang, and Aiwei Si. 2012. Research on early fault diagnosis for rolling bearing based on permutation
entropy algorithm. In Proceedings of IEEE 2012 Prognostics and System Health Management Conference, PHM-2012. 1–5. https://doi.org/10.1109/
PHM.2012.6228833
[52] Javier Fernandez-Anakabe, Ekhi Zugasti Uriguen, and Urko Zurutuza Ortega. 2019. An Attribute Oriented Induction based Methodology for Data
Driven Predictive Maintenance. arXiv preprint (2019). arXiv:1912.00662
[53] Grant S Galloway, Victoria M Catterson, Thomas Fay, Andrew Robb, and Craig Love. 2016. Diagnosis of Tidal Turbine Vibration Data through
Deep Neural Networks. Proceedings of the Third European Conference of the Prognostics and Health Management Society 2016 (2016), 172–180.
https://strathprints.strath.ac.uk/57127/
[54] Fausto Pedro García Márquez, Andrew Mark Tobias, Jesús María Pinar Pérez, and Mayorkinos Papaelias. 2012. Condition monitoring of wind
turbines: Techniques and methods. Renewable Energy 46 (2012), 169–178. https://doi.org/10.1016/j.renene.2012.03.003
[55] A. Garg, V. Vijayaraghavan, K. Tai, Pravin M. Singru, Vishal Jain, and Nikilesh Krishnakumar. 2015. Model development based on evolutionary
framework for condition monitoring of a lathe machine. Measurement: Journal of the International Measurement Confederation 73 (2015), 95–110.
https://doi.org/10.1016/j.measurement.2015.04.025
[56] Aurélien Géron. 2017. Hands-on machine learning with Scikit-Learn and TensorFlow : concepts, tools, and techniques to build intelligent systems.
O’Reilly Media. 572 pages.
[57] G. H. Golub and C. Reinsch. 1970. Singular value decomposition and least squares solutions. In Numerische Mathematik. Vol. 14. Springer, 403–420.
https://doi.org/10.1007/BF02163027
[58] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org.
[59] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014.
Generative adversarial nets. In Advances in Neural Information Processing Systems, Vol. 3. 2672–2680. https://doi.org/10.3156/jsoft.29.5_177_2
[60] Tian Guo, Zhao Xu, Xin Yao, Haifeng Chen, Karl Aberer, and Koichi Funaya. 2016. Robust online time series prediction with recurrent neural
networks. In Proceedings - 3rd IEEE International Conference on Data Science and Advanced Analytics, DSAA 2016. IEEE, 816–825. https://doi.org/10.
1109/DSAA.2016.92
[61] Xiaojie Guo, Liang Chen, and Changqing Shen. 2016. Hierarchical adaptive deep convolution neural network and its application to bearing fault
diagnosis. Measurement: Journal of the International Measurement Confederation 93 (2016), 490–502. https://doi.org/10.1016/j.measurement.2016.07.
054
[62] Clemens Gutschi, Nikolaus Furian, Josef Suschnigg, Dietmar Neubacher, and Siegfried Voessner. 2019. Log-based predictive maintenance in discrete
parts manufacturing. Procedia CIRP 79 (2019), 528–533. https://doi.org/10.1016/j.procir.2019.02.098
[63] Liu Hao, Xiong Xin, Wang Xiaojing, Guo Jiayu, and Shen Jiexi. 2017. Health Assessment of Rolling Bearing based on Self-organizing Map and
Restricted Boltzmann Machine. Journal of Mechanical Transmission 6 (2017), 5.
[64] G. E. Hinton and R. R. Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. Science 313, 5786 (2006), 504–507.
https://doi.org/10.1126/science.1127647
[65] Sepp Hochreiter. 1991. Untersuchungen zu dynamischen neuronalen Netzen. Master’s thesis, Institut für Informatik, Technische Universität, Munchen
91, 1 (1991), 1–71. http://people.idsia.ch/~juergen/SeppHochreiter1991ThesisAdvisorSchmidhuber.pdf
[66] Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.
[67] Sheng Hong and Jiawei Yin. 2018. Remaining useful life prediction of bearing based on deep perceptron neural networks. ACM International
Conference Proceeding Series 48 (2018), 175–179. https://doi.org/10.1145/3289430.3289438
[68] Yang Huang, Chiun Hsun Chen, and Chi Jui Huang. 2019. Motor fault detection and feature extraction using rnn-based variational autoencoder.
IEEE Access 7 (2019), 139086–139096. https://doi.org/10.1109/ACCESS.2019.2940769
[69] Soonsung Hwang, Jongpil Jeong, and Youngbin Kang. 2018. SVM-RBM based predictive maintenance scheme for IoT-enabled smart factory. In 2018
13th International Conference on Digital Information Management, ICDIM 2018. IEEE, 162–167. https://doi.org/10.1109/ICDIM.2018.8847132
[70] Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, and Pierre Alain Muller. 2019. Deep learning for time series
classification: a review. Data Mining and Knowledge Discovery 33, 4 (2019), 917–963. https://doi.org/10.1007/s10618-019-00619-1
[71] R. Jegadeeshwaran and V. Sugumaran. 2015. Fault diagnosis of automobile hydraulic brake system using statistical features and support vector
machines. Mechanical Systems and Signal Processing 52-53, 1 (2015), 436–446. https://doi.org/10.1016/j.ymssp.2014.08.007
[72] Pallabi Kakati, Devendra Dandotiya, and Bhaskar Pal. 2019. Remaining useful life predictions for turbofan engine degradation using online long
short-term memory network. ASME 2019 Gas Turbine India Conference, GTINDIA 2019 2 (2019), 34. https://doi.org/10.1115/GTINDIA2019-2368

Manuscript submitted to ACM


Deep learning models for predictive maintenance: a survey, comparison, challenges and prospect 29

[73] Ameeth Kanawaday and Aditya Sane. 2018. Machine learning for predictive maintenance of industrial machines using IoT sensor data. In
Proceedings of the IEEE International Conference on Software Engineering and Service Sciences, ICSESS, Vol. 2017-November. 87–90. https://doi.org/
10.1109/ICSESS.2017.8342870
[74] Dimitrios Kateris, Dimitrios Moshou, Xanthoula Eirini Pantazi, Ioannis Gravalos, Nader Sawalhi, and Spiros Loutridis. 2014. A machine learning
approach for the condition monitoring of rotating machinery. Journal of Mechanical Science and Technology 28, 1 (2014), 61–71. https://doi.org/10.
1007/s12206-013-1102-y
[75] Samir Khan and Takehisa Yairi. 2018. A review on the application of deep learning in system health management. Mechanical Systems and Signal
Processing 107 (2018), 241–265. https://doi.org/10.1016/j.ymssp.2017.11.024
[76] Diederik P. Kingma and Max Welling. 2014. Auto-encoding variational bayes. In 2nd International Conference on Learning Representations, ICLR
2014 - Conference Track Proceedings, Vol. 1. arXiv:1312.6114
[77] T Kohonen. 1990. The self-organizing map. Proc. IEEE 78, 9 (1990), 1464–1480.
[78] N. Kolokas, T. Vafeiadis, D. Ioannidis, and D. Tzovaras. 2018. Forecasting faults of industrial equipment using machine learning classifiers. In 2018 IEEE
(SMC) International Conference on Innovations in Intelligent Systems and Applications, INISTA 2018. 1–6. https://doi.org/10.1109/INISTA.2018.8466309
[79] Björn Kroll, David Schaffranek, Sebastian Schriegel, and Oliver Niggemann. 2014. System modeling based on machine learning for anomaly
detection and predictive maintenance in industrial plants. 19th IEEE International Conference on Emerging Technologies and Factory Automation,
ETFA 2014 (2014). https://doi.org/10.1109/ETFA.2014.7005202
[80] J Lacaille, A Gouby, W Bense, T Rabenoro, and M Abdel-Sayed. 2015. Turbofan engine monitoring with health state identification and remaining
useful life anticipation. International Journal of Condition Monitoring 5, 2 (2015), 8–16. https://doi.org/10.1784/204764215815848375
[81] Yuval Lavi. 2018. The Rewards and Challenges of Predictive Maintenance. InfoQ (jul 2018). https://www.infoq.com/articles/predictive-maintenance-
industrial-iot/
[82] Y. Le Cun and Françoise Fogelman-Soulié. 1987. Modèles connexionnistes de l’apprentissage. Intellectica. Revue de l’Association pour la Recherche
Cognitive 2, 1 (1987), 114–143. https://doi.org/10.3406/intel.1987.1804
[83] Mitchell Lebold, Karl Reichard, Carl S Byington, and Rolf Orsagh. 2002. OSA-CBM architecture development with emphasis on XML implementations.
In Maintenance and Reliability Conference (MARCON). pp. 6–8. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.110.4066&rep=rep1&
type=pdf
[84] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. 1989. Backpropagation Applied to Handwritten Zip
Code Recognition. Neural Computation 1, 4 (1989), 541–551. https://doi.org/10.1162/neco.1989.1.4.541
[85] Dongjin Lee. 2019. Evaluating reliability of complex systems for Predictive maintenance. arXiv preprint (2019). arXiv:1902.03495
[86] Dongjin Lee and Rong Pan. 2017. Predictive maintenance of complex system with multi-level reliability structure. International Journal of Production
Research 55, 16 (2017), 4785–4801. https://doi.org/10.1080/00207543.2017.1299947
[87] Jay Lee, Jun Ni, Dragan Djurdjanovic, Hai Qiu, and Haitao Liao. 2006. Intelligent prognostics tools and e-maintenance. Computers in Industry 57, 6
(2006), 476–489. https://doi.org/10.1016/j.compind.2006.02.014
[88] Yong Oh Lee, Jun Jo, and Jongwoon Hwang. 2017. Application of deep neural network and generative adversarial network to industrial maintenance:
A case study of induction motor fault detection. In Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017, Vol. 2018-January.
IEEE, 3248–3253. https://doi.org/10.1109/BigData.2017.8258307
[89] Erik Lejon, Petter Kyösti, and John Lindström. 2018. Machine learning for detection of anomalies in press-hardening: Selection of efficient methods.
Procedia CIRP 72 (2018), 1079–1083. https://doi.org/10.1016/j.procir.2018.03.221
[90] Dacheng Li and Jinji Gao. 2010. Study and application of Reliability-centered Maintenance considering Radical Maintenance. Journal of Loss
Prevention in the Process Industries 23, 5 (2010), 622–629. https://doi.org/10.1016/j.jlp.2010.06.008
[91] Ling Li, Min Liu, Weiming Shen, and Guoqing Cheng. 2017. An expert knowledge-based dynamic maintenance task assignment model using
discrete stress-strength interference theory. Knowledge-Based Systems 131 (2017), 135–148. https://doi.org/10.1016/j.knosys.2017.06.008
[92] Pin Li, Xiaodong Jia, Jianshe Feng, Feng Zhu, Marcella Miller, Liang Yu Chen, and Jay Lee. 2020. A novel scalable method for machine degradation
assessment using deep convolutional neural network. Measurement: Journal of the International Measurement Confederation 151 (2020), 107106.
https://doi.org/10.1016/j.measurement.2019.107106
[93] Xiang Li, Qian Ding, and Jian Qiao Sun. 2018. Remaining useful life estimation in prognostics using deep convolution neural networks. Reliability
Engineering and System Safety 172 (2018), 1–11. https://doi.org/10.1016/j.ress.2017.11.021
[94] Y. Li, T. R. Kurfess, and S. Y. Liang. 2000. Stochastic prognostics for rolling element bearings. Mechanical Systems and Signal Processing 14, 5 (2000),
747–762. https://doi.org/10.1006/mssp.2000.1301
[95] Zefang Li, Huajing Fang, Ming Huang, Ying Wei, and Linlan Zhang. 2018. Data-driven bearing fault identification using improved hidden Markov
model and self-organizing map. Computers and Industrial Engineering 116 (2018), 37–46. https://doi.org/10.1016/j.cie.2017.12.002
[96] Haitao Liao, Wenbiao Zhao, and Huairui Guo. 2006. Predicting remaining useful life of an individual unit using proportional hazards model and
logistic regression model. In Proceedings - Annual Reliability and Maintainability Symposium. IEEE, 127–132. https://doi.org/10.1109/RAMS.2006.
1677362
[97] Linxia Liao, Wenjing Jin, and Radu Pavel. 2016. Enhanced Restricted Boltzmann Machine with Prognosability Regularization for Prognostics and
Health Assessment. IEEE Transactions on Industrial Electronics 63, 11 (2016), 7076–7083. https://doi.org/10.1109/TIE.2016.2586442

Manuscript submitted to ACM


30 Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza

[98] Linxia Liao and Felix Köttig. 2016. A hybrid framework combining data-driven and model-based methods for system remaining useful life
prediction. Applied Soft Computing Journal 44 (2016), 191–199. https://doi.org/10.1016/j.asoc.2016.03.013
[99] M. Lichman. 2013. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
[100] Benjamin Lindemann, Fabian Fesenmayr, Nasser Jazdi, and Michael Weyrich. 2019. Anomaly detection in discrete manufacturing using self-learning
approaches. Procedia CIRP 79 (2019), 313–318. https://doi.org/10.1016/j.procir.2019.02.073
[101] André Listou Ellefsen, Emil Bjørlykhaug, Vilmar Æsøy, Sergey Ushakov, and Houxiang Zhang. 2019. Remaining useful life predictions for
turbofan engine degradation using semi-supervised deep architecture. Reliability Engineering and System Safety 183 (2019), 240–251. https:
//doi.org/10.1016/j.ress.2018.11.027
[102] Geert Litjens, Thijs Kooi, Babak Ehteshami Bejnordi, Arnaud Arindra Adiyoso Setio, Francesco Ciompi, Mohsen Ghafoorian, Jeroen A.W.M. van der
Laak, Bram van Ginneken, and Clara I. Sánchez. 2017. A survey on deep learning in medical image analysis. Medical Image Analysis 42 (2017),
60–88. https://doi.org/10.1016/j.media.2017.07.005 arXiv:1702.05747
[103] Puyin Liu. 2001. Approximation capabilities of multilayer feedforward regular fuzzy neural networks. Applied Mathematics 16, 1 (2001), 45–57.
https://doi.org/10.1007/s11766-001-0036-9
[104] Ruonan Liu, Guotao Meng, Boyuan Yang, Chuang Sun, and Xuefeng Chen. 2017. Dislocated Time Series Convolutional Neural Architecture:
An Intelligent Fault Diagnosis Approach for Electric Machine. IEEE Transactions on Industrial Informatics 13, 3 (2017), 1310–1320. https:
//doi.org/10.1109/TII.2016.2645238
[105] X. T. Liu, F. Z. Feng, and A. W. Si. 2012. Condition based monitoring, diagnosis and maintenance on operating equipments of a hydraulic generator
unit. In IOP Conference Series: Earth and Environmental Science, Vol. 15. IOP Publishing, 42014. https://doi.org/10.1088/1755-1315/15/4/042014
[106] Moustapha Lo, Nicolas Valot, Florence Maraninchi, and Pascal Raymond. 2018. Real-time on-Board Manycore Implementation of a Health
Monitoring System: Lessons Learnt.
[107] Bo Lu, John Stuber, and Thomas F. Edgar. 2018. Data-driven adaptive multiple model system utilizing growing self-organizing maps. Journal of
Process Control 67 (2018), 56–68. https://doi.org/10.1016/j.jprocont.2017.06.006
[108] Chen Lu, Zhen Ya Wang, Wei Li Qin, and Jian Ma. 2017. Fault diagnosis of rotary machinery components using a stacked denoising autoencoder-
based health state identification. Signal Processing 130 (2017), 377–388. https://doi.org/10.1016/j.sigpro.2016.07.028
[109] Dusko Lukac. 2016. The fourth ICT-based industrial revolution "industry 4.0" - HMI and the case of CAE/CAD innovation with EPLAN P8. In 2015
23rd Telecommunications Forum, TELFOR 2015. IEEE, 835–838. https://doi.org/10.1109/TELFOR.2015.7377595
[110] Yonghong Luo, Xiangrui Cai, Ying ZHANG, Jun Xu, and Yuan xiaojie. 2018. Multivariate Time Series Imputation with Generative Adversarial
Networks. In Advances in Neural Information Processing Systems 31, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and
R. Garnett (Eds.). Curran Associates, Inc., 1596–1607. http://papers.nips.cc/paper/7432-multivariate-time-series-imputation-with-generative-
adversarial-networks.pdf
[111] Sture Lygren, Marco Piantanida, and Alfonso Amendola. 2019. Unsupervised, deep learning-based detection of failures in industrial equipments:
The future of predictive maintenance. In Society of Petroleum Engineers - Abu Dhabi International Petroleum Exhibition and Conference 2019, ADIP
2019. Society of Petroleum Engineers. https://doi.org/10.2118/197629-ms
[112] Alireza Makhzani and Brendan Frey. 2013. K-sparse autoencoders. arXiv preprint (2013). arXiv:1312.5663
[113] Pankaj Malhotra, Vishnu TV, Anusha Ramakrishnan, Gaurangi Anand, Lovekesh Vig, Puneet Agarwal, and Gautam Shroff. 2016. Multi-Sensor
Prognostics using an Unsupervised Health Index based on LSTM Encoder-Decoder. arXiv preprint (2016). arXiv:1608.06154
[114] Julio Martinez, Christianne Dennison, and Zhengyi Lian. [n.d.]. Sequence Based Classification for Predictive Maintenance. ([n. d.]).
[115] Giovanna Martínez-Arellano and Svetan Ratchev. 2019. Towards an active learning approach to tool condition monitoring with bayesian deep
learning. Proceedings - European Council for Modelling and Simulation, ECMS 33, 1 (2019), 223–229. https://doi.org/10.7148/2019-0223
[116] Vimala Mathew, Tom Toby, Vikram Singh, B. Maheswara Rao, and M. Goutham Kumar. 2018. Prediction of Remaining Useful Lifetime (RUL) of
turbofan engine using machine learning. In IEEE International Conference on Circuits and Systems, ICCS 2017, Vol. 2018-January. IEEE, 306–311.
https://doi.org/10.1109/ICCS1.2017.8326010
[117] Mohammad Rezazadeh Mehrjou, Norman Mariun, Mohammad Hamiruce Marhaban, and Norhisam Misron. 2011. Rotor fault condition monitoring
techniques for squirrel-cage induction machine - A review. Mechanical Systems and Signal Processing 25, 8 (2011), 2827–2848. https://doi.org/10.
1016/j.ymssp.2011.05.007
[118] Gabriel Michau, Yang Hu, Thomas Palmé, and Olga Fink. 2020. Feature learning for fault detection in high-dimensional condition monitoring
signals. Proceedings of the Institution of Mechanical Engineers, Part O: Journal of Risk and Reliability 234, 1 (2020), 104–115. https://doi.org/10.1177/
1748006X19868335
[119] Mike Sondalini. 2019. Plant and Equipment Health Measurement, Assessment and Management. https://www.lifetime-reliability.com/cms/
machinery-health-measurement/
[120] Mohsin Munir, Steffen Erkel, Andreas Dengel, and Sheraz Ahmed. 2017. Pattern-based contextual anomaly detection in HVAC systems. In IEEE
International Conference on Data Mining Workshops, ICDMW, Vol. 2017-November. 1066–1073. https://doi.org/10.1109/ICDMW.2017.150
[121] Mohsin Munir, Shoaib Ahmed Siddiqui, Andreas Dengel, and Sheraz Ahmed. 2019. DeepAnT: A Deep Learning Approach for Unsupervised
Anomaly Detection in Time Series. IEEE Access 7 (2019), 1991–2005. https://doi.org/10.1109/ACCESS.2018.2886457
[122] Raji Murugan and Raju Ramasamy. 2015. Failure analysis of power transformer for effective maintenance planning in electric utilities. Engineering
Failure Analysis 55 (2015), 182–192. https://doi.org/10.1016/j.engfailanal.2015.06.002
Manuscript submitted to ACM
Deep learning models for predictive maintenance: a survey, comparison, challenges and prospect 31

[123] Anvardh Nanduri and Lance Sherry. 2016. Anomaly detection in aircraft data using Recurrent Neural Networks (RNN). In ICNS 2016: Securing an
Integrated CNS System to Meet Future Challenges. IEE, 5C2—-1. https://doi.org/10.1109/ICNSURV.2016.7486356
[124] NASA. 2020. Prognostics Center - Data Repository. https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/{#}femto
[125] Richard E. Neapolitan and Richard E. Neapolitan. 2018. Neural Networks and Deep Learning. Determination Press. 389–411 pages. https:
//doi.org/10.1201/b22400-15
[126] Qiming Niu. 2017. Remaining useful life prediction of bearings based on health index recurrent neural network. Boletin Tecnico/Technical Bulletin
55, 16 (2017), 585–590.
[127] Hyunseok Oh, Joon Ha Jung, Byung Chul Jeon, and Byeng Dong Youn. 2018. Scalable and Unsupervised Feature Engineering Using Vibration-
Imaging and Deep Learning for Rotor System Diagnosis. IEEE Transactions on Industrial Electronics 65, 4 (2018), 3539–3549. https://doi.org/10.
1109/TIE.2017.2752151
[128] C. Okoh, R. Roy, and J. Mehnen. 2017. Predictive Maintenance Modelling for Through-Life Engineering Services. Procedia CIRP 59 (2017), 196–201.
https://doi.org/10.1016/j.procir.2016.09.033
[129] Charles H. Oppenheimer and Kenneth A. Loparo. 2002. Physically based diagnosis and prognosis of cracked rotor shafts. In Component and Systems
Diagnostics, Prognostics, and Health Management II, Vol. 4733. International Society for Optics and Photonics, 122–132. https://doi.org/10.1117/12.
475502
[130] Claudio Passarella. 2018. Failure modes and effects analysis. Control 31, 10 (2018), 72–73. https://doi.org/10.1201/9781420039603.ch9
[131] Kaixiang Peng, Ruihua Jiao, Jie Dong, and Yanting Pi. 2019. A deep belief network based health indicator construction and remaining useful life
prediction using improved particle filter. Neurocomputing 361 (2019), 19–28. https://doi.org/10.1016/j.neucom.2019.07.075
[132] João Pereira. 2018. Unsupervised anomaly detection in time series data using deep learning. Ph.D. Dissertation. Instituto Superior Técnico (IST),
University of Lisbon.
[133] Lorenzo Perini. 2019. Predictive Maintenance for off-road vehicles based on Hidden Markov Models and Autoencoders for trend Anomaly Detection.
Ph.D. Dissertation. Politecnico di Torino.
[134] Marco A.F. Pimentel, David A. Clifton, Lei Clifton, and Lionel Tarassenko. 2014. A review of novelty detection. Signal Processing 99 (2014), 215–249.
https://doi.org/10.1016/j.sigpro.2013.12.026
[135] Samira Pouyanfar, Saad Sadiq, Yilin Yan, Haiman Tian, Yudong Tao, Maria Presa Reyes, Mei Ling Shyu, Shu Ching Chen, and S. S. Iyengar. 2018. A
survey on deep learning: Algorithms, techniques, and applications. Comput. Surveys 51, 5 (2018), 92. https://doi.org/10.1145/3234150
[136] K Prabakaran, S Kaushik, and R Mouleeshuwarapprabu. 2014. Radial Basis Neural Networks Based Fault Detection and Isolation Scheme for
Pneumatic Actuator. Journal of Engineering Computers & Applied Sciences 3, 9 (2014), 50–55.
[137] Satyabrata Pradhan, Rajveer Singh, Komal Kachru, and Srinivas Narasimhamurthy. 2007. A Bayesian network based approach for root-cause-
analysis in manufacturing process. In Proceedings - 2007 International Conference on Computational Intelligence and Security, CIS 2007. IEEE, 10–14.
https://doi.org/10.1109/CIS.2007.7
[138] Ashok Prajapati, James Bechtel, and Subramaniam Ganesan. 2012. Condition based maintenance: A survey. Journal of Quality in Maintenance
Engineering 18, 4 (2012), 384–400. https://doi.org/10.1108/13552511211281552
[139] Mahardhika Pratama, Eric Dimla, Tegoeh Tjahjowidodo, Witold Pedrycz, and Edwin Lughofer. 2018. Online Tool Condition Monitoring Based on
Parsimonious Ensemble+. IEEE Transactions on Cybernetics (2018). https://doi.org/10.1109/TCYB.2018.2871120
[140] Mona Khatami Rad, Mohammadehsan Torabizadeh, and Amin Noshadi. 2011. Artificial neural network-based fault diagnostics of an electric motor
using vibration monitoring. In Proceedings 2011 International Conference on Transportation, Mechanical, and Electrical Engineering, TMEE 2011. IEEE,
1512–1516. https://doi.org/10.1109/TMEE.2011.6199495
[141] Srinivasan Radhakrishnan and Sagar Kamarthi. 2016. Complexity-entropy feature plane for gear fault detection. In Proceedings - 2016 IEEE
International Conference on Big Data, Big Data 2016. 2057–2061. https://doi.org/10.1109/BigData.2016.7840830
[142] Emmanuel Ramasso. 2014. Investigating computational geometry for failure prognostics in presence of imprecise health indicator: Results and
comparisons on C-MAPPS datasets. In 2nd Europen confernce of the prognostics and health management society., Vol. 5. 1–13. https://archivesic.ccsd.
cnrs.fr/UNIV-BM/hal-01144999
[143] Emmanuel Ramasso and Abhinav Saxena. 2014. Performance benchmarking and analysis of prognostic methods for CMAPSS datasets. International
Journal of Prognostics and Health Management 5, 2 (2014), 1–15. https://hal.archives-ouvertes.fr/hal-01324587
[144] Kishore K. Reddy, Soumalya Sarkar, Vivek Venugopalan, and Michael Giering. 2016. Anomaly detection and fault disambiguation in large flight
data: A multi-modal deep auto-encoder approach. In Proceedings of the Annual Conference of the Prognostics and Health Management Society, PHM,
Vol. 2016-October. 192–199.
[145] Thomas Rieger, Stefanie Regier, Ingo Stengel, and Nathan Clarke. 2019. Fast predictive maintenance in Industrial Internet of Things (IIoT) with
Deep Learning (DL): A review. In CEUR Workshop Proceedings, Vol. 2348. 69–79.
[146] A.J. Robinson and F. Fallside. 1987. The utility driven dynamic error propagation network. University of Cambridge Department of Engineering
Cambridge.
[147] Mohendra Roy, Sumon Kumar Bose, Bapi Kar, Pradeep Kumar Gopalakrishnan, and Arindam Basu. 2019. A Stacked Autoencoder Neural Network
based Automated Feature Extraction Method for Anomaly detection in On-line Condition Monitoring. In Proceedings of the 2018 IEEE Symposium
Series on Computational Intelligence, SSCI 2018. IEEE, 1501–1507. https://doi.org/10.1109/SSCI.2018.8628810

Manuscript submitted to ACM


32 Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza

[148] Rabee Rustum and Shaun Forrest. 2018. Fault Detection in the Activated Sludge Process using the Kohonen Self-Organising Map. In 8th International
Conference on Urban Planning, Architecture, Civil and Environment Engineering. Dubai, UAE. https://doi.org/10.15242/heaig.h1217807
[149] Mayu Sakurada and Takehisa Yairi. 2014. Anomaly detection using autoencoders with nonlinear dimensionality reduction. In ACM International
Conference Proceeding Series, Vol. 02-December-2014. ACM, 4–11. https://doi.org/10.1145/2689746.2689747
[150] Manassakan Sanayha and Peerapon Vateekul. 2017. Fault detection for circulating water pump using time series forecasting and outlier
detection. In 2017 9th International Conference on Knowledge and Smart Technology: Crunching Information of Everything, KST 2017. 193–198.
https://doi.org/10.1109/KST.2017.7886095
[151] Abhinav Saxena, Kai Goebel, Don Simon, and Neil Eklund. 2008. Damage propagation modeling for aircraft engine run-to-failure simulation. 2008
International Conference on Prognostics and Health Management, PHM 2008 (2008). https://doi.org/10.1109/PHM.2008.4711414
[152] Cu Scholar and Paul Smolensky. 1986. Parallel distributed processing: Explorations in the microstructure of cognition. Information Processing in
Dynamical Systems: Foundations of Harmony Theory. MIT Press, Cambridge, MA, USA 1, 667 (1986), 194–281. https://pdfs.semanticscholar.org/
3ceb/e856001031cfd22438b9f0c2cd6a29136b27.pdf?{_}ga=1.15022902.1038306691.1479690262
[153] Sébastien Schwartz, Juan José Montero Jimenez, Michel Salaün, and Rob Vingerhoeds. 2020. A fault mode identification methodology based on
self-organizing map. Neural Computing and Applications (2020), 1–19. https://doi.org/10.1007/s00521-019-04692-x
[154] Sule Selcuk. 2017. Predictive maintenance, its implementation and latest trends. Proceedings of the Institution of Mechanical Engineers, Part B:
Journal of Engineering Manufacture 231, 9 (2017), 1670–1679. https://doi.org/10.1177/0954405415601640
[155] Oscar Serradilla, Ekhi Zugasti, Carlos Cernuda, Aandoitz Aranburu, Julian Ramirez de Okariz, and Urko Zurutuza. 2020. Interpreting Remaining
Useful Life estimations combining Explainable Artificial Intelligence and domain knowledge in industrial machinery. In 2020 IEEE International
Conference on Fuzzy Systems, FUZZ-IEEE 2020.
[156] Haidong Shao, Hongkai Jiang, Haizhou Zhang, and Tianchen Liang. 2018. Electric Locomotive Bearing Fault Diagnosis Using a Novel Convolutional
Deep Belief Network. IEEE Transactions on Industrial Electronics 65, 3 (2018), 2727–2736. https://doi.org/10.1109/TIE.2017.2745473
[157] Haidong Shao, Hongkai Jiang, Huiwei Zhao, and Fuan Wang. 2017. A novel deep autoencoder feature learning method for rotating machinery
fault diagnosis. Mechanical Systems and Signal Processing 95 (2017), 187–204. https://doi.org/10.1016/j.ymssp.2017.03.034
[158] Si Yu Shao, Wen Jun Sun, Ru Qiang Yan, Peng Wang, and Robert X. Gao. 2017. A Deep Learning Approach for Fault Diagnosis of Induction Motors
in Manufacturing. Chinese Journal of Mechanical Engineering (English Edition) 30, 6 (2017), 1347–1356. https://doi.org/10.1007/s10033-017-0189-y
[159] Michael Sharp, Thurston Sexton, and Michael P Brundage. 2017. Toward semi-autonomous information. In IFIP International Conference on
Advances in Production Management Systems. Springer, 425–432.
[160] Xiao Sheng Si, Wenbin Wang, Chang Hua Hu, Mao Yin Chen, and Dong Hua Zhou. 2013. A Wiener-process-based degradation model with
a recursive filter algorithm for remaining useful life estimation. Mechanical Systems and Signal Processing 35, 1-2 (2013), 219–237. https:
//doi.org/10.1016/j.ymssp.2012.08.016
[161] Jiedi Sun, Changhong Yan, and Jiangtao Wen. 2018. Intelligent bearing fault diagnosis method combining compressed data acquisition and deep
learning. IEEE Transactions on Instrumentation and Measurement 67, 1 (2018), 185–195. https://doi.org/10.1109/TIM.2017.2759418
[162] Gian Antonio Susto, Alessandro Beghi, and Cristina De Luca. 2012. A predictive maintenance system for epitaxy processes based on filtering and
prediction techniques. IEEE Transactions on Semiconductor Manufacturing 25, 4 (nov 2012), 638–649. https://doi.org/10.1109/TSM.2012.2209131
[163] Gian Antonio Susto, Andrea Schirru, Simone Pampuri, Seán McLoone, and Alessandro Beghi. 2015. Machine learning for predictive maintenance:
A multiple classifier approach. IEEE Transactions on Industrial Informatics 11, 3 (2015), 812–820. https://doi.org/10.1109/TII.2014.2349359
[164] Siqin Tao, Tao Zhang, Jun Yang, Xueqian Wang, and Weining Lu. 2015. Bearing fault diagnosis method based on stacked autoencoder and softmax
regression. In Chinese Control Conference, CCC, Vol. 2015-September. IEEE, 6331–6335. https://doi.org/10.1109/ChiCC.2015.7260634
[165] Peter Tavner, Li Ran, Jim Penman, and Howard Sedding. 2008. Condition monitoring of rotating electrical machines. Condition Monitoring of
Rotating Electrical Machines (2008), 1–250. https://doi.org/10.1049/PBPO056E
[166] UESystems. 2019. Understanding the P-F curve and its impact on reliability centered maintenance. http://www.uesystems.com/news/understanding-
the-p-f-curve-and-its-impact-on-reliability-centered-maintenance
[167] Muhammet Unal, Mustafa Onat, Mustafa Demetgul, and Haluk Kucuk. 2014. Fault diagnosis of rolling bearings using a genetic algorithm optimized
neural network. Measurement: Journal of the International Measurement Confederation 58 (2014), 187–196. https://doi.org/10.1016/j.measurement.
2014.08.041
[168] UNE-EN 13306. 2018. Maintenance. Maintenance terminology. Standard. Asociación Española de Normalización, Génova, Madrid.
[169] Venkat Venkatasubramanian, Raghunathan Rengaswamy, Kewen Yin, and Surya N. Kavuri. 2003. A review of process fault detection and diagnosis
part I: Quantitative model-based methods. Computers and Chemical Engineering 27, 3 (2003), 293–311. https://doi.org/10.1016/S0098-1354(02)00160-6
[170] Wlamir Olivares Loesch Vianna and Takashi Yoneyama. 2018. Predictive Maintenance Optimization for Aircraft Redundant Systems Subjected to
Multiple Wear Profiles. IEEE Systems Journal 12, 2 (2018), 1170–1181. https://doi.org/10.1109/JSYST.2017.2667232
[171] Vorne. 2019. What Is OEE (Overall Equipment Effectiveness). https://www.oee.com/
[172] Hongzhi Wang, Mohamed Jaward Bah, and Mohamed Hammad. 2019. Progress in Outlier Detection Techniques: A Survey. IEEE Access 7 (2019),
107964–108000. https://doi.org/10.1109/ACCESS.2019.2932769
[173] Jindong Wang, Yiqiang Chen, Shuji Hao, Xiaohui Peng, and Lisha Hu. 2019. Deep learning for sensor-based activity recognition: A survey. Pattern
Recognition Letters 119 (2019), 3–11.

Manuscript submitted to ACM


Deep learning models for predictive maintenance: a survey, comparison, challenges and prospect 33

[174] Jinjiang Wang, Rui Zhao, Dongzhe Wang, Ruqiang Yan, Kezhi Mao, and Fei Shen. 2017. Machine health monitoring using local feature-based gated
recurrent unit networks. IEEE Transactions on Industrial Electronics 65, 2 (2017), 1539–1548. https://doi.org/10.1109/TIE.2017.2733438
[175] Peng Wang, Ananya, Ruqiang Yan, and Robert X. Gao. 2017. Virtualization and deep recognition for system fault classification. Journal of
Manufacturing Systems 44 (2017), 310–316. https://doi.org/10.1016/j.jmsy.2017.04.012
[176] Xinqing Wang, Jie Huang, Guoting Ren, and Dong Wang. 2017. A hydraulic fault diagnosis method based on sliding-window spectrum feature and
deep belief network. Journal of Vibroengineering 19, 6 (2017), 4272–4284. https://doi.org/10.21595/jve.2017.18549
[177] Zachary Allen Welz. 2017. Integrating Disparate Nuclear Data Sources for Improved Predictive Maintenance Modeling : Maintenance-Based Prognostics
for Long-Term Equipment Operation. Ph.D. Dissertation. University of Tennessee. https://trace.tennessee.edu/utk_graddiss/4667
[178] Juan Wen and Hongli Gao. 2018. Degradation assessment for the ball screw with variational autoencoder and kernel density estimation. Advances
in Mechanical Engineering 10, 9 (2018). https://doi.org/10.1177/1687814018797261
[179] Long Wen, Liang Gao, and Xinyu Li. 2019. A new deep transfer learning based on sparse auto-encoder for fault diagnosis. IEEE Transactions on
Systems, Man, and Cybernetics: Systems 49, 1 (2019), 136–144. https://doi.org/10.1109/TSMC.2017.2754287
[180] Tailai Wen and Roy Keyes. 2019. Time Series Anomaly Detection Using Convolutional Neural Networks and Transfer Learning. arXiv preprint
(2019). arXiv:1905.13628
[181] Paul J. Werbos. 1988. Generalization of backpropagation with application to a recurrent gas market model. Neural Networks 1, 4 (1988), 339–356.
https://doi.org/10.1016/0893-6080(88)90007-X
[182] Paul J. Werbos. 2005. Applications of advances in nonlinear sensitivity analysis. In System Modeling and Optimization. Springer, 762–770.
https://doi.org/10.1007/bfb0006203
[183] M. Woldman, T. Tinga, E. Van Der Heide, and M. A. Masen. 2015. Abrasive wear based predictive maintenance for systems operating in sandy
conditions. Wear 338-339 (2015), 316–324. https://doi.org/10.1016/j.wear.2015.07.004
[184] Zhenyu Wu, Hao Luo, Yunong Yang, Xinning Zhu, and Xiaofeng Qiu. 2018. An unsupervised degradation estimation framework for diagnostics
and prognostics in cyber-physical system. In IEEE World Forum on Internet of Things, WF-IoT 2018 - Proceedings, Vol. 2018-January. 784–789.
https://doi.org/10.1109/WF-IoT.2018.8355191
[185] Min Xia, Teng Li, Lizhi Liu, Lin Xu, and Clarence W. de Silva. 2017. Intelligent fault diagnosis approach with unsupervised feature learning by
stacked denoising autoencoder. IET Science, Measurement and Technology 11, 6 (2017), 687–695. https://doi.org/10.1049/iet-smt.2016.0423
[186] Fan Xu, Wai tai Peter Tse, and Yiu Lun Tse. 2018. Roller bearing fault diagnosis using stacked denoising autoencoder in deep learning and
Gath-Geva clustering algorithm without principal component analysis and data label. Applied Soft Computing Journal 73 (2018), 898–913.
https://doi.org/10.1016/j.asoc.2018.09.037
[187] Haowen Xu, Yang Feng, Jie Chen, Zhaogang Wang, Honglin Qiao, Wenxiao Chen, Nengwen Zhao, Zeyan Li, Jiahao Bu, Zhihan Li, Ying Liu, Youjian
Zhao, and Dan Pei. 2018. Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications. In Proceedings of
the 2018 World Wide Web Conference. 187–196. https://doi.org/10.1145/3178876.3185996
[188] Wenguang Yang, Chao Liu, and Dongxiang Jiang. 2018. An unsupervised spatiotemporal graphical modeling approach for wind turbine condition
monitoring. Renewable Energy 127 (2018), 230–241. https://doi.org/10.1016/j.renene.2018.04.059
[189] Mustagime Tulin Yildirim and Bulent Kurt. 2016. Engine health monitoring in an aircraft by using Levenberg-Marquardt Feedforward Neural
Network and Radial Basis Function Network. In Proceedings of the 2016 International Symposium on INnovations in Intelligent SysTems and
Applications, INISTA 2016. IEEE, 1–5. https://doi.org/10.1109/INISTA.2016.7571847
[190] Andre S. Yoon, Taehoon Lee, Yongsub Lim, Deokwoo Jung, Philgyun Kang, Dongwon Kim, Keuntae Park, and Yongjin Choi. 2017. Semi-supervised
Learning with Deep Generative Models for Asset Failure Prediction. arXiv preprint (2017). arXiv:1709.00845
[191] Jin Yuan, Yi Wang, and Kesheng Wang. 2019. LSTM based prediction and time-temperature varying rate fusion for hydropower plant anomaly
detection: A case study. In Lecture Notes in Electrical Engineering, Vol. 484. Springer, 86–94. https://doi.org/10.1007/978-981-13-2375-1_13
[192] Mei Yuan, Yuting Wu, and Li Lin. 2016. Fault diagnosis and remaining useful life estimation of aero engine using LSTM neural network. In AUS
2016 - 2016 IEEE/CSAA International Conference on Aircraft Utility Systems. IEEE, 135–140. https://doi.org/10.1109/AUS.2016.7748035
[193] Mitchell Yuwono, Yong Qin, Jing Zhou, Ying Guo, Branko G. Celler, and Steven W. Su. 2016. Automatic bearing fault diagnosis using particle swarm
clustering and Hidden Markov Model. Engineering Applications of Artificial Intelligence 47 (2016), 88–100. https://doi.org/10.1016/j.engappai.2015.
03.007
[194] Simon Zhai, Alexander Riess, and Gunther Reinhart. 2019. Formulation and solution for the predictive maintenance integrated job shop scheduling
problem. In 2019 IEEE International Conference on Prognostics and Health Management, ICPHM 2019. 1–8. https://doi.org/10.1109/ICPHM.2019.8819397
[195] Bin Zhang, Shaohui Zhang, and Weihua Li. 2019. Bearing performance degradation assessment using long short-term memory recurrent network.
Computers in Industry 106 (2019), 14–29. https://doi.org/10.1016/j.compind.2018.12.016
[196] Chunkai Zhang and Yingyang Chen. 2019. Time Series Anomaly Detection with Variational Autoencoders. arXiv preprint (2019). arXiv:1907.01702
[197] Chi Zhang, Chetan Gupta, Ahmed Farahat, Kosta Ristovski, and Dipanjan Ghosh. 2019. Equipment health indicator learning using deep reinforcement
learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 11053
LNAI. Springer, 488–504. https://doi.org/10.1007/978-3-030-10997-4_30
[198] Chong Zhang, Pin Lim, A. K. Qin, and Kay Chen Tan. 2017. Multiobjective Deep Belief Networks Ensemble for Remaining Useful Life Estimation
in Prognostics. IEEE Transactions on Neural Networks and Learning Systems 28, 10 (2017), 2306–2318. https://doi.org/10.1109/TNNLS.2016.2582798

Manuscript submitted to ACM


34 Oscar Serradilla, Ekhi Zugasti, and Urko Zurutuza

[199] Chuxu Zhang, Dongjin Song, Yuncong Chen, Xinyang Feng, Cristian Lumezanu, Wei Cheng, Jingchao Ni, Bo Zong, Haifeng Chen, and Nitesh V.
Chawla. 2019. A Deep Neural Network for Unsupervised Anomaly Detection and Diagnosis in Multivariate Time Series Data. In Proceedings of the
AAAI Conference on Artificial Intelligence, Vol. 33. 1409–1416. https://doi.org/10.1609/aaai.v33i01.33011409
[200] Weiting Zhang, Dong Yang, and Hongchao Wang. 2019. Data-Driven Methods for Predictive Maintenance of Industrial Equipment: A Survey. IEEE
Systems Journal 13, 3 (2019), 2213–2227. https://doi.org/10.1109/JSYST.2019.2905565
[201] Xiaodong Zhang, Roger Xu, Chiman Kwan, Steven Y. Liang, Qiulin Xie, and Leonard Haynes. 2005. An integrated approach to bearing fault
diagnostics and prognostics. In Proceedings of the American Control Conference, Vol. 4. IEEE, 2750–2755. https://doi.org/10.1109/acc.2005.1470385
[202] Yongzhi Zhang, Rui Xiong, Hongwen He, and Michael G. Pecht. 2018. Long short-term memory recurrent neural network for remaining useful life
prediction of lithium-ion batteries. IEEE Transactions on Vehicular Technology 67, 7 (2018), 5695–5705. https://doi.org/10.1109/TVT.2018.2805189
[203] Zhongju Zhang and Pengzhu Zhang. 2015. Seeing around the corner: an analytic approach for predictive maintenance using sensor data. Journal
of Management Analytics 2, 4 (2015), 333–350. https://doi.org/10.1080/23270012.2015.1086704
[204] Fuqiong Zhao, Zhigang Tian, and Yong Zeng. 2013. Uncertainty quantification in gear remaining useful life prediction through an integrated
prognostics method. IEEE Transactions on Reliability 62, 1 (2013), 146–159. https://doi.org/10.1109/TR.2013.2241216
[205] Pushe Zhao, Masaru Kurihara, Junichi Tanaka, Tojiro Noda, Shigeyoshi Chikuma, and Tadashi Suzuki. 2017. Advanced correlation-based anomaly
detection method for predictive maintenance. In 2017 IEEE International Conference on Prognostics and Health Management, ICPHM 2017. IEEE,
78–83. https://doi.org/10.1109/ICPHM.2017.7998309
[206] Rui Zhao, Ruqiang Yan, Zhenghua Chen, Kezhi Mao, Peng Wang, and Robert X. Gao. 2019. Deep learning and its applications to machine health
monitoring. Mechanical Systems and Signal Processing 115 (2019), 213–237. https://doi.org/10.1016/j.ymssp.2018.05.050
[207] Rui Zhao, Ruqiang Yan, Jinjiang Wang, and Kezhi Mao. 2017. Learning to monitor machine health with convolutional Bi-directional LSTM networks.
Sensors (Switzerland) 17, 2 (2017), 273. https://doi.org/10.3390/s17020273
[208] Zhen Zhao, Fu Li Wang, Ming Xing Jia, and Shu Wang. 2010. Predictive maintenance policy based on process data. Chemometrics and Intelligent
Laboratory Systems 103, 2 (2010), 137–143. https://doi.org/10.1016/j.chemolab.2010.06.009
[209] Shuai Zheng, Kosta Ristovski, Ahmed Farahat, and Chetan Gupta. 2017. Long Short-Term Memory Network for Remaining Useful Life estimation. In
2017 IEEE International Conference on Prognostics and Health Management, ICPHM 2017. IEEE, 88–95. https://doi.org/10.1109/ICPHM.2017.7998311
[210] Zhenghua Zhou, Jianwei Zhao, and Feilong Cao. 2014. A novel approach for fault diagnosis of induction motor with invariant character vectors.
Information Sciences 281 (2014), 496–506. https://doi.org/10.1016/j.ins.2014.05.046
[211] Enrico Zio and Francesco Di Maio. 2010. A data-driven fuzzy approach for predicting the remaining useful life in dynamic failure scenarios of a
nuclear system. Reliability Engineering & System Safety 95, 1 (2010), 49–57. https://doi.org/10.1016/j.ress.2009.08.001
[212] Bo Zong, Qi Song, Martin Renqiang Min, Wei Cheng, Cristian Lumezanu, Daeki Cho, and Haifeng Chen. 2018. Deep autoencoding Gaussian mixture
model for unsupervised anomaly detection. 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings
(2018).
[213] E. Zugasti, P. Arrillaga, J. Anduaga, M. A. Arregui, and F. Martínez. 2012. Sensor fault identification on laboratory tower. Proceedings of the 6th
European Workshop - Structural Health Monitoring 2012, EWSHM 2012 2 (2012), 1093–1100.
[214] Ekhi Zugasti, Mikel Iturbe, Inaki Garitano, and Urko Zurutuza. 2018. Null is not always empty: Monitoring the null space for field-level anomaly
detection in industrial IoT environments. 2018 Global Internet of Things Summit, GIoTS 2018 (2018). https://doi.org/10.1109/GIOTS.2018.8534574

Manuscript submitted to ACM

You might also like