Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Machine learning methods for identification and classification of events in ϕ-OTDR systems a review

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Review Vol. 61, No.

11 / 10 April 2022 / Applied Optics 2975

Machine learning methods for identification and


classification of events in φ-OTDR systems: a
review
Deus F. Kandamali,1,2,3 Xiaomin Cao,1,2 Manling Tian,1,2 Zhiyan Jin,1,2 Hui Dong,4 AND
Kuanglu Yu1,2, *
1
Institute of Information Science, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China
2
Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing 100044, China
3
Department of Mathematics, Informatics and Computational sciences, Sokoine University of Agriculture, Morogoro, Tanzania
4
Signal Processing, RF & Optical Department, Institute for Infocomm Research A*Star Research Entities, 138632 Singapore, Singapore
*Corresponding author: klyu@bjtu.edu.cn

Received 1 October 2021; revised 27 February 2022; accepted 1 March 2022; posted 2 March 2022; published 4 April 2022

The phase sensitive optical time-domain reflectometer (ϕ-OTDR), or in some applications called distributed
acoustic sensing (DAS), has been a popularly used technology for long-distance monitoring of vibrational signals
in recent years. Since ϕ-OTDR systems usually operate in complicated and dynamic environments, there have
been multiple intrusion event signals and also numerous noise interferences, which have been a major stumbling
block toward the system’s efficiency and effectiveness. Many studies have proposed different techniques to mitigate
this problem mainly in ϕ-OTDR setup upgrades and improvements in data processing techniques. Most recently,
machine learning methods for event classifications in order to help identify and categorize intrusion events have
become the heated spot. In this paper, we provide a review of recent technologies from conventional machine learn-
ing algorithms to deep neural networks for event classifications aimed at increasing the recognition/classification
accuracy and reducing nuisance alarm rates (NARs) in ϕ-OTDR systems. We present a comparative analysis of
the current classification methods and then evaluate their performance in terms of classification accuracy, NAR,
precision, recall, identification time, and other parameters. © 2022 Optica Publishing Group under the terms of the
Optica Open Access Publishing Agreement

https://doi.org/10.1364/AO.444811

1. INTRODUCTION merits ranging from implementation cost, complexity, and


Phase sensitive optical time-domain reflectometer (ϕ-OTDR) performance. However, recent years have witnessed the rise of
[1] is a distributed fiber optic sensing technique based on detec- ϕ-OTDR as the most extensively used sensing technology due
tion of Rayleigh backscattered signals (RBSs), which some to its capability in achieving distributed monitoring in a rela-
researchers would refer to as distributed acoustic sensing (DAS), tively effective way over the course of a long distance [3]. Also
or as distributed vibration sensing (DVS) [1] according to the ϕ-OTDR systems have attracted more interest due to several
application. Here and after in this paper, we will use ϕ-OTDR reasons, including high sensitivity nature, high dynamic range,
to refer to this sensor. Capable of monitoring acoustic signals full distribution, and relatively easy processing scheme as com-
over a long distance, the fiber optic acoustic system has been pared to most optical fiber sensors [20]. Over the past few years,
very suitable for identifying various external disturbances [2]. there have been tremendous increases in interest for researchers
Distributed fiber optic sensors use optical fibers as their sensing both in academics and industries to jump into ϕ-OTDR sys-
unit capable of measuring hundreds of thousands of points
tems by applying various data processing methods to ensure
simultaneously [3]. Over the years, there have been various sens-
efficient and effective event recognition and classification; those
ing technologies for fiber optical acoustic sensing ranging from
quasi-distributed optical fiber sensing technology to distributed methods are reviewed and discussed further in this paper.
optical fiber technology aspects in the likes of, among others,
Fiber Bragg grating (FBG) [2,4–7], Michelson interferometry
A. Significance of φ-OTDR Systems
(MI) [4,6,8–11], Fabry–Perot interferometer (FPI) [12–14],
Mach–Zehnder interferometry (MZI) [15–17], Sagnac inter- Recently, there have been tremendous advancements of ϕ-
ference (SI) [11,18,19], and ϕ-OTDR, each with their own OTDR systems in a number of applications including perimeter

1559-128X/22/112975-23 Journal © 2022 Optica Publishing Group


2976 Vol. 61, No. 11 / 10 April 2022 / Applied Optics Review

security surveillance [21], seismic waves prediction [22], air- the fading noise and that even some available techniques like
ports runaways monitoring for takeoff and landing aircrafts signal averaging and differentiating may not be adequate when it
replacing the common radar systems [23], oil and gas pipelines comes to high frequency event detection. Shao et al. [60] argued
safety integrity [24–31,31–43], and monitoring of under- that most ϕ-OTDR systems are affected by the problem of low
ground tunnels, sub-marine power cables [44,45], engineering signal to noise ratio (SNR) and improvements in SNR are vital
structures [46], railways [1,34,47–49], bridges [50], underwater in order to increase accuracy in identifying and locating external
seismic signals [14], and others. According to Timofeev [51], intrusions. According to Wu et al. [61], most ϕ-OTDR systems
ϕ-OTDR systems are useful in areas with high electromagnetic are affected by the problem of intrinsic weak backscattered sig-
interferences (EMIs) due to their material and inherent ability. nals. In order to mitigate this challenge, Wu et al. [61] proposed
They are stable and can operate in dynamic operating environ- the fabrication of ultra-weak fiber Bragg grating (UWFBG) in
ments regardless of the weather conditions like rain, fog, wind, single-mode fibers through Ti-doped silica outer cladding for
snow, and many others [51]. Long-distance monitoring capabil- use in ϕ-OTDR. Also Butov et al. [62] proposed a high Rayleigh
ity is demonstrated in a study from a 2014 paper [28]; Peng et al. scattering fiber (HRF) in order to increase further sensitivity of
revealed an ultimately long high sensitivity ϕ-OTDR system ϕ-OTDR by introducing a nitrogen-doped single-mode fiber
with over 131.5 km, which can be used in military bases and with enhanced Rayleigh scattering properties; however, those
national borders security due to its high sensitivity and relatively techniques would shorten the measurement length.
low nuisance alarm rates (NARs). Toward achieving an optimal pattern recognition method in
ϕ-OTDR systems are cost effective considering the monitor- ϕ-OTDR systems, the main hindering factor is the absence of a
ing length, and they can be integrated and used to detect crack best feature extraction method due to dynamic nature of data in
in civil structures, therefore, suitable for healthy monitoring of different environments [55].
engineering buildings [52]. They can also be used in detection Generally, there have been several attempts toward mitigating
of engine anomalies for disaster preventions, railway safety the aforementioned ϕ-OTDR challenges including hardware
monitoring, chemical leakage monitoring for oil-gas pipelines improvements like parallel computing [56], which are extremely
[53], and intrusion detection for perimeter security [52], even in costly, and other methods like canny edge detections [49],
places with high EMI [54]. Along with so many other benefits, which are not based on artificial intelligence (AI), machine
ϕ-OTDR systems have a simple deployment structure they also learning (ML), or deep learning (DL). According to many stud-
offer higher position accuracy as well as multipoint vibration ies, data processing methods based on AI, ML, and DL are more
detections as compared to similar systems [55,56]. advanced approaches as they do not modify but can be inte-
grated into the current ϕ-OTDR data acquisition systems. They
can provide better accuracy and lower NAR, provided with large
B. Challenges Facing φ-OTDR Systems
sets of data from the field. If systems are trained smartly, they can
However, the ϕ-OTDR techniques have not always been per- be dynamic enough on the varying environmental conditions.
forming smoothly due to dynamic nature of events and the So in this review paper, we will provide a deep survey of most
noises in the environments they operate in; thus, more robust possible works done regarding ML methods (DL included)
and sophisticated approaches must be deployed to emphasize used for classifying events in ϕ-OTDR systems to the best of our
accuracy of event detection [32]. There have been numerous knowledge.
challenges hindering the effectiveness and efficiency of the
ϕ-OTDR event detections, and we need to employ better
C. Brief Introduction to This Review
denoising techniques to reduce NAR and apply better classifica-
tion methods in order to accurately identify events and separate We have briefly introduced ϕ-OTDR, the significances of
them from other events. ϕ-OTDR systems, and the challenges facing ϕ-OTDR sys-
According to Federov et al. in 2016, the key challenges facing tems in Section 1. Section 2 of this work briefly presents the
ϕ-OTDR are signal interferences caused by excessive phase underlying working principle of ϕ-OTDR. The main objective
noise during photo detection, attenuation as signals travel over of this review is to explore and evaluate ML and DL algo-
longer distances, and false positives (or false alarms) caused by rithms used for event classification in ϕ-OTDR systems in
incorrect event identification, of which they all may lead to inef- a wide range of ϕ-OTDR application domains as presented
ficient allocation of resources or delay in processing time, which in Sections 5 and 6. Since the performance of an ϕ-OTDR
can lead to destruction of properties and/or fatal accidents system requires well-structured and denoised data signals to
[57]. According to Wu [58], ϕ-OTDR is constantly affected act as input feature vectors in order to increase efficiency and
by dynamic environmental changes like rapid air movements, effectiveness of a classifier; therefore, we have briefly intro-
laser frequency drifts, transient acoustic reference, environ- duced various signal preprocessing methods used in ϕ-OTDR
mental noises, and so forth, which can result into high NARs. in Section 3. Evaluation metrics are briefly introduced in
In 2019, Adeel et al. [59] argues that the frequency drift of the Section 4, while the discussion and summary are presented in
laser source and laser linewidth results in Rayleigh noise as well Section 7. Finally, in Section 8, we provide a general conclusion
as non-coherent addition of RBS interference. These are some and recommendations for possible future research directions.
provisions of random noise relation between the input and fiber In total, we covered over 100 papers, and many of them are
response responsible for certain noise levels in differential RBS very recent (2020 to date) research discoveries regarding event
signals [59]. In [52], the authors conceded that detection in classification methods in ϕ-OTDR systems from different
coherent ϕ-OTDR systems is highly affected by the effect of domain areas. In Section 7, we present pros and cons of each
Table 1. Comparative Analysis of the Events Classification Methods in ϕ-OTDR
Classification Spatial Preprocessing
Methods No. of Events Fiber Length Application Field Resolution Method Accuracy Precision Recall f-score NAR IDT
Review

ANN [40] 3 65 km Oil pipeline safety – WPD 94.4% – – – 5.6% –


monitoring WD 91.1% – – – 8.9% –
MLP [63] 2 17 km Pipeline safety 10 m Signal 99.88% 99.87% 99.80% 0.99 < 1 per 0.55 µs
monitoring smoothing month
and filtering
C-GAN [64] 3 20 km Seismic wave 10.3 m – 80.2% – – 89.5% 45% –
detection
GAN [64] 3 5 km Seismic wave 5.5 m – 83% – – 87.72% 54% –
detection
PNN [65] 4 – Safety alarm detection – FFT, Power 98% – – – 1.5% –
spectrum,
WD
KNN + ANN [66] 3 – Vehicle detection – MFCC 72.33% – – – – –
CNN [32] 5 1 km Home-made 10 m No 96.67% – – – – –
Perimeter security Preprocessing.
Temporal-
spatial data
matrix as
CNN inputs
CNN [67] 6 50 km Threat detection 10 m WD, SFTF, 93% 98.10% – – – –
FFT
CNN [25] 4 5 km Pipeline safety 20 m – 85% – – – –
monitoring
CNN [68] 7 50 km Long perimeter – “Hand 91% 92.06% – 91.39% – –
monitoring engineered”
CNN [42] 6 8 km Pipeline monitoring 10 m No 91% – – – – –
preprocessing
CNN [69] 4 1.5 km High-speed railway 10 m – 98.04% – – – – –
track inspection
1DCNN + SVM [37] 5 34 (35) km Oil pipeline safety 8 (10) m WPD 97.5% 97.95% 97.16% 97.52% – –
monitoring
1DCNN + Softmax 5 34 (35) km Oil pipeline 8 (10) m WPD 95.7% 95.19% 95.10% 95.03% – –
[37] monitoring
1DCNN + BiLSTM 4 48 km Oil and gas safety 20 m – 99.26% – – – – –
[27] pipeline monitoring (500 Hz)
97.20%
(100 Hz)
Vol. 61, No. 11 / 10 April 2022 / Applied Optics

1DCNN + BiLSTM 5 40 km Urban safety 5m No 97% 97.06% 96.9% 97.06% – –


[15] monitoring preprocessing
1DCNN + XGB [37] 5 34 (35) km Pipeline monitoring 8 (10) m WPD 96.68% 97.61% 96.92% 97.25% – –
2977

(Table continued)
Classification Spatial Preprocessing
2978

Methods No. of Events Fiber Length Application Field Resolution Method Accuracy Precision Recall f-score NAR IDT
1DCNN + RF [37] 5 34 (35) km Pipeline monitoring 8 (10) m WPD 98% 96.98% 95.39% 96.13% – –
1DCNN [26] 5 – Oil pipeline safety – WPD 95.5% 94.97% 94.89% 95.63% – –
monitoring
1DCNN [15] 5 40 km Urban safety 5m No 92.9% 92.76% 92.86% 92.70% – –
monitoring preprocessing
2DCNN [26] 5 – Oil distance safety – WPD 89.12% 87.96% 85.76% 86.66% – –
pipeline monitoring
2DCNN [15] 5 40 km Urban safety 5m No 94.90% 94.66% 94.90% 94.76% – –
monitoring preprocessing
CLDNN [33] 3 33 km Oil pipeline 8m No 97.2% – – – – –
monitoring preprocessing
ATCN-BiLSTM [70] 3 – – – No 99.6% – – – 0% –
preprocessing
DPN [34] 7 – Railway safety 10 m – 97% 99.29% 99.28% 99.27% – –
Vol. 61, No. 11 / 10 April 2022 / Applied Optics

monitoring
SVM [71] 5 50 km Perimeter security – FFT 92.62% 98.6% 91.2% – – –
monitoring
SVM [72] 3 – Vehicle detection – PCA 88.9% – – – – –
SVM [30] 5 37.5 (34) km Pipeline safety 10 (8) m MFCC 91.9% – – – – –
monitoring
SVM [73] – 40 km Real-time train 10 m PCA 98% – – – – –
tracking
SVM [74] 4 40 km Long perimeter 20 m Spectral 93.3% – – – – 0.6 s
monitoring subtraction
LSVM [54] 3 – – – WD, VMD 79.5% – – – – –
RVM [75] 3 20 km Pipeline safety – WPT 97.8% – – – <1s
monitoring
RVM [76] 3 10 km Near-ground military 20 m Wavelet 88.6% 88.6% 88.9% 88.7% – –
target detection energy
spectrum
analysis
NC-SVM [97] 5 25.05 km Long-distance 50 m WPD 94.3% – – – 5.62% 0.55 s
perimeter monitoring
Multiclass-SVM [51] 7 5 (1.5) km Seismic waves 3–10 m Spectral 98% – – – – –
prediction subtraction,
FFT
CNN-SVM [55] 4 40 km – 20 m Spectral 93.3% – – – – –
subtraction,
STFT

(Table continued)
Review
Classification Spatial Preprocessing
Methods No. of Events Fiber Length Application Field Resolution Method Accuracy Precision Recall f-score NAR IDT
Gradient Boosting 7 5 (1.5) km Seismic waves 3–10 m Spectral 98.67% – – – – –
Review

[51] prediction subtraction,


FFT
XGBoost [51] 7 5 (1.5) km Seismic waves 3–10 m Spectral 99% – – – – –
prediction subtraction,
FFT
XGBoost [77] 5 25.05 km Perimeter monitoring 50 m EMD energy 95.90% 95.96% 95.95% 95.93% 4.1% 0.093 s
analysis
XGBoost [30] 5 37.5 (34) km Pipeline safety 10 (8) m – 93.7% – – – – –
monitoring
RF [78] 4 25.05 km – 50 m – 96.58% – – – – –
RF [30] 5 37.5 (34) km Pipeline safety 10 (8) m – 92.8% – – – – –
monitoring
RF [79] 2 – Perimeter monitoring – Filter method 98.67% – 95.5% – – –
F-ELM [80] 5 25.05 km Airport safety 50 m Fisher score 95% – – – 4.67% < 0.1%
surveillance
GMM [39] 8 45 km Pipeline safety 5m ST-FFT 68.11% – – – 55.6% –
monitoring
GMM [57] 2 few km Perimeter monitoring 10–50 m MFCC 90% – – – – –
GMM [38] 8 45 km Pipeline safety 5m Contextual 69.7% – – – 31.2% –
monitoring feature
extraction
mCNN + HMM [81] 4 34 (18) km Long-distance 20 m – 98.1% 98.07% 98.07% 98.05% – –
monitoring
HMM [30] 5 37.5 (34) km Pipeline safety 10 (8) m WPD 98.2% – – – – –
monitoring
GMM-HMM [35] 3 45 km Pipeline safety 5m ST-FFT 91% – – – 53.7% –
monitoring
DT [30] 5 37.5 (34) km Pipeline safety 10 (8) m – 89.2% – – – – –
monitoring
BN [30] 5 37.5 (34) km Pipeline safety 10 (8) m – 78.3% – – – – –
monitoring
LSTM [82] 5 50 km Long-distance 20 m Spectral 90.6% – – – – 0.87 s
monitoring subtraction
ALSTM [82] 5 50 km Long-distance 20 m Spectral 94.3% – – – – 0.91 s
monitoring subtraction
ConvLSTM [47] 3 40 km High-speed railway 10 m – 85.6% – 69.3% 85.7% 8% 8.25 s
SSAE [24] 4 85 km Long-distance 20 m – 94.47% – – – – 0.68 ms
Vol. 61, No. 11 / 10 April 2022 / Applied Optics

pipeline safety (100 Hz) 1.73 ms


monitoring 97.06%
(500 Hz)
2979
2980 Vol. 61, No. 11 / 10 April 2022 / Applied Optics Review

discussed ML/DL method used for event identification in FG


ϕ-OTDR under different circumstances. A summary table Circulator
(Table 1) is also organized, which shows event classification NLL Coupler AOM EDFA
methods, the number of events, the length of the fiber optic Sensing Fiber

cable, the signal processing methods used for feature extraction,


PC DAQ BPD Coupler
and the performance results; the performance of the methods is
evaluated based on average classification accuracy, f -measure, Fig. 2. φ-OTDR architecture for coherent detection scheme
identification time (IDT), and NAR. ([88], Fig. 5). (NLL, narrow linewidth laser; FG, functional genera-
tor; AOM, acoustic optical modulator; EDFA, erbium-doped fiber
amplifier; PC, personal computer; DAQ, data acquisition card; BPD,
2. UNDERLYING φ-OTDR WORKING PRINCIPLE balanced photoelectric detector.)
ϕ-OTDR systems are used to detect faults or intrusion events
by analyzing changes in vibration signals of the fiber optic an NLL source is split into a lower and an upper part through
sensors then identifying the precise locations of these events a coupler. The lower branch is then served as the local light to
through measuring the Rayleigh backscattered light from across implement heterodyne detection while the upper is employed
the entire fiber optic spectrum [75]. Any disturbance by some as probe light as in direct detection scheme [88]. The Rayleigh
perturbation events will lead to amplitude/phase changes of backscattered light coming from the fiber is then mixed with
the backscattering light; hence, by measuring the changes of the local light through another coupler then sent to a BPD and
the intensity or phase, the distributed acoustic sensors based finally received by the DAQ before being sent to a computer for
on ϕ-OTDR can provide important information about the further processing the backscattered light’s information [88].
position, frequency, event’s pattern, and so forth [3].
ϕ-OTDR systems were originally developed based on
3. SIGNAL PREPROCESSING TECHNIQUES IN
OTDR, which uses a broadband light source, while now ϕ-
φ-OTDR
OTDR uses narrow linewidth laser (NLL, for long coherence
length) as the light source [83,84]. Basically, there are two Signal preprocessing mainly includes denoising and feature
major detection schemes employed in ϕ-OTDR, namely direct extraction methods in ϕ-OTDR systems; we have mainly
and coherent detection. The direct detection scheme usually focused on the latter in this paper. Classical ML algorithms
straightforwardly relies on the registration of local changes in require raw input signals to be initially processed so that their
the backscattered intensity over time [48], whereas in coherent features can be extracted properly and be fed as inputs into a
detection the backscattered signal is mixed with a local oscillator classifier. However, for end-to-end DL networks, feature extrac-
[85] to enhance the scattered light’s SNR. tion is not a required task since DL networks can automatically
In ϕ-OTDR, the function generator (FG) generates pulse, learn by themselves. During feature extraction, important signal
and the coherent probe light from an NLL is sent into an acous- features are extracted before being fed as the inputs to a desired
tic optical modulator (AOM), which converts the continuous traditional ML classification algorithm. Thus, in ϕ-OTDR sys-
wave (CW) light into optical pulse signals, which are directed to tems, when vibration signals are initially recorded, they are not
an erbium-doped fiber amplifier (EDFA) in order to boost the always so meaningful and sparse as they usually contain some
input power. The amplified signal is sent to the sensing fiber [or additional random noises and, hence, may produce unsuitable
fiber under test (FUT)] through the circulator. The Rayleigh results. Lucky for DL algorithms, they have an ability to take
backscattering light is then routed to a photoelectric detector on raw data, then intelligently learn the patterns through back-
(PD) through the same circulator, which is then recorded by ward and forward propagations, and finally converge into an
the data acquisition card (DAQ) ready to be processed by a output(s). However, for legacy ML algorithms, we may not get
computer for further analysis [84,86,87]. The architecture for desired results, plus it may take abundant time to be processed
ϕ-OTDR with the direct detection scheme is shown in Fig. 1. by the classifier especially when we are dealing with large sets of
Figure 2 shows sample architecture for an ϕ-OTDR with un-normalized data. Thus, we need to apply some good signal
the coherent detection scheme. In this scheme, the light from processing methods and techniques in order to extract better
quality features ready for accurate event identification during
FG
the classification stage [28].
Principally, there exist a number of signal processing tech-
niques also known as feature extraction methods or signal
NLL AOM EDFA
denoising methods. Among many others, these are some
commonly and regularly used signal processing techniques
Circulator including, but not limited to, fast Fourier transform (FFT),
Sensing Fiber
wavelet packet transform (WPT), discrete wavelet transform
PC DAQ PD (DWT), continuous wavelet transform (CWT), wavelet
Fig. 1. 8-OTDR Architecture for direct detection scheme ([49], decomposition (WD), and wavelet packet decomposition
Fig. 2). (NLL, narrow linewidth laser; FG, functional generator; (WPD). Usually these methods exist either in time-domain,
AOM, acoustic optical modulator; EDFA, erbium-doped fiber frequency-domain, or both time- and frequency-domain. FFT
amplifier; PC, personal computer; DAQ, data acquisition card; PD, is perhaps the most known as it can transform signals from
photoelectric detector.) time-domain into frequency-domain and inverse FFT (IFFT)
Review Vol. 61, No. 11 / 10 April 2022 / Applied Optics 2981

can be used to convert the frequency-domain signals back to superior method suitable for all ϕ-OTDR applications as each
time-domain signals after denoising or likewise [56]. DWT method is preferred more under different circumstances. The
is a feature extraction method that uses the mother wavelet performance or efficiency of each method depends on several
function in order to simultaneously analyze a signal in time and factors including, but not limited to, the application scenario,
frequency-domains [89]. nature of the data, goal of the ϕ-OTDR application, and/or the
However, some conventional signal processing methods classification algorithm used.
like FFT work better with stationary signals, and in practice However, according to our paper, WPD has been mostly
the events that cause most intrusion signals inside the struc- used in our review paper. Most of its usage comes from oil-gas
tures are transient and non-stationary. Thus, the conventional pipeline safety monitoring and a few other areas, and it has
FFT may not be so suitable [90]. As per recent studies, more proved to perform better than most signal processing meth-
convenient methods that can process and interpret signals in ods. In [40], the authors compared WPD and WD to extract
time-frequency-domain are highly recommended. According to frequency-domain features for three events in a 65 km oil
[15,50,58], conventional methods like short-time Fourier trans- pipeline. According to the authors, WPD is a suitable feature
form (STFT) and CWT are time consuming, especially when identification method for oil pipelines. WPD applies decom-
multiple intrusion events occur simultaneously as they first need position to both the approximations and details; as a result, it
to pinpoint the exact location of the event before extraction of offers a far richer frequency analysis than a WD, which only
its time-domain signals for the recognition process. So, both applies the decomposition to the approximations. The authors
methods are often prone to an unnecessarily longer recognition added that WPD is a far better frequency-spectrum analysis
time problem because pinpointing the precise location of the method because it can obtain more accurate frequency-band
intrusion signal is usually difficult as mostly intrusion(s) occurs decomposition compared to WD. In [37], the authors also urges
within a given range and not a just a single point [75], hence that WPD offers a much richer frequency analysis as compared
leading to an increase in the number of false alarms. Due to a to WD. According to the paper, a three-level WPD with a db6
series of non-linear and dynamic nature of ϕ-OTDR signals, mother wavelet accurately specifies useful signal separately from
the WPT is commonly suggested to mitigate the number of the noise signals as compared to WD. This helps the classifiers
false alarms [15,75]. In the signal processing phase [82], Chen to yield higher accuracy. In [26], the author suggests the use of
and Xu suggest that for each frame of disturbance signal fea- WPD over WD because WPD can divide the high frequency
tures, mel-frequency cepstral coefficients (MFCCs) should be parts of the signals finer than a WD. In [30], the authors presents
extracted as frequency-domain features while the short-time a WPD for extracting time-frequency-domain signals including
energy ratio as well as the short-time level crossing (LC) rate the energy entropy and energy spectra of WPD.
should be extracted as time-domain features. Spectral subtraction and FFT have also appeared in several
Non-linearity, dynamism, and the unpredictable nature of similar cases, but mostly in long perimeter monitoring and
ϕ-OTDR signals have accounted to the introduction of so many seismic waves prediction achieving a very high accuracy (up to
different signal processing techniques mostly derived from 99% in seismic waves). Other methods perform fairly good,
the traditional methods above. An MFCC-related algorithm maintaining high accuracy above 90%, while poor performers
is proposed to reduce NAR [91]. Other algorithms like WD include short-time (ST)-FFT 68.11% and 55.6% NAR, as well
and WPD have been thoroughly compared before being fed as contextual feature extraction 69.7% and 31.2% (see Table 1).
to the neural network with similar datasets and performance To finalize, although WPD has been widely used and has
results proving that WPD is more convenient due to higher performed better in most instances from our review paper, we
identification rate and accuracy with lower NARs for practical believe better or poor results may or may not be due to a signal
applications for pre-warning in oil pipeline safety monitoring preprocessing method used. However, we believe that a good
[40]. In [77], an empirical mode decomposition (EMD) energy signal preprocessing technique helps to reduce noise, which
analysis method is proposed and argued to be advantageous over further helps the classifier to make better decisions/recognition.
WD as it does not require the prior setting of the basis function. An overall summary showing the signal preprocessing methods
In 2019, Zhao et al. [29] proposed a multi-dimensional feature along with other parameters along with their performance
extraction method algorithm based on polynomial least squares results is shown in Table 1.
for removing trend terms from vibration signals and wavelet
threshold denoising for reducing noise interferences, where by
4. EVALUATION METRICS FOR MACHINE/DEEP
the multi-dimensional features of the signals are extracted using
LEARNING METHODS IN φ-OTDR
a combination of short-time analysis (in time-domain) and
wavelet analysis (in wavelet domain). Cubical smoothing [49] After the signal preprocessing, we need to input our data into
and spectral subtraction [56] are good and effective algorithms ML models/classifiers for data analysis. In order to evaluate the
for signal denoising with the latter being so popular and widely efficiency and effectiveness of any classifier, we need to build
used to enhance signal features. A power spectrum estimation strong and sound performance metrics. Usually, the perform-
[72] and wavelet energy spectrum analysis [76] approach have ance of ML/DL algorithms is measured using the confusion
also been utilized to extract the feature vectors generated by matrix parameters namely: accuracy, recall, precision, and
acoustic signals. f -measure (also called f 1-score or f -score) as explained in
In summary, there have been several signal preprocessing the Table 2 below. However, for ϕ-OTDR, we need to include
techniques/methods applied in ϕ-OTDR in which most of necessary parameters like NAR, and some studies have gone
them have been discussed in this section. There is not a single further by adding some additional parameters like IDT.
2982 Vol. 61, No. 11 / 10 April 2022 / Applied Optics Review

A. Nuisance Alarm Rates 5. MACHINE LEARNING ALGORITHMS FOR


NAR is an erroneous or deceptive report of non-event disturb- EVENTS CLASSIFICATIONS IN φ-OTDR
ance causing unnecessary attention, which results in the misuse ML is a science of training a computer on how to act on
of resources. ϕ-OTDR suffers from high NAR [92; therefore, newly incoming feature-sets based on the given input features
abundant efforts have been applied into alleviating NAR in (datasets). Usually in computer science, ML is categorized into
ϕ-OTDR event classification systems [59]. Equation (1) shows four main parts, namely: supervised learning, semi-supervised
the general formula for computing NAR in terms of recall while learning, unsupervised learning, and reinforcement learning.
Eq. (2) further breaks down recall in terms true positives (TP) Once given sufficient learning samples also known as training
and false negatives (FN), datasets, the computer can learn and understand data patterns
in the process called training, and then based on the training
NAR = 1 − Recall. (1) datasets a classifier (classification model) it can identify a newly
unknown feature and place it to its most relevant category
Since recall can be expressed in terms of the confusion matrix (class). In supervised learning, the machine uses labeled data
parameters (TP and FN), therefore, while in unsupervised learning the machine uses unlabeled
data during the training process. Semi-supervised learning is
FN a mixture of both labeled and unlabeled training data while
NAR = . (2)
(TP + FN) reinforcement learning is basically based on trials and errors
whereby the system is rewarded if the error is low and punished
otherwise.
B. Identification Time Some common algorithms for data-preprocessing and clas-
sification as discussed in this paper are demonstrated in Fig. 3.
IDT actually explains how fast the classifier can process and rec-
Classification algorithms can be divided into traditional ML
ognize signals before classification. It is basically the time taken
and deep neural networks, which accomplish the same task
to identify the classes to which a given signal belongs. (classification) but mainly differ in their structures and the way
they process data. As a matter of fact, both are useful, but their
C. Confusion Matrix applications are different depending on the size of datasets,
nature of datasets, classification task ahead, and so forth.
In supervised ML, we usually measure the efficiency of a clas-
As far as event classification in ϕ-OTDR systems is con-
sifier in terms of accuracy, recall, precision, and f -measure.
cerned, many methods have been proposed for different kinds
Together these parameters are illustrated in terms of a table- of environments, each with different contributions to the
like matrix called the confusion matrix or the error matrix as performance of ϕ-OTDR in the aforementioned real-life
presented in Table 2. This helps to visualize the performance applications. Some papers interchangeably refer to these ML
of an algorithm in terms of its efficiency and effectiveness. A classification algorithms as pattern recognition systems, and
binary classifier can classify instances as either positives or neg- several methods have been explained for event identifications.
atives. Table 2 shows a confusion matrix with predicted and However, some studies reviewed in this paper have proposed
actual classes. The confusion matrix parameters TP, FP, FN, and different approaches toward solving identification problems
TN are used to compute the precision, recall, f -measure, and in ϕ-OTDR, which are contrary to ML including Canny edge
accuracy of a classifier as shown in Eqs. (3)–(6),
Machine Learning in -OTDR
TP
Precision = , (3)
TP + TN
Data pre-processing Classification
TP
Rcecall = , (4)
TP + FP
Data Dimensionality Deep
Classical
generation reduction learning
(Precision × Recall)
F − measure = 2 × , (5)
(Precision + Recall) ANN KNN SVM CNN
GAN PCA
RF HMM ELM LSTM

TP + TN GMM XGBoost DT Encoder-


Accuracy = . (6) decoder
TP + TN + FP + FN
Fig. 3. Showing some of current machine learning methods for
Table 2. General Structure of a Confusion Matrix events classifications in φ-OTDR. (GAN, generative adversarial
Predicted Class network; PCA, principal component analysis; ANN, artificial neural
networks; KNN, k-nearest neighbors; SVM, support vector machine;
Actual Class Population Positive Negative RF, random forest; GMM, Gaussian mixture model; XGBoost,
Positive True Positive (TP) False Negative (FN) extreme gradient boosting; ELM, extreme learning machine; DT,
Negative False Positive (FP) True Negative (TN) decision tree; HMM, hidden Markov model; CNN, convolution
neural network; LSTM, long short-term memory.)
Review Vol. 61, No. 11 / 10 April 2022 / Applied Optics 2983

detection [49] and GPU-based parallel computing to reduce WPD, which have been comparatively analyzed for three dif-
time consumption [56]. ferent signals collected from the field as the testing datasets. In
the experimental results, it is vivid that WPD outperformed its
counterpart by margins as it recorded up to 94.4% identification
A. Artificial Neural Networks
rate, with a 5.6% NAR. On the other hand, WD recorded only
Artificial neural networks (ANNs), or simply neural networks, 91.1% average identification rate and a considerably higher
are a branch of ML with the ability to intelligently learn on NAR of 8.9%. Therefore, these experimental results suggest
their own how to extract relevant features once trained. They that WPD is a better option for ϕ-OTDR signal processing in
can be considered as smart computational algorithms with a oil pipeline safety monitoring applications.
unique ability to extract meaningful information from a range
of imprecise or complex datasets, then draw out the patterns,
2. MLP
and finally detect the trends that are otherwise too convoluted
for other simple ML techniques or algorithms. As they have Classical ANNs are fully connected (FC) feed-forward networks
been commonly used in so many domains, ϕ-OTDR systems where each neuron in one layer is connected to all neurons of the
have also witnessed a significant impact in performance by previous layer. This kind or ANN is also referred to as an MLP,
applying the ANN in a wide range of its applications including which is a classical ML algorithm with good performance and
event localizations and classification. Multilayer perceptron low execution time [41]. In 2017, Tejedor et al. [38] proposed
(MLPs) have been commonly adapted by several researchers in an MLP-based method for ϕ-OTDR, which has achieved a
ϕ-OTDR [38,41,63], and probability-based neural networks fair result of 61.8% accuracy for eight class events on a 45 km
(PNNs) have also been incorporated with ϕ-OTDR systems long fiber. In 2021, Bublin [63] reported an MLP plus feature
[65,93–95]; despite their huge popularity, most studies have extraction method; its accuracy is 99.88% with a processing
preferred some DL approaches like convolutional neural net- time of 0.55 ms and <1 false alarms per month.
works (CNNs) and long short-term memory (LSTM), as they
have appear recently in the last one or two years. A typical archi- 3. Probability-Based Neural Network
tecture for ANN used in ϕ-OTDR is shown in Fig. 4. This ANN
has one input layer to begin the workflow by taking initial data PNN is basically an implementation of the “kernel discriminant
(feature vectors from the wavelet packet energy feature space) analysis” statistical algorithm. It was introduced in 1990 by
then performing some calculations via its neurons before send- Specht based on Bayesian probability (BP) theory [93]. It works
ing its outputs to the subsequent layers called hidden layers (h1 by marking the input patterns into a number of different class
and h2 ) of the ANN for further processing. Finally, the output levels, and its network can be organized into a multilayered
layer takes the results of the hidden layers to provide the final feed-forward neural network with four major layers, namely: (1)
results for event classifications. input layer, (2) pattern layer, (3) summation layer, and (4) out-
Principally an ANN is a layered approach with three different put layer [94]. Wu [65] presented a PNN with a successful 1.5%
categories of layers, namely: the input layer, hidden layer(s)m NAR and over 98% recognition accuracy with 0% leakage alarm
and output layer. The network can have as many hidden layers rate using a combination of FFT, power spectrum estimation,
as necessary due to the complexity nature of the datasets; how- and WD feature extraction methods. The major pros of PNN
ever, too many layers can lead to a slower network with longer include faster training capability compared to BP, no local min-
processing time, or even overfitting. Below we have presented ima problem, and an ability to guarantee coverage to an optimal
ANN and related algorithms as applied in ϕ-OTDR. classifier even with larger training datasets. Meanwhile some
of the key disadvantages for the PNN include slow network
execution due to composition of several layers and high memory
1. Basic ANN requirements for their networks [95].
In 2017, Wu et al. [40] presented a four layered backward ANNs has been extensively used, and experimental results
propagation (BP) ANN training model for event identification show an increase in performance by achieving higher event
in order to reduce NARs in a 65 km long oil pipeline. The ANN identification rate with essentially low NAR. It can be discov-
classifier was fed the input feature vectors extracted using two ered that ANN training models or any of their derivatives like
commonly used feature extraction methods, namely: WD and PNN work better with another conventional ML algorithm like
Hidden Layer
k-nearest neighbors (KNNs), support vector machines (SVMs),
Input Layer Output Layer
(feature) (targets) and others of the like combined with a good feature extraction
h1 h2
method(s) that can be used to increase recognition rate and to
E0
lower the NAR.
O1
Wavelet E1 1.background noises
packet 2.artificial digging
energy O2 3.vehicle passing B. Support Vector Machines
...
...

...

...
...

E7 SVMs are supervised ML techniques that can perform both clas-


sification and regression. Linear SVMs were originally designed
Fig. 4. General ANN architecture with one input layer, two hidden for binary classification problems [96]. By the use of the special
layers (h1 and h2 ) and one output layer, where O1 and O2 are the out- decision boundary called the hyperplane, an SVM model can
put neurons ([40], Fig. 3). distinguish between two classes based on the closest datasets
2984 Vol. 61, No. 11 / 10 April 2022 / Applied Optics Review

the effectiveness of the PCA-SVM approach for automatic


y Linear combination of
the inner products vehicle type detection using acoustic signals.
Aiming at improving SNR, another multiclass SVM with
spectral subtraction feature processing method is presented
[74]. Over 800 samples generated by four vibration events
y
1 1 y
n n (taping, shaking, striking, and crushing) were processed using
2 y2 3 y3
spectral subtraction method to reduce wideband background
K ( x1, x) K ( x2 , x) K(x3 , x) K ( xn , x) noise and to enhance the time-frequency properties of the
Kernel function generated signals. More than 90% recognition accuracy was
achieved with only under 0.6 s recognition time in a 20 km long
ϕ-OTDR system.

x1 x2 x3 xn Feature vector
2. Near Category Support Vector Machine
Fig. 5. General SVM architecture ([71], Fig. 3). As an attempt to mitigate the rate of nuisance alarms, a paper
[97] suggests near category support vector machines (NC-
from the hyperplane. Basically, SVM is originated from research SVM) as an improvement of the legacy binary SVM classifier
of the optimal separating hyperplane, which is required to sep- in order to support multiclass classifications using the KNN
arate all the samples exactly while making the margin between algorithm. In their experiments, five different event types,
two sizes of hyperplane maximal [72]. SVM is one of the most i.e., watering, climbing, pressing, knocking, and false disturb-
commonly used event classification methods in ϕ-OTDR and ance were trained and tested in the NC-SVM classification
has achieved suitable results. In the subsections below, we pro- model within a 25.5 km range. Experimental results present
vide a review of SVM algorithms and their variations as used higher average identification rate of above 94% with 0.55 s IDT
in ϕ-OTDR. Figure 5 shows the architecture for a standard and a NAR of 5.62%.
SVM used for event classifications in ϕ-OTDR that takes input
features x 1 to x n through the kernel function, which helps to 3. Linear Support Vector Machine
standardize the inputs for smooth calculations.
The linear SVM can be efficiently used along with DL algo-
rithms by replacing the softmax layer of the CNN to maximize
1. Basic SVM recognition accuracy [96]. In [54], a linear support vector
A perimeter security monitoring system using an SVM classifier machine (LSVM)-based classifier was proposed to classify three
and a FFT signal processing algorithm was presented [71] in different activities, namely: digging with a shovel, hammer, and
order to classify five different events, namely: a stable state, walk- pickaxe along the buried fiber. In the first stage, initially, the
ing on the lawn, vibration exciter, shaking the fence, and fence wavelet denoising method was used to reduce excessive noises
exposed to the wind in a 50 km long ϕ-OTDR system. During from the measured backscattered signal, then high-pass filtering
signal processing phase, three-dimensional (3D) feature vectors was performed using “difference in time-domain” approach,
expressed in terms of low frequency to total energy ratio (Feature and finally an autocorrelation was applied to remove uncorre-
1), total energy (Feature 2), and peak value to mean value ratio lated signals by comparing each signal to itself. In the second
(Feature 3) were extracted as feature vectors to the SVM classifier stage, a variation mode decomposition (VMD) technique was
using the radial basis function (RBF) kernel function. The SVM used in order to decompose the detected activity’s signals into
achieved an average identification rate of 92.62%, intrusion a band-limited series starting from where the event signals
detection rate of up to 98.6%, and event classification rate of are reconstructed. Finally, higher order statistical features are
91.2%. The SVM classifier was also employed in a paper [73] by extracted, which includes variance, skew-ness, and kurtosis. In
Wiesmeyr et al. to monitor and extract active positions of a train the classification stage, LSVM is then employed under different
with a 40 km long fiber optic cable. In this case, a FFT was used levels of SNR. The confusion matrix shows higher accuracy of
for signal processing and principle component analysis (PCA) about 79.5% for higher SNR from −4 to −8 dB while lower
was used to remove dimensionality from 10 feature values to SNR level from −8 to −18 dB leads to a decrease in accuracy to
2 feature values. The experimental results recorded over 98% 75.2%.
accuracy.
The acoustic signals generated by three different types of
4. Relevance Vector Machine
vehicles—cars, trucks, and tractors—were classified using
Library-SVM (LIBSVM) with RBF kernel function [72]. The The relevance vector machine (RVM) is a Bayesian-based proba-
feature vectors were generated using power spectrum estimation bility framework, which is generally sparser than the commonly
method, and then PCA of the normalized spectrum lines was implemented SVM algorithm, but with a shorter recognition
finally implemented in MATLAB to form four principle com- time and higher recognition accuracy [98]. Hence, it is more
ponents with accumulative contributive rate of above 90% to suitable for recognition in fiber optic pre-warning systems
be selected as final feature vectors. The average identification [37,76]. Sun et al. [75] analyzed signals in two-dimensional
accuracy of the training datasets is 95.5% while the average (2D, time and space) domain during feature extraction phase
accuracy of testing samples is 88.9%, and those results showed instead of a noisy and time consuming one-dimensional (1D,
Review Vol. 61, No. 11 / 10 April 2022 / Applied Optics 2985

time) domain and then fed the extracted feature vectors into In 2019, Jia et al. [97] presented a combination of KNN and
the proposed RVM classification model. Three events (walk- SVM to form a hybrid classifier called near category SVM (NC-
ing, digging, and vehicle passing) were successfully identified SVM) for five events (watering, climbing, pressing, knocking,
in a 20 km fiber sensing system during the experiment. The and a disturbance event) in a 25.05 km ϕ-OTDR. According to
RVM yielded 97.8% recognition accuracy with a short (<1 s) the authors, the introduction of the KNN algorithm into SVM
computation time. helped to effectively boost performance results by attaining
Another study demonstrated the application of the RVM higher classification accuracy up to 94%, with 5.2% NAR and
method [76] where a RVM classification algorithm is deployed a good IDT of 0.55 s. As explained in Subsection 5.B.2 of this
to classify three events (walking, jogging, and striking through paper. The algorithm for computing KNN is shown in Fig. 8.
the fiber) in a 10 km sensing fiber. The 10-fold cross valida-
tion is applied to ensure standardized results. The results show D. Random Forest
macro-accuracy of 88.6% and precision of 88.61% with a recall
of 88.99% and an f -measure 88.79%. Figures 6 and 7 show two Random forest (RF) models are ML models that predict the out-
major processes for RVM, namely: training and recognition, put by combining outcomes from a sequence of decision trees
respectively. (DTs). Each tree is constructed independently and depends on
a random vector sampled from the input data [99–101]. Major
advantages of an RF algorithm over a DT algorithm include,
C. K-Nearest Neighbors but are not limited to, easy to fine-tune hyper-parameters, high
KNN is one of the simplest supervised ML algorithms for accuracy without overfitting problems, no need for feature
classification problems, although not in ϕ-OTDR. In 2013, scaling, resilience to noise, and robustness when it comes to the
George et al. [66] presented an efficient detection and classifi- selection of training samples in training dataset [101]. However,
cation method to classify acoustic signals using ANN and KNN the major disadvantage is RF classifiers are harder to interpret
algorithms in order to help detect moving vehicles in traffic compared to DT classifiers. Figure 9 shows the DT of an RF
monitoring. The study shows extraction of MFCC features classifier with two classes.
from different vehicles. However, results show that KNN has In 2018, Wang et al. [79] used RF to identify two event sig-
a poor classification accuracy of only 50.62%. Although this nals named digging and normal signals, which were extracted
experiment may not involve ϕ-OTDR technology, the approach in time- and frequency-domains, respectively. The classifier
used is in many ways similar to the ones in ϕ-OTDR since both showed higher accuracy of 98.67% as presented in Table 3.
involve event classifications of acoustic signals. Therefore, we Wang et al. [78] presented an RF event classifier in ϕ-OTDR
believe that this can be an opportunity for future works to adapt with the aim to reduce NAR. The RF is based on learning time-
the similar approach and improve performance results as far as domain disturbance signal features prior to classification, and
event classifications in ϕ-OTDR. the experiment was done using four different events including
three disturbance events: watering, pressing, and knocking
RVM1 RVM2
Begin

Compute distances from testing sample


Walking Digging Vehicle Passing to training samples

List the training samples(S1, S2, , Sn)according to


distances in ascending order
RVM3

Fig. 6. Training phase for RVM with three classifiers ([75], Fig. 14).
Set L1 as the label of S1 and take i = 2

Unknown Event
Yes
i = i+1 The label of Si = L1?

RVM1 RVM2 RVM3 No

Set L2 as the label of Si

Walking Digging Vehicle Passing


End
Fig. 7. Recognition phase for RVM with three classifiers ([75],
Fig. 15). Fig. 8. Flow chart for KNN algorithm ([97], Fig. 8).
2986 Vol. 61, No. 11 / 10 April 2022 / Applied Optics Review

Node11 tree(i) [40], RF [79], CNN [55], BP neural network [104], and RVM
[76] before comparing their results with their newly proposed
F-ELM algorithm. Experimental results suggest F-ELM is more
f11>t11 effective only next to RF in terms of identification rate but with
f11<=t11
a much shorter IDT than SVM and CNN + SVM as shown in
Table 3.
Leaf3 In Table 3, the identification rates are arranged in a descend-
Node21 f21 ing order from the highest to the lowest with the most accurate
class3 being the RF with an IR of 98.67%. However, the RF had the
f21>t21 f21<=t21 fewest number of disturbance events (only two), so it may not
be safe to conclude that it is better than other classifiers as the
proposed F-ELM achieved a higher identification rate 95.33%
Leaf1 Leaf2 of 5 different events in less than 0.1 second, which makes it the
most effective under the circumstances.
class1 class2 According to Zhang [105], usually the features with higher
Fig. 9. RF classifier for two classes ([78], Fig. 3). intra-class relationship and lower inter-class similarity lead to
best classification accuracy and vice versa. The Fisher score is
important for removing unrelated eigenvalues; thus, for m-
Table 3. Analysis of Several Classification Algorithms identification problems, the Fisher score can be given by Eq. (7)
a
Compared to F-Elm below [80]:
Identification Identification X (µid − µjd )2
Classifier No. of Events Rate (%) Time (s) f (d ) = m
0<i< j <m , (7)
σid2 + σjd2
RF 2 98.67 –
F-ELM 5 95.33 < 0.1
where µid and µjd are the means of classes “i” and “ j ”
BP Network 3 94.4 –
SVM 4 93.8 0.6 corresponding to the d th feature, and σid2 and σjd2 are the
CNN + SVM 4 93.3 0.6 variances.
RVM 3 88.6 – According to [80], during feature selection phase, Fisher
CNN 5 82.1 – scores of every feature should be computed first, and then only
a
[80]. features with larger scores are selected for the next (classification)
phase.
along with one non-disturbance event. And the experimental During the practical experiment, five events were used, and
results recorded over 96.58% average accuracy with individual the experimental results show that four different types of dis-
accuracies being 93.79% for watering, 97.06% for pressing, turbance events, which are watering, climbing, pressing, and
97.36% for knocking, and 98.12% for a non-disturbance event. knocking, could effectively be identified and separated from the
These results are fairly high and almost equally balanced as there fifth false disturbance with above 95% average identification
is no huge gap between the highest and the lowest recorded rate and less than 0.1 s IDT. Meanwhile, the NAR is about
accuracy for individual events. Even though the authors claim
4.67% using 25 selected features with a 25.05 km long fiber.
to have reduced NAR, the paper did not state the exact NAR
The architecture for the ELM algorithm is shown in Fig. 10.
obtained from their experiment, but the average identification
accuracy (96.58%) is pretty high and convincing.
F. Extreme Gradient Boosting
E. Extreme Learning Machine In 2019, Timofeev and Groznov [51] presented a classification
A combination of an extreme machine learning (ELM) and the of seismic-acoustic waves using ϕ-OTDR in a time-domain
feature extraction method called Fisher score is presented to through time-reconstruction of the interference signal phase.
form a method called F-ELM [80]. This method is proposed In their experiments, they presented three classifiers includ-
as an event identification technique to reduce NARs. The F- ing multiclass SVM, gradient boosting (GB), and extreme
ELM is basically an improvement to the previously algorithms, gradient boosting (XGBoost) algorithms with both attaining
namely: between-category to within-category (BW) ELM [102] higher classification accuracies of over 98% in a 20 km system.
and distributed generalized and regularized ELM [103]. In Alternating between FFT, wavelet denoising, and MFCC fea-
2020, Jia et al. [80] conducted an experiment to identify five ture extraction methods, their experiments showed that the
kinds of events, namely: watering, climbing, pressing, knock- multilayered ANN lags behind with lower accuracy as compared
ing, and a false disturbance event in a 25.05 km long ϕ-OTDR. the above-mentioned classifiers, and multiclass SVMs proved to
The results show over 95% average classification accuracy in be more robust using a cross validation technique for generali-
just less than 0.1 s (IDT) and 4.67% NAR by using 25 selected zation. In 2020, a paper by Wang et al. indicated that extreme
features. The authors also performed a mini survey from some gradient boosting (XGBoost) is superior to most other common
other articles and demonstrated a comparative analysis of few classifiers including SVM, RF, and GB especially if using EMD
classification methods including the SVM [75], CNN + SVM energy analysis method for feature extractions [77].
Review Vol. 61, No. 11 / 10 April 2022 / Applied Optics 2987

xj1 ·· xjn Table 4. Performance Comparison of Six


·· Classification Methods Discussed
a

n input neurons No. of


Classifier Events Accuracy Precision Recall F1-Score
w HMM 5 0.982 0.9905 0.9826 0.9860
SVMM 5 0.919 0.9295 0.9402 0.9296
hidden neurons
g(·) ·· ·· single layer
RF 5 0,928 0.9206 0.9264 0.9231
XGB 5 0.937 0.9327 0.9638 0.9462
DT 5 0.892 0.8829 0.8927 0.8860
BN 5 0.783 0.8301 0.7971 0.8046
a
m output neurons [30].
··
2. mCNN-HMM
In 2021, Wu et al. [81] proposed an end-to-end combined
Oi1 · · Oim
model with a modified multi-scale CNN and HMM for a long-
distance safety surveillance. According to the authors, this new
approach can effectively identify vibrational signals by simulta-
max
neously extracting the multi-scale structural features as well as
the sequential information of the signals. In their experiment,
mCNN is used to extract local structural features of the DAS
Label of test sample
signals from a multi-level perspective and their relationship
whereas HMM is used for mining of the sequential information
Fig. 10. Architecture of an ELM model for multiclass recognition
([80], Fig. 2). of previously extracted features. The experiment was conducted
in a 34 km long fiber cable, and the experimental results show
98.1% classification accuracy, 98.07% for both precision and
recall, and 98.05% f1-score. The authors then performed a
G. Probabilistic Approach: Gaussian Mixture Models
comparison between their proposed method (mCNN-HMM)
and Hidden Markov Models
model and three other models, namely: handcrafted feature
A Gaussian mixture model (GMM) is a clustering algorithm with HMM (93.4% average accuracy), CNN-HMM (96%
based on a probabilistic approach that assumes that each gener- average accuracy), and MS-CNN (75% average accuracy). The
ated samples or data points follow a mixture of finite Gaussian comparison results show that the mCNN-HMM method has
distribution with several unknown parameters [57,106] while achieved better results in terms of average accuracy than the rest
a hidden Markov model (HMM) is a statistical model used to of the models.
describe an evolution of observable events depending on some
internal factors that are not directly observable. An observed 3. GMMs
event is called a “symbol” while an invisible factor underly- In 2016, Fedorov et al. [57] used GMM to recognize two event
ing the observation is called a “state” [107]. Hidden states of classes (single target passage and digging near the cable) both
the HMM form a Markov Chain whereas the probability of following the Gaussian distribution. The feature space for their
observed symbols usually depends on the underlying states. GMM was formed by cepstral coefficients using M = 10 as the
optimal value. Experimental results showed that the highest
probability of correct event recognition can reach up to 0.94.
1. HMMs However, according to the authors, this probability of correct
In 2019, Wu et al. [30] proposed a different approach called event identification depends on the number and properties of
dynamic time sequence recognition and a knowledge-mining testing samples, which opens a room for future works.
technique based on the HMMs. According to the authors, this GMM was also applied to cluster two classes (threats and
approach can deal with non-linearity of non-stationary vibra- non-threats) by using real data collected from the field in their
tion signals in long-distance underground pipelines caused by a Fiber Network Distributed Acoustic Sensor (FINDAS) project
for energy pipeline surveillance [39]. In their experiments,
range of complicated dynamic events. The experimental results
GMM was used along with the ST-FFT signal processing
using real testing datasets from the field show higher average
method to generate the spectral information. Finally the
recognition accuracy of 98.2% for five different commonly expectation-maximization algorithm was used for GMM train-
encountered events along buried pipelines. Additionally, other ing, and then acoustic inputs frames were assigned to the class
related performance metrics like precision, recall, and f -score with highest probability. The experiment recorded a threat
are also better than those traditional ML methods such as RF, classification rate of 68.11%, and more than 55% false alarms
XGB, DT, and Bayesian network (BN) as shown in Table 4. (according to the authors) were detected using a six-fold cross
Table 4 presents a performance summary of six classifiers for five validation method in a 45 km long ϕ-OTDR system. However,
different events. the authors claim that these results are just preliminary, and
2988 Vol. 61, No. 11 / 10 April 2022 / Applied Optics Review

future work is aimed at reducing noises and creating more 6. DEEP LEARNING ALGORITHMS FOR EVENT
robust feature vectors. CLASSIFICATION IN φ-OTDR
In 2016, a study based on GMM to monitor the integrity of Usually most ML algorithms are designed to work on simpli-
gas pipeline was presented [38]. In this paper, the contextual fied datasets with only up to few hundred features whereas DL
feature extraction based on the tandem approach is employed to algorithms can run the data through numerous layers of neural
produce the tandem feature vectors, and then a three-layer MLP networks with each layer processing the data to a more simpli-
is employed in order to integrate the feature-level contextual fied form before feeding it to the next layer until the final output.
information. The length of the fiber optic cable is 45 km, and In this section, we have surveyed recent DL methods deployed
eight different activities were recognized. The contextual feature in ϕ-OTDR systems like the recurrent neural network (RNN),
extraction results module recorded a 69.7% classification accu- CNN, temporal convolutional network (TCN), LSTM, gen-
racy, which is fairly low, and a 31.2% NAR, which is too high. erative adversarial network (GAN), and sparse stacked auto
Furthermore, the paper presented a fair 80.7% threat detection encoder (SSAE), as well as some hybrid approaches, which
involve both deep and ML methods, e.g., CNN + SVM,
rate; however, the authors did not clearly specify the differences
CNN + KNN, CNN + LSTM, CNN + RF, and others. These
between the two. Overall results are not very good as the results
techniques have proved to do better classification jobs to
in terms of classification accuracy and NAR are not satisfying increase accuracy while they lower NARs as the future of event
and convincing enough. classification greatly depends on smart DL algorithms with the
So, according to the above results, we can fairly say that abilities to handle large sets of data, to learn intelligently, and to
GMM is a good clustering model due to the fact that it could train with a higher degree of adaptation. They require less effort
achieve up to 94% clustering accuracy [57] under normal oper- in feature processing as they have an ability to intelligently learn
ating conditions. However, Martins et al.’s [39] results are not as and input the raw data into the network hence less manual work
good enough when compared to other classification methods. needed [31,43,45,98]. Also DL algorithms are able to combine
The authors urge that in the future works should focus more the softmax or the FC layer with a traditional ML classifier to
on better noise-reduction methods in order to extract robust form a single robust algorithm.
feature vectors as well as deploying new strategies to deal with An approach on how to develop good DL algorithms for sig-
non-linearity behaviors of the sensing system. nal recognition in long perimeter monitoring in fiber optic sen-
sors was presented [78]. The author demonstrated how an effi-
cient DL algorithm can be used to accurately identify an activity
4. GMM-HMM in long perimeter security using ϕ-OTDR. However, the author
did not review all the approaches based on DL classification but
In 2018, Tejedor et al. [35] proposed a system based on GMM- rather the underlying principles toward developing a strong and
HMM to detect potential threats in a ϕ-OTDR. The presented robust DL algorithm.
results show an improved performance of over 45.15% as com- In subsections below, we present some DL approaches from
pared to traditional GMM in terms of classification accuracy. different studies regarding event recognitions in ϕ-OTDR
The proposed algorithm also shows 91% of threat detection systems.
with a very high 53.7% of false alarms, unlike the traditional
GMM algorithm that shows 80% of threat detection and 40% A. Long Short-Term Memory
of false alarms according to the authors. The false alarm rates
The LSTM network is the most powerful and a common subset
demonstrated in this paper appeared to be too high, but the
of RNN, which is a kind of feed-forward neural network [82].
authors did not provide any explanations about that.
LSTM is a useful method in sequence prediction problems
Generally, if given enough normalized training samples for since it takes both sequence and time into account [108,109].
fairly easy and common activities, a GMM clustering has the LSTMs, especially when combined with CNNs, form robust,
potential of yielding higher clustering probability (i.e., higher intelligent, and more consolidated classification methods for
classification accuracy) during the pattern recognition process. event recognition in ϕ-OTDR systems. Bidirectional LSTM
It can also be used as the learning algorithm during feature (BiLSTM) is a popular variation of a regular LSTM with the
extraction for DL algorithms. HMM works best in some com- ability to feed forward and propagate backwards unlike its
plex activities where an activity has more than one behavior since counterpart with the ability of forward movement only [110].
it is capable of recording them in different states in a Markov So the Bi-LSTM networks are useful in pattern recognition for
chain. So a combination of both GMM and HMM provides ϕ-OTDR since they can better connect and retrieve features
a better threat detection rate as indicated in the studies above. from the sensing points using their bilateral spatial and unidirec-
Even though the GMM-HMM approach has higher detection tional time relationships. Figure 11 shows the architecture for a
Bi-LSTM network.
rate that in GMM and HMM, it suffers from higher NAR as
demonstrated in the FINDAS project. Initially the GMM-
HMM’s NAR was more than 53.7% whereas as the traditional 1. Basic LSTM
GMM only had 40% NAR. This suggests that a GMM-HMM Manie et al. [108] proposed a regular LSTM classification
pattern recognition technique should be best deployed in areas algorithm integrated with a DWT feature extraction technique
where event classification is of a higher priority than a low NAR. for signal denoising in a ϕ-OTDR system, and the results show
Review Vol. 61, No. 11 / 10 April 2022 / Applied Optics 2989

y1 y2 y3 OL

F1 F2 F3 Conv Pooling
AFL Input Output
Layer Layer
n

LSTM LSTM LSTM BPL Softmax Layer


FC Layer

Fig. 12. Basic CNN architecture. (Conv layer, convolution layer;


LSTM LSTM LSTM FFL
FC, fully connected.)

x1 x2 x3 IL blocks, formed by a Conv layer, a pooling layer, a FC layer, and


the softmax layer as shown in Fig. 12. Different CNN algo-
Fig. 11. Basic structure of the BLSTM network ([110], Fig. 3). (IL,
input layer; FFL, forward feeding layer; BPL, backward propagation rithms used for event classifications in ϕ-OTDR are presented
layer; AFL, activation function layer; OL, output layer.) in this section.

that an LSTM model combined without DWT achieves 92% 1. Basic CNNs
accuracy while the accuracy can shoot to as high as 98% when Shi et al. [32] proposed a CNN to classify five different kinds of
DWT denoising is applied. events, namely: background, walking, jumping, beating, and
digging with a shovel. In this paper, the main difference in their
2. Attention-Based Long Short-Term Memory approach from traditional CNN deployments is the input data
Attention-based long-short term memory (ALSTM) is an matrix. The temporal-spatial data matrix acquired from the
improved version of the most common traditional LSTM fiber optics was directly plugged into the CNN as input feature
method, which can focus most of its attention to the main or vectors. In their experiment, 5644 different event samples were
key parts of a signal. It is derived from a RNN feed-forward processed for under only 7 min and managed to get high classifi-
neural network [82]. Chen et al. [82] introduced an ALSTM cation accuracy of 96.67% with a very short recognition time for
in which they compared performances of both ALSTM and a 1 km long fiber. This method was compared to common legacy
the legacy LSTM. In their experiment, they used five different CNNs like LeNet, AlexNet, VggNet, GoogleNet, and ResNet,
kinds of disturbing activities, namely: digging, walking, vehicle and the biggest advantage of the proposed method over the
passing, climbing, and digging at different positions in a 50 km legacy algorithms is the reduction of retrained speed by making
long fiber buried 20 cm deep under the ground. Experimental a network relatively smaller and faster.
results show that the ALSTM model has a faster convergence In 2021, Wang et al. [69] proposed a deep CNN to clas-
speed, a lower training loss, a higher classification accuracy of sify four events, namely: switches, highway below the railway,
94.3%, and a 0.91 s recognition time while traditional LSTM cracking, and beam crevices for a 1.5 km section of a high-
has a classification accuracy of 90.6% and 0.87 s recognition speed railway track using ϕ-OTDR. Their experimental results
time. Additionally, the paper shows superiority of ALSTM to yielded 98.04% accuracy. Although the results are pretty satis-
other well-reputed classification methods: CNN with 89.9% fying, the authors believe that a large amount of data is needed,
classification accuracy and 0.49 s recognition time and morpho- and since there is a limited amount of labeled data, future works
logic feature extraction (MFE) with 88.1% accuracy and 0.21 s should involve semi-supervised DL models to further improve
recognition time. Generally, with the above results, both LSTM performance.
and ALTSM with proper signal processing techniques can Aktas et al. [67] put forward a deep CNN trained with real
achieve high accuracy, and while an ALSTM is more accurate, sensing data using ϕ-OTDR for event classifications in a 40 km
its counterpart is faster in terms of learning and overall network long fiber buried 1 m deep underground. The algorithm suc-
convergence as well as low training loss. However, in both cases, cessfully achieved over 93% accuracy in classifying six events:
the papers did not explain about NARs or the other confusion
walking, pickaxe digging, shovel digging, harrow digging,
matrix parameters.
strong wind, and facility noise caused by water pipes, genera-
tors, and/or air conditioning. Other metrics from a confusion
B. Convolutional Neural Networks matrix present over 97.6% precision with a time-frequency
A CNN, also referred to as ConvNet, is an essential class of signals approach while the precision lowers to 73.7% with a
DL, or deep ANNs, which are most commonly applied in time-domain signals.
computer vision [111] for visual imagery but also have claimed Another attempt to protect malicious activities in pipelines
state-of-the-art performances in an abundantly wide range of was done by Peng et al. [25]. By using CNN, they achieved
tasks including natural language processing and others [112]. over 85% accuracy in identifying four different events. In their
However, in recent years, they have been widely employed in experiment, they used three convolutional layers with a max
various fields of ϕ-OTDR for pattern recognitions. Usually pooling layer. However, the paper did not state the length of the
a typical CNN architecture can consist of few convolutional fiber cable used.
2990 Vol. 61, No. 11 / 10 April 2022 / Applied Optics Review

According to Chen et al. [26], they introduced a 1D CNN however, Makarenko did not disclose more details about the
capable of intelligently learning and identifying the distinguish- secondary classifier as the research is still in progress.
able features from different sources of disturbances by raw event In 2019, Peng et al. [42] presented a two-layer classifier based
signals. According to their experimental results from the real on CNN. In this case, layer 1 is designed to extract third-party
environment in oil pipeline monitoring, it is proven that their
threats from traffic as well as pedestrian noises while layer 2 is
proposed 1D-CNN performs slightly better than 2D CNN in
aimed at determining specific types of third-party interferences.
terms of recognition metrics and processing speed. According
to the authors, a 2D-CNN uses 2D convolution kernels in the According to them, reduction of NARs is done by implement-
convolution layers while a 1D uses a 1D convolution kernel. ing time-space matrix to reduce possible errors in a real-time
The authors urge that in order for a 2D-CNN to recognize 1D surveillance system for security safety monitoring reasons in a
sensing signals, the signal needs to be transformed into a 2D buried municipal pipeline. In the experiment, the time-space
image through time-frequency analysis or it can be reshaped matrix is deployed to correct any possible errors. Six activities—
as a matrix, which in both cases is time consuming and com- excavation, hammering, electrical hammering, shoveling,
putationally expensive; hence, they introduce a 1D-CNN. So pickaxing, and metro passing generated different vibration
without any transformation need on the raw signals, it reduces
signals—were identified, and results show 91% recognition
the network structure, and as a result the computational effi-
accuracy in an 8 km long fiber.
ciency increases. The performance metrics of the confusion
matrix show an overall average accuracy of above 95% as com- In 2019, Wang et al. [34] presented an algorithm called
pared to only 89.1% accuracy from a 2D-CNN. The results also DPN92. DPN stands for deep neural network, which is an
include an average of 99.5% for precision, recall, and f1-score improved version of CNN. A DPN is basically a CNN but
for a 1D-CNN while the 2D-CNN lags behind again with an with so many layers, hence a deep neural network. The authors
average of 93% for precision, recall, and f1-score. The construc- customized their DPN to have 92 layer; hence, it was called
tion of these two algorithms 1D versus 2D-CNNs was based on DPN92. In their experiment, they classified seven disturbance
their inputs, and a 1D-CNN accepted just a single vector while a events, namely: excavator operation, concrete fence breaking,
2D algorithm should be a 2D matrix. However, some key details
pedestrians walking, tamping operation, ambient noise, moving
like the distance of the fiber covered were not stated.
Another paper also proposed a 1D-CNN [37] over a range of train, and local wind blowing. Their datasets were collected
other conventional ML methods. In the first phase, 1D-CNN from a real-life Shanghai railway using a fiber optic cable buried
was employed to extract distinguishable features of the signals just alongside the railway. The experimental results show the
from an ϕ-OTDR. In the second phase, the softmax layer is efficiency of DPN92 with 97% average classification accuracy
replaced by either the vector machine randomly selected from and over 99% of both precision, recall, and f-score. However,
optimal classifiers like SVM, RF, and GB. According to their the paper did not state the length of the cable used.
experiments, this paper suggests a combination of a 1D-CNN
and SVM for feature classifications, and this method can achieve
up to 98% recognition accuracy using five classes of disturbance 2. Convolutional Long Short-Term Neural Network
signals in the oil/gas pipeline monitoring systems, which proved
superior to 2D and most other conventional methods. However, Bai et al. [33] proposed a deep neural network based algo-
the paper did not state important parameters like the length of rithm called the convolutional long short-term neural network
the sensing fiber and the types of the events used. (CLDNN) in a 33 km long fiber optic sensing system to iden-
In 2018, Xu et al. [55] employed a combination of a CNN tify external intrusion events for pipeline safety. The authors
and multiclass SVM to form a hybrid classification method in presented three classes of events called percussive tap (PT),
order to intelligently identify four intrusion events: walking, mechanical digging (MD), and normal (non-intrusion) events.
digging, vehicle passing, and striking along the 40 km fiber. PT events includes all harmful labor activities that involve using
The network contains five convolution layers and two FC layers tools to tap the ground and produce soil vibration, for example,
while the SVM replaces the traditional softmax layer of the
ramming, digging, and drilling. MD events include all activ-
CNN. Two sets of experiments were conducted with one using
traditional CNN with its softmax layer yielding 88% classifi- ities done by heavy machines like excavation and other heavy
cation accuracy while the second one is CNN-SVM combined machines activities. Both PT and MD are considered harm-
replacing the softmax layer with the SVM yielding up to 93.3% ful for the safety of a pipeline. According to the author, their
classification accuracy. proposed algorithm is composed of two convolutional layers,
Makarenko [68] demonstrated that a well-designed DL one LSTM layer, and one FC layer as shown in Fig. 13. The
classifier that uses three different CNNs as the primary classi- experimental results suggest CLDNN is an effective and robust
fier. Each of these CNNs has a separate FC layer and the uses a algorithm for fast and accurate localization of event signals in
sigmoid activation function; however, the main differences lay complex environments. It works by directly inputting the time
in their convolution layers. The primary classifier achieved up
series of data into the DL network. Performance results show
to 91% average detection accuracy, 92.06% average precision,
and 91.39% f1-score using complicated real environments data, that this method is a relatively better approach than others used
with seven intrusion events detected with a 50 km long sensing in previous works, especially when dealing with huge volumes of
system. In order to stabilize NARs a secondary classifier is added, data. Testing results show 97.2% average recognition rate [33].
Review Vol. 61, No. 11 / 10 April 2022 / Applied Optics 2991

Convolutional layers
Linear layer metrics of the confusion matrix. More than 93% accuracy
Input Feature
Conv1 Conv1 dim red LSTM FC Outputs
was attained by replacing the softmax layer after the FC layer
Vectors
and replacing it by the multiclass SVM in a 40 km fiber optic
sensor cable [55]. Wu et al. demonstrated a tremendous rise
Fig. 13. CLDNN architecture ([33], Fig. 6). in accuracy by combining a 1D-CNN with an SVM classifier,
which achieved approximately 98% accuracy, and the result
is better than a 95% accuracy using 1D-CNN with a regular
Table 5. Comparison of Experimental Results of Five softmax layer in oil and gas monitoring pipelines [37]. However,
a
Events for the Aforementioned Methods
one thing to keep in mind is that the size of the network should
Classifier No. of Events Accuracy Precision Recall be as uncluttered and as neat as possible to make the network
1DCNN 5 0.9290 0.9276 0.9286 small and faster, which eventually increases the training and
1DCNN & 5 0.9490 0.9504 0.9496 classification speed.
CNN
2DCNN 5 0.9490 0.9466 0.9490 C. Generative Adversarial Network
1DCNN & 5 0.9700 0.9762 0.9690
BiLSTM The GAN was introduced in recent years, but since then it has
a been applicable in many different applications with ϕ-OTDR
[15].
been one of them. A GAN is a useful tool used for learning in a
3. One-Dimensional Convolutional Neural Networks and given data the distribution, usually consisting of two models,
Bidirectional Long Short-Term Memory namely: a generator as well as a discriminator, of which are
In this article [15], a combination of 1D-CNN and a BiLSTM trained as adversaries [64,113]. First, the generator is trained for
to form a 1DCNN-BiLSTM classification algorithm out- capturing data distribution while the discriminator is trained to
performed both 1D-CNN, 2D-CNN and a regular CNN in be able to differentiate between the generated data and the real
terms of accuracy, precision, recall, and number of events as data. When the generator generates data that the discrimina-
demonstrated in Table 5. In this model, 1D-CNN was used tor fails to discriminate from the real data, then the training is
to extract detailed temporal-structural features for each signal eventually terminated [114].
node, and then a customized Bi-LSTM network was applied In 2018, Shiloh et al. [50] introduced a GAN model for
for construction of spatial relationships among several different efficient training of ϕ-OTDR data. In this paper, the GAN was
signal nodes; thus, the proposed method works by creating a used to transform simulated data in order to mimic genuine data
relationship between spatial and temporal information [15]. based on relatively small experimental dataset, which is manu-
In 2021, Yang et al. [27] also proposed a 1DCNN and a ally labeled. This technique is verified to be effective by yielding
BiLSTM method for four event classifications, namely: back- up to 94% classification accuracy in a 5 km long ϕ-OTDR
ground noise, manual excavation, mechanical excavation, sensing system; however, the author did not say anything about
and vehicle driving in a pipeline early warning system. In their the NAR.
experiment, the input features are first fed into a 1DCNN to In 2019, Shiloh et al. [64] demonstrated a modified GAN
extract the spatial features before being fed into BiLSTM to architecture called C-GAN to perform a classification of three
obtain bidirectional and complex relations. The experiment was different events which are footsteps, noises, and vehicles in 5 km
done in two different frequencies, i.e., 500 Hz yielding 99.26% and 20 km long fibers, respectively. The experimental results
accuracy and 100 Hz yielding 97.20% accuracy. show 83% accuracy for the shorter fiber and 80.2% accuracy
In 2021, Tian et al. [70] designed an attention-based TCN for the longer. In the 5 km long sensing fiber, the NAR was as
with a BiLSTM model (ATCN-BiLSTM) for 8-OTDR, high as 54% initially but reduced to 45% only after fine-tuning
achieving an average classification accuracy of 99.6% with the network with both experimental and simulation data, while
0 NAR on three types of events. In the BiLSTM model, two the f1-score climbed from an initial 87.72%–89.85% after
LSTMs are separately used for accepting inputs where the first refinement. For the 20 km fiber sensing system, the experiments
LSTM takes a forward sequence of raw data while the second show an evident decrease in SNR and the f1-score reaching to
LSTM takes a reverse sequence of data. By applying BiLSTM, 87.23% from an initial 82.64% when trained with experimental
it helps to obtain the long-term dependence and bidirectional data only. In summary, these experiments were conducted using
and complex relations in space domain. On the other hand, the three different kinds of training data, which are case 1, exper-
attention mechanism can be used to focus on the key features imental dataset only; case 2, simulation dataset only; and case
and fiber sections to help reduce the number of parameters 3, experimental dataset with simulation dataset and the refiner
and process faster, while the TCN is employed to look for the for fine-tuning the algorithms, and it is evident that combining
long-term dependence of signals in the time-domain. a mixture of these all converge the algorithm to the best results.
In summary, CNNs and similar networks have recently been Overall, GANs seems to provide best classification accuracy, but
widely trained and applied in tens of applications regarding the NAR is way too high and not very convincing for now.
ϕ-OTDR systems for event classifications. However, in most
cases, studies indicate that, by replacing the traditional softmax
D. Semi-Supervised DL method
layer as the activation function of the model with other classi-
fiers like SVM either multi-cast SVMs or linear SVMs, it usually In 2021, Yang et al. [24] proposed a novel semi-supervised DL
yields better classification results in terms of accuracy and other model for long-distance pipeline safety early warning. Due to
2992 Vol. 61, No. 11 / 10 April 2022 / Applied Optics Review

SSAE model are affected by several issues like in some cases there are too small
training datasets, which may cause a generalization problem,
Peak branch Encoder Decoder so we really cannot jump into conclusions. Due to the absence
of public open datasets for ϕ-OTDR, it is hard to draw fair and
consistent results for all discussed experiments in ϕ-OTDR
Classifier systems (with the code of the models not open). Also, in some
cases there can be over-fitting issues whereas training datasets
Encoder Decoder may be fitted too perfectly into a classification model, hence
leading to higher accuracy. Another issue is the inconsistency in
Energy branch
the number of events to be classified, as can be seen from Table 1
Fig. 14. Sparse stacked auto-encoder model ([24], figure in some algorithms classified fewer events (lowest being 2) while
abstract). some classified up to eight events. In such cases, if an algorithm
can recognize many types of events with higher accuracy, then it
is much more efficient. Another factor to consider is the length
higher costs of collecting data especially for longer pipes, the of the sensing fiber cable. Some methods are only effective over
method is proposed in order to capitalize on utilization of unla- a very short distance like 1 km, while some are more robust over
beled data, as they believe this could reduce experimental costs noises and disturbances as they can still maintain accuracy over a
and also address the model migration problem due to a smaller long distance (up to 65 km).
model size and latency. This event recognition and localiza- In Table 1, we have seen MLP recording higher accuracy of
tion method is based on several components, namely: SSAE, over 99% with a NAR of less than once per month, and this can
BiLSTM, and self-attention. The SSAE is an event recognizer; be regarded as the most effective method since no other algo-
it comprises a stacked auto-encoder (for layer-wise training rithm can come close in terms of NAR; however, a two-event
of data) and sparse auto-encoder (for sparsely compression of dataset was used. PNN records an accuracy of over 98% with
data), which in the end helps to reduce model size and improve the NAR of just 1.5% using FFT, power spectrum, and wavelet
performance. And BiLSTM and self-attention techniques are denoising methods during signal processing. ATCN + BiLSTM
utilized as independent components in the SSAE model. record a high accuracy of 99.6% with the lowest NAR of 0% for
The experiment was conducted to recognize four events, three-event classification, although the dataset is relatively small.
namely: background noise, manual excavation, mechanical SVMs and its variations have performed better as well whether
excavation, and vehicle digging in different sections of an 85 km standalone or when combined with CNN to replace the soft-
real-life oil pipeline deployed by PipeChina Northern Pipeline max layer. Nevertheless, only NC-SVM has had success in
Company. The experimental results show high average recog- both attaining higher accuracy (94.3%) for five different event
nition accuracies of 94.47% for 100 Hz data and 97.06% for classes as well as lowering the NAR (5.62%) in 0.55 s IDT using
500 Hz data. Additionally, the average latency for 100 Hz data wavelet packet denoising technique for feature-prepossessing.
is 0.68 ms while for 500 Hz it is 1.73 ms. The architecture for These are strongly convincing results especially for over 25 km
SSAE is shown in Fig. 14. long optical fiber. XGBoost algorithm has proved to be one the
The authors suggest to further use the few-shot learning best classifiers according to [51] by recording 99% accuracy
mechanism and change the implementation from tensor flows for seven events in a 20 km cable using spectral subtraction
to C++, by which they believe the model latency can be reduced and FFT signal processing methods, while in another study
by 15 times. [77] XGBoost recorded over 95% accuracy using the EMD
energy analysis signal processing method with a lower NAR of
about 4.1% in 0.09 s IDT. The ELM method combined with
7. DISCUSSION
the Fisher score feature extraction method to form the F-ELM
In Table 1, we presented a comparative analysis for different model may not be a very popular method in ML, but it has
event classification methods in ϕ-OTDR systems. The results proved to be very effective by recording 4.67% NAR, which is
in this table have been summarized from different experimental the third lowest NAR after PNN and XGBoost. The F-ELM
results presented in different research papers as discussed in the has higher classification accuracy (95%) for five different events
above sections of this review. We have organized our table such in under 0.1 s IDT (which is even faster than the PNN) inside a
that for every classification method, we have included key details 25.05 km long fiber cable.
including; the number of disturbance events, the length of the Moreover, a semi-supervised DL method SSAE for pipeline
fiber optic cable (measurement range), the application field, the surveillances also achieved very good results of over 94% and
spatial resolution, the feature extraction method used (if any), 97% average identification accuracy for 100 Hz and 500 Hz
and experimental results. The experimental results are discussed data, respectively. The method has proved to be very effective
in terms of recognition accuracy (for most papers) while some since it uses a significant amount of unlabeled data with only
papers also presented other performance metrics like precision, a small amount of labeled data, which makes semi-supervised
recall, f1-score, NAR, and IDT. methods useful especially for real-time applications with new
The results from Table 1 clearly indicate efficiency and/or data coming and an ever-growing dataset.
effectiveness of an algorithm; however, there may be several key Another successful method in our review is the CLDNN,
factors influencing these results. Therefore, we should be careful which has a high accuracy of 85% in a 40 km long fiber cable,
when evaluating the performance of these algorithms. Results and an 8% NAR, which is the lowest by any DL algorithm in
Review Vol. 61, No. 11 / 10 April 2022 / Applied Optics 2993

our review. Other metrics recorded an 85% f1-score with a lower event classification algorithms, ranging from conventional ML
recall of 69% [39]. methods to DL classifiers that are capable of dealing with com-
A combination of CNN and HMM in [81] proved to be very plex data generated in different application areas. As a matter of
effective by achieving over 98% for all confusion matrix param- fact, the latter has been the basis of this paper.
eters (accuracy, recall, precision, and f-score) in a 34 km long The technique of combining two or more algorithms to form
fiber cable. Also a combination of GMM and HMM in [35] one coherent and robust classifier seems to work the better by
performed well by achieving high accuracy of 91% especially outperforming most of single-algorithm approaches in most
for a long-distance (50 km) cable, but with a very low NAR, it is cases, e.g., DL algorithms like the CNN in parallel with ML
(53.7%). methods like SVM, RF, or LSTM have demonstrated better
RF was used in three different occasions, and in both cases it results as compared to the CNN acting alone provided other
performed well. Its highest recorded accuracy is 98% with the conditions remain constant. Even though most research works
lowest being 92% while its counterpart DT recorded a fair 89% have presented better performance results, some works still
accuracy when classifying five events in a 37.5 km fiber cable. lack adequate information toward attaining particular results
One GMM instance recorded 90% accuracy, which is pretty while some lack enough performance measures to evaluate the
satisfying; however, other two instances saw GMM record- classifiers; for example, some articles claim higher event identi-
ing the lowest of the results. In one instance, GMM achieved fication accuracy but do not explain the number of NAR or the
accuracy of 68.11%, which is the lowest in this review with IDT, making it so hard to conclude the system’s performance
the highest 55.6% NAR by using ST-FFT feature extraction in basing on just accuracy alone. Some papers, despite having
method in a 45 km fiber cable; another instance of GMM using higher accuracy, lower NAR, and low IDT, do not state some
the contextual feature extraction method achieved 69.7% accu- useful experimental parameters they operated in, like the length
racy and 31.2% NAR, which is still poor. For these latter two of the sensing fiber, the number/types of events used for signal
GMM cases, the results are not convincing compared to most generation, or even the feature extraction method, which are key
algorithms discussed in our paper, and in both cases, however, ingredients when assessing the success of ϕ-OTDR systems.
the authors did not state the reasons for such poor results in
terms of lower accuracy and such higher NARs.
B. Future Works and Recommendation
In summary, algorithms presented in this paper have man-
aged to achieve higher classification accuracy of 90%–99.6%, According to our review, both classic ML and DL methods
and some of them lie between 80%–90% with few of them have demonstrated fair performances in ϕ-OTDR systems.
lying between 60%–70%. Since these experiments are held in Although DL models are not significantly better than their ML
different environments under different conditions, it is unfair to counterparts for the time being, they could likely be the future
jump into conclusions that higher measurements always ensure focus as they be greatly improved for unique merits, i.e., no
better ϕ-OTDR system performances. However, the aim is to need to design an extra feature extraction module, end-to-end
achieve as much accuracy (along with other confusion matrix model, better transfer learning abilities, and potential better per-
parameters) as possible while maintaining the lowest NAR formance with larger datasets as for now the dataset is relatively
possible. A clear detailed summary of these methods is shown in small. Lowering NAR while maintain higher accuracy would
Table 1. be still an ongoing major challenge facing ϕ-OTDR systems, as
most papers have managed to achieve high recognition accuracy
but only few of them have managed to record low NARs. We
A. Summary of the Comparative Analysis
believe more attention might need to be paid to the algorithms
Table 1 provides a descriptive analytical comparison between that recorded lower NAR rates (below 10%), which includes
dozens of approaches applied in ϕ-OTDR with the main goal the PNN [65] by recording just 1.5% while maintaining 98%
being to mitigate the major challenges facing ϕ-OTDR systems accuracy, ANN [40], NC-SVM [97], XGBoost [77], F-ELM
in different environments. There is no one single better optimal [80], ConvLSTM [47], and ATCN [70].
classifier for all ϕ-OTDR applications, but usually it depends on In future works, we believe the application of semi-supervised
a number of factors like the nature of the environment, sensitiv- learning techniques should be recommended. Since semi-
ity of the sensors, length of the sensing fiber cable, application supervised learning models can work on huge amounts of
domain, and many others. However, in order to improve the unlabeled data with only a small amount of labeled data, this
efficiency and effectiveness of traditional ML algorithms, more might be a huge advantage to improve performances of ϕ-
attention should be paid to both (during the classification stage OTDR systems especially for real-time applications where
and the feature extraction stage), but for DL models it is differ- there is a limited amount of labeled data. Also semi-supervised
ent as we only need to concentrate on the classification stage, learning models might be more useful since they can reduce
thanks to their automatic feature processing ability. data collection cost in environments where it is technically hard
Currently in order to accomplish the main ϕ-OTDR objec- or very expensive to generate enough labeled data for training.
tives, many attempts have been made, including the addition Additionally, ϕ-OTDR systems can be deployed in a wide
of intelligent and extra-sensitive sensors, the overall ϕ-OTDR range of applications discussed in Subsection 1.A of this paper.
architecture improvements, the deployment of better signal However, from our review, we have observed that most research
processing methods for quality feature extractions from tradi- works have focused on perimeter monitoring, followed by
tional to modern techniques, and many others. However, most pipelines surveillance features and high-speed railway tracking,
importantly, there has been a huge development in terms of while the rest of the areas have not been explored as much. So we
2994 Vol. 61, No. 11 / 10 April 2022 / Applied Optics Review

encourage future research works to explore other practical areas 7. Y.-N. Tan, Y. Zhang, and B.-O. Guan, “Simultaneous measurement
of ϕ-OTDR applications like underwater surveillance, military, of temperature, hydrostatic pressure and acoustic signal using
a single distributed Bragg reflector fiber laser,” Proc. SPIE 7753,
airports surveillance, seismic waves prediction, and so forth.
77539S (2011).
Finally, almost every dataset in this review is small in size 8. C. Wang, Y. Shang, X. Liu, C. Wang, H. Wang, and G. Peng,
considering the requirements for DL methods, and they do not “Interferometric distributed sensing system with phase optical
follow a uniform format. Also, unlike in the computer science time-domain reflectometry,” Photon. Sens. 7, 157–162 (2017).
9. M. Chojnacki and N. Palka, “Demodulation of output signals from
field, the codes for each model are always accessible outside their
unbalanced fibre optic Michelson interferometer,” in Modern
respective research groups. It would definitely be very helpful for Problems of Radio Engineering, Telecommunications and Computer
both scientific researches and industries in this area if a uniform Science (IEEE Cat. No.02EX542) (2002), pp. 249–250.
public open dataset base is built and codes of the models are 10. X. Liu, C. Wang, Y. Shang, C. Wang, W. Zhao, G. Peng, and H.
publicized for a fair comparison. Wang, “Distributed acoustic sensing with Michelson interferometer
demodulation,” Photon. Sens. 7, 193–198 (2017).
11. O. Kilic, M. J. F. Digonnet, G. S. Kino, and O. Solgaard, “Miniature
photonic-crystal hydrophone optimized for ocean acoustics,” J.
8. CONCLUSION Acoust. Soc. Am. 129, 1837–1850 (2011).
In this survey paper, we have first introduced ϕ-OTDR working 12. J. Leng and A. Asundi, “Structural health monitoring of smart com-
posite materials by using EFPI and FBG sensors,” Sens. Actuators A
mechanism. As our main focus, we have reviewed and presented Phys. 103, 330–340 (2003).
in details the recent ML and DL methods for event classifica- 13. L. Liu, P. Lu, S. Wang, X. Fu, Y. Sun, D. Liu, J. Zhang, H. Xu,
tions in ϕ-OTDR systems and also reviewed feature extraction and Q. Yao, “UV adhesive diaphragm-based FPI sensor for
methods used for signal processing in ϕ-OTDR systems. We very-low-frequency acoustic sensing,” IEEE Photon. J. 8, 1–9
(2016).
have prepared a table (Table 1) to summarize the details of each
14. F. Wang, Z. Shao, Z. Hu, H. Luo, J. Xie, and Y. Hu, “Micromachined
event classification algorithm reviewed along with its impor- fiber optic Fabry-Perot underwater acoustic probe,” Proc. SPIE
tant details including the performance results. Finally, we have 9283, 52–58 (2014).
discussed the performance of event classification algorithms as 15. H. Wu, M. Yang, S. Yang, H. Lu, C. Wang, and Y. Rao, “A novel das
presented in recent papers, outlined some pros and cons for each signal recognition method based on spatiotemporal information
extraction with 1DCNNs-BiLSTM network,” IEEE Access 8, 119448
of them, and recommended possible future ways forward for the (2020).
future works in order to improve the performance of ϕ-OTDR 16. L. Tie-Gen, Y. Zhe, J. Jun-Feng, L. Kun, Z. Xue-Zhi, D. Zhen-Yang,
systems. W. Shunag, H. Hao-Feng, H. Qun, Z. Hong-Xia, and L. Zhi-Hong,
“Advances of some critical technologies in discrete and distributed
Funding. Fundamental Research Funds for the Central Universities optical fiber sensing research,” Acta Phys. Sin. 66, 070705 (2017).
(2020JBM024); National Natural Science Foundation of China (61805008); 17. Q. Chen, C. Jin, Y. Bao, Z. Li, J. Li, C. Lu, L. Yang, and G. Li, “A
Outstanding Chinese and Foreign Youth Exchange Program of China distributed fiber vibration sensor utilizing dispersion induced walk-
Association of Science and Technology; National Research Foundation off effect in a unidirectional Mach-Zehnder interferometer,” Opt.
Singapore (NRF) Central Gap Fund (NRF2020NRF-CG001-040). Express 22, 2167–2173 (2014).
18. W. Fang, Q. Jia, S. Zhen, J. Chen, X. Cheng, and B. Yu, “Low
Acknowledgment. Deus F. Kandamali would like to extend his sincere coherence fiber differentiating interferometer and its passive
gratitude to the China Scholarship Council (CSC) scholarship for funding his demodulation schemes,” Opt. Fiber Technol. 21, 34–39 (2015).
Ph.D. 19. M. Zyczkowski, M. Szustakowski, N. Palka, and M. Kondrat, “Fiber
optic perimeter protection sensor with intruder localization,” Proc.
Disclosures. The authors declare no conflicts of interest. SPIE 5611, 71–78 (2004).
20. R. Zinsou, X. Liu, Y. Wang, J. Zhang, Y. Wang, and B. Jin, “Recent
Data availability. No data were generated or analyzed in the presented progress in the performance enhancement of phase-sensitive
research. OTDR vibration sensing systems,” Sensors (Switzerland) 19, 1709
(2019).
21. H. Wu, S. Xiao, X. Li, Z. Wang, J. Xu, and Y. Rao, “Separation and
REFERENCES determination of the disturbing signals in phase-sensitive optical
time domain reflectometry (8-OTDR),” J. Lightwave Technol. 33,
1. K. Yüksel, J. Jason, and M. Wuilpart, “Development of a phase- 3156–3162 (2015).
OTDR interrogator based on coherent detection scheme,” Uludağ 22. M. R. Fernández-Ruiz, M. A. Soto, E. F. Williams, S. Martin-Lopez, Z.
Univ. J. Fac. Eng. 23, 355–370 (2018). Zhan, M. Gonzalez-Herraez, and H. F. Martins, “Distributed acous-
2. M. M. Sherif, E. M. Khakimova, J. Tanks, and O. E. Ozbulut, “Cyclic tic sensing for seismic activity monitoring,” APL Photon. 5, 030901
flexural behavior of hybrid SMA/steel fiber reinforced concrete ana- (2020).
lyzed by optical and acoustic techniques,” Compos. Struct. 201, 23. S. Merlo, P. Malcovati, M. Norgia, A. Pesatori, C. Svelto, A. Pniov,
248–260 (2018). A. Zhirnov, E. Nesterov, and V. Karassik, “Runways ground
3. Y. Wang, H. Yuan, X. Liu, Q. Bai, H. Zhang, Y. Gao, and B. Jin, monitoring system by phase-sensitive optical-fiber OTDR,”
“A comprehensive study of optical fiber acoustic sensing,” IEEE in IEEE International Workshop on Metrology for AeroSpace
Access 7, 85821–85837 (2019). (MetroAeroSpace) (2017), pp. 523–529.
4. K. O. Hill, Y. Fujii, D. C. Johnson, and B. S. Kawasaki, 24. Y. Yang, H. Zhang, and Y. Li, “Long-distance pipeline safety early
“Photosensitivity in optical fiber waveguides: application to warning: a distributed optical fiber sensing semi-supervised
reflection filter fabrication,” Appl. Phys. Lett. 32, 647–649 (1978). learning method,” IEEE Sens. J. 21, 19453–19461 (2021).
5. C. Li, Z. Mei, J. Tang, K. Yang, and M. Yang, “Distributed acoustic 25. Z. Peng, J. Jian, H. Wen, A. Gribok, M. Wang, H. Liu, S. Huang, Z.-H.
sensing system based on broadband ultra-weak fiber Bragg grating Mao, and K. P. Chen, “Distributed fiber sensor and machine learning
array,” in 26th International Conference on Optical Fiber Sensors data analytics for pipeline protection against extrinsic intrusions and
(Optical Society of America, 2018), paper ThE14. intrinsic corrosions,” Opt. Express 28, 27277–27292 (2020).
6. S. K. Ibrahim, M. Farnan, D. M. Karabacak, and J. M. Singer, 26. J. Chen, H. Wu, X. Liu, Y. Xiao, M. Wang, M. Yang, and Y. Rao,
“Enabling technologies for fiber optic sensing,” Proc. SPIE 9899, “A real-time distributed deep learning approach for intelligent
98990Z (2016). event recognition in long distance pipeline monitoring with DOFS,”
Review Vol. 61, No. 11 / 10 April 2022 / Applied Optics 2995

in Proceedings International Conference on Cyber-Enabled 44. Y. Hu, Z. Meng, M. Zabihi, Y. Shan, S. Fu, F. Wang, X. Zhang, Y.
Distributed Computing and Knowledge Discovery (CyberC) (2018), Zhang, and B. Zeng, “Performance enhancement methods for
pp. 290–296. the distributed acoustic sensors based on frequency division
27. Y. Yang, Y. Li, T. Zhang, Y. Zhou, and H. Zhang, “Early safety warn- multiplexing,” Electronics 8, 617 (2019).
ings for long-distance pipelines: a distributed optical fiber sensor 45. A. Lv and J. Li, “On-line monitoring system of 35 kV 3-core subma-
machine learning approach,” Proc. AAAI Conf. Artif. Intell. 35, rine power cable based on ϕ-OTDR,” Sens. Actuators A Phys. 273,
14991–14999 (2021). 134–139 (2018).
28. F. Peng, H. Wu, X.-H. Jia, Y.-J. Rao, Z.-N. Wang, and Z.-P. Peng, 46. M. L. Filograno, C. Riziotis, and M. Kandyla, “A low-cost phase-
“Ultra-long high-sensitivity 8-OTDR for high spatial resolution OTDR system for structural health monitoring: design and
intrusion detection of pipelines,” Opt. Express 22, 13804–13810 instrumentation,” Instruments 3, 46 (2019).
(2014). 47. Z. Li, J. Zhang, M. Wang, Y. Zhong, and F. Peng, “Fiber distributed
29. Z. Zhao, D. Liu, L. Wang, and S. Liu, Feature Extraction and acoustic sensing using convolutional long short-term memory net-
Identification of Pipeline Intrusion Based on Phase-Sensitive Optical work: a field test on high-speed railway intrusion detection,” Opt.
Time Domain Reflectometer BT—Wireless and Satellite Systems, M. Express 28, 2925–2938 (2020).
Jia, Q. Guo, and W. Meng, eds. (Springer International Publishing, 48. J. Jason, K. Yüksel, and M. Wuilpart, “Laboratory evaluation of a
2019), pp. 665–675. phase-OTDR setup for railway monitoring applications,” in IEEE
30. H. Wu, X. Liu, Y. Xiao, and Y. Rao, “A dynamic time sequence recog- Photonics Society, 22nd Annual Symposium (2017), pp. 2–6.
nition and knowledge mining method based on the hidden Markov 49. M. He, L. Feng, and J. Fan, “A method for real-time monitoring
models (HMMs) for pipeline safety monitoring with 8-OTDR,” J. of running trains using 8-OTDR and the improved Canny,” Optik
Lightwave Technol. 37, 4991–5000 (2019). (Stuttgart) 184, 356–363 (2019).
31. J. Li, Y. Wang, P. Wang, Q. Bai, Y. Gao, H. Zhang, and B. Jin, “Pattern 50. L. Shiloh, A. Eyal, and R. Giryes, “Deep learning approach for
recognition for distributed optical fiber vibration sensing: a review,” processing fiber-optic DAS seismic data,” in Optics InfoBase
IEEE Sens. J. 21, 11983–11998 (2021). Conference Papers (2018), Part F124.
32. Y. Shi, Y. Wang, L. Zhao, and Z. Fan, “An event recognition method 51. A. V. Timofeev and D. I. Groznov, “Classification of seismoacoustic
for 8-OTDR sensing system based on deep learning,” Sensors emission sources in fiber optic systems for monitoring extended
(Switzerland) 19, 3421 (2019). objects,” Optoelectron. Instrum. Data Process. 56, 50–60 (2020).
33. Y. Bai, J. Xing, F. Xie, S. Liu, and J. Li, “Detection and identification 52. I. Ölçer and A. Öncü, “Adaptive temporal matched filtering for noise
of external intrusion signals from 33 km optical fiber sensing system suppression in fiber optic distributed acoustic sensing,” Sensors
(Switzerland) 17, 1288 (2017).
based on deep learning,” Opt. Fiber Technol. 53, 102060 (2019).
53. S. Grosswig, H. Dijk, M. Den Hartogh, T. Pfeiffer, M. Rembe, M.
34. Z. Wang, H. Zheng, L. Li, J. Liang, X. Wang, B. Lu, Q. Ye, R. Qu,
Perk, and L. Domurath, “Leakage detection in a casing string
and H. Cai, “Practical multi-class event classification approach for
of a brine production well by means of simultaneous fibre optic
distributed vibration sensing using deep dual path network,” Opt.
DTS/DAS measurements,” Oil Gas Eur. Mag. 45(4), 161–169 (2019).
Express 27, 23682–23692 (2019).
54. S. A. Abufana, Y. Dalveren, A. Aghnaiya, and A. Kara, “Variational
35. J. Tejedor, J. Macias-Guarasa, H. F. Martins, S. Martin-Lopez, and
mode decomposition-based threat classification for fiber optic
M. Gonzalez-Herraez, “A Gaussian mixture model-hidden Markov
distributed acoustic sensing,” IEEE Access 8, 100152–100158
model (GMM-HMM)-based fiber optic surveillance system for
(2020).
pipeline integrity threat detection,” in Optics InfoBase Conference
55. C. Xu, J. Guan, M. Bao, J. Lu, and W. Ye, “Pattern recognition based
Papers (2018), Part F124, pp. 3–6.
on time-frequency analysis and convolutional neural networks for
36. H. Maral and M. Aktaş, “Field independent target classification
vibrational events in ϕ-OTDR,” Opt. Eng. 57, 016103 (2018).
analysis in distributed acoustic sensing systems,” in 27th Signal
56. T. Wen, P. Zhu, W. Ye, M. Bao, and J. Guan, “Application of graphics
Processing and Communications Applications Conference (SIU)
processing unit parallel computing in pattern recognition for vibra-
(2019), pp. 1–4.
tion events based on a phase-sensitive optical time domain reflec-
37. H. Wu, J. Chen, X. Liu, Y. Xiao, M. Wang, Y. Zheng, and Y. Rao,
tometer,” Appl. Opt. 58, 7127–7133 (2019).
“One-dimensional CNN-based intelligent recognition of vibra- 57. A. K. Fedorov, M. N. Anufriev, A. A. Zhirnov, K. V. Stepanov, E.
tions in pipeline monitoring with DAS,” J. Lightwave Technol. 37, T. Nesterov, D. E. Namiot, V. E. Karasik, and A. B. Pnev, “Note:
4359–4366 (2019). Gaussian mixture model for event recognition in optical time-
38. J. Tejedor, J. Macias-Guarasa, H. F. Martins, D. Piote, J. Pastor- domain reflectometry based sensing systems,” Rev. Sci. Instrum.
Graells, S. Martin-Lopez, P. Corredera, and M. Gonzalez-Herraez, 87, 036107 (2016).
“A novel fiber optic based surveillance system for prevention of 58. H. Wu, X. Li, Z. Peng, and Y. Rao, “A novel intrusion signal process-
pipeline integrity threats,” Sensors (Switzerland) 17, 355 (2017). ing method for phase-sensitive optical time-domain reflectometry
39. H. F. Martins, D. Piote, J. Tejedor, J. Macias-Guarasa, J. Pastor- (8-OTDR),” Proc. SPIE 9157, 915750 (2014).
Graells, S. Martin-Lopez, P. Corredera, F. De Smet, W. Postvoll, C. 59. M. Adeel, C. Shang, K. Zhu, and C. Lu, “Nuisance alarm reduction:
H. Ahlen, and M. Gonzalez-Herraez, “Early detection of pipeline using a correlation based algorithm above differential signals in
integrity threats using a smart fiber optic surveillance system: the direct detected phase-OTDR systems,” Opt. Express 27, 7685
PIT-STOP project,” Proc. SPIE 9634, 96347X (2015). (2019).
40. H. Wu, Y. Qian, W. Zhang, and C. Tang, “Feature extraction and 60. L. Y. Shao, S. Liu, S. Bandyopadhyay, F. Yu, W. Xu, C. Wang, H. Li,
identification in distributed optical-fiber vibration sensing system M. I. Vai, L. Du, and J. Zhang, “Data-driven distributed optical vibra-
for oil pipeline safety monitoring,” Photon. Sens. 7, 305–310 (2017). tion sensors: a review,” IEEE Sens. J. 20, 6224–6239 (2020).
41. J. Tejedor, J. Macias-Guarasa, H. F. Martins, J. Pastor-Graells, 61. J. Wu, Z. Peng, M. Wang, R. Cao, M. J. Li, H. Wen, H. Liu, and K.
P. Corredera, and S. Martin-Lopez, “Machine learning methods P. Chen, “Fabrication of ultra-weak fiber Bragg grating (UWFBG)
for pipeline surveillance systems based on distributed acoustic in single-mode fibers through Ti-doped silica outer cladding for
sensing: a review,” Appl. Sci. 7, 841 (2017). distributed acoustic sensing,” in Optical Sensors and Sensing
42. R. Peng, Z. Liu, and S. Li, “Perimeter monitoring of urban buried Congress (ES, FTS, HISE, Sensors), OSA Technical Digest (Optica
pipeline subject to third-party intrusion based on fiber optic sensing Publishing Group, 2019), paper ETh1A.4.
and convolutional neural network,” Proc. SPIE 11209, 112091Z 62. O. V. Butov, Y. K. Chamorovskii, K. M. Golant, A. A. Fotiadi, J. Jason,
(2019). S. M. Popov, and M. Wuilpart, “Sensitivity of high Rayleigh scatter-
43. J. Tejedor, H. F. Martins, D. Piote, J. Macias-Guarasa, J. Pastor- ing fiber in acoustic/vibration sensing using phase-OTDR,” Proc.
Graells, S. Martin-Lopez, P. C. Guillén, F. De Smet, W. Postvoll, SPIE 10680, 106801B (2018).
and M. González-Herráez, “Toward prevention of pipeline integrity 63. M. Bublin, “Event detection for distributed acoustic sensing: com-
threats using a smart fiber-optic surveillance system,” J. Lightwave bining knowledge-based, classical machine learning, and deep
Technol. 34, 4445–4453 (2016). learning approaches,” Sensors 21, 7527 (2021).
2996 Vol. 61, No. 11 / 10 April 2022 / Applied Optics Review

64. L. Shiloh, A. Eyal, S. Member, and R. Giryes, “Efficient processing of 84. A. Ahmed and Z. Yu, “Research status of distributed optical fiber
distributed acoustic sensing data using a deep learning approach,” sensing system based on phase-sensitive optical time domain
J. Lightwave Technol. 37, 4755–4762 (2019). reflectometry,” Int. J. Sci. Res. 9, 834–840 (2019).
65. L. Wu, “Study on the fiber-optic perimeter sensor signal proces- 85. Y. Muanenda, “Recent advances in distributed acoustic sensing
sor based on neural network classifier,” in Proceedings—IEEE based on phase-sensitive optical time domain reflectometry,” J.
2011 10th International Conference on Electronic Measurement & Sens. 2018, 3897873 (2018).
Instruments (ICEMI) (2011), Vol. 1, pp. 93–97. 86. Z. Pan, K. Liang, Q. Ye, H. Cai, R. Qu, and Z. Fang, “Phase-
66. J. George, L. Mary, and K. S. Riyas, “Vehicle detection and classi- sensitive OTDR system based on digital coherent detection,” in
fication from acoustic signal using ANN and KNN,” in International Asia Communications and Photonics Conference and Exhibition
Conference on Control Communication & Computing (ICCC) (2013), (ACP) (2011), pp. 1–6.
pp. 436–439. 87. F. Uyar, T. Onat, C. Unal, T. Kartaloglu, I. Ozdur, and E. Ozbay,
67. M. Aktas, T. Akgun, M. U. Demircin, and D. Buyukaydin, “Deep “94.8 km-range direct detection fiber optic distributed acoustic
learning based multi-threat classification for phase-OTDR fiber sensor,” in Conference on Lasers Electro-Optics (CLEO) (2019),
optic distributed acoustic sensing applications,” Proc. SPIE 10208, pp. 2–3.
102080G (2017). 88. H. He, L. Yan, H. Qian, Y. Zhou, X. Zhang, B. Luo, W. Pan, X. Fan, and
68. A. V. Makarenko, “Deep learning algorithms for signal recognition Z. He, “Suppression of the interference fading in phase-sensitive
in long perimeter monitoring distributed fiber optic sensors,” IEEE OTDR with phase-shift transform,” J. Lightwave Technol. 8724,
International Workshop on Machine Learning for Signal Processing 295–302 (2020).
(MLSP), November 2016, pp. 1–11. 89. X. Liang, Z. Ge, L. Sun, M. He, and H. Chen, “LSTM with wavelet
69. S. Wang, F. Liu, and B. Liu, “Research on application of deep con- transform based data preprocessing for stock price prediction,”
volutional network in high-speed railway track inspection based Math. Probl. Eng. 2019, 1340174 (2019).
on distributed fiber acoustic sensing,” Opt. Commun. 492, 126981 90. Z. Qin, L. Chen, and X. Bao, “Continuous wavelet transform for non-
(2021). stationary vibration detection with phase-OTDR,” Opt. Express 20,
70. M. Tian, H. Dong, and K. Yu, “Attention based Temporal convolu- 20459–20465 (2012).
tional network for ϕ-OTDR event classification,” in 19th International 91. K. Liu, P. Ma, J. An, Z. Li, J. Jiang, P. Li, L. Zhang, and T. Liu,
Conference on Optical Communications and Networks (ICOCN) “Endpoint detection of distributed fiber sensing systems based
(2021), pp. 1–3. on STFT algorithm,” Opt. Laser Technol. 114, 122–126 (2019).
71. C. Cao, X. Fan, Q. Liu, and Z. He, “Practical pattern intrusion mon- 92. S. Liang, X. Sheng, and S. Lou, “Experimental investigation on lower
itoring distributed optical fiber recognition system for based on
nuisance alarm rate phase-sensitive OTDR using the combination
88COTDR,” ZTE Commun. 27, 2282–2283 (2017).
of a Mach–Zehnder interferometer,” Infrared Phys. Technol. 75,
72. X. X. Qi, J. W. Ji, X. W. Han, and Z. H. Yuan, “An approach of passive
117–123 (2016).
vehicle type recognition by acoustic signal based on SVM,” in 3rd
93. Z. Rehman, M. T. Mirza, A. Khan, and H. Xhaard, “Predicting G-
International Conference on Genetic and Evolutionary Computing
protein-coupled receptors families using different physiochemical
(WGEC) (2009), pp. 545–548.
properties and pseudo amino acid composition,” in G Protein
73. C. Wiesmeyr, M. Litzenberger, M. Waser, A. Papp, H. Garn, G.
Coupled Receptors, P. Conn, ed. (Academic, 2013), Vol. 522, Chap.
Neunteufel, and H. Döller, “Real-time train tracking from distributed
4, pp. 61–79.
acoustic sensing data,” Appl. Sci. 10, 448 (2020).
94. S. K. Satapathy, S. Dehuri, A. K. Jagadev, and S. Mishra,
74. C. Xu, J. Guan, M. Bao, J. Lu, and W. Ye, “Pattern recognition
“Introduction,” in EEG Brain Classification for Epileptic Seizure
based on enhanced multifeature parameters for vibration events
Detection, S. K. Satapathy, S. Dehuri, A. K. Jagadev, and S. Mishra,
in ϕ-OTDR distributed optical fiber sensing system,” Microw. Opt.
eds. (Academic, 2019), Chap. 1, pp. 1–25.
Technol. Lett. 59, 3134–3141 (2017).
95. B. Mohebali, A. Tahmassebi, A. Meyer-Baese, and A. H. Gandomi,
75. Q. Sun, H. Feng, X. Yan, and Z. Zeng, “Recognition of a phase-
“Probabilistic neural networks: a brief overview of theory, imple-
sensitivity OTDR sensing system based on morphologic feature
mentation, and application,” in Handbook of Probabilistic Models,
extraction,” Sensors 15, 15179–15197 (2015).
76. Y. Wang, P. Wang, K. Ding, H. Li, J. Zhang, X. Liu, Q. Bai, D. Wang, P. Samui, D. Tien Bui, S. Chakraborty, and R. C. Deo, eds.
and B. Jin, “Pattern recognition using relevant vector machine in (Butterworth-Heinemann, 2020), Chap. 14, pp. 347–367.
optical fiber vibration sensing system,” IEEE Access 7, 5886–5895 96. Y. Tang, “Deep learning using linear support vector machines,”
(2019). arXiv:1306.0239 (2013).
77. Z. Wang, S. Lou, S. Liang, and X. Sheng, “Multi-class disturbance 97. H. Jia, S. Liang, S. Lou, and X. Sheng, “A k-nearest neighbor
events recognition based on EMD and XGBoost in ϕ-OTDR,” IEEE algorithm-based near category support vector machine method
Access 8, 63551–63558 (2020). for event identification of `-OTDR,” IEEE Sens. J. 19, 3683–3689
78. X. Wang, Y. Liu, S. Liang, W. Zhang, and S. Lou, “Event identification (2019).
based on random forest classifier for 8-OTDR fiber-optic distrib- 98. J. Hu and P. W. Tse, “A relevance vector machine-based approach
uted disturbance sensor,” Infrared Phys. Technol. 97, 319–325 with application to oil sand pump prognostics,” Sensors (Basel) 13,
(2019). 12663–12686 (2013).
79. J. Wang, Y. Hu, and Y. Shao, “The digging signal identification by the 99. D. Shrivastava, S. Sanyal, A. K. Maji, and D. Kandar, “Bone cancer
random forest algorithm in the phase-OTDR technology,” IOP Conf. detection using machine learning techniques,” in Smart Healthcare
Ser. Mater. Sci. Eng. 394, 032005 (2018). for Disease Diagnosis and Prevention, S. Paul and D. Bhatia, eds.
80. H. Jia, S. Lou, S. Liang, and X. Sheng, “Event identification by F- (Academic, 2020), Chap. 17, pp. 175–183.
ELM model for ϕ-OTDR fiber-optic distributed disturbance sensor,” 100. B. Williams, C. Halloin, W. Löbel, F. Finklea, E. Lipke, R. Zweigerdt,
IEEE Sens. J. 20, 1297–1305 (2020). and S. Cremaschi, “Data-driven model development for cardiomy-
81. H. Wu, S. Yang, X. Liu, C. Xu, H. Lu, C. Wang, K. Qin, Z. Wang, Y. J. ocyte production experimental failure prediction,” in 30 European
Rao, and A. O. Olaribigbe, “Simultaneous extraction of multi-scale Symposium on Computer Aided Process Engineering, S. Pierucci,
structural features and the sequential information with an end-to- F. Manenti, G. L. Bozzano, and D. Manca, eds. (Elsevier, 2020),
end mCNN-HMM combined model for DAS,” J. Lightwave Technol. Vol. 48, pp. 1639–1644.
39, 6606–6616 (2021). 101. S. Misra and H. Li, “Noninvasive fracture characterization based on
82. X. Chen and C. Xu, “Disturbance pattern recognition based on an the classification of sonic wave travel times,” in Machine Learning
ALSTM in a long-distance ϕ-OTDR sensing system,” Microw. Opt. for Subsurface Characterization, S. Misra, H. Li, and J. He, eds. (Gulf
Technol. Lett. 62, 168–175 (2020). Professional Publishing, 2020), Chap. 9, pp. 243–287.
83. F. Uyar, T. Onat, C. Unal, T. Kartaloglu, E. Ozbay, and I. Ozdur, “A 102. Z.-L. Sun, H. Wang, W.-S. Lau, G. Seet, and D. Wang, “Application
direct detection fiber optic distributed acoustic sensor with a mean of BW-ELM model on traffic sign recognition,” Neurocomputing
SNR of 7.3 dB at 102.7 km,” IEEE Photon. J. 11, 1–8 (2019). 128, 153–159 (2014).
Review Vol. 61, No. 11 / 10 April 2022 / Applied Optics 2997

103. F. K. Inaba, E. O. Teatini Salles, S. Perron, and G. Caporossi, “DGR- 109. Z. Jiang, Y. Lai, J. Zhang, H. Zhao, and Z. Mao, “Multi-factor oper-
ELM–distributed generalized regularized ELM for classification,” ating condition recognition using 1D convolutional long short-term
Neurocomputing 275, 1522–1530 (2018). network,” Sensors (Switzerland) 19, 5488 (2019).
104. C. Cao, X. Fan, Q. Liu, and Z. He, “Practical pattern recognition 110. Ö. Yildirim, “A novel wavelet sequences based on deep bidirectional
system for distributed optical fiber intrusion monitoring system LSTM network model for ECG signal classification,” Comput. Biol.
based on phase-sensitive coherent OTDR,” in Asia Communication Med. 96, 189–202 (2018).
Photonics Conference (ACPC) (2015), pp. 2–4. 111. B. Rosenhahn and B. Andres, Pattern Recognition, D. Hutchison
105. M. Zhang, Y. Li, J. Chen, Y. Song, J. Zhang, and M. Wang, “Event and T. Kanade, eds. (Springer Nature, 2016).
detection method comparison for distributed acoustic sensors 112. T. N. Sainath, A. R. Mohamed, B. Kingsbury, and B. Ramabhadran,
using ϕ-OTDR,” Opt. Fiber Technol. 52, 101980 (2019). “Deep convolutional neural networks for LVCSR,” in IEEE
106. P. Kannadaguli and V. Bhat, “A comparison of Gaussian mixture International Conference on Acoustics, Speech and Signal
modeling (GMM) and hidden Markov modeling (HMM) based Processing—Proceedings (ICASSP) (2013), pp. 8614–8618.
approaches for automatic phoneme recognition in Kannada,” in 113. J. Yoon, D. Jarrett, and M. van der Schaar, “Time-series gener-
International Conference on Signal Processing and Communication ative adversarial networks,” in Advances in Neural Information
(ICSC) (2015), pp. 257–260. Processing Systems (2019), Vol. 32, pp. 1–11.
107. B.-J. Yoon, “Hidden Markov models and their applications in 114. I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-
biological sequence analysis,” Curr. Genomics 10, 402–415 (2009). Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial
108. Y. C. Manie, J. W. Li, P. C. Peng, R. K. Shiu, Y. Y. Chen, and Y. T. nets,” in Advances in Neural Information Processing Systems
Hsu, “Using a machine learning algorithm integrated with data (2014), Vol. 3, pp. 2672–2680.
de-noising techniques to optimize the multipoint sensor network,”
Sensors (Switzerland) 20, 1070 (2020).

You might also like