1. Introduction
Pipelines play an important role in the industrial sector by facilitating the distribution of fluids and gases to various locations. However, these pipelines are susceptible to different defects, including corrosion cracks, fatigue cracks, stress cracks, and structural discontinuities, which can lead to leakages [
1,
2,
3]. The consequences of such leaks are severe, wasting valuable resources, endangering public safety, and causing substantial economic losses [
4]. In certain regions, leaks in pipeline networks are the cause for wasting over 40% of water resources [
5]. Unfortunately, in 2013, China witnessed 130 injuries and 60 fatalities due to pipeline leaks. Given the severe consequences of pipeline leaks, detecting and locating such leaks early on is of the utmost importance. High-pressure conditions, sometimes reaching several megapascals, can be produced by the backfill slurry within a pipeline. This heightened pressure intensifies the risk of pipeline rupture, bursts, failures, and various other potential hazards. Such mishaps could cause roadway pollution, necessitating additional pipe cleaning and borehole sealing. They might lead to delays in mining backfilling operations, financial losses, and human fatalities [
6,
7]. Therefore, the real-time detection of abnormal conditions in a pipeline is of paramount importance. It allows for an immediate response, which can significantly mitigate the extent of the associated losses [
8].
Historically, various nondestructive testing methodologies have been employed for the identification and localization of pipeline leakage, including but not limited to pressure monitoring [
9], ultrasonic surveillance [
10], time-domain reflectometry [
11], accelerometer-based monitoring, and acoustic emission (AE)-based surveillance [
12]. Within the industrial context, AE methodology is predominantly favored in these nondestructive evaluation techniques, attributable primarily to its real-time feedback, heightened sensitivity, prompt leak detection capabilities, and uncomplicated implementation [
13]. Given the multiple benefits conferred by AE technology, it serves as the focal point of the present investigation, deployed explicitly to detect and localize pipeline leaks.
In the relevant leak detection algorithms, the conventional practice often involves the implementation of mass or volume balancing, which typically quantifies the differential in mass or volume flow on opposing ends of a pipeline [
14]. Pressure point analysis (PPA), conversely, represents a more streamlined approach, comparing the observed flow rate or pressure with the historical statistical trajectory of previous data [
15]. This method relies on a comparative assessment, drawing on the existing data trends for a more informed analysis. Currently, AI and machine learning (ML) methodologies are being extensively deployed across a multitude of smart systems, primarily to facilitate tasks related to detection and prediction [
16,
17,
18,
19]. Wang et al. [
20] designed a refined deep convolutional neural network, integrated with a long short-term memory (LSTM) model, specifically built for forecasting crack development in natural gas pipelines. Li et al. [
21] proposed a forecasting model for estimating the greatest depth of pitting corrosion in offshore oil pipelines, leveraging a sparrow search algorithm-integrated LSTM method. Techniques for leak detection based on AI can be broadly stratified into three categories: time domain (TD), frequency domain (FD), and time-frequency domains (TFDs) [
22]. Li et al. [
23] detected operational states in pipelines utilizing an Artificial Neural Network (ANN) [
24] model, trained using TD statistical attributes. It is important to note that the AE signals from the pipeline are significantly affected by background interference and attenuation. Consequently, the AE signal TD distribution varies, rendering the statistical characteristics vulnerable to these variations, potentially leading to false-positive indications concerning pipeline conditions. When a pipeline experiences leakage, it discharges elastic energy manifesting as stress waves and pressure waves. The intersection of these stress and pressure waves results in distortions in the frequency spectrum [
25]. Hence, the FD features of the AE signal can accurately reflect the conditions of the pipeline. Zhou et al. (2021) [
26] harnessed the power of empirical mode decomposition (EMD) and continuous wavelet transformations (CWTs), components of time-frequency EMD methodologies, to accurately localize a leak. Similarly, T. Xu et al. (2021) [
27] implemented variational mode decomposition (VMD) as an efficient strategy for denoising AE signals. They proceeded to extract a distinct feature set from the highly interrelated VMD coefficients via Mel-frequency cepstral coefficients (MFCCs). These MFCCs were then subject to support vector machine (SVM) classification to evaluate the pipeline’s condition. While these approaches significantly improved the accuracy of pipeline leak detection, they were not devoid of drawbacks. For instance, the AE signal’s inherent noise could trigger unwanted alarms if the defined threshold for feature extraction is considered insufficient. Alongside this issue of threshold determination, the extraction of features from AE signals also demands extensive expertise and a profound understanding of the subject. The usage of EMD for the extraction of AE signals’ intrinsic modes also carries the risk of severe interpolation [
28]. This mode mixing poses an additional challenge in the context of EMD usage.
Extensive research has been conducted utilizing ML and deep learning (DL) to detect pipeline leaks and identify normal operations. Various techniques have been developed to identify these conditions, including linear discriminant analysis (LDA), K-nearest neighbor (KNN) [
29], random forest [
30], and SVM [
31]. However, these methods typically require significant manual effort to select essential data features, which can be time-consuming. Recent studies have focused on deep learning, which can automate this process. Advanced techniques, such as one-dimensional convolutional neural networks (CNNs), RNN, LSTM, and GRU, can be utilized to achieve superior results compared to traditional ML approaches.
The highlights of the primary contribution in this study are as follows.
- I.
Proposed an automatic detection system of pipeline leakage for efficient transportation of liquid (water) and gas across cities.
- II.
Proposed the time-series-based intelligent framework leveraging the AE signal for pipeline leakage detection and leak size identification.
- III.
Developed deep learning multiple sensor input based on three sequential deep learning models for predicting the leakage status of the pipeline: long short-term memory (LSTM), bi-directional LSTM (Bi-LSTM), and gated recurrent units (GRUs).
- IV.
Compared the performance of LSTM, BiLSTM, and GRU for optimal classification of pipeline leak detection and size identification considering two different transportation media such as fluid and gas at pressure levels.
- V.
Real-world industrial fluid pipeline data are utilized in this study for leak detection and size identification using advanced DL models.
- VI.
Fine-tuned and calibrated the settings of sequential deep learning algorithms to determine the optimal parameters and achieve the accuracy for the classification of leak and normal conditions in the pipeline.
The following sections make up the structure of this paper.
Section 2 is the proposed framework and sequential deep learning models for leakage detection. The results and pipeline experimental test are presented in
Section 3.
Section 4 presents the conclusion of this study.
2. The Proposed Framework for Pipeline Leakage Detection Leveraging AE Sensors and Time-Series-Based Sequential Deep Learning
This section outlines the experimental setup for AE sensor data from pipeline systems. It also provides a comprehensive overview of the DL models employed for identifying leaks and pinhole sizes.
Figure 1 shows a flow diagram of the proposed technique. The proposed approach comprises the following steps:
Step 1: AE sensors are strategically placed along pipelines to capture high-frequency acoustic signals that indicate potential leaks, operating at a data acquisition rate of 1M sample/second, subsequently decimated to 4K sample/second for effective real-time processing.
Step 2: High-frequency signals are first subjected to decimation to decrease the volume of data while maintaining critical information, essential for deep learning analysis. In the initial step, a discrete wavelet transform (DWT) is used to denoise and clean the signal. The second step involves the application of a moving average filter to smooth the signal, reducing short-term variations. In the final step, normalization adjusts the signal to a specific range, ensuring uniformity across analyses.
Step 3: After normalization, the data are fed into a time-series deep learning model designed specifically for the classification of conditions in pipelines, distinguishing between normal and leak scenarios. This model utilizes the sequential and temporal patterns within the normalized data, utilizing its architecture—typically incorporating LSTM, Bi-LSTM, and GRU layers—to effectively identify and classify the state of the pipeline. By analyzing the sequence of data points over time, the model can predict potential leaks with high accuracy, ensuring that timely preventive measures and maintenance can be initiated.
2.1. The Time-Series Sequential Models for Pipeline Leakage Detection Modeling
In pipeline leak detection, DL techniques play a primary role in automatically identifying features from AE signals. To enhance the accuracy of signal categorization, multi-layered neural networks can be implemented [
32]. However, when working with limited datasets, issues like overfitting and performance degradation may arise. Learning rate, small batch size, and weight decay may need adjustment to overcome these issues in the DL methods. Thus, the AE sensor-based system for pipeline leak detection must include a specific number of unique neural layers to achieve the desired performance level. This research utilized sequential DL models that are highly efficient in handling sparse datasets and require less computational power. The sequential DL models used in this study are LSTM, GRU, and Bi-LSTM.
2.1.1. Long Short-Term Memory (LSTM)
LSTM networks are advanced versions of RNNs. They were developed by Hochreiter and Schmidhuber and feature several gate structures capable of simultaneously storing short-term and long-term data. This feature enables them to efficiently overcome the issues of vanishing and exploding gradients, which are common during the training of RNN models. Three essential gates define the LSTM networks: the input gate, forget gate, and output gate.
They control the propagation of input and output data, selectively retaining crucial long-term information and disregarding irrelevant details. This ability distinguishes LSTM networks from traditional RNNs and contributes to their superior performance in tasks involving sequential data. The fundamental structure of an LSTM cell is depicted in
Figure 2a. In the diagram,
represents the input sequence, and
is the output of the hidden layers.
is the forget gate, it is the input gate, and
ot signifies the output gate. The activation functions
and
denote the sigmoid and hyperbolic tangent functions, respectively.
- ▪
The forget gate: The forget gate plays a major role in deciding which information to retain and which to disregard entirely. Its output is computed as follows:
where
and
represent the weight and bias matrices, respectively.
- ▪
The input gate: Once the forget gate is chosen, the information passes to the input gate. Then, it determines which parameters need to be altered and how they should be modified. The outcome is then presented as:
where
represents the current cell state value.
- ▪
The output gate: Once the information has been processed by the above two gates, it ultimately reaches the output gate. The output gate’s function is to determine which information should be presented in the output, which is then expressed as:
where
represents the output value of the current unit.
The LSTM architecture described in
Table 1 is designed to differentiate between leaks and normal conditions in a pipeline system. It starts with an LSTM layer that processes sequential data, capturing temporal patterns crucial for identifying anomalies such as leaks. The LSTM outputs a four-feature representation at each time step, indicated by the shape (None, 1, 4), which is essential for detecting subtle variations in signaling leaks. A dropout layer follows, reducing overfitting by randomly omitting features during training to enhance the model’s generalization. The sequence is then flattened by a flattened layer, converting the LSTM output into a single vector suitable for the subsequent dense layer. This dense layer outputs two values, classifying each instance as either leak or normal. This binary output likely represents the probabilities of each condition, facilitating effective monitoring and prompt response to pipeline leaks.
2.1.2. Gated Recurrent Unit (GRU)
In order to reduce the computational costs associated with LSTM architectures, a GRU network was proposed [
33]. This network modifies the forget and input gates of the LSTM network to create a new gate called the update gate. The GRU architecture includes two gates, namely the update gate and the reset gate, as illustrated in
Figure 2b. The process involves combining and reorganizing the cell state and hidden state of the standard LSTM cell. As a result, the GRU network is more straightforward than the traditional LSTM network and has become increasingly popular.
- ▪
The reset gate: The reset gate enables the network to utilize its short-term memory. It determines the amount of past information from the previous time step that should be forgotten and how much should be retained. The output of the reset gate is presented as follows.
- ▪
The update gate: The
update gate enables the network to utilize its long-term memory by determining the amount of information from the previous hidden step that should be passed on to the current step. The resulting output is expressed as:
Finally, the output produced at the terminus is induced by applying a tan hyperbolic function to the reset gate, which is expressed as follows:
The neural network configuration provided is tailored for detecting leaks and non-leak conditions in pipeline systems, employing a series of specialized layers for sequence data processing. Beginning with a GRU (Gated Recurrent Unit) layer, the model learns temporal dependencies within the data, outputting sequences with dimensions (None, 1, 4) and containing 144,072 parameters. Subsequently, a dropout layer is introduced to mitigate overfitting by selectively deactivating neurons during training, ensuring the model’s generalization. The flatten layer then simplifies the output structure into a one-dimensional array with eight elements, facilitating further processing. Finally, a dense layer with ten parameters produces a final output shape (None, 2), categorizing inputs into ‘leak’ or ‘non-leak’ conditions. This architecture effectively combines temporal understanding, regularization, and feature simplification for accurate classification of pipeline conditions.
Table 2 provides a detailed architecture summary of the GRU network.
2.1.3. Bi-Directional LSTM (Bi-LSTM)
The Bi-LSTM is an upgraded version of the LSTM that can learn information from both past and future states. It connects two hidden layers of the LSTM in opposite directions to produce the same output [
33]. One LSTM cell is used to process the input sequence data as a forward-state layer, while the reversed form of the input sequence data is fed into a second LSTM cell as a backward-state layer simultaneously. This allows for information to flow both from start to end and vice versa. However, the LSTM’s hidden state only retains data from the past and does not consider future information. To address this, the forward and backward hidden layers are added together moment by moment to achieve the desired result, as shown in
Figure 2c. Mathematically, it can be computed as follows:
where g is any function used to combine the two output sequences;
and
, represent the forward and backward hidden layer states at time t, respectively.
Table 3 presents a Bi-LSTM model tailored for pipeline leak detection. It begins with a bi-directional layer outputting (None, 1, 8) and utilizing 384,160 parameters to assimilate data from both directions of the sequence, crucial for recognizing patterns indicative of leaks. A subsequent dropout layer mitigates overfitting by intermittently deactivating neurons during training, thus enhancing model reliability. The output is then streamlined by a flatten layer that transforms it into a single vector of eight elements, preparing it for classification. The final dense layer, with an output (None, 2) and 18 parameters, decisively categorizes the pipeline status into either ‘leak’ or ‘non-leak.’ This structure effectively merges bi-directional data integration, regularization, and data simplification for accurate leak detection in pipeline systems.
2.2. t-Distributed Stochastic Neighbor Embedding (t-SNE)
The t-SNE is a powerful technique used to reduce the dimensionality of high-dimensional data while maintaining local relationships among data points [
34,
35]. In a proposed study, t-SNE plots were utilized to visualize the performance of three sequential models—LSTM, Bi-LSTM, and GRU—in classifying instances of leaks and non-leaks. In the plot of the LSTM model, two distinct clusters are observable: orange points represent leak instances, while blue points represent normal (non-leak) instances. This clear separation indicates an effective classification. The Bi-LSTM model plot exhibits an even clearer separation between leak and non-leak instances, suggesting that it outperforms the LSTM and GRU models in terms of classification strength. On the other hand, the GRU model plot also displays distinct clusters but with a slight overlap between the leak and non-leak instances, indicating strong but less definitive classification performance compared to the LSTM and Bi-LSTM models. These visualizations effectively demonstrate the capabilities of the models, with the Bi-LSTM model showing particularly robust performance and surpassing both the LSTM and GRU in distinguishing between leak and non-leak conditions. A detailed comparison is shown in Figure 8.
3. Experiment, Results, and Discussion
This section gives a detailed description of the experimental setup, the outcomes acquired through experimentation, and an extensive analysis incorporating all relevant details for a comprehensive understanding.
3.1. The Test Setup for Pipeline under Test
The data acquisition process for the AE involves utilizing a segment of an extensive industrial fluid pipeline. The diagram in
Figure 3a offers a visual representation of the pipeline testbed, while
Figure 3b provides detailed schematics of the testbed.
Table 4 presents the specific parameters used in the experiment while collecting the data.
The electric drill machine is used to create a hole in the pipeline. To replicate leaks of varying magnitudes, a fluid control valve is welded to the pipeline near the hole. Water as a fluid medium is used in the pipeline for the experiment. The choice of water is based on its non-hazardous nature to both the environment and the personnel involved in the operation.
3.1.1. Acoustic Emission Data Acquisition System Development
Figure 3c illustrates the sequential flow of the AE data acquisition system. AE data collection was accomplished using MITRAS Corporation R15I-AST high-sensitivity sensors. The system incorporates a National Instruments (NI) module, specifically the NI-9223, which integrates a 16-bit analog-to-digital converter (ADC), offering adjustable sampling frequency and interfacing through a universal standard bus. A computer with a data storage capacity of one terabyte, and compatible with NI-9223, was used to store the recorded AE signals.
To monitor the fluid pipeline, AE sensors (R15I-AST) were attached as per the specifications in
Table 4. Plastic tape was used to attach the sensors to the pipeline and a specialized gel was applied for ensuring proper contact between the sensors and the pipeline surface. A data acquisition software, created by Ulsan Industrial Artificial Intelligence Laboratory utilizing interface libraries from NI and Python language, was employed to manage the data acquisition process. Prior to collecting pipeline leak data, the sensor calibration and sensitivity of the acquisition system underwent testing via the Hsu–Nelson test. After confirming the sensitivity of the AE sensors and the functionality of the acquisition system, a reliable pipeline leak dataset was recorded.
3.1.2. Dataset Collection and Its Description
This study involved collecting datasets from a pipeline in both normal and leaking conditions. When the control valve is closed, it refers to the normal condition of the pipeline, with the pressure of the fluid controlled by a centrifugal pump (CP) at either 13 or 18 bar. The environmental temperature during data collection was approximately 25 °C. Data were collected for two minutes for each pressure condition, at a 1 MHz sampling frequency. AE signals were collected for each pressure condition under three different leak states, with leak diameters of 0.3, 0.5, and 1 mm. The collected datasets are presented in
Table 5.
The data collection process involved first closing the fluid control valve and turning on the CP, then recording data for two minutes when the pressure in the pipeline was maintained at p1 and p2, which is considered Dataset-1 and Dataset-2, for the normal condition, respectively. After collecting the normal condition data, the control valve was opened to 0.3 mm under p1 and p2, and data were recorded for two minutes. The acquisition process was not interrupted during the condition switching, allowing for continuous data acquisition. This process was repeated to collect data for leak sizes of 0.5 and 1 mm under both p1 and p2.
Figure 4 presents a detailed analysis of AE signals under different conditions. (a) illustrates the AE signals when comparing leak versus non-leak scenarios at a pressure of 13 bar. (b) extends this comparison to a higher pressure of 18 bar, showcasing how AE signals vary between leaking and non-leaking states under different pressure conditions.
3.2. Model Architecture and Hyperparameters
The dataset examined comprises 12K samples/second per record, comprising 4K as input data for each sensor. Within deep learning (DL), crucial parameters include the number of epochs (iterations over the training data), batch size (data processed before updating model parameters), and learning rate (magnitude of weight adjustments). Experiments manipulating these parameters demonstrate notable enhancements in classification performance. The data are randomly split, allocating 80% and 20% for training and validation, respectively. In order to have fair comparisons, the architecture and hyperparameters across three sequential DL models remain consistent. For instance, the Adam optimization algorithm is uniformly employed, and categorical cross-entropy loss is utilized for multi-label classification tasks. Each model consists of one hidden layer with four units, respectively, along with random dropout rates of 0.5. A dense layer with two units and an activation function, SoftMax, is incorporated into every model. The training concludes after 100 epochs, retaining weights corresponding to the highest validation accuracy for subsequent testing.
3.3. Evaluation Metrics
To comprehensively evaluate the performance of the employed models—LSTM, Bi-LSTM, and GRU network—four important performance measures were calculated: accuracy (Acc), precision (Pre), F1 score (F1sco), and recall (Rec). It is worth noting that these metrics have been commonly used by researchers since 1950 to assess classification techniques. Therefore, we can assess these metrics using the following formulas:
where
,
,
, and
represent the True-Positive, True-Negative, False-Positive, and False-Negative labels, respectively.
3.4. Experimental Setup
The simulation tests were conducted using a 7th Generation Intel® Core™ i7-7700K processor at 3.60 GHz, equipped with 16 GB of RAM and an NVIDIA® GeForce GTX 1080 Ti graphics card. The algorithms were developed using Python 3.10.11, with Keras and TensorFlow 2.12.0 for backend processing and Pandas 1.5.3 for data handling.
3.5. Results and Discussions
This study presents a method using AE signaling technology for detecting and analyzing leaks in pipelines. Traditional methods typically involve physical contact and visual inspection of the pipeline, but the proposed method utilizes a time-series-based deep learning approach for more efficient and safer real-time detection. A framework is developed that combines AE technology with time-series deep learning algorithms to identify leaks through subtle changes in time-series data of AE signals. To process these signals, sequential deep learning models, including LSTM, Bi-LSTM, and GRU, are employed to classify the acoustic emissions into leak and normal conditions, minor seepage, moderate leaks, and major ruptures. The system incorporates three AE sensors positioned in varied configurations along the pipeline. These sensors collect data at a rate of 1M sample/sec, which is then reduced to 4K sample/second to effectively manage the memory limitations of remote systems.
As mentioned earlier, the proposed study involves using three sequential DL models for the classification of pipeline leakage: LSTM, Bi-LSTM, and GRU. All of them have the same structure, architecture, and hyperparameters to ensure an optimal performance comparison. The performance evaluation analysis is carried out based on convergence curves, confusion matrices, and accuracies, along with additional evaluation metrics, like precision, F1 score, and recall.
The accuracy and loss convergence curves of each of the three models are evaluated by varying the epoch size, as depicted in
Figure 5. Through careful examination of both the training and validation accuracy curves, it is evident that despite achieving a high level of accuracy, there are only marginal discrepancies. A 50-epoch duration is deemed appropriate for assessing the accuracy and loss characteristics of the model. Within an epoch size range of 10 to 20, all three models consistently achieve successive training, testing, and validation accuracies ranging from 90% to 95%. As the epoch size extends from 20 to 40, the models maintain an accuracy level spanning from 97% to 99%. However, as the epoch size approaches 50, the proposed models attain an accuracy level ranging from 99% to 100%. The graphical representation of model accuracy demonstrates that the proposed models achieve peak accuracy while demonstrating a close alignment between the training and validation datasets. All models were trained for 100 epochs, yielding results for both training and validation accuracies and losses. Furthermore, when the epoch size is set to 100, these models exhibit a loss of less than 0.10. Hence, based on the analysis of loss, it is evident that all models achieved minimal error values through successful training. Nonetheless, it is observed that certain models, such as GRU in (f), may manifest signs of overfitting. Despite this, their performance remains noteworthy, likely attributed to the implementation of a dropout rate ranging from 0.5.
The confusion matrices for all three models are depicted in
Figure 6, where only the testing labels are presented and randomly selected through data splitting. Based on these confusion matrices, a quantitative comparison is provided in
Table 6, illustrating precision, recall, F1 score, and accuracy for each model considering batch sizes of 64, 128, and 256. The analysis presents evaluation metric results for each model, showcasing individual and aggregated quantitative values for precision, recall, and F1 score for the leak and normal conditions of the pipeline. The Bi-LSTM model demonstrates superior performance compared to the remaining two models across all parameters. Specifically, in terms of precision score, the LSTM model achieved 99.71%, the GRU model scored 99.73%, and the Bi-LSTM model achieved the highest precision score of 99.79%. For recall scores, the LSTM model achieved 99.72%, the GRU model scored 99.69%, and the Bi-LSTM model achieved the highest recall score of 99.76%. Regarding the F1 score, the LSTM model scored 99.75%, the GRU model attained 99.68%, and the Bi-LSTM model exhibited the highest F1 score of 99.80%. Overall, the Bi-LSTM model demonstrates superior performance across all metrics. This superior performance of the Bi-LSTM can be explained as follows. A leak in the pipeline results in the release of elastic energy. The elastic energy detected by the AE sensor in the form of an AE hit changes the distribution of the AE signal. These changes in the AE signal vary according to the pipeline health conditions. As such, the Bi-LSTM efficiently utilized these changes as compared to other sequential DL methods. From
Figure 7, it is evident that the latent spaces obtained from Bi-LSTM are highly discriminant, as compared to the reference sequential DL methods. It is a known fact that the classification accuracy of a classifier strongly depends on the discrimination of the features. These models were also tested for leak size identification. The confusion matrices for leak size identification using sequential models are shown in
Figure 8. The visualization demonstrates the performance of different models in classifying various leak sizes. Notably, the BiLSTM model continues to outperform the LSTM and GRU models in identifying leak sizes, mirroring its superior performance in the initial leak detection task. To provide additional insight into the models’ ability to distinguish between different leak sizes, t-SNE plots are presented in
Figure 9. These plots illustrate the distribution and clustering of data points corresponding to different leak sizes in a lower-dimensional space.
Figure 9 provides evidence of the superior performance of Bi-LSTM in identifying the pipeline health state and leak size compared to the reference sequential DL methods.
Figure 10 offers a visual representation of the quantitative conclusions for comparative analysis.
3.6. Proposed Method Surveillance Zone
As per the ISO standard 18211:2016, evaluating the surveillance zone involves assessing the attenuation properties of AE signals generated by noise from the AE source. This evaluation is conducted before data collection from the AE sensor begins. In acoustic emissions, attenuation denotes a decrease in signal intensity, typically quantified in decibels (dB). The attenuation characteristics of the AE sensor were determined using Equation (18).
Understanding the attenuation characteristics of AE signals is vital for data collection. In Equation (18), V denotes the measured potential, while V* indicates the reference potential. The term measured AE potential pertains to the AE signal recorded by the AE sensor. In the context of acoustic emission, a 0 dB reference corresponds to an AE signal potential of 1 μV at the sensor, with no amplification applied.
This study employed the Hsu–Nielsen test as an active AE source to evaluate the attenuation characteristics of the AE sensor. The Hsu–Nielsen test involves performing a pencil lead break test, where a 0.5 mm diameter lead is applied to the surface of the pipeline to generate an acoustic emission event. The AE signals detected during this test closely resembled those from natural AE sources, such as leaks.
Figure 11 illustrates the attenuation characteristics of a fluid-filled industrial pipeline with an outer diameter of 114.3 mm. The proposed method demonstrated over 95% accuracy in classifying normal and Hsu–Nielsen test activities when the attenuation of the AE signals collected by the R15I-AST AE sensor was below 25 dB.
Figure 11 shows the attenuation level at a distance of 10.8 m, indicating that the method can provide effective surveillance up to this distance.
4. Conclusions
This study introduced a framework for pipeline leakage utilizing AE sensors and sequential deep learning techniques. The proposed methodology aims to identify leakage and pinhole size. Our approach innovatively utilizes AE sensors, positioned strategically along pipelines, to detect subtle acoustic signals generated by different types of leak sizes, ranging from minor seepages to significant ruptures. The captured AE data underwent meticulous preprocessing, including normalization, cleaning, and smoothing, to prepare them for the analytical phase, ensuring that the data fed into the deep learning models were of the highest quality. We explored various deep learning architectures, such as LSTM, Bi-LSTM, and GRU, to accurately classify these signals into leak and normal conditions. Our findings indicate that the models are highly effective, achieving classification accuracies over 95% within the initial 15 to 20 epochs of training. Among the models tested, the Bi-LSTM model proved to be particularly effective, outperforming others in terms of accuracy, precision, and recall. This superior performance can be attributed to its simplified structure and reduced gate complexity, which facilitate faster and more effective learning. The implementation of this AE-based deep learning framework marks a significant advancement in the field of pipeline integrity monitoring, providing a method that is not only more accurate but also safer and more efficient than traditional detection methods. Future work will delve into enhancing the capability of our models to not only detect but also precisely localize leaks within the pipeline system, leveraging spatial data from multiple AE sensors.