Main
Main
Main
Keywords: Non-destructive evaluation of aircraft production is optimised and digitalised with Industry 4.0. The aircraft
NDT structures produced using fibre metal laminate are traditionally inspected using water-coupled ultrasound
NDE 4.0 scans and manually evaluated. This article proposes Machine Learning models to examine the defects in
Aircraft production
ultrasonic scans of A380 aircraft components. The proposed approach includes embedded image feature
Quality control
extraction methods and classifiers to learn defects in the scan images. The proposed algorithm is evaluated
Machine learning
POD
by benchmarking embedded classifiers and further promoted to research with an industry-based certification
process. The HoG-Linear SVM classifier has outperformed SURF-Decision Fine Tree in detecting potential
defects. The certification process uses the Probability of Detection function, substantiating that the HoG-Linear
SVM classifier detects minor defects. The experimental trials prove that the proposed method will be helpful to
examiners in the quality control and assurance of aircraft production, thus leading to significant contributions
to non-destructive evaluation 4.0.
∗ Corresponding author.
E-mail address: navya.prakash@dfki.de (N. Prakash).
https://doi.org/10.1016/j.ndteint.2023.102885
Received 6 September 2022; Received in revised form 13 May 2023; Accepted 21 May 2023
Available online 26 May 2023
0963-8695/© 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
N. Prakash et al. NDT and E International 138 (2023) 102885
discrete labels) and regression (predicts a continuous quantity). Next, crack shape estimation with height, length and depth parameters us-
unsupervised learning – requires no data labels for training; dimension- ing Eddy Current and SVM for regression (SVR) with RBF kernel in
ality reduction and clustering are the two significant methodologies. conductive materials. It achieved a maximum error rate of 0.3 mm in
Following is reinforcement learning – the agent (training) sends an defect length, but height and depth detection needed more training.
action (a move causing change) to the environment (real or virtual Following, [40] used an ANN, MLP (with back-propagation) and SVR
world) and in-return environment sends the state and its reward (eval- (with RBF kernel) for crack defect classification. SVR outperformed
uation of the action, either positive or negative) for the agent; real-time MLP with a maximum error rate of 0.8 mm on a 5 mm crack length,
decisions and gaming models are its prototypes. Additionally, semi- but height and depth parameters needed more SVR model tuning.
supervised learning is a combination of supervised and unsupervised Dynamic PCA, 𝑘-NN, MLP, RBF and SVM were implemented by [41]
learning methodologies. for defect depth in infrared NDT in CFRP composite material. The
Supervised learning examples [11] include Support Vector Ma- MLP outperformed RBF and SVM for complex composite, whereas the
chine (SVM) [12], Decision Trees [13,14], Random Forest (RF) [15], dynamic PCA and 𝑘-NN could estimate defect depth on plane composite
𝑘-Nearest Neighbour (𝑘-NN) [16], Naïve Bayes [17], Linear Discrim- and detection limit for classifiers. The NDT data of oil or gas pipeline
inant Analysis (LDA) [18] and Logistic Regression [19]. The Fuzzy defects were detected by [42] using LDA, MLP, SVR, RBF, PCA, 𝑘-NN
C-means (FCM) [20], 𝑘-means [21] and Principal Component Analysis and SVR outperformed all other methods with 98.28%. SVM and ANN
(PCA) [22] are a few state-of-the-art unsupervised learning techniques. were trained by [43] with NDT rail data for real-time defect processing
SVM predicts classes based on an optimal hyperplane creating margins and SVM outperformed ANN with 97% of accuracy. For fabric defect
to find similar features from each class and classifies them together. image analysis, [44] implemented AdaBoost [45] and HoG for feature
Decision Trees predict a class by learning the decision rules from the extraction with SVM for classification. This method identified most de-
data features of that class. Random Forest combines the outcome of fects with fewer false rejection rates. The SVM and ANN classified NDT
multiple Decision Trees into a prediction. 𝑘-Nearest Neighbour predicts data of construction structures by [46] with Fast Fourier Transforms
using the proximity of 𝑘 nearest data points for classification. Naïve (FFT) and RBF for feature extraction, SVM outperformed ANN with
Bayes classifies based on the probability of data points applying Bayes’ 93% accuracy.
theorem. Using Fisher’s algorithm, LDA finds a linear combination Aerospace structure defects were classified based on their shapes
of data features to characterise different classes. Logistic regression (Shape Geometric Descriptor (SGD)) using J48 Decision Tree [47],
finds the probability of an event occurring, such as voted or no vote, MLP, Naïve Bayes classifiers with Content-Based Image Retrieval (CBIR)
and SGD for feature extraction in the research of [48]. MLP out-
based on the data variables. Fuzzy C-means is similar to 𝑘-means but
performed J48 Decision Tress (96%) and Naïve Bayes (95%) with
is a soft clustering where a data point can belong to one or more
98% accuracy. Another research [49] trained J48 Decision Trees and
clusters. 𝑘-means is a hard clustering that partitions data points into
Random Forest to determine weld quality in NDT data of Shielded
𝑘 clusters, each belonging to one cluster with the nearest mean value.
Metal Arc Welding (SMAW) of carbon steel plates. Random Forest
PCA reduces data dimensionality and increases its interpretability with
outperformed J48 Decision Trees (70.78%) with 88.69% of accuracy.
less information loss.
Automatic NDT aircraft defects were diagnosed by [50] using SVM
Deep Learning is a subset of Machine Learning applied to im-
and SURF with AlexNet [51] and VGG-F Deep Neural Network as
ages, videos, text and other data formats. It comprises multi-layer
feature extraction methods. SVM gained the highest accuracy of 96%
Artificial Neural Networks (ANN) [23]. Deep Neural Network (DNN)
with the SURF for Region of Interest (RoI) selection. The mobile panel
has many hidden layers of neural networks to perform classification
surface defects were inspected by [52] with LBP and HoG feature
and regressions. The state-of-the-art neural networks are Radial Ba-
extractors that trained Naïve Bayes and SVM. The HoG-SVM classifier
sis Function (RBF) [24], Autoencoders (AEC) [25], Multi-Layer Per-
outperformed all other feature extractors and Naïve Bayes with >90%
ceptron (MLP) [26], VGG-F [27], Fast R-CNN [28], ResNet v2 [29],
average accuracy. Random Forest with RoI classified defects on alloys
Transformer [30].
and achieved >90% accuracy [53].
The Machine Learning algorithm often includes a feature extrac- An Aeronautics Engine Radiographic Testing Inspection System
tion process depending on the input data type to improve its perfor- Net (AE-RTISNet) with Fast R-CNN was developed to inspect defects
mance [31]. A few state-of-the-art image feature extraction methods are in aeronautical engines [54]. It contains RoI as a feature extractor
Local Binary Patterns (LBP) [32], Maximally Stable Extremal Regions and obtained a mean average precision (mAP) of 90% compared
(MSER) [33], KAZE [34], Speeded Up Robust Features (SURF) [35], to YOLO [55]. The Aluminium Conductor Composite Core (ACCC)
Histogram of Oriented Gradients (HoG) [36]. LBP labels pixels in an with NDT X-ray images was analysed for defects using Inception
image by thresholding each pixel neighbourhood, resulting in a binary ResNet v2 [56]. This Deep Neural Network, Inception ResNet v2,
number to encode local texture information. For blob detection, the maintained 97.01% accuracy compared with Res2Net-18 (96.28%)
MSER method uses co-variant regions in corresponding grey-level cells and ResNet-v2-50 (96.15%) after data augmentation. Random For-
in images. KAZE works on non-linear scale space and determinants est, RBF-SVM, hidden Markov model (HMM) [57] were implemented
of the Hessian matrix with the local difference binary descriptor to by [58] for training with autoencoders-FFT, low-pass filtering and
detect multi-scale corner features from the scale space. SURF detects PCA for feature extractions to measure defects in aerospace CFRP
interest points and local neighbourhoods to match, finds features in aluminium plates. AEC-PCA outperformed all other classifiers with
the Gaussian scale space, can distinguish between background and >0.9 clustering scores. Convolution Neural Networks (CNN) deter-
foreground features in an image, finds blob features and is partially mined aerospace NDT defects using spot classifiers in research of [59]
influenced by Scale-Invariant Feature Transform (SIFT) [37]. HoG is and the Indirect spot CNN classifier outperformed the Direct spot
a feature descriptor that describes the image features by calculating CNN classifier with 98% of accuracy. Another CNN approach [60]
the frequency of gradients oriented in localised parts of an image; it was developed to detect defects in NDT data of stainless steel and
encodes local shape information. welded Gas Tungsten Arc Welding (GTAW) or Shielded Metal Arc
The previous research methods that use Machine Learning for NDT Welding (SMAW) joints. This CNN resembles VGG-16 and gained a
data defect analysis are as follows, [38] for UT C-mode scanning acous- Probability of Detection (POD) of 𝑎90∕95 = 2.1 mm, where 𝑎 is the
tic microscopy (C-SAM) in integrated circuits using the Mumford–Shah defect size and 90/95 denotes 90% POD with 95% of CNN model
model for grayscale image processing and SVM for defect classification confidence. An ANN was developed by [61] to monitor defects in
with 80% of recognition rate. This technique needs more training data NDT data of mechanical, aerospace and civil structures consisting of
to improve classification accuracy. The research of [39] implemented aluminium and magnesium alloys and inferred >95% of precision.
2
N. Prakash et al. NDT and E International 138 (2023) 102885
Table 1
A brief literature survey (ordered by publication year)
Source NDT data Feature extraction Machine learning Performance analysis
Zhang et al. (2005) [38] Integrated Circuits Mumford–Shah model SVM Recognition rate: 80%
Bernieri et al. (2006) [39] Conductive materials RoI SVM regression (SVR) with Maximum error rate (length): 0.3
RBF mm
Bernieri et al. (2008) [40] Conductive materials RoI ANN-MLP (reference) and SVR: maximum error rate (length)
SVR with RBF of 0.8 mm; SVR outperformed
MLP
Benítez et al. (2009) [41] CFRP structure RoI Dynamic PCA, 𝑘-NN, MLP, MLP outperformed RBF and SVM
RBF and SVM
Khodayari-Rostamabad et Oil, gas pipelines PCA 𝑘-NN, SVR, RBF, LDA, MLP Accuracy: SVR - 98.28%
al. (2009) [42]
Wei & Cheng-Tong Rail flaws RoI SVM, ANN Accuracy: SVM - 97%
(2009) [43]
Shumin et al. (2011) [44] Fabric HoG, AdaBoost SVM Detection rate: SVM - high, less
false rejections
Saechai et al. (2012) [46] Construction cement FFT, RBF SVM, ANN Accuracy: SVM - 93%
structure
D’Angelo & Rampone Aerospace structure SGD, CBIR J48 Decision Trees, Accuracy: MLP - 98%,
(2015) [48] Multilayer Perceptron
(MLP) and Naïve Bayes
Sumesh et al. (2015) [49] SMAW Carbon Steel plates Statistical approach J48 Decision Trees, Accuracy: Random Forest -
Random Forest 88.69%
Internal Study: Schmidt, T CFRP C-scans Measured values of all SVM, Random Forest AUC: Gradient histogram-SVM -
et al. (2015) [10] sections, mean or variance, 0.987
gradient histograms
Malekzadeh et al. Aircraft surface LBP, RGB and HSV SVM Accuracy: SVM-SURF - 96%
(2017) [50] histograms, AlexNet,
VGG-F DNN, SURF
Huang et al. (2017) [52] Mobilephone Panel LBP, HoG Naïve Bayes, SVM Average accuracy: HoG-SVM -
>90%
Internal Study: University GLARE® -NDT C-scan Laplace filter, material CNN-ASPP, SGD, softmax High exclusion rate of manual
of Augsburg [64] images thickness, edge information inspection for component area -
97.36%
Shipway et al. (2019) [53] Titanium alloy plates RoI Decision Trees, Random Accuracy: >90%
Forest
Chen & Juang (2020) [54] Aeronautical engine RoI Fast R-CNN, YOLO mAP: Fast R-CNN - 90%
Hu et al. (2021) [56] Aluminium conductor Image normalisation Inception ResNet v2, Accuracy: Inception ResNet v2 -
composite core ResNet-18, ResNet-v2-50 97.01%
Kraljevski et al. Sensor network signals of FFT, low-pass filtering, AEC, HMM, RBF-SVM, Clustering score: AEC-PCA - >0.9
(2021) [58] aluminium and CFRP PCA Random Forest
plates
Niccolai et al. (2021) [59] Aerospace structures RoI Direct and Indirect spot Accuracy: Indirect spot CNN -
CNN 98%
Siljama et al. (2021) [60] Stainless steel Normalisation CNN POD: a90/95 = 2.1 mm
Fakih et al. (2022) [61] Aerospace/mechanical/civil Geometric constraints, ANN Precision: ANN - >95%
structures Approximate Bayesian
computation
Le et al. (2022) [62] Aircraft structure PCA SVM, Naïve Bayes, 𝑘-NN, Average accuracy: SVM - 89.48%
Random Forest, Logistic
Regression
Risheh et al. (2022) [63] Steel structures RoI, threshold selection, 𝑘-means clustering Defects detected accurately
image segmentation, Canny
edge detection
Aircraft structure corrosion was analysed using NDT data with PCA Learning classifiers. The positive class had 37 annotated discontinuities
for feature extraction and SVM, Naïve Bayes, Random Forest, 𝑘-NN with 18 delaminations and 19 porosities and consisted of 222 total
and Logistic regression models [62]. SVM outperformed all other training samples. The gradient of histograms for feature extraction was
models with 89.48% average accuracy. 𝑘-means clustering for NDT combined with SVM and Random Forest to classify discontinuities. The
steel structure was developed by [63] to determine defects with RoI, gradient histogram-SVM had the highest AUC of 0.987 and 10% of
thresholds, image segmentation and Canny edge detection techniques. FP rate, but the gradient histogram-Random Forest classifier had a
This method does not need training and can detect defects accurately lesser FP rate for the positive class. In contrast, the gradient histogram-
in smaller datasets. Random Forest classifier gained lesser confidence than the gradient
Further, [10] was an automated evaluation of CFRP component NDT histogram-SVM. There is a requirement for more training data with
data with discontinuities such as delaminations, layer porosity, volume positive class samples to increase the classification rate.
porosity and foreign bodies. These CFRP C-scans were converted to Following, [64] detected anomalies using a Deep Learning tech-
.png images using ULTIS® NDT Kit software and trained Machine nique with the same GLARE® NDT dataset used in the proposed model.
3
N. Prakash et al. NDT and E International 138 (2023) 102885
The NDT scans were converted to grayscale images with Python pro-
gramming. These images were pre-processed using a Laplace filter to
extract local material thickness and edge information as features, lead-
ing to an advantage in differentiating faulty and splice regions. These
features trained the Deep Learning architecture with the first six CNN
layers and one Atrous Spatial Pyramid Pooling (ASPP) layer that helps
for significant faulty pixel classifications and another CNN layer with
the last layer of Upsampling. The Stochastic Gradient Descent (SGD) for
the learning method and Softmax cross entropy for the error function
Fig. 3. Defect categories [1,2,65,66].
were used in this research. This classifier achieved an average high
exclusion rate (manual inspection) of 97.36% for the component area
on the test data; training steps are inversely proportional to the True
Positive rate. The disadvantages of this classifier are: the exclusion rate classified defects according to the AITM6-4001 and provided ground
varies with the component type and has a higher False Positive rate. truth values (C-scans) for automated evaluation. The data collected
This classifier determines non-faulty regions instead of differentiating from NDT inspection reports are plotted on a plane view of the compo-
faults and displays additional faults even in non-faulty regions. This nent as images, known as C-scans (process mentioned in AITM6-4001).
method needs more training data for faulty regions to improve its Fig. 2 shows a sample C-scan with denoted defects.
performance and use it in real-time offline-QA of aircraft production. In the proposed approach, the NDT ultrasonic inspection report of
The proposed research aims to develop an automated evaluation FML A380 contains C-scans of each aircraft component. These scans
of aircraft NDT data, i.e., an offline-QA to help human examiners. (.xml file – raw dataset) were analysed using the quality software
Learning defects from aircraft production involves data acquisition, ULTIS® -TESTIA (NDT Kit). The experts at DLR-ZLP denoted the defects
pre-processing, Machine Learning training, predictions and determining in the raw dataset with the help of PAG inspection reports and visu-
the model’s confidence. Choosing an appropriate Machine Learning alised them using this software, forming the ground truth data for this
algorithm can seem complicated because many supervised and unsuper- research. This NDT Kit creates three files, .nkc, .nkd and .nkz for each C-
vised algorithms use different learning strategies. However, choosing scan. The .nkc file has the original C-scan data consisting of two blocks:
an algorithm depends on the quantity of data, data type, applicable the first block is the header of the file with a length (in bytes) defined
insights and the requirement to utilise the model’s evaluation results. by the data offset field and written in ASCII format (indications and
Highly flexible models tend to overfit data by modelling minor vari- values). The second block of .nkc contains the physical data written
ations that could be noise. Simple models are easier to interpret but in binary format. The .nkd file contains defect information such as
might have lower accuracy. Therefore, choosing a suitable algorithm file name, defect surface (mm2 ), outline surface (mm2 ), outline length
requires trading one benefit against another, including model speed, (mm) and comments. Any other information is stored in the .nkz file.
accuracy and complexity. In contrast to the literature survey (Table 1), In the proposed approach, the defect classes of the C-scans are
the proposed work comprises state-of-the-art Machine Learning clas- categorised as porosity (Fig. 2), fold, twist, overlap, gap and foreign
sifiers with distinct image feature extraction methods to detect two body, as illustrated in Fig. 3. There were 343 data samples and 99
classes (binary classification): defects and good components in the contained at least one defect as illustrated in Fig. 4. Fig. 4 describes
aircraft ultrasonic-scan imageset. that the minimum number of defects in an image of a component
is one and the maximum is 15. Most defects belong to the porosity
2.1. NDT dataset category (distribution over the different defect types is confidential).
The proposed method pre-processes the data using these 343 data
GLARE® [67,68] is a new FML class that produces A380 aircraft
samples for further processing. The quantity of data samples used in
structures. The A380 comprises 15.1, 18.1, 18.14, 18.16, 18.17 com-
this study is limited because of the industrial aircraft production rate.
ponents, as in Fig. 1.
The FML of the A380 NDT inspection technique is explained in
the Airbus Test Method for inspection processes (AITM) AITM6-4001 2.2. Machine learning model
(confidential). The aircraft production company, Premium AEROTECH
GmbH (PAG), followed signal analysis requirements according to the The proposed model comprises training and evaluation (Section 4)
AITM6-4001 and generated inspection reports. These inspection reports processes. Preparation for the training process includes three primary
4
N. Prakash et al. NDT and E International 138 (2023) 102885
Fig. 5. Input image in RGB format and its corresponding grayscale image. (For
interpretation of the references to colour in this figure legend, the reader is referred
to the web version of this article.)
Fig. 4. Distribution of defects from NDT data.
where: 𝑑𝑠 is the defect size in pixels (px), length and breadth of the
The proposed Machine Learning model pipeline comprises two
rectangular defect label
steps: training – feature extraction and classification (including vali-
Further, pre-processing includes calculating defect size (𝑑𝑠) in the
dation). The in-built functions of MATLAB were used with the Clas-
image labels. 𝑑𝑠 is defined as the square root of the defect area as
sification Learner App for the proposed model. An algorithm for the
in Eq. (1). The defect area is obtained from the rectangular image
proposed model is as follows:
label dimensions (length, breadth). A square root over the defect area
is formulated for two reasons: for standardising all the defect data (1) Feature Extraction: input positive imageset (208 cropped defect
and most defects are not frame-filling, i.e., the defect pixel area is images) and negative (244 good images) imageset of RGB or
not equal to the rectangular label area, for example, twist, fold, pores truecolour images (as shown in Fig. 5) as an image datastore
(Fig. 3). The minimum defect size in image labels encountered is 6 to form a training imageset. Datastore can store larger feature
px and the maximum is 383 px. According to the defect size, all 208 vector size and increases processing rate.
defects were cropped to their equivalent defect label size and stored as
the labelled defect (positive) imageset. The 244 labelled non-defective 𝑡𝑟𝑎𝑖𝑛𝑖𝑛𝑔 𝑖𝑚𝑑𝑠=𝑖𝑚𝑎𝑔𝑒𝐷𝑎𝑡𝑎𝑠𝑡𝑜𝑟𝑒 (𝑓 𝑜𝑙𝑑𝑒𝑟 𝑝𝑎𝑡ℎ)
images formed the good (negative) imageset.
The processing step has feature extraction and training. It includes (a) These labelled images of both classes have features ex-
training the proposed Machine Learning model with a feature set from tracted using custom extractors as follows:
the training imageset (positive and negative imageset) and class labels (i) Convert all input RGB images to grayscale (Fig. 5)
– defect and good. The feature set is obtained from different image for LBP, MSER, KAZE and SURF feature extraction
feature extraction techniques: LBP, MSER, KAZE, SURF and HoG. Each (HoG can extract features from RGB and grayscale
feature extractor has a bag-of-features to store its features. Each bag-of- images)
features (feature set) is input to each state-of-the-art Machine Learning
model for binary classification: SVM, Decision Trees, Random Forest, 𝑘- 𝑔𝑟𝑎𝑦𝑠𝑐𝑎𝑙𝑒 𝑖𝑚𝑎𝑔𝑒 = 𝑟𝑔𝑏2𝑔𝑟𝑎𝑦 (𝑅𝐺𝐵 𝑖𝑚𝑎𝑔𝑒)
NN and Naïve Bayes. MATLAB’s Classification Learner application was
(ii) LBP (Fig. 6) and HoG (Fig. 7) features of each input
loaded with the training set (a feature set and class labels). During
image
the training process, the Cross Validation (CV) [69,70] technique is
applied to the training set to prevent overfitting (model overtrain), un- 𝑓 𝑒𝑎𝑡𝑢𝑟𝑒𝑠 = 𝑒𝑥𝑡𝑟𝑎𝑐𝑡𝐿𝐵𝑃 𝐹 𝑒𝑎𝑡𝑢𝑟𝑒𝑠 (𝑔𝑟𝑎𝑦𝑠𝑐𝑎𝑙𝑒 𝑖𝑚𝑎𝑔𝑒)
derfitting (insufficient model training), to observe the model’s reaction
to a similar independent dataset and prediction error function. The 𝑓 𝑒𝑎𝑡𝑢𝑟𝑒𝑠 = 𝑒𝑥𝑡𝑟𝑎𝑐𝑡𝐻𝑂𝐺𝐹 𝑒𝑎𝑡𝑢𝑟𝑒𝑠 (𝑅𝐺𝐵 𝑖𝑚𝑎𝑔𝑒)
5
N. Prakash et al. NDT and E International 138 (2023) 102885
(v) Load scene data as an encoded bag-of-features from Fig. 8. MSER features.
each custom extractor and training imageset
(vi) Load all labels of training imageset to scene labels as
an attribute to scene data; label names ‘defect’ and normalises the result using a block-wise pattern and returns a descriptor
‘good’ are stored as scene type for each cell.
Fig. 8 shows MSER feature extraction (zoomed-in) for Fig. 5. From
(2) Training: Open Classification Learner App and load scene data the grayscale image, co-variant regions (MSER regions) (coloured re-
and scene type gions) are extracted by checking the variation of the region area
(a) select all scene data as predictors size between different intensity thresholds. Ellipses (marked in black
colour) and centroids (marked in black plus) from MSER regions are
(b) simultaneously apply Cross-Validation with 10-fold
stable connected components of the grayscale image.
(c) start the session and store validation results (Section 3)
Fig. 9 displays KAZE features (zoomed-in) from Fig. 5. The grayscale
(d) In the Classification Learner App, use parallel computing
image is used to obtain KAZE points (marked in blue ellipses and
to train all available Machine Learning classifiers at once.
black plus), with non-linear diffusion to construct a scale space for the
(e) Store all trained classifiers for further analysis (Sections
grayscale image and then detect multi-scale corner features from that
3, 4)
scale space.
Fig. 10 shows SURF points (marked in black colour) (zoomed-
A part of the data from the A380 component is visualised in Fig. 5
in) are extracted from Fig. 5. These SURF points are obtained using
(cropped smaller section of a good part) due to data confidentiality,
Hessian blob detector and its feature vector from Haar wavelet from
the input RGB image is converted to grayscale for feature extraction
the grayscale image.
processes (except for HoG).
During the training process, HoG extracted 34,596 features from
Fig. 6 represents the LBP feature graph of encoded local texture each image and 422 × 34,596 feature vectors were elected with the
information in binary format extracted from Fig. 5. The LBP feature par- strongest features from each class. These strongest HoG feature vectors
titions the grayscale image into non-overlapping cells. The histogram created a bag-of-features with 500 clusters. SURF extracted 12,093 fea-
bins represent the number of features from each cell in the grayscale tures (total – 422 × 12,093) and the strongest features from each class
image and bins depend on the number of neighbours of each cell. The formed 50 bag-of-features clusters. MSER extracted 10,644 features
uniform feature set values of each cell (local texture information) are with 500 bag-of-features clusters. KAZE extracted 9124 features with
plotted with LBP histogram bins and each histogram describes an LBP 500 clusters and LBP extracted 420 features with 302 bag-of-features
feature. clusters. Overall, in the training process, HoG produces the most feature
Fig. 7 illustrates HoG features (zoomed-in) (marked in white colour) vectors in this setup and more features are required to train Machine
extracted from an RGB input image (Fig. 5) converted to a binary Learning classifiers to gain better prediction results.
image. This binary image is decomposed into small, squared cells The classifiers trained in the proposed method from the Classifica-
and computes a histogram of oriented gradients in each cell. Then, it tion Learner App include 𝑘-NN – fine, medium, coarse, cosine, cubic,
6
N. Prakash et al. NDT and E International 138 (2023) 102885
Table 2
Possibilities of predictions.
Type Predicted Actual value
True positive Defect Defect
True negative Good Good
False positive Defect Good
Fig. 9. KAZE features. False negative Good Defect
𝑇𝑃
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = (3)
𝑇𝑃 + 𝐹𝑃
Precision is the rate of correct defects predicted to the total positive
predictions by the trained Machine Learning model (Eq. (3)).
𝑇𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 = (4)
𝑇𝑃 + 𝐹𝑁
Recall or sensitivity is the rate of correct defects predicted to the
total positive instances in the test data (Eq. (4)).
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙
𝐹 1 − 𝑠𝑐𝑜𝑟𝑒 = 2 ∗ (5)
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙
Fig. 10. SURF features. 𝑇𝑁
𝑆𝑝𝑒𝑐𝑖𝑓 𝑖𝑐𝑖𝑡𝑦 = (6)
𝑇𝑁 + 𝐹𝑃
𝐹𝑃
𝐹 𝑃 𝑅 = 1 − 𝑆𝑝𝑒𝑐𝑖𝑓 𝑖𝑐𝑖𝑡𝑦 = (7)
weighted and Decision Trees – fine, medium, coarse. Random Forest 𝑇𝑁 + 𝐹𝑃
– ensemble boosted trees, ensemble bagged trees, ensemble subspace The F1-score is the harmonic mean of precision and recall (Eq. (5)).
discriminant, ensemble subspace 𝑘-NN, ensemble RUS boosted trees; The Rate of Change (ROC) is the probability curve [72] and the
SVM – linear, quadratic, cubic, fine Gaussian, medium Gaussian, coarse Area under the ROC curve (AUC) is the degree of separability. ROC-
Gaussian and Naïve Bayes. The performance of all these classifiers with AUC evaluates the trained classifier’s performance in distinguishing
image feature extraction methods is discussed in Section 3. the ‘defect’ and ‘good’ classes with the values of True Positive Rate
(TPR) (recall or sensitivity) and False Positive Rate (FPR). The FPR is
3. Experimental result and discussion calculated based on the specificity (Eq. (6)) of the trained model using
Eq. (7). The ROC-AUC curve is plotted with FPR (x-axis) against TPR
The proposed Machine Learning model is evaluated using metrics (y-axis). The trained model can better classify defects and good aircraft
such as accuracy, precision, recall, F1-score, Receiver Operating Curve structures if the AUC value is higher.
with Area Under the Curve (ROC-AUC) [71], 𝑘-fold Cross Validation
and POD certification. The classifier’s confidence is designated based 3.1. Cumulative models
on the values of true-positive (TP), true-negative (TN), false-positive
(FP) and false-negative (FN). Figs. 11–14 illustrate the analysis to choose the best accuracy of
From Table 2, a prediction is a TP or TN when the predicted and cumulative Machine Learning classifiers with image feature extraction
actual values are the same; TP is when a defect is classified as defect methods.
class and TN is a good part classified as a good class. An FP or FN From Fig. 11, the performance of LBP-Fine 𝑘-NN has the high-
occurs when the predicted and actual values are different; FP is the est accuracy of 59.3% and the least of LBP-Coarse 𝑘-NN with 55%.
classification with the predicted value of a defect, but the actual value MSER-Cosine 𝑘-NN has the highest accuracy of 92.4% and least with
is a good aircraft part and FN is vice-versa. A matrix representation of MSER-Coarse 𝑘-NN of 45%. KAZE-Cosine 𝑘-NN has 80.5% high accu-
all these values forms a confusion matrix. racy and a low of 55.2% with KAZE-Coarse 𝑘-NN. SURF-Fine 𝑘-NN has
95.2% highest accuracy and 87.4% with KAZE-Cubic 𝑘-NN. HoG-Cosine
𝑇𝑃 + 𝑇𝑁
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (2) 𝑘-NN has 90% accuracy and is low with HoG-Coarse 𝑘-NN of 55.5%.
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
7
N. Prakash et al. NDT and E International 138 (2023) 102885
Table 3
Consolidated accuracy chart.
Feature extraction Classifiers Accuracy (%)
LBP Linear SVM 66.4
Ensemble Subspace Discriminant 66.4
Fine 𝑘-NN 59.3
Naïve Bayes 55
MSER Quadratic SVM 92.6
Ensemble Bagged Trees 91.9
Cosine 𝑘-NN 92.4
Naïve Bayes 57.6
KAZE Linear SVM 92.1
Ensemble Subspace Discriminant 94.2
Cosine 𝑘-NN 80.5
Naïve Bayes 94.3
SURF Linear SVM 96.9
Fig. 13. Random Forest accuracy.
Decision Fine Tree 97.9
Fine 𝑘-NN 95.2
Naïve Bayes 59.5
Fig. 12 shows performance analysis of the Decision Tree, the combi- HoG Linear SVM 99
nation of LBP-Decision Fine Tree has 64.3% accuracy and less of 56.4% Ensemble RUS Boosted Trees 93.5
Cosine 𝑘-NN 90
with LBP-Decision Coarse Tree. MSER-Decision Fine Tree has a high
Naïve Bayes 56
accuracy of 90.7% and MSER-Decision Coarse Tree has low accuracy
of 80.7%. KAZE-Decision Fine Tree and KAZE-Decision Medium Tree
have a similarly high accuracy of 91% and KAZE-Decision Coarse Tree Table 4
has low of 88.6% accuracy. SURF-Decision Fine Tree has the highest Evaluation chart.
accuracy of 97.9% and SURF-Decision Medium Tree and SURF-Decision Classifiers Accuracy (%) Recall Precision F1-score
Coarse Tree have an accuracy of 97%. KAZE-Decision Fine Tree gained HoG-SVM 99 0.9919 0.9880 0.984
92.14% high accuracy and KAZE-Decision Coarse Tree of 82.1% low SURF-Fine Tree 97.9 0.9839 0.97 0.97
accuracy.
Fig. 13 demonstrates the Random Forest or Ensemble Trees de-
tection rate, LBP-Ensemble Subspace Discriminant has gained 66.4%
accuracy with 90.7%. The highest accuracy is gained by HoG-Linear
and low accuracy of 49.5% with LBP-Ensemble Subspace 𝑘-NN. MSER-
SVM of 99% and the low accuracy of HoG-Fine Gaussian SVM with
Ensemble Boosted Trees has a high of 91.9% and MSER-Ensemble
72.6%.
Bagged Trees of 56.9% low accuracies. KAZE-Ensemble Subspace Dis-
The LBP had the lowest feature extraction performance with all
criminant and KAZE-Ensemble Subspace 𝑘-NN achieved the highest
classifiers compared to MSER, KAZE, SURF and HoG. The second least
accuracy of 94.2%, but KAZE-Ensemble Boosted Trees has 55% low
feature extraction interpretations were MSER, followed by KAZE. The
accuracy. SURF-Ensemble Bagged Trees, SURF-Ensemble Subspace Dis-
selection of the best feature extraction methods influences the classi-
criminant and SURF-Ensemble Subspace 𝑘-NN have the same high
fiers. SURF and HoG feature extraction methods were selected as the
accuracy around 95.6%; low accuracy of 55% with SURF-Ensemble
best to encase with classifiers to avoid false negatives. The Naïve Bayes
Bagged Trees. 93.5% of accuracy is gained by HoG-Ensemble RUS
(Table 3) and 𝑘-NN were not applicable with most feature extraction
Boosted Trees and a low of 59% with HoG-Ensemble Bagged Trees.
Fig. 14 illustrates the detection rate of SVM classifier, LBP-Linear methods and thus were eliminated in the further evaluation process.
SVM, LBP-Quadratic SVM, LBP-Cubic SVM, LBP-Fine Gaussian SVM and
LBP-Medium Gaussian SVM has a similarly high accuracy of 66.4%; 3.2. Best performing models
LBP-Coarse Gaussian SVM has low accuracy of 55%. MSER-Quadratic
SVM and MSER-Cubic SVM have similar high accuracy of 92.6%, a The highest accuracies from all classifiers are consolidated in Ta-
low of 69.8% from MSER-Fine Gaussian SVM. KAZE-Linear SVM gained ble 3. The embedded classifiers HoG-Linear SVM and SURF-Decision
92.1% high accuracy and KAZE-Coarse Gaussian SVM low of 63.6%. Fine Trees achieved the highest accuracy of 99% and 97.9%, respec-
SURF-Linear SVM, SURF-Quadratic SVM, SURF-Coarse SVM, SURF- tively. Therefore, these two classifiers are further evaluated with recall,
Medium Gaussian SVM and SURF-Coarse Gaussian SVM have matching precision and F1-score metrics as exhibited in Table 4. HoG-Linear SVM
high accuracy of around 96%. SURF-Fine Gaussian SVM achieved low gains the highest F1-score with 0.984 compared to SURF-Decision Fine
8
N. Prakash et al. NDT and E International 138 (2023) 102885
The ROC-AUC curves provide AUC values for HoG-Linear SVM with
𝐴𝑈 𝐶 = 1.00 and prediction probability of zero for negative and 0.98
for positive classes as demonstrated in Fig. 17. The SURF-Decision Fine
Tree has 𝐴𝑈 𝐶 = 0.92 and prediction probability of 0.06 for negative
and 0.87 for positive class predictions as represented in Fig. 18.
After assessing all the evaluation metrics from Table 3 and Table 4,
Fig. 16. SURF-Decision Fine Tree confusion matrix.
the best-performing embedded Machine Learning classifiers are HoG-
Linear SVM and SURF-Decision Fine Tree. These have negligible FN
leading 𝑅𝑒𝑐𝑎𝑙𝑙 ≈ 1.00, high precision and F1-score. The robust require-
ment for the proposed model is to achieve 100% of TP rate on the
prediction data and zero FN rate. The FN rate is essential for calibrating
the proposed model and ROC-AUC curves help with calibration. The
threshold curve is the ROC curve that separates positive and negative
classes, selected to obtain a significantly lower or zero FN rate and
maximum TP rate in the prediction process. From Figs. 17 and 18, as
the TP rate increases, the FP rate also increases. If the AUC of HoG-
Linear SVM decreases below 1.00 and SURF-Decision Fine Tree above
0.92, their FN rate increases. The FP rate is negligible (an experienced
examiner can scrutinise the FP visually) for the real-time usage of the
proposed system, but the FN rate should not be increased because of
the risk involved in the industrial offline-QA of aircraft production.
As SVM is primarily a binary classifier and HoG-Linear SVM has
performed best with the prediction data, selecting it as a predominant
classifier for the proposed approach is beneficial. Hence, it is further
evaluated with the POD certification process. SURF-Decision Fine Tree
can be an option for multi-class classification.
Fig. 17. HoG-Linear SVM ROC-AUC curve.
3.3. Comparison and constraints
Tree F1-score of 0.97. A 98.4% of correct defects are predicted to total The proposed HoG-Linear SVM classifier performs better than [10,
defect samples by trained HoG-Linear SVM model and in test data. 64]. But it has some constraints, such as the Linear SVM classifier is a
In contrast, with test data, SURF-Decision Fine Tree has fewer correct black box, as the path to its predictions is unknown. But Decision Fine
defects predictions. Tree is a grey-box as its prediction path is returned as a binary tree
split into branching nodes based on input data values.
The selection of the best-fitting model anticipates factors such as
A binary tree resulting from one of the proposed prediction analyses
low FN, high recall, precision and F1-score. Apart from accuracy, the
is illustrated in Fig. 19. This binary tree starts with the root and has two
confusion matrix and ROC-AUC curve help calculate these influencing
branches at each node; the nodes contain conditions for the predictions.
scores and calibrate the model. Confusion matrices of HoG-Linear SVM This tree has four and three levels, with the leaf nodes having the
(Fig. 15) and SURF-Decision Fine Tree (Fig. 16) reveal the lowest FN predicted classes, thus explicitly demonstrating the prediction analysis.
rate, with the former having 2% for positive class, zero for negative The SURF-Decision Fine Tree can be feasible for real-time offline-
class. The latter has an FN rate of 13% and 6% for positive and negative QA in aircraft industries for NDE 4.0 and inline-QA, but it could be
classes, respectively. complicated with a heap of Decision Trees and branches. The HoG-
The TP rate of HoG-Linear SVM is 98% for the positive class and Linear SVM analysis from the proposed prediction dataset may not
100% for the negative class and the TP rate of SURF-Decision Fine Tree reflect an accurate performance in the real-time industrial offline-QA
for the positive class is 87% and 94% for the negative class. due to a deficit in additional positive class training data from each
9
N. Prakash et al. NDT and E International 138 (2023) 102885
The data loss is calculated as in Eq. (8). The worst-case data loss
is 1.32% and the average data loss is 0.66%. These .bmp images were
trained and tested with the HoG-Linear SVM classifier and observed
that data loss had no influence on its performance.
4. Certification
10
N. Prakash et al. NDT and E International 138 (2023) 102885
11
N. Prakash et al. NDT and E International 138 (2023) 102885
CRediT authorship contribution statement [17] Bayes. An essay towards solving a problem in the doctrine of chances. In:
FRS communicated by Mr. Price in a letter to John Canton, A.M. FRS. 1763,
https://royalsocietypublishing.org/doi/pdf/10.1098/rstl.1763.0053.
Navya Prakash: Methodology, Software, Validation, Formal anal-
[18] Fisher A. The use of multiple measurements in taxonomic problems. Ann Eugen
ysis, Writing – original draft, Visualization. Dorothea Nieberl: Con- 1936;7(2):179–88. http://dx.doi.org/10.1111/j.1469-1809.1936.tb02137.x.
ceptualization, Data curation, Writing – review & editing, Supervision, [19] Cramer JS. The Origins of Logistic Regression. Tinbergen institute working paper
Project administration. Monika Mayer: Conceptualization, Data cura- No. 2002-119/4, 2002, http://dx.doi.org/10.2139/ssrn.360300.
tion, Writing – review & editing, Supervision, Project administration. [20] Bezdek JC, Ehrlich R, Full W. FCM: The fuzzy c-means clustering algo-
rithm. Comput Geosci 1984;10(2–3):191–203. http://dx.doi.org/10.1016/0098-
Alfons Schuster: Writing – review & editing, Supervision. 3004(84)90020-7.
[21] Faber V. Clustering and the continuous K-means algorithm. Los Alamos Sci
Declaration of competing interest 1994;22:138–44, https://www.cs.kent.edu/zwang/schedule/lj9.pdf.
[22] Jolliffe IT. Principal Component Analysis. Springer Series in Statistics; 2002,
http://cda.psych.uiuc.edu/statistical_learning_course/Jolliffe%20I.%20Principal%
The authors declare that they have no known competing finan- 20Component%20Analysis%20(2ed.,%20Springer,%202002)(518s)_MVsa_.pdf.
cial interests or personal relationships that could have appeared to [23] McCulloch WS, Pitts W. A logical calculus of the ideas immanent in ner-
influence the work reported in this paper. vous activity. Bull Math Biophys 1943;5:115–33. http://dx.doi.org/10.1007/
BF02478259.
[24] Broomhead DS, Lowe D. Radial basis functions, multi-variable functional in-
Data availability
terpolation and adaptive networks. In: Royal signals and radar establishment
malvern (United Kingdom). Complex Systems Publications, Inc.; 1988, p. 321–55,
The data that has been used is confidential. RSRE-MEMO-4148 https://apps.dtic.mil/sti/citations/ADA196234.
[25] Kramer MA. Nonlinear principal component analysis using autoassociative neural
networks. AIChE J 1991;37(2):11, https://people.engr.tamu.edu/rgutier/web_
Acknowledgement courses/cpsc636_s10/kramer1991nonlinearPCA.pdf.
[26] Van Der Malsburg C. Frank rosenblatt: Principles of neurodynamics: Perceptrons
We thank Prof. Dr. Jens Lehmann (Informatik-University of Bonn) and the theory of brain mechanisms. In: Brain theory. Springer Berlin Heidelberg;
for supervising this research (2019). 1986, p. 245–8. http://dx.doi.org/10.1007/978-3-642-70911-1_20.
[27] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale
image recognition. In: International conference on learning representations
References (ICLR). 2015, http://dx.doi.org/10.48550/arXiv.1409.1556.
[28] Girshick R. Fast R-CNN. In: IEEE international conference on computer vision.
[1] Ucan H, Scheller J, Nguyen C, Nieberl D, Mayer M, et al. Automated, quality ICCV, 2015, p. 1440–8. http://dx.doi.org/10.1109/ICCV.2015.169.
assured and high volume oriented production of fibre metal laminates (FML) for [29] Szegedy C, Ioffe S, Vanhoucke V, Alemi A. Inception-v4, inception-ResNet and the
the next generation of passenger aircraft fuselage shells. Sci Eng Compos Mater impact of residual connections on learning. 2016, http://dx.doi.org/10.48550/
2019;26:502–8. http://dx.doi.org/10.1515/secm-2019-0031, https://elib.dlr.de/ arXiv.1602.07261.
129574/. [30] Vaswani A, Shazeer N, et al. Attention is all you need. Neural Information
[2] Apmann H, Mayer M, et al. Verfahren der INLINE-qualitätssicherung und der zer- Processing Systems 2017. http://dx.doi.org/10.48550/arXiv.1706.03762, https:
störungsfreien prüfung innerhalb der fertigungslinie von faser-metall-laminaten. //dl.acm.org/doi/10.5555/3295222.3295349.
In: DLR congress (DLRK) conference - FML. 2017, https://elib.dlr.de/117260/. [31] Medjahed SA. A comparative study of feature extraction methods in images
[3] Bisle W, Meier T, Mueller S, Rueckert S. In-service inspection concept of GLARE® classification. IJIGSP 2015;7(3):16–23. http://dx.doi.org/10.5815/ijigsp.2015.03.
– an example for the use of new UT array inspection systems, ECNDT. 2006, 03.
https://www.ndt.net/search/docs.php3?id=3540. [32] Ojala T, Pietikainen M, Maenpaa T. Multiresolution gray-scale and rotation
[4] Vrana J, Singh R. NDE 4.0 - a design thinking perspective. J Nondestruct Eval invariant texture classification with local binary patterns. IEEE Trans Pattern
2021;24. http://dx.doi.org/10.1007/s10921-020-00735-9. Anal Mach Intell 2002;24(7):971–87. http://dx.doi.org/10.1109/TPAMI.2002.
[5] Schmidt T, Dutta S. Automation in production integrated NDT using thermogra- 1017623.
phy. In: International symposium on NDT in aerospace. 2012, https://www.ndt. [33] Matas J, Chum O, et al. Robust wide baseline stereo from maximally Stable
net/article/aero2012/papers/we3b1.pdf. Extremal Regions. Image Vis Comput 2004;22(10):761–7. http://dx.doi.org/10.
[6] Wunderlich C, Tschöpe C, Duckhorn F. Advanced methods in NDE using machine 1016/j.imavis.2004.02.006.
learning approaches. In: AIP conference proceedings 1949-020022. 2018, http: [34] Alcantarilla PF, Bartoli A, Davison AJ. KAZE features. In: Computer Vision -
//dx.doi.org/10.1063/1.5031519. ECCV 2012. Springer Berlin Heidelberg; 2012, p. 214–27. http://dx.doi.org/10.
[7] Ren I, Zahiri F, et al. A deep ensemble classifier for surface defect detection 1007/978-3-642-33783-3_16.
in aircraft visual inspection, smart sustain. Manuf Syst 2020;4(1):20200031. [35] Bay H, Tuytelaars T, Van Gool L. SURF: Speeded up robust features. In: Computer
http://dx.doi.org/10.1520/SSMS20200031. vision – ECCV 2006, Vol. 3951. Springer Berlin Heidelberg; 2006, p. 404–17.
[8] Nieberl D, Mayer M, Stefani T, Willmeroth M. Automated manufacturing of http://dx.doi.org/10.1007/11744023_32.
large fibre-metal-lmainate parts. In: European conference on composite materials. [36] Dalal N, Triggs B. Histograms of oriented gradients for human detection. In:
2018, https://elib.dlr.de/124296/. IEEE computer society conference on computer vision and pattern recognition
[9] Schuster A, Mayer M, Willmeroth M, Brandt L, Kupke M. Inline quality control (CVPR’05). 2005, http://dx.doi.org/10.1109/cvpr.2005.177.
for thermoplastic automated fibre placement. In: Procedia manufacturing, Vol. [37] Lowe GD. Distinctive image features from scale-invariant keypoints. Int J Comput
51. Elsevier, FAIM; 2021, p. 505–11. http://dx.doi.org/10.1016/j.promfg.2020. Vis 2004;60:91–110. http://dx.doi.org/10.1023/B:VISI.0000029664.99615.94.
10.071. [38] Zhang YL, Guo N, et al. Automated defect recognition of C-SAM images
[10] Internal Study, Schmidt T, Mayer M, Rainer L, Kupke M. Pilotstudie automa- in IC packaging using support vector machines. Int J Adv Manuf Technol
tisierte auswertung von NDT daten. DLR-IB 435-2015/32. 43 S, DLR-Interner 2005;25(11–12):1191–6. http://dx.doi.org/10.1007/s00170-003-1942-1.
Bericht, Unpublished https://elib.dlr.de/101533/. [39] Bernieri A, Ferrigno L, et al. An SVM approach to crack shape reconstruction
[11] Caruana R, N-Mizil A. An empirical comparison of supervised learning al- in eddy current testing. In: IEEE instrumentation and measurement technology
gorithms. In: International conference on machine learning. 2006, p. 161–8. conference proceedings. 2006, p. 2121–6. http://dx.doi.org/10.1109/IMTC.2006.
http://dx.doi.org/10.1145/1143844.1143865. 328502.
[12] Cortes C, Vapnik V. Support-vector networks. Mach Learn 1995;20:273–97. [40] Bernieri A, Ferrigno L, et al. Crack shape reconstruction in eddy current
http://dx.doi.org/10.1007/BF00994018. testing using machine learning systems for regression. IEEE Trans Instrum Meas
[13] Breiman L. Bagging predictors. Mach Learn 1996;24(2):123–40. http://dx.doi. 2008;57(9):1958–68. http://dx.doi.org/10.1109/TIM.2008.919011.
org/10.1007/BF00058655. [41] Benítez HD, Loaiza H, et al. Defect characterization in infrared non-destructive
[14] Rokach L, Maimon O. Decision trees. In: Data mining and knowledge discovery testing with learning machines. NDT E Int. 2009;42(7):630–43. http://dx.doi.
handbook. Springer; 2005, p. 165–92. http://dx.doi.org/10.1007/0-387-25465- org/10.1016/j.ndteint.2009.05.004.
X_9. [42] Khodayari-Rostamabad A, Reilly JP, et al. Machine learning techniques for the
[15] Breiman L. Random forests. Mach Learn 2001;45:5–32. http://dx.doi.org/10. analysis of magnetic flux leakage images in pipeline inspection. IEEE Trans Magn
1023/A:1010933404324. 2009;45(8):3073–84. http://dx.doi.org/10.1109/TMAG.2009.2020160.
[16] Fix E, Hodges Jr JL. Discriminatory analysis - nonparametric discrimination: [43] Wei H, C-Tong L. Automatic real time SVM based ultrasonic rail flaw detection
consistency properties. Technical Report 4, USAF School of Aviation Medicine; and classification system. J Graduate Sch Chin Acad Sci 2009;26(4):517–21,
1951, https://apps.dtic.mil/sti/pdfs/ADA800276.pdf. http://journal.ucas.ac.cn/EN/Y2009/V26/I4/517.
12
N. Prakash et al. NDT and E International 138 (2023) 102885
[44] Shumin D, Zhoufeng L, Chunlei L. Adaboost learning for fabric defect detection [64] Internal Study: University of Augsburg, Detection of anomalies in ultrasonic
based on HOG and SVM. In: International conference on multimedia technology. images of fibre-metal-laminate skin fields, DLR Augsburg, (Unpublished).
IEEE; 2011, p. 2903–6. http://dx.doi.org/10.1109/ICMT.2011.6001937. [65] Ucan H, Apmann H, Grassel G, Krombholz C, Fortkamp K, Nieberl D,
[45] Freund Y, Schapire RE. A short introduction to boosting. J Japan Ehmke F, Nguyen C, Akin D. Produktionstechnologien für leichtbaustrukturen
Soc Artif Intell 1999;14(5):771–80, https://cseweb.ucsd.edu/yfreund/papers/ aus faser-metall-laminaten im flugzeugrumpf. Deutscher Luft- und Raum-
IntroToBoosting.pdf. fahrtkongress; 2017, https://elib.dlr.de/114906/ https://www.researchgate.net/
[46] Saechai S, Kongprawechnon W, Sahamitmongkol R. Test system for defect publication/321964549.
detection in construction materials with ultrasonic waves by support vector [66] Zapp P, Pantelelis N, Ucan H. The way to decrease the curing time by 50% in
machine and neural network. In: SCIS-ISIS. 2012, p. 1034–9. http://dx.doi.org/ the manufacturing of structural components using the example of FML fuselage
10.1109/SCIS-ISIS.2012.6505090. panels. In: SAMPE Europe conference. 2019, https://elib.dlr.de/130943/.
[47] Salzberg SL. Book review C4.5: Programs for machine learning by j. Ross quinlan. [67] Wanhill RJH. GLARE: A versatile fibre metal laminate (FML) concept. 2017,
Morgan Kaufmann publishers, inc. 1993. Mach Learn 1994;16(3):235–40. http: http://dx.doi.org/10.1007/978-981-10-2134-3_13.
//dx.doi.org/10.1007/BF00993309. [68] Etr HE, Korkmaz ME, Gupta MK, Gunay M, Xu J. A state-of-the-art review
[48] D’Angelo G, Rampone S. Shape-based defect classification for non destructive on mechanical characteristics of different fibre metal laminates for aerospace
testing. IEEE Metrol Aerospace (MetroAeroSpace) 2015;406–10. http://dx.doi. and structural application. In: International journal of advanced manufacturing
org/10.1109/MetroAeroSpace.2015.7180691. technology, Vol. 123. Springer; 2022, p. 2965–91. http://dx.doi.org/10.1007/
[49] Sumesh A, Rameshkumar K, et al. Use of machine learning algorithms s00170-022-10277-1.
for weld quality monitoring using acoustic signature. Procedia Comput Sci [69] Stone M. Cross-validatory choice and assessment of statistical predictions. J R
2015;50:316–22. http://dx.doi.org/10.1016/j.procs.2015.04.042. Stat Soc Ser B Stat Methodol 1973;36(2):111–33. http://dx.doi.org/10.1111/j.
[50] Malekzadeh T, Abdollahzadeh M, et al. Aircraft fuselage defect detection using 2517-6161.1974.tb00994.x.
deep neural networks. In: The IEEE global conference on signal and infor- [70] Berrar D. Cross-validation. In: Encyclopedia of bioinformatics and computational
mation processing. 2017, http://dx.doi.org/10.48550/arXiv.1712.09213, arXiv: biology, Vol. 1. Elsevier; 2018, p. 542–5. http://dx.doi.org/10.1016/B978-0-12-
1712.09213. 809633-8.20349-X, Elsevier.
[51] Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolu- [71] Jarvis R, Cawley P, Nagy PB. Performance evaluation of a magnetic field
tional neural networks. Commun ACM 2017;60(6):84–90. http://dx.doi.org/10. measurement NDE technique using a model assisted probability of detection
1145/3065386. framework. NDT E Int 2017;91:61–70. http://dx.doi.org/10.1016/j.ndteint.2017.
[52] Huang H, Hu C, et al. Surface defects detection for mobilephone panel work- 06.006.
pieces based on machine vision and machine learning. In: IEEE international [72] Fawcett T. An introduction to ROC analysis. Pattern Recognit Lett
conference on information and automation. ICIA, 2017, p. 370–5. http://dx.doi. 2006;27(8):861–74. http://dx.doi.org/10.1016/j.patrec.2005.10.010.
org/10.1109/ICInfA.2017.8078936. [73] Zhou W, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment:
[53] Shipway NJ, Huthwaite P, et al. Performance based modifications of random From error visibility to structural similarity. IEEE Trans Image Process
forest to perform automated defect detection for fluorescent penetrant inspection. 2004;13(4):600–12. http://dx.doi.org/10.1109/TIP.2003.819861.
J Nondestruct Eval 2019;38(2):37. http://dx.doi.org/10.1007/s10921-019-0574- [74] Georgiou GA. PoD curves, their derivation, applications and limitations. Insight
9. 2006;49:409–14. http://dx.doi.org/10.1784/insi.2007.49.7.409.
[54] Chen Z-H, Juang J-C. AE-rtisnet: Aeronautics engine radiographic testing in- [75] Harding CA, Hugo GR. Guidelines for interpretation of published data on
spection system net with an Improved Fast Region-based convolutional neural probability of detection for non-destructive testing. 2011, p. 31, https://apps.
network framework. Appl Sci 2020;10(23):8718. http://dx.doi.org/10.3390/ dtic.mil/sti/pdfs/ADA398282.pdf.
app10238718. [76] Matzkanin GA, Yolken HT. Probability of Detection (POD) for nondestructive
[55] Redmon J, Divvala S, et al. You only look once: Unified, real-time object evaluation. NDE, Defense Technical Information Center; 2001, http://dx.doi.org/
detection. 2016, http://dx.doi.org/10.48550/arXiv.1506.02640. 10.21236/ADA398282.
[56] Hu Y, Wang J, et al. Automatic defect detection from X-ray scans for aluminium [77] Sause MGR, Jasiuniene E. Structural health monitoring damage detection systems
conductor composite core wire based on classification neutral network. NDT E for aerospace. Cham: Springer International Publishing; 2021, http://dx.doi.org/
Int 2021;124:102549. http://dx.doi.org/10.1016/j.ndteint.2021.102549. 10.1007/978-3-030-72192-3.
[57] Rabiner LR, Juang BH. An introduction to hidden Markov models. IEEE ASSP [78] Zolfaghari A, Kolahan F. Reliability and sensitivity of visible liquid penetrant
Mag 1986;12. http://dx.doi.org/10.1109/MASSP.1986.1165342. NDT for inspection of welded components. Mater Test 2017;59(3):290–4. http:
[58] Kraljevski I, Duckhorn F, et al. Machine learning for anomaly assessment in //dx.doi.org/10.3139/120.111000.
sensor networks for NDT in aerospace. IEEE Sens J 2021;21(9):11000–8. http: [79] Tschöke K, et al. Feasibility of model-assisted probability of detection prin-
//dx.doi.org/10.1109/JSEN.2021.3062941. ciples for structural health monitoring systems based on guided waves for
[59] Niccolai A, Caputo D, et al. Machine learning-based detection technique for fibre-reinforced composites. IEEE Trans Ultrason Ferroelectr Freq Control
NDT in industrial manufacturing. In: Mathematics, Vol. 9. MDPI; 2021, p. 1251. 2021;68(10):3156–73. http://dx.doi.org/10.1109/TUFFC.2021.3084898.
http://dx.doi.org/10.3390/math9111251, (11). [80] Silva RR da, Padu GX de. Nondestructive inspection reliability: State of the
[60] Siljama O, Koskinen T, et al. Automated flaw detection in multi-channel phased art. In: Nondestructive testing methods and new applications. InTech; 2012,
array ultrasonic data using machine learning. J Nondestruct Eval 2021;40(3):67. http://dx.doi.org/10.5772/37112.
http://dx.doi.org/10.1007/s10921-021-00796-4. [81] Schnars U, Kück A. Application of POD analysis at airbus. In: 4th European-
[61] Fakih MA, Chiachío M, et al. A Bayesian approach for damage assessment in american workshop on reliability of NDE. 2009, https://www.ndt.net/?id=
welded structures using lamb-wave surrogate models and minimal sensing. NDT 8320.
E Int 2022;128:102626. http://dx.doi.org/10.1016/j.ndteint.2022.102626. [82] Topp M, Strothmann L. How can NDT 4.0 improve the probability of detec-
[62] Le M, Luong VS, et al. Auto-detection of hidden corrosion in an aircraft tion (POD)? e-J Nondestruct Test (NDT) 2021;26(4). https://www.ndt.net/?id=
structure by electromagnetic testing: A machine-learning approach. Appl Sci 26013.
2022;12(10):5175. http://dx.doi.org/10.3390/app12105175, MDPI.
[63] Risheh A, Tavakolian P, et al. Infrared computer vision in non-destructive
imaging: Sharp delineation of subsurface defect boundaries in enhanced trun-
cated correlation photothermal coherence tomography images using K-means
clustering. NDT E Int 2022;125:102568. http://dx.doi.org/10.1016/j.ndteint.
2021.102568.
13