Efficient Fusion of Handcrafted and Pre-Trained Cnns Features To Classify Melanoma Skin Cancer
Efficient Fusion of Handcrafted and Pre-Trained Cnns Features To Classify Melanoma Skin Cancer
Efficient Fusion of Handcrafted and Pre-Trained Cnns Features To Classify Melanoma Skin Cancer
https://doi.org/10.1007/s11042-020-09637-4
Youssef Filali 1 1 1
& Hasnae EL Khoukhi & My Abdelouahed Sabri & Abdellah Aarab
2
Abstract
Skin cancer is one of the most aggressive cancers in the world. Computer-Aided Diagnosis
(CAD) system for cancer detection and classification is a top-rated solution that decreases
human effort and time with very high classification accuracy. Machine learning (ML) and deep
learning (DL) based approaches have been widely used to develop robust skin-lesion classi-
fication systems. Each of the techniques excels when the other fails. Their performances are
closely related to the size of the learning dataset. Thus, approaches that are based on the ML are
less potent than those found on the DL when working with large datasets and vice versa. In this
article, we propose a powerful skin-lesion classification approach based on a fusion of
handcrafted features (shape, skeleton, color, and texture) and features extracted from most
powerful DL architectures. This combination will make it possible to remedy the limitations of
both the ML and DL approaches for the case of large and small datasets. Features engineering
is then applied to remove redundant features and to select only relevant features. The proposed
approach is validated and tested on both small and large datasets. A comparative study is also
conducted to compare the proposed approach with different and recent approaches applied to
each dataset. The results obtained show that this features-fusion based approach is very
promising and can effectively combine the power of ML and DL based approaches.
Keywords Skin cancer . Melanoma . Handcrafted features . CNNs . Features fusion . Genetic
algorithm
1 Introduction
The rate of skin cancer incidence has been increasing in the last decades. The reduction of the
ozone layer that protects the human body from the radiation in part, the abusive body
* Youssef Filali
filaliucf@gmail.com
1
Department of Computer science, Faculty of Sciences Dhar-Mahraz, USMBA, Fez, Morocco
2
Department of Physics, Faculty of Sciences Dhar-Mahraz, USMBA, Fez, Morocco
Multimedia Tools and Applications
exposition to the sun, or the use of the tanning are an explanation of the present trend all over
the world. The difficulty in distinguishing between melanoma and non-melanoma skin cancer
pushes many medical communities to invest money, time, and effort to raise awareness of the
danger that presents the actual type of cancer. However, it is also important to invest in the
development of techniques that can be used in the early prevention of this cancer. There are
different techniques; one of them is the acquisition of the image of skin lesion. It can be
acquired from either macroscopic or dermoscopic devices. The macroscopic, which are called
clinical images, are taken from mobile phones or standard cameras. The launch of Medicine
4.0 has led to an increased effort to develop platforms, both hardware and software and the
impact of the Internet of Things (IoT) on the growth of the healthcare industry [44, 51].
On the other hand, dermoscopic images are taken by using a specific device that is the
dermoscope. Considerable efforts have been made by many researchers to develop a
Computer-Aided Diagnosis (CAD) tool that can help dermatologists in their diagnoses. This
system follows specific architecture that are: i) pre-processing step; ii) segmentation of the
image; iii) features extraction; iv) features engineering (optional step); v) classification step.
& Pre-processing step: is a process applied to the images that have an insufficient quality or
contain artifact that may negatively influence the subsequent step.
& Segmentation of the image: to distinguish between skin cancer and in particular between
melanoma and benign lesions, we must begin by isolating the healthy skin lesion that
surrounds each image using a segmentation algorithm. For image segmentation, many
techniques have been applied to help the diagnosis of skin lesion images [65, 67].
& Features extraction: is a crucial step to obtain a good classification rate. Finding an appropriate
feature is a big challenge that a lot of research has faced [45]. It can be divided into two parts:
& Indentation or numerotation with 1 Handcrafted features: that describe well the image using
a broad descriptor such as using ABCD rules: “Asymmetry, Border, Color and Diameter”
that are inspired by the clinical medical meaning [14, 63], or textural features [2, 30, 57].
& Indentation or numerotation with 2 Deep learning features: that use a convolutional layer
to describe well and learn from the images [35, 36].
& Features engineering: that contains; features normalization, dimensionality reduction, and
features selection; this process is used to first normalize all the features and then select
features to reduce dimensionality and to remove redundant features [23].
& Classification step: is the last step in such a system, which helps classify the lesion into
melanoma and non-melanoma lesion. Several classifiers are used for this field, like
Artificial Neural Network (ANN), Decision Tree (DT), Support Vector Machine (SVM),
and Ensemble Methods [29, 47].
Machine learning (ML) and Deep Learning (DL) based approaches have been widely used to
develop robust skin-lesion classification systems. Each of the techniques excels when the other
fails. Their performances are closely related to the size of the learning dataset. Thus, approaches
that are based on the ML are less powerful than those based on the DL when working with large
datasets, and the DL is less powerful than the ML when working with small datasets. Machine
learning uses pre-processing, segmentation, feature extraction and classification. The difficulties
that exist in machine learning approach is to extract and select the relevant features from the
dataset. The machine learning approaches showed the effectiveness applied to a small dataset. On
the other hand, deep learning is based on a convolutional neural network (CNN) that contains
(convolution, pooling, and fully connected layers). A deep-learning algorithm can fall very easily
Multimedia Tools and Applications
in misclassification when it does not have enough training dataset quality. Building an algorithm
that is based on the fusion of DL and ML will give a better performance in case different database
sizes are used. The main contribution of the present work is to propose a powerful skin-lesion
classification approach based on a fusion of handcrafted features (shape, skeleton, color, and
texture) and features extracted from the most powerful DL architectures (AlexNet, VggNet,
GoogLeNet, and ResNet) to diagnose melanoma skin cancer. This combination will make it
possible to remedy the limitations of both ML and DL approaches for the case of large and small
datasets. An evaluation of each descriptor (handcrafted and pre-trained CNNs) separately have
been done. A fusion of all the different descriptors with features engineering is then applied to
remove redundant features and select only relevant ones. In total, we evaluated (4034 features)
and defined an optimal subset that gives the best score.
The rest of the article is organized as follows. Section 2 contains a background of the
segmentation and classification of skin cancer going through the features extraction and
selection. Section 3 contains a description of the proposed approach. A description of the
two databases, the metrics, and the results are depicted in section 4. Finally, the conclusion and
some perspectives are displayed in section 5.
2 Background
CNN is the most used in skin cancer; most of the architectures contain three main layers
(convolution, pooling, and fully connected layers). The first ones are the convolutional layers
where the image is convolved with different and various kernel dimensions. Secondly, the
pooling layers are applied to reduce the dimensionality of the features maps. Average and
max-pooling are the most used. Finally, the fully connected layers work as a simple neural
network [16, 55, 58]. The VggNet, ResNet, AlexNet, and GoogLeNet are the most
Convolutional Neural Network (CNN) architectures used [60].
Features selection is an important step to improve the classification accuracy. Many
algorithms have been used to select the most important features and to remove the redundant
features like Relief, Correlation-based Feature Selection (CFS), chi2, and Recursive Feature
Elimination (RFE) [23].
The classification is the last step in a computer-aided diagnosis system. More than one
method has been applied to evaluate the best outcome. Support Vector Machine (SVM) [11,
17], Logistic Regression [19], Decision Tree [64], and K-Nearest Neighbours [15] are
examples of classifiers used in skin cancer.
The work proposed by Kasmi [32], used the ABCD rule in feature extraction. In the case of
diameter, they represent differential dermoscopic structures for the evaluation of the presence
of striate, pigmented network, globules and dots. In the tests, 200 dermoscopic images form
ERDA Interactive Atlas of Dermoscopy were used and showed that this method achieves an
accuracy of 94%.
Bhati et al. in [7] also used the ABCD rules as a features extraction. Furthermore, they used
that system based on TDS (Total dermoscopic score) and analyzed the score using the
sensitivity and specificity, 92.30% and 84.61%, respectively. In another study, Sanchez [56]
proposed a system that combines the GLCM, LBP, Markov Randomly Filed, and ABCD rules
to describe the lesion. The database used was 556 images from the Atlas Dermoscopy. In the
step of classification, the author used several methods. The best result was found with the
Logistic regression using the Initial variables and Product Units method with 68.51% accuracy
rate. In Codella [12], a sparse coding and deep learning method are used to extract features
from the skin cancer and used the SVM as a classifier. The database used is the ISIC challenge
[13]. Chang [10] used a CAD system to diagnose skin cancer. The color, shape, and texture are
used in the step of feature extraction. The images are collected from the Department of
Dermatology, Kaohsiung Medical University, with a total of 769 images. The result of using
an SVM as a classifier got an accuracy of 90.64. Amirreza Mahbod [37] proposed an
automatic computerized method for skin lesion classification, which employs deep features
from CNNs. They use three pre-trained deep models. The extracted features are used to train
support vector machine classifiers and got an 83.83% accuracy rate. The database used is the
ISIC challenge [13]. N. Moura [42] classifies skin lesions using a hybrid descriptor obtained
by combining features of color, shape, texture, and pre-trained Convolutional Neural Net-
works. These features are used as inputs of a Multilayer Perceptron classifier and got a 96.5
accuracy rate. The dataset used is the PH2 databases [40].
The proposed approach aims to identify and classify skin cancer. Our work emphasizes the
fusion of features based on the handcrafted and pre-trained CNNs that work on two different
databases, ‘small and large’ size. We propose in fig. 1 two different kinds of features: four
Multimedia Tools and Applications
handcrafted features and four pre-trained CNNs features. For handcrafted features, we will
extract shape, color, texture, and skeleton features. On the other hand, we extract features from
the pre-trained CNNs: AlexNet, VggNet, GoogLeNet, and ResNet. A comparison of the
extracted features is made for both databases, followed by a feature selection to keep only
the best and relevant one using the genetic algorithm. In the end, a classification based on the
new features to classify skin cancer into melanoma or non-melanoma.
The features extraction in this part, are composed into two parts; handcrafted features and pre-
trained CNNs features.
The features extracted from the handcrafted are shape, skeleton, texture, and color of the
lesion. Before extracting features, a pre-processing step of the skin cancer is moderate to
remove the artifact that the lesion contains; this will improve the result of segmentation and the
extracted features. A multi-scale decomposition using the Aujol model in the pre-processing
step proves its effectiveness form our previous work [21]. The decomposition of the original
image gives two components; object and, texture images. The Object component will be used
for the segmentation of the lesion and thus for the extraction of shape, skeleton and color
features. While the Texture component will be used for the extraction of textural features:
& Shape features: the segmented image will give a good description of the lesion’s shape’s
malignancy. Eight features are extracted: area, greatest diameter, smallest diameter, pe-
rimeter, eccentricity, exent, and the circularity [27] (see fig. 2 (a)).
& The lesion skeleton provides information on the general shape of the lesions [24] (see fig. 2
(b)). Our new descriptor based on lesion skeleton will consist of nine features that are:
Fig. 2 Example of features extraction from a melanoma and non-melanoma skin cancer (a) segmented images,
(b) skeletonization of the images, (c) textural images, (d) color images
& Textural features: the projection of the region of interest of the segmented image on the
texture component will be used to extract the textural features [22, 36] (see fig. 2 (c)). In
our approach, we will extract 11 features that are:
& Correlation:
n o
G−1 G−1 fi jg Pði; jÞ− μx μy
Corl ¼ ∑ ∑ ð1Þ
i¼0 j¼0 σx σy
Multimedia Tools and Applications
Correlation is a measure of gray level linear dependence between the pixels at the specified
positions relative to each other. With μx, μy, σx and σy are the means and standard deviations
of Px and Py.
& Contrast:
( )
G−1 G−1 G−1
Cont ¼ ∑ n 2
∑ ∑ Pði; jÞ ; ji; jj ¼ n ð2Þ
n¼0 i¼0 j¼0
This measure of contrast or local intensity variation will favour contributions from P(i,j) away
from the diagonal. G is the number of gray levels used.
& Entropy:
G−1 G−1
Entropy ¼ ∑ ∑ Pði; jÞ logðPði; jÞÞ ð3Þ
i¼0 j¼0
homogeneous scene has a high entropy, while a Inhomogeneous scenes have low first order
entropy.
G−1 G−1 n o
ASM ¼ ∑ ∑ Pði; jÞ2 ð4Þ
i¼0 j¼0
A homogeneous scene will contain only a few gray levels, ASM is a measure of homogeneity
of an image. Giving a GLCM with only a few but relatively high values of P(i,j).
G−1 G−1 1
IDM ¼ ∑ ∑ Pði; jÞ ð5Þ
i¼0 j¼0 1 þ ði; jÞ2
Inverse Difference Moment (IDM) have also an influenced by the image homogeneity.
& Mean:
j¼1 1
MeanðμÞ ¼ ∑ Pði; jÞ ð6Þ
N N
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi
1 2
StddevðσÞ ¼ ðPði; jÞ−E i Þ ð7Þ
N
& Skewnes:
1 N
Skewnes ¼ ∑ ði−μÞ3 Pði; jÞ ð8Þ
σ3 i¼0
It is the measurement of the inequality of the intensity level distribution about the mean.
& Kurtosis:
1 N
Kurtosis ¼ ∑ ði−μÞ3 Pði; jÞ ð9Þ
σ3 i¼0
& Variance:
G−1 G−1
E i ¼ ∑ ∑ ði−μÞ2 Pði; jÞ ð10Þ
i¼0 j¼0
This feature puts relatively high weights on the elements that differ from the average value of P
(i,j). μ is the mean value of P(i,j).
rffiffiffiffiffiffiffiffiffiffiffiffiffi
1 n m
RMS ¼ ∑ ∑ bPði; jÞc2 ð11Þ
n m j¼1 i¼1
RMS is defined as the some sort of average or sum (or integral) of square of the error. n, m is
the dimensions of the image.
Multimedia Tools and Applications
& Color features: Based on the normal diagnosis of dermatologists, the number and percent-
age of colors identified on the skin and the lesion can help to classify the lesion as
melanoma or non-melanoma. These colors are white, red, light-brown, dark-brown,
blue-gray, and black [25] (see fig. 2 (d)).
IWhite ¼ ∑ðINR > 0:8&INR > 0:8&INB > 0:8Þ > 0 *100=NT ð12Þ
IRed ¼ ∑ðINR > 0:8&ING < 0:2&INB < 0:2Þ > 0 *100=NT ð13Þ
ILight brown ¼ ∑ðINR > 0:6&INR < 1&ING > 0:32&ING < 0:72&INB > 0:05&INB < 0:45Þ
> 0 *100=NT
ð14Þ
IBlue grey ¼ ∑ðINR < 0:2&ING > 0:32&ING < 0:72&INB > 0:34&INB < 0:74Þ
> 0 *100=NT ð15Þ
IDark brown ¼ ∑ðINR > 0:2&INR < 0:6&ING > 0:06&ING < 0:46&INB > 0&INB < 0:33Þ
> 0 *100=NT
ð16Þ
IBlack ¼ ∑ðINR < 0:2&ING < 0:2&INB < 0:2Þ > 0Þ*100=NT (17)
Whit INR, ING and INB are consecutively the Intensities number of pixels in red, green and
blue components in the RGB representation and NT is the number of pixels in the lesion.
The Convolutional neural network (CNN) can be used into two different ways; The first use is
to design a classification model and the second is to use it to extract features using the Transfer
Learning [26, 33]. In our work, we will use the transfer learning using pre-trained networks;
we extract features from four different architecture of pre-trained network: AlexNet, VggNet,
GoogLeNet, and ResNet. Features are extracted from the last layer of the fully connected
network in case of having multiple layers.
& AlexNet: developed by Alex Krizhevsky [34]. It competed the in the ImageNet Large
Scale Visual Recognition Competition (ILSVRC) 2012 to carry out training and classifi-
cation of the ImageNet database. The AlexNet architecture has been classified in the top-5.
Multimedia Tools and Applications
It contains eight layers. Five convolutional layers with some of them are flowed by max-
pooling, and at the end, 3 layers of fully connected layers [6].
& VggNet: was developed by Simonyan and Zisserman [59]. Very Similar to AlexNet
architecture, with lots of filters that push on the depth. The VggNet is an excellent choice
for images feature extraction. The VGG16 contains 13 convolutional layers and 3 fully
connected layers.
& GoogLeNet: the winner of ImageNet Large Scale Visual Recognition Challenge
(ILSVRC) in 2014 was GoogLeNet [62]. The network uses an architecture similar to
AlexNet but implemented with a new element that is the inception module. This module is
based on several small convolutions and uses normalization. The architecture of
GoogLeNet contains 22 layers deep network.
& ResNet: Residual Neural Network developed by Kaiming He in 2015 [28] and inspired by
the VggNet, introduced a new architecture with skip connection, and batch normalization.
That can help to build deeper networks. The ResNet has fewer filters and lower complexity
than VggNet. The network contains only one fully connected layer.
In Table 1. we present a comparison between the different architecture used. The two
architectures AlexNet and VggNet are almost the same with the difference that the VggNet
is a bit deep and has more parameters, as well as using 3 × 3 filters. The architecture
GoogLeNet consists of 22 layer in deep. It reduces the number of parameters from 60 million
(AlexNet) to 4 million. GoogLeNet has a quite different architecture than both: it uses
combinations of inception modules, including some pooling and convolutions at different
scales and concatenation operations. It also uses 1 × 1 feature convolutions that work like
feature selectors. ResNet is very deep compared to AlexNet and VGG and GoogLeNet. This
architecture introduced a concept called “skip connections.” Moreover, layers in a ResNet also
use Batch Normalization.
Given the high number of features that will be extracted from handcrafted and the four CNN
architectures, it is very wise to make a Features Selection to keep only the best and most
relevant ones. Features engineering that contains features normalization and features selection
will be used to improve the quality of features extraction and improve the accuracy rate of
classification.
Table 1 Comparison between the different architectures used: model size, classification error rate and model
depth
& Features normalization: The features normalization used the Zscore transformation [50] to
normalize value for each feature element using the mean and the standard deviation.
& Features selection: is the process that selects the relevant features that represent the data
exactly. We will focus on the genetic algorithm which belong to the evolutionary
algorithms family [3]. Their goal is to obtain an approximate solution to an optimization
problem. Genetic algorithms use the notion of natural selection and apply it to a population
of potential solutions to the given problem [31, 61]. Genetic Algorithm based on wrapper
features selection approach is employed in the proposed method to select the optimal
feature set. The wrapper feature selection technique incorporates inductive classification
algorithm to measure the goodness of the subset of the features selected.
The structure of the chromosome composed is a binary sequence. The number of
features in the dataset is equal to the length of the chromosome. For each feature in the
dataset correspond to a bit in the chromosome. The absence or presence of features in the
chromosomes is denoted by 0 or 1 respectively. The fitness is measured as a function of
the accuracy of the classifier with which Genetic Algorithm is wrapped. The size of the
population is set as 20 and an initial population of random chromosomes is generated. On
these we perform crossover with a probability Set the probability of crossover. This is the
probability that two population members will exchange genetic material of 0.6 and
mutation with a probability of 0.033 to create the next generation. The result of this
module is to obtain an optimal feature set. This reduced set of features are used as input for
the classifier.
3.3 Classification
The last step after the feature selection is to use these features to classify skin cancer. Both
datasets are first divided randomly into two sets: training and test sets. Then, we will use cross-
validation to maintain the fairness of performance. The training set consists of building a
model that learns from a sample and a test set to evaluate the model’s performance. The
Support Vector Machine (SVM) is used in this work to evaluate both datasets from the
extracted features. This algorithm used a hyperplane to separate the data using label classes.
The separation of the data is simple and gives a good result due to the different kernels that can
be used. Linear, quadratic and RBF are an example of these kernels. In our approach, we will
use the RBF kernel (with gamma = 0.1), and the complexity parameter is 10.
All the studies and implementation of the different algorithms of this work have been
implemented with Matlab 2018 installed on a computer Intel Core™ i7 processor; NVIDIA®
GeForce® GTX 1660 Ti with Max-Q design (6 GB GDDR6 dedicated); 8 GB memory.
Our proposed approach will be evaluated and validated using two datasets; PH2 [40] and
ISIC challenge [13]. Ph2 is a small dataset while ISIC challenge is a large dataset. The choice
of these two reference datasets is due to the fact that Machine Learning is very powerful than
Deep Learning for small datasets case and vice versa in case of for large datasets. The idea of
the fusion of Handcrafted features and those extracted from the four architectures is to design
an efficient classification model for both small and large image databases.
Multimedia Tools and Applications
4.1 Datasets
The PH2 dataset contains 200 images: 160 non-melanomas (80 common nevi, 80 atypical
nevi), and 40 melanomas skin cancer. These images are in RGB (red, green, blue) color system
and have a resolution of 764*575 pixels. Figure 3 first row shows examples of the database.
The ISIC2017 dataset contains 2000 images: 1626 non-melanomas (254 seborrheic keratoses,
1372 atypical nevi), and 374 melanomas skin cancer. These images are RGB (red, green, blue)
color system and have a resolution of 767*1022 pixels. Figure 3 second row shows examples
of the database. The two databases are classified by experts and contain the segmentation
ground-truth. As is customary and to evaluate our proposed approach, the dataset is randomly
divided into training and test sets using k-fold cross-validation (5 fold in this study). That
preserves the fairness of the performance of our proposed approach.
To evaluate our proposed approach, the performance measures used in this paper are Recall,
Specificity, Precision, Accuracy, F-Mesure and Kappa index:
TP
Recall or Sensitivity ¼ ð18Þ
TP þ FN
TN
Specificity ¼ ð19Þ
FP þ TN
TP
Precision ¼ ð20Þ
TP þ FP
Fig. 3 Example of melanoma and non-melanoma skin cancer from PH2 database and ISIC 2017 challenge
Multimedia Tools and Applications
TP þ TN
Accuracy ¼ ð21Þ
TP þ TN þ FP þ FN
2*ðprecision*recallÞ
F−score or F−measure ¼ ð22Þ
Precison þ recall
In this part, the classification result going through features extraction and features engineering
are depicted and discussed.
Features extraction in our approach is composed of two parts: handcrafted and pre-trained
CNNs features. The first one is to extract the features from the lesion. The segmented lesion
will give us a description of the lesion in terms of shape using the area, greatest diameter,
smallest diameter, perimeter, eccentricity and Exent, Equivalent diameter, Circularity; then the
segmented lesion will be transformed to the skeleton that can extract nine effective feature to
describe the lesion. These are number of Endpoints, number of Branch-points, number of Sub-
branches, size of the skeleton, length of the skeleton, width of the skeleton image, the ratio
between width and length of the skeleton, the maximum and minimum length between two
endpoints. The texture and color will be extracted from the projection of segmented lesion on
the texture component result from the PDE decomposition and the original images respective-
ly. A total of 34 features extracted from the handcrafted features. Secondly, we extract features
from the pre-trained CNNs: AlexNet, VggNet, GoogleNet, and ResNet. The features are
extracted from the last fully connected (FC) layers of the pre-trained AlexNet and Vgg16.
For ResNet18 and GoogLeNet, since it has only one FC layer, we extract features from the last
convolutional layer of these pre-trained models. We will extract 1000 features from each pre-
trained CNN, a total of 4000 features for all the pre-trained models. In Table 2 we compare the
result of classification for the two databases separately, using the handcrafted features and pre-
trained CNNs algorithms.
The following result shows classification results using handcrafted AlexNet, GoogLeNet,
Vgg16, ResNet18 separately and the fusion of all the features. For both datasets, the fusions of
all the features gives the best classification result. Thus, in the following step we will use the
Multimedia Tools and Applications
Table 2 The result of the classification of both datasets using the Support Vector Machine
fusion of all the features and we will proceed to the features engineering to select only the
relevant ones.
To improve more our classification rate, we will use a feature selection to keep only the best
features, because not all of them are relevant. For the result of using a genetic algorithm as a
feature selection for both databases.
& PH2 database: 406 features are selected: they contain 10 features from handcrafted and
396 features from pre-trained CNNs.
& ISIC challenge database: 160 features are selected: they contain 7 features from
handcrafted and 153 features from pre-trained CNNs.
Even if the number of handcrafted features is very small (34 features) compared to the number
of features extracted from the Deep Learning features (4000 features), the percentage of
features selected from the Handcrafted (33.33% for PH2 and 20.58% for ISIC challenge) is
much higher than those selected from the DL (9.9% for ph 2 and 3.825% for ISIC challenge).
Table 3 shows the features selected form the handcrafted and pre-trained CNNs with the
impact of each feature.
In Table 4 we can see the result of classification using the genetic algorithm as a features
selection for both databases. And in fig. 4 we can see the impact of the use of GA for features
Table 4 Result of SVM classification using features obtained after features selection
selection using the Accuracy and F-measure metrics. Table 5 presents the confusion matrix,
which shows the performance of our classification approach when predicting the skin lesion
type using cross-validation(k = 5) for both Ph2 and ISIC challenge datasets respectively. The
result of the classification for both databases are: 94.69%, 96.63% and 98% for the Ph2
database and 62.73%, 55.68%, and 87.8% for the ISIC challenge using the F-measure, Kappa
index and accuracy rate respectively.
The confusion matrix is presented to show the efficiency of the proposed approach where
the False Positive rate is very lower. FP means that only 1 and 76 Melanoma has been wrongly
classified as Non-melanoma in for the Ph2 and ISIC datasets respectively. This parameter is
very significant in the field of medicine.
In this section, a comparative study between our proposed approach and recent approaches in
the literature will be discussed.
Table 6 presents the PH2 dataset classification accuracy using our proposed approach in
comparison with recent approaches proposed respectively by J. A. Salido [55], N. Moura [42],
S. Pathan [49], M. Nasir [43], Tallha Akram [1].
In [55] Julie Ann A. Salido uses AlexNet as a pre-trained CNN as a feature extractor and
got a 93.00% accuracy rate. On the other hand, in [42] N. Moura classifies skin lesions using a
hybrid descriptor obtained by combining features of color, shape, texture and pre-trained
Convolutional Neural Networks. These features are used as inputs of a Multilayer Perceptron
classifier and got a 96.5% accuracy rate. In [49] S. Pathan uses a robust ensemble architecture,
which is developed using dynamic classifier selection techniques to detect malignancy and got
97% accuracy. In [43] Muhammad Nasir uses a Serial based on which method is applied
Fig. 4 Result of classification for the both dataset using the F-measure, kappa and Accuracy
Multimedia Tools and Applications
Table 5 The confusion matrix using genetic algorithms as a features selection with Ph2 and ISIC databases
Melanoma 37 1 206 76
Non-melanoma 3 159 168 1550
subsequently and it also extracts and fuses the traits such as color, texture, and HOG (shape).
The fused features are selected afterwards by implementing a novel Boltzman Entropy
method. Finally, the selected features are classified by Support Vector Machine and get an
accuracy rate equal to 97.5%. In [1] Tallha akram uses feature extraction and dimensionality
reduction criteria, which combines conventional as well as recent feature extraction techniques
and got 97.5% as classification accuracy.
Table 7 presents the ISIC dataset classification accuracy of our proposed approach in
comparison with recent approaches proposed by Iván González-Díaz [18], Kazuhisa
Matsunaga [39], Amirreza Mahbod [37], Lei Bi [8] respectively.
Iván González-Díaz [18] used an incorporate expert knowledge of dermatologists into the
Convolutional Neural Networks (CNN) and got a score of 82.3% classification rate. On the
other hand, Kazuhisa Matsunaga [39] used a pre-trained model to classify Melanoma by Deep
Neural Network and got an 82.8% accuracy rate. Amirreza Mahbod [37] proposed an
automatic computerized method for skin lesion classification, that employs deep features from
CNNs. They use three pre-trained deep models, namely AlexNet, VGG16, and ResNet-18.
The extracted features are used to train support vector machine classifiers and got an 83.83%
accuracy rate. Lei Bi [8] used an automatic Skin Lesion Analysis using the new architecture of
Deep Residual Networks Resnet and got 85.5% accuracy rate.
Our proposed approach got a higher score compared to recent approaches from literature
with the result of the classification for both databases are: 94.69%, 96.63% and 98% for the
Ph2 database and 62.73%, 55.68%, and 87.8% for the ISIC challenge using the F-measure,
Kappa index and accuracy rate respectively. This is firstly due to the fusion of features from
the handcrafted and pre-trained features and secondly to the genetic algorithms as a feature
selection that proves the effectiveness of those selected as the best and most relevant ones.
Even though our proposed approach is shown to give a high performance on both Ph2 and
the ISIC datasets, there are some limitations that can be addressed in future work. The number
of pre-trained and handcrafted methods that we have done is limited. Extending more
incorporate pre-trained models such as DenseNet and Inception will lead to further improved
classification performance. Also, extending the training data is expected to lead to better results
for each method as well as their combinations. The difficulty is to find the best and the relevant
features that should work for all the databases and give the best score.
Table 6 Results of the proposed approach compared to recent approaches from the literature for PH2 dataset
Table 7 Results of the proposed approach compared to recent approaches from the literature for ISIC dataset
5 Conclusion
The skin lesion is considered as the most dangerous cancer. If diagnosed and treated earlier, it
is usually curable. Computer-Aided Diagnosis (CAD) is widely used to automatically diagnose
skin lesion as melanoma or not. The process is often done in three basic steps: segmentation,
features extraction and classification. Good-Extracted features will give better accuracy in the
classification.
Machine learning (ML) and deep learning (DL) based approaches have been widely used to
develop robust skin-lesion classification systems. Each of the techniques excels when the other
fails. Their performances are closely related to the size of the learning dataset. Thus,
approaches that are based on the ML are less powerful than those based on the DL when
working with large datasets and vice versa. In this paper, we proposed to merge different
features provided from handcrafted features (shape, skeleton, color, and texture) and pre-
trained convolutional neural network (CNNs) to have new descriptors of the lesion. The
experimentations was validated and tested on both small and large datasets PH2 and ISIC
challenge. The new proposed fusion features based on the handcrafted and pre-trained allowed
a clear improvement of the classification rate with 94.69%, 96.63% and 98% for the Ph2
database and 62.73%, 55.68%, and 87.8% for the ISIC challenge using the F-measure, Kappa
index and accuracy rate respectively. This is firstly due to the fusion of features from the
handcrafted and pre-trained features and secondly to the use of genetic algorithm for features
selection that proves the effectiveness of those selected as the best and most relevant ones.
In our future works, will be concentrated on improving more feature extraction using a new
deep learning method.
Acknowledgments This research work has been funded by the laboratories: LIIAN and LESSI and the Faculty
of Sciences, University Sidi Mohamed Ben abdellah, Fez, Morocco.
References
1. Akram T, Khan MA, Sharif M, Yasmin M (2018) Skin lesion segmentation and recognition using
multichannel saliency estimation and M-SVM on selected serially fused features. J Ambient Intell
Humaniz Comput 1–20
Multimedia Tools and Applications
2. Almansour E, Jaffar MA (2016 Apr 30) Classification of Dermoscopic skin cancer images using color and
hybrid texture features. IJCSNS Int J Comput Sci Netw Secur 16(4):135–139
3. Anirudha RC, Kannan R, Patil N (2014) Genetic algorithm based wrapper feature selection on hybrid
prediction model for analysis of high dimensional data. In2014 9th international conference on industrial
and information systems (ICIIS) (pp. 1-6). IEEE
4. Arifin MS, Kibria MG, Firoze A, Amini MA, Yan H (2012) Dermatological disease diagnosis using color-
skin images. In 2012 international conference on machine learning and cybernetics (Vol. 5, pp. 1675-1680).
IEEE
5. Barata C, Celebi ME, Marques JS (2018 Jun 11) A survey of feature extraction in dermoscopy image
analysis of skin cancer. IEEE Journal of biomedical and health informatics 23(3):1096–1109
6. Berseth M, Logix NLP (2017) ISIC 2017 – Skin Lesion Analysis Towards Melanoma Detection, pp. 1–4
7. Bhati P, Singhal M (2015) Early stage detection and classification of melanoma. In: Communication,
control and intelligent systems (CCIS), 2015, pp 181–185. IEEE
8. Bi L, Kim J, Ahn E, Feng D (2017) Automatic skin lesion analysis using large-scale dermoscopy images
and deep residual networks. arXiv preprint arXiv: 1703.04197
9. Celebi ME, Kingravi HA, Uddin B, Iyatomi H, Aslandogan YA, Stoecker WV, Moss RH (2007 Sep 1) A
methodological approach to the classification of dermoscopy images. Comput Med Imaging Graph 31(6):
362–373
10. Chang WY, Huang A, Yang CY, Lee CH, Chen YC, Wu TY, Chen GS (2013) Computer-aided diagnosis
of skin lesions using conventional digital photography: a reliability and feasibility study. PLoS One 8(11):
e76 212
11. Codella N, Cai J, Abedini M, Garnavi R, Halpern A, Smith JR (2015) Deep learning, sparse coding, and
SVM for melanoma recognition in dermoscopy images. In international workshop on machine learning in
medical imaging (pp. 118-126). Springer, Cham
12. Codella N, Cai J, Abedini M, Garnavi R, Halpern A, Smith JR (2015) Deep learning, sparse coding, and
svm for melanoma recognition in dermoscopy images. In: International workshop on machine learning in
medical imaging, pp. 118–126. Springer, Berlin.
13. Codella NC, Gutman D, Celebi ME, Helba B, Marchetti MA, Dusza SW, Kalloo A, Liopyris K, Mishra N,
Kittler H, Halpern A (2018) Skin lesion analysis toward melanoma detection: a challenge at the 2017
international symposium on biomedical imaging (isbi), hosted by the international skin imaging collabora-
tion (isic). In2018 IEEE 15th international symposium on biomedical imaging (ISBI 2018) (pp. 168-172).
IEEE
14. Correa DN, Paniagua LR, Noguera JL, Pinto-Roa DP, Toledo LA (2015) Computerized diagnosis of
melanocytic lesions based on the ABCD method. In2015 Latin American computing conference (CLEI)
(pp. 1-12). IEEE.
15. Dalila F, Zohra A, Reda K, Hocine C (2017 Jul 1) Segmentation and classification of melanoma and benign
skin lesions. Optik. 140:749–761
16. Dara S, Tumma P (2018) Feature extraction by using deep learning: a survey. In2018 second international
conference on electronics, communication and aerospace technology (ICECA) (pp. 1795-1801). IEEE
17. Deepa SN, Devi BA (2011 Nov 1) A survey on artificial intelligence approaches for medical image
classification. Indian J Sci Technol 4(11):1583–1595
18. Díaz IG (2017) Incorporating the knowledge of dermatologists to convolutional neural networks for the
diagnosis of skin lesions. arXiv preprint arXiv:1703.01976
19. Dreiseitl S, Ohno-Machado L, Kittler H, Vinterbo S, Billhardt H, Binder M (2001 Feb 1) A comparison of
machine learning methods for the diagnosis of pigmented skin lesions. J Biomed Inform 34(1):28–36
20. Fan H, Xie F, Li Y, Jiang Z, Liu J (2017 Jun 1) Automatic segmentation of dermoscopy images using
saliency combined with Otsu threshold. Comput Biol Med 85:75–85
21. Filali Y, Sabri MA, Aarab A (2017) An improved approach for skin lesion analysis based on multiscale
decomposition. In2017 international conference on electrical and information technologies (ICEIT) (pp. 1-
6). IEEE
22. Filali Y, Ennouni A, Sabri MA, Aarab A (2017) Multiscale approach for skin lesion analysis and
classification. In2017 international conference on advanced Technologies for Signal and Image
Processing (ATSIP) (pp. 1-6). IEEE
23. Filali Y, Ennouni A, Sabri MA, Aarab A (2018) A study of lesion skin segmentation, features selection and
classification approaches. In2018 international conference on intelligent systems and computer vision
(ISCV) (pp. 1-7). IEEE
24. Filali Y, El Khoukhi H, Sabri MA, Yahyaouy A, Aarab A (2019) New and Efficient Features for Skin
Lesion Classification based on Skeletonization". In2019 Journal of Computer Science. Volume 15, Issue 9.
pp 1225.1236
Multimedia Tools and Applications
25. Filali Y, Abdelouahed S, Aarab A (2019 May 19) An improved segmentation approach for skin lesion
classification. Statistics, Optimization & Information Computing 7(2):456–467
26. Filali Y, El Khoukhi H, Sabri MA, Yahyaouy A, Aarab A (2019) Texture classification of skin lesion using
convolutional neural network. In2019 international conference on wireless technologies, embedded and
intelligent systems (WITS) (pp. 1-5). IEEE
27. Filali Y, Sabri MA, Aarab A (2020) Improving skin Cancer classification based on features fusion and
selection. In embedded systems and artificial intelligence (pp. 379–387). Springer, Singapore.
28. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. InProceedings of the
IEEE conference on computer vision and pattern recognition (pp. 770–778)
29. Immagulate I, Vijaya MS (2015) Categorization of non-melanoma skin lesion diseases using support vector
machine and its variants. International Journal of Medical Imaging 3(2):34–40
30. Jain S, Pise N (2015 Jan 1) Computer-aided melanoma skin cancer detection using image processing.
Procedia Computer Science 48:735–740
31. Kannan V (2018) Feature selection using genetic algorithms
32. Kasmi R, Mokrani K (2016) Classification of malignant melanoma and benign skin lesions: implementation
of automatic abcd rule. IET Image Process 10(6):448–455
33. Kassani SH, Kassani PH (2019 Jun 1) A comparative study of deep learning architectures on melanoma
detection. Tissue Cell 58:76–83
34. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural
networks. In Advances in neural information processing systems (pp. 1097–1105)
35. Li Y, Shen L (2018) Skin lesion analysis towards melanoma detection using deep learning network.
Sensors. 18(2):556
36. Litjens G, Kooi T, Bejnordi BE, Setio AA, Ciompi F, Ghafoorian M, Van Der Laak JA, Van Ginneken B,
Sánchez CI (2017 Dec 1) A survey on deep learning in medical image analysis. Med Image Anal 42:60–88
37. Mahbod A, Schaefer G, Wang C, Ecker R, Ellinge I (2019) Skin lesion classification using hybrid deep
neural networks. In ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal
processing (ICASSP) (pp. 1229-1233). IEEE
38. Marques JS, Barata C, Mendonça T (2012) On the role of texture and color in the classification of
dermoscopy images. In2012 annual international conference of the IEEE engineering in medicine and
biology society (pp. 4402-4405). IEEE
39. Matsunaga K, Hamada A, Minagawa A, Koga H (2017) Image classification of melanoma, nevus and
seborrheic keratosis by deep neural network ensemble. arXiv preprint arXiv:1703.03108
40. Mendonc T, Ferreira PM, Marques JS (2013) PH 2 - A dermoscopic image database for research and
benchmarking *, no. July, pp. 1–5
41. Mohanaiah P, Sathyanarayana P, GuruKumar L (2013 May) Image texture feature extraction using GLCM
approach. Int J Sci Res Publ 3(5):1
42. Moura N, Veras R, Aires K, Machado V, Silva R, Araújo F, Claro M (2018) Combining ABCD Rule,
texture features and transfer learning in automatic diagnosis of melanoma. In2018 IEEE symposium on
computers and communications (ISCC) (pp. 00508-00513). IEEE
43. Nasir M, Attique Khan M, Sharif M, Lali IU, Saba T, Iqbal T (2018 Jun) An improved strategy for skin
lesion detection and classification using uniform segmentation and feature selection based approach.
Microsc Res Tech 81(6):528–543
44. Nauman A, Qadri YA, Amjad M, Zikria YB, Afzal MK, Kim SW (2020) Multimedia internet of things: a
comprehensive survey. IEEE Access 8:8202–8250
45. Oliveira RB, Mercedes Filho E, Ma Z, Papa JP, Pereira AS, Tavares JM (2016 Jul 1) Computational
methods for the image segmentation of pigmented skin lesions: a review. Comput Methods Prog Biomed
131:127–141
46. Oliveira RB, Pereira AS, Tavares JM (2018) Computational diagnosis of skin lesions from dermoscopic
images using combined features. Neural Comput & Applic 1–21
47. OZKAN IA, KOKLU M (2017 Dec 28) Skin lesion classification using machine learning algorithms.
International Journal of Intelligent Systems and Applications in Engineering 5(4):285–289
48. Pathan S, Prabhu KG, Siddalingaswamy PC (2018 Nov 1) Hair detection and lesion segmentation in
dermoscopic images using domain knowledge. Medical & biological engineering & computing 56(11):
2051–2065
49. Pathan S, Prabhu KG, Siddalingaswamy PC (2019 May 1) Automated detection of melanocytes related
pigmented skin lesions: a clinical framework. Biomedical Signal Processing and Control 51:59–72
50. Patro S, Sahu KK (2015) Normalization: A preprocessing stage. arXiv preprint arXiv:1503.06462
51. Qadri YA, Nauman A, Zikria YB, Vasilakos AV, Kim SW (2020) The future of healthcare internet of
things: a survey of emerging technologies. IEEE Communications Surveys & Tutorials 22(2):1121–1167
Multimedia Tools and Applications
52. Rubegni P, Cevenini G, Burroni M, Perotti R, Dell'Eva G, Sbano P, Miracco C, Luzi P, Tosi P, Barbini P,
Andreassi L (2002 Oct 20) Automated diagnosis of pigmented skin lesions. Int J Cancer 101(6):576–580
53. Sabri, M., Filali, Y., Ennouni, A., Yahyaouy, A., & Aarab, A. (2019). "2. An overview of skin lesion
segmentation, features engineering, and classification". In Intelligent decision support systems. Berlin,
Boston: De Gruyter, pp. 31–52 doi:https://doi.org/10.1515/9783110621105-002
54. Salido JANN and C. R. Jr (2018) Hair artifact removal and skin lesion segmentation of dermoscopy images,
vol. 11, no. 3, pp. 2–5
55. Salido JA, Ruiz C (2018) Using deep learning for melanoma detection in dermoscopy images. International
Journal of Machine Learning and Computing 8(1):61–68
56. Sanchez-Monedero J, Saez A, Perez-Ortiz M, Gutierrez PA, Hervás-martínez C (2016) Classification of
melanoma presence and thickness based on computational image analysis. In: International conference on
hybrid artificial intelligence systems. Springer, Berlin, pp 427–438
57. Schaefer G, Krawczyk B, Celebi ME, Iyatomi H (2014 Dec 1) An ensemble classification approach for
melanoma diagnosis. Memetic Computing 6(4):233–240
58. Shoieb DA, Youssef SM, Aly WM (2016 Dec) Computer-aided model for skin diagnosis using deep
learning. Journal of Image and Graphics 4(2):122–129
59. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition.
arXiv preprint arXiv:1409.1556
60. Singhal A, Ramesht Shukla PK, Dubey S, Singh S, Pachori RB Comparing the capabilities of transfer
learning models to detect skin lesion in humans
61. Srividya TD, Arulmozhi V (2018) Detection of skin cancer - A genetic algorithm approach, vol. 7, pp. 131–
135
62. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015)
Going deeper with convolutions. InProceedings of the IEEE conference on computer vision and pattern
recognition (pp. 1–9)
63. Vasconcelos MJ, Rosado L, Ferreira M (2015) A new risk assessment methodology for dermoscopic skin
lesion images. In2015 IEEE international symposium on medical measurements and applications (MeMeA)
proceedings (pp. 570-575). IEEE.
64. Victor A, Ghalib M (2017) Automatic detection and classification of skin cancer. International Journal of
Intelligent Engineering and Systems 10(3):444–451
65. Xu L, Jackowski M, Goshtasby A, Roseman D, Bines S, Yu C, Dhawan A, Huntley A (1999 Jan 1)
Segmentation of skin cancer images. Image Vis Comput 17(1):65–74
66. Zhang X (2017) Melanoma segmentation based on deep learning. Computer assisted surgery 22(sup1):267–
277
67. Zhou H, Schaefer G, Celebi ME, Iyatomi H, Norton KA, Liu T, Lin F (2010) Skin lesion segmentation
using an improved snake model. In2010 annual international conference of the IEEE engineering in
medicine and biology (pp. 1974-1977). IEEE.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.