1 s2.0 S2772442523000102 Main
1 s2.0 S2772442523000102 Main
1 s2.0 S2772442523000102 Main
Healthcare Analytics
journal homepage: www.elsevier.com/locate/health
1. Introduction Most skin diseases have revealing symptoms such as rash, ulcers,
lesions, moles, etc. However, the diagnosis of skin diseases faces some
1.1. Background difficulties. The most common obstacle is that many skin conditions
have similarities between them that are not distinguishable visually.
Skin is the most vital and sensitive organ in the human body, Besides, symptoms are constantly changing over a long process. Even
shielding against heat, injury, and infections. Unfortunately, the skin physicians are bound to visual imperfections due to the lighting con-
condition is sometimes disrupted due to bacterial and viral infection, ditions of the environment, the skin color of the patient, and their
fungus, lack of a strong immune system, and genetic imbalances. In professional experience. In most cases, early detection of skin diseases
many cases, diseases caused by those factors have macabre effects on reduces the risk factors. The mortality rate of some diseases with a high
human life. In addition, some skin diseases are contagious, risking not mortality rate can be reduced to 90% if diagnosed in the early stage [3].
only individuals but also others related to the infected. Statistics [1]
reported that over 100 million people all over the world are suffering
from different types of skin indispositions; the most frequent skin disor- 1.2. Motivation
ders are Atopic dermatitis, Eczema, Herpes, Nevus, Warts, Ringworm,
Chickenpox, and Melanoma, etc. American Cancer Society reported [2]
Researchers are actively investigating methods to develop skin dis-
that, by the end of the year 2020, 100,350 new melanoma cases will
ease recognition systems. Many studies have utilized image process-
be reported and diagnosed, and almost 6850 people are about to die
ing techniques incorporating statistical analysis to extract information
because of melanoma.
∗ Corresponding author.
E-mail addresses: rifat.sadik.rs@gmail.com (R. Sadik), anupmajumder@juniv.edu (A. Majumder), alaminbiswas.cse@gmail.com (A.A. Biswas),
bulbul@juniv.edu (B. Ahammad), mrrajuiit@gmail.com (Md.M. Rahman).
https://doi.org/10.1016/j.health.2023.100143
Received 11 June 2022; Received in revised form 25 November 2022; Accepted 24 January 2023
2772-4425/© 2023 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
R. Sadik, A. Majumder, A.A. Biswas et al. Healthcare Analytics 3 (2023) 100143
about skin conditions [4–8]. Researchers were trying to recognize • Propose an automated framework for skin disease recognition
skin diseases by analyzing textures, structures, and colors in these based on pre-trained CNN architectures, namely MobileNet and
approaches. Methods like Self-organizing Map (SOM), Radial Basis Xception.
Function (RBF), Gray Level Co-occurrence Matrix (GLCM), etc., were • For a more robust and generalization model, augmentation and
used for such approaches. But all these methods lack in terms of transfer learning techniques are included.
precision and accuracy since these methods require sufficient data, • Propose and implement a web-based application to recognize skin
good coverage of the input space, and high dependence on texture diseases remotely.
features such as contrast, correlation, entropy, etc. • Evaluate the model’s performance by comparing it with other
In recent times, Artificial Intelligence (AI) has evolved enormously deep learning models such as ResNet50, InceptionV3, Inception-
in the clinical context, or medical field [9,10]. In the medical field, ResNet, and DenseNet.
Machine learning (ML) and Deep Learning (DL) algorithms prove
their worth in implementing smart and automated AI-based systems
2. Literature review
[10–13]. Researchers have pulled their strings to develop more ad-
vanced frameworks that can be applied in various image-based ap-
Researchers were trying to develop an efficient and effective system
plications. Convolutional Neural Network (CNN) is considered the
that visually recognizes different classes of skin diseases. Some of
state-of-the-art method in the analysis of visual imagery. In medical
the approaches include image processing techniques with statistical
image analysis such as X-ray images, MRI images, CNN model and its
methods, texture, and color analysis. AD Mengistu, DM Alemayehu [4]
derivations such as ResNet, VGG-16, GoogleNet, AlexNet, etc., have
proposed image processing techniques for recognizing and predicting
shown significant results in detection, recognition, and classification
skin cancers. Predefined classes of skin cancers collected from the
tasks [14]. However, deep learning architectures like CNN require
immense computation resources as well as a lot of image data to American cancer society and DERMOFIT were used in this experiment.
train the proposed model [15]. Due to the lack of sufficient data and A hybrid method that integrates two image processing techniques,
resources, the field of medical image analysis for skin diseases is yet to namely a Self-organizing map (SOM) and radial basis function (RBF),
explore to the full extent. Pretrained CNN models have come to a point was used in this recognition task, and image features such as color, tex-
by researchers to aid the purpose. Besides, image analysis techniques tures, and image structure were combined. Further, the acquired results
such as Augmentation are widely used to construct a generalized model were compared with other approaches such as KNN, Naïve Bayes, and
and robust systems where training data is inadequate. ANN. The reported result revealed that the overall accuracy for this
CNN architectures like MobileNet and Xception are helping re- applied hybrid method was 93.15%. Manish Pawar et al. [5] Identify
searchers to bring out new intelligent systems nowadays. For exam- different skin disease conditions based on feed-forward backpropaga-
ple, the MobileNet model shows high accuracy for the classification tion neural networks. Texture features were used as key attributes
task [16] where welding defects from images were analyzed. In medical for image recognition purposes that were analyzed from the GLCM
imaging, such as children’s colonoscopy [17] combination of MobileNet method. Three skin conditions were selected for the classification task,
with DenseNet is proposed for better classification results. In [18], lung and the overall accuracy was reported at 66.66%. To enhance the
diseases were analyzed and detected from chest X-ray images using scope for identifying multiple skin diseases, Li-sheng et al. [6] pro-
the MobileNet model. In language processing tasks [19,20] MobileNet posed a method that combines both color and texture features. The
model was studied for the recognition task of Bangla characters which preprocessing task included noise and background removal through
are handwritten and complex sign language translations. The Xception filtering and transformations. The GLCM approach was implemented
model is also widely used for different computer vision-based tasks. to extract texture features such as contrast, correlation, entropy, etc.,
For example, chest X-ray images were analyzed using the Xception and for color feature extraction watershed algorithm was used. For
model in [21,22] to differentiate between COVID-19 lung condition and this research purpose, three types of common skin diseases, namely
normal pneumonia. In [23], Xception based framework is used to clas- herpes, dermatitis, and psoriasis, were classified using a support vec-
sify and authenticate forensic images. Researchers also implemented tor machine (SVM) classifier. The average accuracy while recognizing
this model for the garbage image classification task in [24] for the those 3 classes of skin disease images reached 90% using SVM classifier
productive garbage management system. and combining color and texture features. Md. Nazrul Islam et al. [7]
established a system for recognizing multiclass skin diseases that relied
1.3. Contribution on image texture. Different preprocessing operations, such as resize,
grayscale conversion, contrast enhancement, and noise removal were
In this work, we implement an automated system based on com-
conducted for this experiment. Images textures were extracted using
puter vision-based techniques where two structured Convolutional Neu-
the GLCM method, and segmentation was carried out using Maximum
ral Network architectures MobileNet [25] and Xception [26], con-
Entropy Thresholding. Finally, the Backpropagation (BPN) algorithm
tribute to the recognition of different types of dermatological diseases,
was used to classify 3 different classes of skin disease images Eczema,
namely Atopic dermatitis, Eczema, Herpes, Nevus, and Melanoma. In
Impetigo, and Psoriasis. The obtained accuracy for this method was
order to construct an accurate model, we combined these two archi-
reported at 80% along with sensitivity and specificity of 71.4% and
tectures with transfer learning and a real-time image augmentation
process. In addition to that, we evaluated the effectiveness of our 87.5%, respectively.
propositions by comparing the performance with state-of-the-art deep Rahat Yasir et al. [28] proposed a computer vision-based approach
learning models such as ResNet50, InceptionV3, Inception-ResNet, and for recognizing skin diseases from images. Different preprocessing algo-
DenseNet. rithms, like sharpening, median, smooth filter, binary mask, histogram,
Besides, we proposed and implemented a web-based architecture YCbCr, etc., were used for feature extraction. An artificial neural net-
for the real-time recognition of diseases. We deployed our trained work (ANN) was used for training and test purposes. On a real-time
models on the web using Flask framework [27], and the recognition dataset, the proposed model obtained a classification accuracy of 90%.
of skin diseases can be done remotely using this system. Our proposed To make a classification between skin conditions such as normal, spots,
approach can aid health professionals by recognizing different skin and wrinkles, Jhan S. Alarifi et al. [29] used traditional ML approaches
diseases more efficiently and making the diagnosis process more user- based on SVM and CNN. SVM used feature extraction techniques like
friendly for the patients. Moreover, besides pandemics and natural LPB and HOG. For CNN, GoogleNet architecture was implemented with
disasters, a cloud-based healthcare system can be built to operate the different optimizers. The experimental result showed that GoogleNet
healthcare system remotely. Here we sum up the whole concept of this with NAG optimizer outperformed SVM in all aspects, reached to an
work’s contribution below: accuracy level of 89%. Yuexiang Li and Linlin Shen [30] proposed
2
R. Sadik, A. Majumder, A.A. Biswas et al. Healthcare Analytics 3 (2023) 100143
3
R. Sadik, A. Majumder, A.A. Biswas et al. Healthcare Analytics 3 (2023) 100143
Fig. 2. Proposed system architecture. (A systematic representation of our proposed approach including data acquisition, preprocessing using augmentation, transfer learning,
training, testing, and predictions carried out in building and deployment phases.)
Table 1
All the necessary steps with the setup that will be carried out throughout the experiment.
Algorithm: Experimental setup
Input 1. Collected images of 5 classes of skin diseases .
Environment 2. Google Colab.
3. Import all necessary libraries and packages .
Configuration
4. Import the images.
Directories Configuration 5. Construct directories for Training, testing and validation.
6. Build CNN models. For transfer learning use a model trained on ImageNet dataset.
Training and Testing
7. Fine-tune the models by adding additional global average pooling layers, a fully connected layer and Softmax class
8. Model compile with RMSProp optimizer and learning rate of 0.001.
9. Set 100 epochs for model fitting.
Model Compilation
10. Use val_accuracy monitor as model checkpoint.
11. Save model.
12. Generate classification report and confusion matrix.
Performance Evaluation 13. Generate AUC–ROC curve.
14. Generate model accuracy and loss reports.
15. Load best model.
Prediction 16. Load random images.
17. Predict disease classes.
We have implemented six different CNN-based architectures namely The learning process of CNN constitutes convolutional layers, non-
ResNet50, InceptionV3, Inception-ResNet, DenseNet, MobileNet, and linear processing units, and layers for subsampling tasks [45]. The
Xception. But we focused more specifically on MobileNet and Xception working of CNN implements a layered architecture and presented in
Model. The remaining models are used in this study to compare the Fig. 3. Three main layers, namely convolution, pooling, and fully con-
performance of our propositions. nected layer, are used to build a CNN model [46]. Convolutional layers
have a convolutional kernel that works as a feature extractor. These
3.2.1. Convolutional Neural Networks (CNN) kernels slice the input image into receptive fields. The relation between
CNN is the most popular artificial neural network specially designed the input feature map and output feature map can be expressed using
∑ ∑
for computer vision-based applications that incorporates analyzing vi- convolutional operation, i.e., 𝐹 (𝑥, 𝑦) = (𝑓 ∗ 𝑘)(𝑥, 𝑦) = 𝑖 𝑗 𝑓 (𝑖, 𝑗)𝑘(𝑥 −
sual imagery [44]. The network takes an image as input and processes 𝑖, 𝑦 − 𝑗). 𝐹 (𝑥, 𝑦) and 𝑓 (𝑥, 𝑦) corresponds to the output and input feature
the image for extracting different features and patterns from that input map, and k(x,y) represents the element of the corresponding kernel.
image. These features are also made distinguishable by the network. The pooling layer involves an operation that sums up all the relevant
Both spatial and temporal characteristics are captured using CNN. and similar information from the neighborhood. The size of the input
These characteristics are used in differentiating different classes of feature map has been reduced by cutting down the number of param-
images. The feature detection task is the backbone of the CNN model, eters. The pooling operation can be formulated using the equation,
which has been carried out using the feature extractor filter or Kernel. 𝑍 = 𝑔𝑝 (𝑓 ) where Z is the polled feature map operating with input
4
R. Sadik, A. Majumder, A.A. Biswas et al. Healthcare Analytics 3 (2023) 100143
Fig. 3. A convolutional neural network (CNN) architecture with its dimensions. (A layered representation of CNN architecture for performing different operations like convolution,
pooling, and consist of convolution layers, pooling layers and fully connected layer.)
Fig. 4. Architecture of MobileNet. (A CNN architecture performing Depthwise and Pointwise convolution on the input image for the completion of the filtering task and the
creation of linear output combinations.)
feature map f. Finally, the classification task has been carried out using convolution layers where K is the kernel of size 𝐷𝐾 ×𝐷𝐾 ×𝑀 ×𝑁 where
a global operation carried out in a fully connected layer (FC). All the 𝐷𝐾 × 𝐷𝐾 denotes the dimension of the kernel. The output feature map
extracted features are analyzed in this layer and create a non-linearity is given by the following equation
between them. ∑
𝐺(𝑘,𝑙,𝑛) = 𝐾(𝑖,𝑗,𝑚,𝑛) .𝐹(𝑘+𝑖−1,𝑖+𝑗−1,𝑚) (1)
(𝑖,𝑗,𝑚)
3.2.2. MobileNet
MobileNet is a popular Deep CNN network, widely used in computer For the depthwise convolution layer, is depthwise convolution ker-
vision-based applications such as image classification, categorization nel is denoted by 𝐾, ̂ and the size of this kernel can be computed as
or segmentation, etc., for its lightweight and small architecture and 𝐷𝐾 × 𝐷𝐾 × 𝑀. So the depthwise convolution for input depth can be
fast operational characteristics [25]. The fabrication of MobileNet is written as
established on depthwise separable filters represented in Fig. 4. The ∑
̂(𝑘,𝑙,𝑚) =
𝐺 ̂(𝑖,𝑗,𝑚) .𝐹(𝑘+𝑖−1,𝑖+𝑗−1,𝑚)
𝐾 (2)
main focus of this model is to optimize latency with a small network (𝑖,𝑗)
and make a model that is suitable for deploying on mobile devices. Mo-
bileNet architecture is incorporated with two steps, namely depthwise Here 𝑚𝑡ℎ filter in 𝐾̂ applied to the 𝑚𝑡ℎ channel in F to produce
convolutions and pointwise convolutions. First, the feature extraction the 𝑚𝑡ℎ channel of 𝐺. ̂ The total computational cost for depthwise
process is carried out by depthwise convolutions, where only a filter convolutions is given by 𝐷𝐾 .𝐷𝐾 .𝑀.𝐷𝐹 .𝐷𝐹
processes each input channel. Then the pointwise 1 × 1 convolution is
applied that combines features obtained from depthwise convolutions. 3.2.3. Xception
In depthwise separable convolutions, extraction of features, and com- Xception is another class of Deep CNN which is adapted from the
bining those features are done by separate layers. This results in the Inception-V3 model [26]. The model is constructed based on the intu-
reduction of computation time and computation cost, and model size. ition of the depthwise separable convolutional module. Modification is
There exist some architectural differences between the general con- made in the inception block of the Inception-V3 model. The modified
volutional layer and the depthwise convolutional layer. The input that architecture for Xception has a wider inception block than Inception-
is taken by a standard convolutional layer can be expressed as 𝐷𝐹 ×𝐷𝐹 × V3. It has spatial dimensions of 1 × 1, 5 × 5, and 3 × 3, which is
𝑀 of feature map F and produces 𝐷𝐺 × 𝐷𝐺 × 𝑁 of feature map G. The replaced in the Xception model with a single dimension of size 3 × 3
value of 𝐷𝐹 × 𝐷𝐹 represents the dimension (height*width) of the input and 1 × 1, i.e., Convolution part is divided into spatial and pointwise
image and 𝐷𝐺 × 𝐷𝐺 represents the dimension (height*width) of the convolution. Fig. 5 illustrates the architecture of the Xception network.
output image. Here 𝑁 is the number of input channels or input depth, Firstly, 1 × 1 pointwise convolution is applied, and then a 3 × 3
and M is the number of output channels or output depth. For standard depthwise convolution is applied [45]. This approach results in the
5
R. Sadik, A. Majumder, A.A. Biswas et al. Healthcare Analytics 3 (2023) 100143
Fig. 5. Architecture of Xception. (A layered architecture of Xception consisting of 36 convolutional layers and 14 modules. It implements a 1 × 1 pointwise convolution followed
by a 3 × 3 depth-wise convolution.)
Fig. 6. The process of transfer learning. (Pretrained weights from earlier tasks conducted on a very large dataset has been used for the purpose of transporting knowledge. An
additional global average pooling layer, a fully connected layer, and a Softmax layer are added for fine-tuning the network.)
reduction of parameters and layers and makes the network lightweight. a lot of images and computational resources [47]. To overcome this,
Disengagement of this correlation is followed by Eqs. (3) and (4). deep learning models can utilize the TL approach, in which a model
∑ that has been trained for one task can be used as a baseline model
𝑘
𝑓(𝑙+1) (𝑝, 𝑞) = 𝑓𝑙𝑘 (𝑥, 𝑦).𝑒𝑘𝑙 (𝑢, 𝑣) (3)
for another. This method of reusing models that have been trained
(𝑥,𝑦)
previously with a large amount of data can be used in another training
𝑘 𝑘
𝐹(𝑙+2) = 𝑔𝑐 (𝐹(𝑙+1) , 𝐾(𝑙+1) ) (4) process that has a small amount of data and paves the way for achieving
Here, F corresponds to the feature map of l transformation layers, higher accuracy [48]. In general, weights are initialized using random
(𝑥, 𝑦) and (𝑢, 𝑣) show the spatial indices of feature map F and kernel numbers in the training process of neural networks. These assigned
K having depth one. Kernel K is spatially convolved across feature weights are then slowly updated during the training process. So in most
map F. Here 𝑔𝑐 (.) indicates the convoluted operation. In total, a basic cases, training with a small number of training data cannot achieve
Xception model has 36 convolutional layers and 14 modules. Among sufficient accuracy. To perform the transfer learning process, we should
these, 12 modules are connected with a residual layer boosting the prepare a neural network model trained with many data that can
merging process and paving the way for achieving higher accuracy. handle similar types of data, which becomes the source model for
Architecturally, the Xception network consists of 3 flows, namely Entry transfer learning.
flow, Middle flow, and Exit flow. Downsampling of input images with In the transfer learning process, features learned from huge image
dimensionality reduction is carried out using the Entry flow part. sets such as ImageNet are highly transferable to a variety of image
Learning from features and optimizing those features is done by the recognition tasks [49]. This process is depicted in Fig. 6. Several ways
Middle flow part of the network. Finally, the Exit flow carries out the to transfer knowledge from one model to another. One approach is to
integration of features. train the top layer of the already pretrained model and then replace it
with a randomly initialized one. After that, the top layer parameters
3.3. Transfer learning are trained for the new task while all other parameters remain fixed.
This approach best suits a task where there is a maximum similarity
The Transfer Learning (TL) approach in the context of deep learning between the pretrained model and the new task. If we have more data,
is a pervasive method in computer vision-related tasks. However, creat- then we can train the entire network by unfreezing these transferred pa-
ing a robust and generalized deep learning model, it is highly required rameters. Only the initial values of the parameters are transferred while
6
R. Sadik, A. Majumder, A.A. Biswas et al. Healthcare Analytics 3 (2023) 100143
weights are initialized using pretrained models instead of initializing The performance of a classifier is described through the confusion
them randomly, boosting the convergence process. matrix, which gives an insight into the correct and incorrect predictions
made by the classifiers [53]. A classifier is used to predict some classes
4. Experimental evaluation and result analysis that can be either true or false. There can be four cases as output
while classifying some data belonging to more than one class. Firstly
4.1. Environment specifications all the predictions (true or false) are correct, which is indicated by True
Positive (TP) and True Negative (TN). However, there can be another
Image analysis or classification requires intense computing powers, case in which the prediction is true, but in reality, it is false, and vice-
and GPU (Graphics Processing Unit) can provide such computing com- versa. These two cases are called False Positive (FP) and False Negative
patibility. But GPU installation is expensive and requires additional (FN). Not only that, we can calculate some more specific metrics from
hardware to support the computing task. So we use the Google Colab1 the confusion matrix that can be deciding factors for revealing the
platform to train our model, which provides us with high-end GPU on classification performance of our models. These metrics are Accuracy,
the cloud. It comes with all the necessary packages which are used in Precision, Recall, and F1-score. These metrics are calculated using the
the training process, so there is no burden of installing packages or following formulas.
extra storage [50]. Google Colab comes with NVIDIA K80 GPU, GPU Accuracy: Accuracy is the indicator of how well a model can predict
memory of 12 GB, Up to 2.91 teraflops double-precision rendition, true and false classes precisely and expressed using formula (5).
and disk space of 358 GB. These specs give an enormous computation ∑𝑁
𝑖 𝑀𝑖
environment to train Deep Learning models. 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = ∑𝑁 × 100% (5)
| |
𝑖 |𝑇𝑖 |
∑
4.2. Dataset description where, 𝑁 𝑀𝑖 indicates the total number of correct predictions, and
∑𝑁 | | 𝑖
𝑖 |𝑇𝑖 | is the total number of predictions.
We have used 5 classes of skin diseases, namely Atopic dermatitis, When it comes to binary classification, Accuracy is represented
Eczema, Herpes, Nevus, and Melanoma. Since there is no available using the following formula (6)
dataset that contains images of all these classes, we prepared our 𝑇𝑃 + 𝑇𝑁
dataset by collecting images from two different sources. We have 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (6)
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
collected images for Atopic dermatitis, Eczema, Nevus, and Herpes from
where, 𝑇 𝑃 = True Positives, 𝑇 𝑁 = True Negatives, 𝐹 𝑃 = False Posi-
Dermnet [51]. For Melanoma images, we have used the HAM10000
tives, and 𝐹 𝑁 = False Negatives.
dataset [52]. A total number of 18692 images are used in our approach, Precision: Precision indicates how well a classier performs in terms
split for training, validation, and testing purposes. A glimpse of images of predicting correct outcomes that are positive. Mathematically repre-
constituting our dataset is given in Fig. 7 Splitting the dataset into sentation can be established using the formula (7)
training and testing datasets depicts in Table 2.
𝑇𝑃
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = (7)
𝑇𝑃 + 𝐹𝑃
4.3. Data preprocessing
Recall: Recall indicates the performance of a classier by measur-
ing the proportion of true positive observations that were correctly
The proposed CNN architecture MobileNet and Xception require
predicted. Formally Eq. (8) defines Recall,
very less preprocessing images as they extract features directly from
images. MobileNet model requires an input shape of 224 × 224, and 𝑇𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 = (8)
the Xception model requires images of dimension 229 × 299. So firstly, 𝑇𝑃 + 𝐹𝑁
images are resized according to the measurement for each model. F1 score (F-measure): F1 score is the symphonic average of pre-
Since a robust model requires many images to train and validate cision and recall. Formally it is represented mathematically as Eq. (9)
2 ∗ 𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙
1
https://colab.research.google.com/ 𝐹 1 𝑠𝑐𝑜𝑟𝑒 = (9)
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙
7
R. Sadik, A. Majumder, A.A. Biswas et al. Healthcare Analytics 3 (2023) 100143
Table 3 Table 4
Class-wise classification results of MobileNet and Xception. (Values of evaluation Class-wise classification results of MobileNet and Xception. (Values of evaluation
metrics Precision, Recall, and F1-score for MobileNet and Xception model with Transfer metrics Precision, Recall, and F1-score for MobileNet and Xception model with without
Learning approaches is presented for each disease classes.) Transfer Learning and without augmentation approaches is presented for each
Model Class Recall (%) Precision (%) F1 (%) disease classes.)
Method Class Recall (%) Precision (%) F1 (%)
Atopic dermatitis 97.00 90.70 93.71
Eczema 89.00 95.70 92.22 Atopic dermatitis 88.3 91.0 89.6
MobileNet Herpes 95.00 96.94 95.96 Eczema 85.7 66.0 74.5
Melanoma 100.00 97.08 98.51 MobileNet Herpes 84.6 99.0 91.2
Nevus 99.00 100.00 99.50 Melanoma 89.4 93.0 91.1
Nevus 100. 99.0 99.4
Atopic dermatitis 96.00 97.00 96.50
Eczema 90.00 95.74 92.80 Atopic dermatitis 84.2 91.0 87.4
Xception Herpes 99.00 92.52 95.65 Eczema 91.0 71.0 79.7
Melanoma 100.00 100.00 100.00 Xception Herpes 80.4 99.0 88.7
Nevus 100.00 100.00 100.00 Melanoma 97.8 93.0 95.3
Nevus 100.0 96.0 97.9
Sometimes accuracy and F1-score are not enough for evaluating Table 5
Overall classification report. (Comparison results between ResNet50,
predictive models. So another metric which is called the Receiver Op- InceptionV3,Inception-ResNet, DenseNet, MobileNet, and Xception model based
erating Characteristics curve or ROC curve, is also used for evaluation. on average values of Precision, Recall, and F1-score.)
With AUC, an accumulated measure of performance can be defined Model Recall(%) Precision(%) F1(%)
at every possible classification threshold. From the ROC curve, the ResNet50 87.00 87.00 87.00
area under the ROC curve (AUC) is induced, which is a compatibility Inception-V3 93.00 93.00 93.00
indicator of a predictive model. Derivation of ROC can be done when Inception-ResNet 95.00 95.00 95.00
the True Positive Rate (TPR) is plotted against False Positive Rate DenseNet 93.00 93.00 93.00
MobileNet 96.00 96.00 96.00
(FPR). True positive rate is nothing but Recall, and FPR is defined by Xception 97.00 97.00 97.00
an Eq. (10)
𝐹𝑃
𝐹𝑃𝑅 = (10)
𝐹𝑃 + 𝑇𝑁
In addition, a comparison is established with the other models, such as
4.5. Results ResNet50, InceptionV3, Inception-ResNet, and DenseNet.
Another compatibility indicator for our proposed models is ROC
In this segment, we demonstrate the results of our proposed ar- which is presented in Fig. 10. The highest reported micro average AUC
chitectures (MobileNet and Xception) to scrutinize the robustness of score is 0.9974, which is reported for the MobileNet model. The lowest
the models. Additionally, the experiment was conducted on other micro AUC score is reported for the ResNet50 model. The ROC of the
deep learning models, ResNet50, InceptionV3, Inception-ResNet, and Xception model is the second highest, which is 0.9972. Other models
DenseNet, to compare and evaluate the performance of our proposi- also showed good AUC scores.
tions. Finally, we present the performance comparison of the proposed
architectures with some graphical presentations and tables. 4.5.2. Prediction accuracy and loss
In this segment, accuracy and loss for our approaches are depicted
4.5.1. Classification performance of proposed MobileNet and Xception for all six models. In Table 6, validation and testing accuracy and
models loss are presented in terms. The highest testing accuracy is 97.00%,
Classification results of our proposed models MobileNet and Xcep- and the lowest loss is 0.16, reported for the Xception model with TL
tion according to our classes (skin disease) are illustrated in Tables 3 and augmentation. For MobileNet, the highest accuracy is 96.00%.
and 4. We have shown the results based on propositions transfer ResNet50 showed the lowest test accuracy (86.60%) and highest loss
learning (TL) with augmentation for each model and without TL and score (2.40) compared to other models.
augmentation. To give an overall insight into our classification results In Fig. 11(a) and (b), a line chart is illustrated for accuracy and
in terms of the number of right classifications and misclassification, loss for the MobileNet and Xception model for 100 epochs. It is seen
We presented confusion matrices for MobileNet and Xception model in from Fig. 11(a) that accuracy is pretty high and consistent for the
Fig. 9. Fig. 9(a) illustrates the produced confusion matrix of MobileNet approach using TL and augmentation for the MobileNet model. There
architectures. From this representation, it can be observed that Herpes exist some reductions and fluctuations per epoch for both models. For
and Eczema classes have achieved 100% right prediction scores for this loss, Fig. 11(a) demonstrates that the lowest loss rate is gained from
approach. The classification performance of Xception architecture is implementing both TL and augmentation approaches. For Xception
illustrated in Fig. 9(b). Model, Fig. 11(b) demonstrates accuracy and loss for each epoch. Like
A more comprehensive representation of classification results is MobileNet, here also observed high and consistent accuracy scores per
depicted in Table 5 of our proposed MobileNet and Xception models. epoch by implementing TL and augmentation. From Fig. 11(b), insight
8
R. Sadik, A. Majumder, A.A. Biswas et al. Healthcare Analytics 3 (2023) 100143
Fig. 9. Confusion matrix presenting total number of right and wrong prediction that occurs in the testing process for MobileNet and Xception model.
Table 7
Total running times for each model.
Model Runtime (s)
ResNet50 14514 s
Inception-V3 5436 s
Inception-ResNet 22971 s
Densenet 16379
MobileNet 7869 s
Xception 10877 s
Fig. 10. ROC curve for deep learning models. This representation depicts the micro In this research work, we proposed implementing two deep learning-
areas under the ROC curve (AUROC) for each of the models.
based architectures MobileNet and Xception, in recognizing different
classes of skin diseases for computer vision-based applications. Be-
Table 6 sides, other deep learning models such as ResNet50, InceptionV3,
Accuracy and Loss for the best models. Inception-ResNet, and DenseNet were also implemented to compare our
Model
Accuracy (%) Loss approaches’ effectiveness. Finally, we scrutinize the performance of our
Validation Test Train Validation different propositions for skin recognition tasks based on classification
ResNet50 96.70 86.60 0.19 2.40 reports, confusion matrix, ROC curves, and classification accuracy.
Inception-V3 98.0 93.0 0.05 0.45 From Table 3, the decision can be reached about class-wise classi-
Inception-ResNet 98.80 94.80 0.99 0.42 fication for both models. The highest Precision score is 100%, which
Densenet 94.20 92.80 0.98 0.34
is achieved for the Nevus class using both MobileNet and Xception
MobileNet (Without TL) 94.45 89.38 0.8 1.77
Xception (Without TL) 95.55 89.79 0.76 1.13
models. Additionally Xception model also achieved a precision score
MobileNet (Proposed with TL) 96.00 96.0 0.15 0.21 of 100 for the Melanoma class. This tells us that our approaches result
Xception (Proposed with TL) 97.94 97.0 0.07 0.16 in a very good measure of the positive predictions that were actually
correct. For Recall, a maximum of 100 scores is observed for nevus
and Melanoma classes in Xception and melanoma classes in MobileNet.
Since we have used an imbalanced dataset, F1-score can be a deciding
into the loss per epoch can be achieved. Here low loss scores were
factor. The maximum score is achieved for Melanoma and Nevus class
reported per epoch by implementing Tl and augmentation.
using the Xception model. From Table 4, it can be seen that both
Finally, the running time of our training process is given in Table 7
Precision and Recall and F1-score are much lower for cases with no
for each of our models. MobileNet model with TL and augmentation
TL and augmentation. We observed the highest F1-score of 97% for the
takes the shortest time (7869 s) to complete the 100 epochs. The
Xception (TL+A) model and 96.38% for MobileNet (TL+A) model. This
longest time to complete the execution is reported for the Inception- is an indication that our proposed approach with TL and augmentation
ResNet model with the TL approach, 22971 s. The exception model also have good classification capability for imbalanced dataset.
showed less time to complete the training process with 10877 s. The more comprehensive representation of Precision, Recall, and
We have also presented a comparison between recent deep learning F1-score is given in Table 5, where an overall score for each of the
approaches that are proposed in different computer vision-based works metrics is given for each of the models. The highest precision is 97.05%,
in Table 8. From this comparison, it can be clearly derived that our which is reported for the Xception model. This means that Xception
models with augmentation and transfer learning techniques have better models predict the correct class of skin disease most of the time.
prediction accuracy. The highest recall value is 97.00% for Xception and 96.00% for the
The effectiveness of our approach in recognizing skin diseases is MobileNet model, i.e., both of these modes correctly identify most
depicted in Figs. 12 and 13. we have used both MobileNet and Xception skin diseases. However, other models such as ResNet50 performed
models with TL and augmentation to predict diseases as a part of poorly, achieving a low score. We observed the highest F1-score of
deployment phases. From this presentation, it can be seen that Both 97.00% for the Xception model and 96.38% for the MobileNet model.
9
R. Sadik, A. Majumder, A.A. Biswas et al. Healthcare Analytics 3 (2023) 100143
Table 8
Comparison between existing approaches and our proposed approaches.
Method/Work done Dataset Used architecture Classification accuracy Best model
Yasir et al. [28] 775 clinical images CNN with adaptive learning 90% CNN
Alarifi et al. [29] Clinical images SVM + CNN 89% CNN with SVM
Li and Shen [30] ISIC 2017 FCRN with LICU 91% FCRN
Rathod et al. [31] DermNet CNN 70% CNN
CNN (PNASNet-5-Large,
Milton [33] ISIC 2018 InceptionResNetV2, SENet154, 76%,70%, 74%, 67% PNASNet-5-Large
InceptionV4)
Liao [34] DermNet and OLE CNN (VGG16) 91% (DermNet), VGG16
69.5% (OLE)
Shanthi et al. [35] DermNet CNN (ALexNet) 93.3% ALexNet
Kalaiyarivu and Nalini [40] Clinical images CNN 87.5% CNN
Kousis et al. [41] HAM10000 CNN 92.25% DenseNet169
Ahmad et al. [42] Customized CNN + stacked BLSTM 91.73% –
Gupta et al. [39] ISIC VGG16, VGG19, and Inception V3 82.4%, 83.0%, 83.2% Inception V3
ResNet50, InceptionV3,
Proposed DermNet + ISIC 2018 Inception-ResNet , DenseNet, 86.60%, 93%, 94.80%, Xception
MobileNet, and Xception 92.80%, 96%, and 97%
Fig. 11. Accuracy and Loss for MobileNet and Xception model with transfer learning and augmentation techniques.
This indicates that our proposed approach with TL and augmentation The highest classification accuracy is 97.00%, which is observed for
has good classification capability for imbalanced datasets than other the Xception model. The MobileNet model also gives a tremendous
models presented in this study. performance with a classification accuracy of 96.00%. ResNet seems to
For illustrating the entire classification and misclassification, the be a bad choice in terms of testing accuracy achieving 86.60% testing
confusion matrix as a heatmap is depicted in Fig. 9. Using the transfer accuracy. Both models with TL and augmentation reported very low
learning and augmentation approach, both our models performed very loss scores also. But approach with no TL and augmentation reported
satisfactorily, outperforming other models. MobileNet and Xception a higher loss score with low accuracy than other approaches.
models reported only 20 and 15 misclassification cases, respectively. With the ROC curve presented in Fig. 10, a relation is established
Accuracy and loss reported by our models are presented in Table 6. between the false positive rate and the true positive rate. The highest
10
R. Sadik, A. Majumder, A.A. Biswas et al. Healthcare Analytics 3 (2023) 100143
4.7. Deployment of web application The authors declare that they have no known competing finan-
cial interests or personal relationships that could have appeared to
Finally, we use the Flask [54] web framework to deploy our trained influence the work reported in this paper.
model. We created a web application that detects skin conditions by
analyzing the skin photograph supplied by the client. To deploy the
Data availability
flask, we need two routes. First, we have created an index page route,
which will help the users to upload their images. Finally, a predicted
Data will be made available on request.
route will create an inference from our saved model.
The web application is created using the Xception model that has
been trained using our skin dataset. Fig. 14 shows the developed web References
interface for the clients. The user uploads the image of a suspected
[1] V. Balaji, S. Suganthi, R. Rajadevi, V.K. Kumar, B.S. Balaji, S. Pandiyan, Skin
diseased area using any smart device and submits it to the developed
disease detection and segmentation using dynamic graph cut algorithm and
expert system or the web application through the interface. Then classification through Naive Bayes classifier, Measurement (2020) 107922.
feedback will be generated based on our trained model by classifying [2] American Cancer Society, Cancer facts & figures for hispanics/latinos 2018–2020,
the image based on different skin conditions. 2020.
11
R. Sadik, A. Majumder, A.A. Biswas et al. Healthcare Analytics 3 (2023) 100143
[3] R. Kasmi, K. Mokrani, Classification of malignant melanoma and benign skin [29] J.S. Alarifi, M. Goyal, A.K. Davison, D. Dancey, R. Khan, M.H. Yap, Facial skin
lesions: implementation of automatic ABCD rule, IET Image Process. 10 (6) classification using convolutional neural networks, in: International Conference
(2016) 448–455, http://dx.doi.org/10.1049/iet-ipr.2015.0385. Image Analysis and Recognition, vol. 10317, Springer, Cham, 2017, pp. 479–485,
[4] A.D. Mengistu, D.M. Alemayehu, Computer vision for skin cancer diagnosis and http://dx.doi.org/10.1007/978-3-319-59876-5_53.
recognition using RBF and SOM, Int. J. Image Process. (IJIP) 9 (6) (2015) [30] Y. Li, L. Shen, Skin lesion analysis towards melanoma detection using deep
311–319. learning network, Sensors 18 (2) (2018) 556.
[5] M. Pawar, D.K. Sharma, R. Giri, Multiclass skin disease classification using neural [31] J. Rathod, V. Wazhmode, A. Sodha, P. Bhavathankar, Diagnosis of skin diseases
network, Int. J. Comput. Sci. Inform. Technol. Res. 2 (4) (2014) 189–193. using convolutional neural networks, in: 2018 Second International Conference
[6] L.-s. Wei, Q. Gan, T. Ji, Skin disease recognition method based on image color on Electronics, Communication and Aerospace Technology, ICECA, IEEE, 2018,
and texture features, Comput. Math. Methods Med. 2018 (2018). pp. 1048–1051.
[7] M.N. Islam, J. Gallardo-Alvarado, M. Abu, N.A. Salman, S.P. Rengan, S. Said, [32] M. Chen, P. Zhou, D. Wu, L. Hu, M.M. Hassan, A. Alamri, AI-skin: Skin disease
Skin disease recognition using texture analysis, in: 2017 IEEE 8th Control and recognition based on self-learning and wide data collection through a closed-loop
System Graduate Research Colloquium, ICSGRC, 2017, pp. 144–148, http://dx. framework, Inf. Fusion 54 (2020) 1–9.
doi.org/10.1109/ICSGRC.2017.8070584. [33] M.A.A. Milton, Automated skin lesion classification using ensemble of deep
[8] A. Nawar, N.K. Sabuz, S.M.T. Siddiquee, M. Rabbani, A.A. Biswas, A. Majumder, neural networks in isic 2018: Skin lesion analysis towards melanoma detection
Skin disease recognition: A machine vision based approach, in: 2021 7th challenge, 2019, arXiv preprint arXiv:1901.10802.
International Conference on Advanced Computing and Communication Systems, [34] H. Liao, A deep learning approach to universal skin disease classification, 2015.
vol. 1, ICACCS, 2021, pp. 1029–1034, http://dx.doi.org/10.1109/ICACCS51430. [35] T. Shanthi, R. Sabeenian, R. Anand, Automatic diagnosis of skin diseases using
2021.9441980. convolution neural network, Microprocess. Microsyst. (2020) 103074.
[9] F. Curia, Features and explainable methods for cytokines analysis of Dry Eye [36] P.N. Srinivasu, J.G. SivaSai, M.F. Ijaz, A.K. Bhoi, W. Kim, J.J. Kang, Classification
Disease in HIV infected patients, Healthc. Anal. 1 (2021) 100001. of skin disease using deep learning neural networks with MobileNet V2 and
[10] V. Chang, V.R. Bhavani, A.Q. Xu, M. Hossain, An artificial intelligence model LSTM, Sensors 21 (8) (2021) 2852.
for heart disease detection using machine learning algorithms, Healthc. Anal. 2 [37] I. Iqbal, M. Younus, K. Walayat, M.U. Kakar, J. Ma, Automated multi-class
(2022) 100016. classification of skin lesions through deep convolutional neural network with
[11] S. Dev, H. Wang, C.S. Nwosu, N. Jain, B. Veeravalli, D. John, A predic- dermoscopic images, Comput. Med. Imaging Graph. 88 (2021) 101843, http://dx.
tive analytics approach for stroke prediction using machine learning and doi.org/10.1016/j.compmedimag.2020.101843, URL https://www.sciencedirect.
neural networks, Healthc. Anal. 2 (2022) 100032, http://dx.doi.org/10.1016/ com/science/article/pii/S0895611120301385.
j.health.2022.100032, URL https://www.sciencedirect.com/science/article/pii/ [38] H.C. Reis, V. Turk, K. Khoshelham, S. Kaya, InSiNet: a deep convolutional
S2772442522000090. approach to skin cancer detection and segmentation, Med. Biol. Eng. Comput.
[12] R. AlSaad, Q. Malluhi, I. Janahi, S. Boughorbel, Predicting emergency department 60 (3) (2022) 643–662.
utilization among children with asthma using deep learning models, Healthc. [39] S. Gupta, A. Panwar, K. Mishra, Skin disease classification using dermoscopy
Anal. 2 (2022) 100050, http://dx.doi.org/10.1016/j.health.2022.100050, URL images through deep feature learning models and machine learning classifiers,
https://www.sciencedirect.com/science/article/pii/S2772442522000181. in: IEEE EUROCON 2021 - 19th International Conference on Smart Technologies,
[13] M. Ahammed, M.A. Mamun, M.S. Uddin, A machine learning approach for 2021, pp. 170–174, http://dx.doi.org/10.1109/EUROCON52738.2021.9535552.
skin disease detection and classification using image segmentation, Healthc. [40] M. Kalaiyarivu, N. Nalini, Hand image based skin disease identification using
Anal. 2 (2022) 100122, http://dx.doi.org/10.1016/j.health.2022.100122, URL machine learning and deep learning algorithms, ECS Trans. 107 (1) (2022)
https://www.sciencedirect.com/science/article/pii/S2772442522000624. 17381.
[14] S. Serte, A. Serener, F. Al-Turjman, Deep learning in medical imaging: A brief [41] I. Kousis, I. Perikos, I. Hatzilygeroudis, M. Virvou, Deep learning methods for
review, Trans. Emerg. Telecommun. Technol. (2020) e4080. accurate skin cancer recognition and mobile application, Electronics 11 (9)
[15] N.C. Thompson, K. Greenewald, K. Lee, G.F. Manso, The computational limits of (2022) 1294.
deep learning, 2020, arXiv preprint arXiv:2007.05558. [42] B. Ahmad, M. Usama, T. Ahmad, S. Khatoon, C.M. Alam, An ensemble model of
[16] H. Pan, Z. Pang, Y. Wang, Y. Wang, L. Chen, A new image recognition convolution and recurrent neural network for skin disease classification, Int. J.
and classification method combining transfer learning algorithm and MobileNet Imaging Syst. Technol. 32 (1) (2022) 218–229.
model for welding defects, IEEE Access 8 (2020) 119951–119960. [43] S.F. Aijaz, S.J. Khan, F. Azim, C.S. Shakeel, U. Hassan, Deep learning application
[17] W. Wang, Y. Li, T. Zou, X. Wang, J. You, Y. Luo, A novel image classification for effective classification of different types of psoriasis, J. Healthc. Eng. 2022
approach via dense-MobileNet models, Mob. Inf. Syst. 2020 (2020). (2022).
[18] K. Sriporn, C.-F. Tsai, C.-E. Tsai, P. Wang, Analyzing lung disease using highly [44] C.D.S. Duong, Automated fruit recognition using EfficientNet and MixNet,
effective deep learning techniques, Healthcare 8 (2) (2020) 107. Comput. Electron. Agric. 171 (2020) 105326.
[19] T. Ghosh, M.M.-H.-Z. Abedin, S.M. Chowdhury, Z. Tasnim, T. Karim, S.S. Reza, [45] A. Khan, A. Sohail, U. Zahoora, A.S. Qureshi, A survey of the recent architec-
S. Saika, M.A. Yousuf, Bangla handwritten character recognition using MobileNet tures of deep convolutional neural networks, Artif. Intell. Rev. 53 (8) (2020)
V1 architecture, Bullet. Electr. Eng. Inform. 9 (6) (2020) 2547–2554. 5455–5516.
[20] T.M. Angona, A. Siamuzzaman Shaon, K.T.R. Niloy, T. Karim, Z. Tasnim, S. Reza, [46] Q. Li, W. Cai, X. Wang, Y. Zhou, D.D. Feng, M. Chen, Medical image classification
T.N. Mahbub, Automated bangla sign language translation system for alphabets with convolutional neural network, in: 2014 13th International Conference on
by means of MobileNet, Telkomnika 18 (3) (2020). Control Automation Robotics & Vision, ICARCV, IEEE, 2014, pp. 844–848.
[21] M. Rahimzadeh, A. Attar, A modified deep convolutional neural network for [47] O. Ukwandu, H. Hindy, E. Ukwandu, An evaluation of lightweight deep learning
detecting COVID-19 and pneumonia from chest X-ray images based on the techniques in medical imaging for high precision COVID-19 diagnostics, Healthc.
concatenation of xception and ResNet50V2, Inform. Med. Unlocked (2020) Anal. 2 (2022) 100096, http://dx.doi.org/10.1016/j.health.2022.100096, URL
100360. https://www.sciencedirect.com/science/article/pii/S2772442522000417.
[22] E. Ayan, H.M. Ünver, Diagnosis of pneumonia from chest X-Ray images using [48] K. Guzel, G. Bilgin, Classification of breast cancer images using ensembles
deep learning, in: 2019 Scientific Meeting on Electrical-Electronics & Biomedical of transfer learning, Sakarya Üniv. Bilimleri EnstitÜSÜ Dergisi 24 (5) (2020)
Engineering and Computer Science, EBBT, IEEE, 2019, pp. 1–5. 791–802.
[23] L. Yang, P. Yang, R. Ni, Y. Zhao, Xception-based general forensic method on [49] T.H. Sanford, L. Zhang, S.A. Harmon, J. Sackett, D. Yang, H. Roth, Z. Xu, D.
small-size images, in: Advances in Intelligent Information Hiding and Multimedia Kesani, S. Mehralivand, R.H. Baroni, et al., Data augmentation and transfer
Signal Processing, Springer, 2020, pp. 361–369. learning to improve generalizability of an automated prostate segmentation
[24] C. Shi, R. Xia, L. Wang, A novel multi-branch channel expansion network for model, Am. J. Roentgenol. (2020) 1–8.
garbage image classification, IEEE Access 8 (2020) 154436–154452. [50] E. Bisong, Google colaboratory, in: Building Machine Learning and Deep Learning
[25] A.G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Models on Google Cloud Platform, Springer, 2019, pp. 59–64.
Andreetto, H. Adam, MobileNets: Efficient convolutional neural networks for [51] Dermnet, 2020, URL http://www.dermnet.com/. (Accessed 04 November 2020).
mobile vision applications, 2017, arXiv:1704.04861. [52] P. Tschandl, C. Rosendahl, H. Kittler, The HAM10000 dataset, a large collection
[26] F. Chollet, Xception: Deep learning with depthwise separable convolutions, in: of multi-source dermatoscopic images of common pigmented skin lesions, Sci.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Data 5 (2018) 180161.
2017, pp. 1251–1258. [53] N.D. Marom, L. Rokach, A. Shmilovici, Using the confusion matrix for improving
[27] D.S. Reddy, P. Rajalakshmi, A novel web application framework for ubiquitous ensemble classifiers, in: 2010 IEEE 26-Th Convention of Electrical and Electronics
classification of fatty liver using ultrasound images, in: 2019 IEEE 5th World Engineers in Israel, IEEE, 2010, pp. 000555–000559.
Forum on Internet of Things (WF-IoT), IEEE, 2019, pp. 502–506. [54] P. Singh, A. Verma, J.S.R. Alex, Disease and pest infection detection in coconut
[28] R. Yasir, M.A. Rahman, N. Ahmed, Dermatological disease detection using image tree through deep learning techniques, Comput. Electron. Agric. 182 (2021)
processing and artificial neural network, in: 8th International Conference on 105986.
Electrical and Computer Engineering, IEEE, 2014, pp. 687–690.
12