AI-Based Automatic Detection and Classification of Diabetic Retinopathy Using U-Net and Deep Learning

Bilal, Anas; Zhu, Liucun; Deng, Anan; Lu, Huihui; Wu, Ning

doi:10.3390/sym14071427

Open AccessArticle

AI-Based Automatic Detection and Classification of Diabetic Retinopathy Using U-Net and Deep Learning

by

Anas Bilal

¹

,

Liucun Zhu

^2,*,

Anan Deng

¹,

Huihui Lu

¹ and

Ning Wu

^1,*

¹

College of Electronics and Information Engineering, Beibu Gulf University, Qinzhou 535011, China

²

Advanced Science and Technology Research Institue, Beibu Gulf University, Qinzhou 535011, China

^*

Authors to whom correspondence should be addressed.

Symmetry 2022, 14(7), 1427; https://doi.org/10.3390/sym14071427

Submission received: 14 June 2022 / Revised: 28 June 2022 / Accepted: 6 July 2022 / Published: 12 July 2022

(This article belongs to the Special Issue Symmetry in Artificial Intelligence and Edge Computing)

Download

Browse Figures

Versions Notes

Abstract

:

Artificial intelligence is widely applied to automate Diabetic retinopathy diagnosis. Diabetes-related retinal vascular disease is one of the world’s most common leading causes of blindness and vision impairment. Therefore, automated DR detection systems would greatly benefit the early screening and treatment of DR and prevent vision loss caused by it. Researchers have proposed several systems to detect abnormalities in retinal images in the past few years. However, Diabetic Retinopathy automatic detection methods have traditionally been based on hand-crafted feature extraction from the retinal images and using a classifier to obtain the final classification. DNN (Deep neural networks) have made several changes in the previous few years to assist overcome the problem mentioned above. We suggested a two-stage novel approach for automated DR classification in this research. Due to the low fraction of positive instances in the asymmetric Optic Disk (OD) and blood vessels (BV) detection system, preprocessing and data augmentation techniques are used to enhance the image quality and quantity. The first step uses two independent U-Net models for OD (optic disc) and BV (blood vessel) segmentation. In the second stage, the symmetric hybrid CNN-SVD model was created after preprocessing to extract and choose the most discriminant features following OD and BV extraction using Inception-V3 based on transfer learning, and detects DR by recognizing retinal biomarkers such as MA (microaneurysms), HM (hemorrhages), and exudates (EX). On EyePACS-1, Messidor-2, and DIARETDB0, the proposed methodology demonstrated state-of-the-art performance, with an average accuracy of 97.92%, 94.59%, and 93.52%, respectively. Extensive testing and comparisons with baseline approaches indicate the efficacy of the suggested methodology.

Keywords:

diabetic retinopathy (DR); DR detection and classification; automatic diagnosis; feature extraction; multi-class segmentation and classification; fundus images (FIs); transfer learning

1. Introduction

Diabetic retinopathy (DR) is a visual manifestation of diabetes that also leads to blindness. According to the WHO report on diabetes, about 415 million people suffer from diabetes mellitus. This disease’s occurrence has doubled in the last three decades. People with diabetes between the ages of 20 and 74 may have blindness due to diabetic retinopathy. Many studies highlighted that an early diagnosis could save 90% of diabetic patients from this disease. Diabetic people have more risk of developing diabetic retinopathy (DR) [1]. Microvascular tissue receives blood supply from the body through the blood vasculature, similar to other body tissue. In addition, the retinal tissue absorbs the blood via microscopic blood vessels and maintains the blood glucose level for the continuous flow of blood. Microscopic blood vessels begin to crash while glucose or carbohydrates are gathered in the blood due to inadequate oxygen distribution into the cells. Any obstruction in these vessels results in serious injury to the retina.

Consequently, the microvascular failure to provide the retina with nutrients as normal leading to ischemia or reduced blood flow [2]. Figure 1 illustrates the difference between a normal and diabetic retina. Figure 1a depicts a normal retina free of DR signs. Meanwhile, Figure 1b illustrates a retina with numerous DR symptoms, including hemorrhages, microaneurysm, cotton wool patches, and exudates.

In the last several years, Computer-aided diagnostic (CAD) advancements have led to the development of automated approaches for detecting and grading DR using fundus images [3]. Segmentation of blood vessels (BVs), optic discs (ODs), and lesions from fundus images are among the significant problems in developing a CAD system for DR detection [4]. Traditional ML approaches are only practical when the handcrafted characteristics are carefully selected [5]. In order to generalize this feature extraction as well as the selection method, it is very time-consuming and complicated. DL (Deep learning) is a powerful tool for automatically extracting features from images, and it has lately shown impressive accuracy in categorizing medical images [6]. Based on the dataset size, DNN (Deep Neural Network) architectures, parameter tuning, DR detection, and grading systems have varying levels of effectiveness. In addition, the total performance is determined by the efficacy of several steps, such as OD extraction, BV segmentation, as well as DR classification.

A new two-stage approach was developed to discover bright lesions in retinal fundus images [7]. For DR diagnosis, an intelligent decision support model has been used, implementing the texture and color characteristics to distinguish between exudate and non-exudate pixels. First, for OD segmentation, edge detection, as well as modulation operations are used. Second, periodic energy and color measurements are performed to gather tissue characteristics from the retinal region. Next, for the classification, a fuzzy SVM intelligent classifier is used. In [8], an automatic DSS is designed to detect microaneurysms and hemorrhages. The strength of the DR depends on the position and amount of microaneurysm and hemorrhage. The technique was tested on 98 fundus images. Similarly, experimental results showed that 87.53% and 84.31% sensitivity as well as 95.08% and 93.63% specificity were reported throughout the proposed system to detect hemorrhages as well as microaneurysms, respectively. Numerous research projects have used OD segmentation for DR detection. In [9], OD, as well as the blood vessel segmentation-based approach for detecting DR, are discussed. Watershed transform and RBFNN were used to classify OD and DR, respectively, with a sensitivity of 87% and specificity of 93%.

In [9], Wang acquainted a profound learning framework with distinguishing optic disc in the DR. The author constructed a CCN structure zeroed in on the U-net model for the explicit acknowledgement of optical plates. In this task, CNNs autonomously processed the dark and shaded retinal fundus pictures to acquire different division outcomes. Using the U-net framework, the creator represented a settled procedure to recognize nearby images utilized for an additional division. In [10], the creators direct OD and OC ID by figuring OD and OC’s sectioned area. Watershed change and morphological separating strategies are utilized to discover and distinguish OD. The technique was checked with 30 separate shading retinal fundus photographs, acquired affectability, and a mean prescient execution of 92.8 and 92.4% [11]. In [12], the authors used a multi-model to segment MRI tumor images based on ML, DL and conventional methods. Moreover, in [13], the authors modify the basic U-net model using residual and identity mapping for the segmentation of microaneurysms. The Harris corner finder was utilized to find the OD center as well as accomplish 97.8% for the nearby dataset, 97.5% for the DRIVE and STARE datasets, as well as 86.75% for the DRIVE and STARE datasets, separately [14].

The optic disc density in certain fundus pictures is brighter over the background; hence, inaccurate optic disc segmentation can lead to incorrect identification of light lesions, such as MA as well as HM. In contrast, the blood vessel strengths are darker, which grounds junctures by EX [15]. To overcome such limitations, a novel architecture for DR detection classification is proposed in this study. Data were preprocessed and augmented in the first phase using novel Gaussian space scale theory and some other general augmentation settings. During the preprocessing, we used two different U-Net models for segmenting the optic disc as well as blood vessels. The notable features were then retrieved using the hybrid CNN-SVD approach, which combines a DCNN model with SVD (Singular Value Decomposition). Finally, transfer learning (TL) based Inception-V3, GoogLeNet, AlexNet, and ResNet models are used to classify fundus images. From three publicly available datasets, 11841 retinal fundus images have been used to train the proposed method (Messidor-2, EyePACS-1, and DIARETDB0). Finally, the model’s effectiveness is assessed using precision, F1-score, sensitivity, accuracy, specificity, and AUC (Area under the Curve). The following are the major finding of the planned investigation:

Proposed two-stage novel classification system for diabetic retinopathy.
After preprocessing and data augmentation, we used two independent U-Net models to segment the optic disc and blood vessels.
The features from the fundus images were extracted using a novel hybrid CNN-SVD created in this study. In total, 256 features were extracted from processed FIs using CNN. Then, by selecting the most significant characteristics, SVD reduces these features to 100, reducing the model complexity and enhancing model performance.
The transfer learning (TL)-based Inception-V3 model is used to classify fundus images.
A comprehensive DR categorization system is created that is accurate, dependable, and intuitive.
Three large public datasets were used to test the model’s performance: Messidor-2, EyePACS-1, and DIARETDB0.
The diagnostic capabilities of the proposed model are verified using several performance metrics, such as precision, F1-score, sensitivity, accuracy, specificity, as well as AUC.

The organization of the next part of the paper is as follows: the current research status is given in Section 2. The datasets used in this study are described in Section 3. The proposed methodology is presented in Section 4. The results and discussions are explained in Section 5, and the study is concluded in Section 6.

2. Current Research Status

In recent decades, retinal image investigation has received a lot of attention; additionally, the automated diagnosis of diabetic retinopathy has attracted significant attention [16]. The following sections will briefly overview the most popular methods for detecting the major DR characteristics (hemorrhages, microaneurysms, and exudates).

2.1. DR Detection Based on Classical Machine Learning

Researchers have established numerous predictive models based on machine learning to help ophthalmologists detect and classify diabetic retinopathy during the last several years. Much work has been performed on the early diagnosis of diabetic retinopathy and multi-stage DR classification using handcrafted feature extraction. Microaneurysms detection is vital for early DR diagnosis. The K-nearest Neighbor Classifier (KNN) detect Microaneurysms in [17]. For MA detection in fundus images, morphological operators are introduced. The method achieved 81.61% sensitivity, 99.99% specificity, and 63.76% precision [18]. Deep learnable features are extracted from retinal fundus images, and ResNet-50 with SVM was used to detect exudates. The author summarizes different methods of CNN and applies efficient methods to differentiate exudates. The obtained results showed that the reported method’s accuracy is 98% [19].

Exudate, the bright white structure on the retina, is one of the most critical indicators of DR. Exudate detection methods have been divided into three groups based on previous research: ML (machine learning), morphology (mathematics), and pixel-based. In [20], before detecting exudate, a matching filter was used to exclude vessel and optic disc. A random forest algorithm was employed to locate the exudate area using the saliency map, and it was shown to be 79% efficient. To identify exudates, [21] used mathematical morphology as well as SVM classifiers. Using a private dataset, the author assessed the classifier’s accuracy and found that it had an AUC of 95%. For exudate detection, a number of ML-based algorithms have demonstrated significant results [22].

In [23], the authors proposed the automated red lesion detection strategy in diabetic retinopathy, adding hemorrhages and microaneurysms employing optical retinal images. Frangi-based filtering was conducted for the identification of blood vessels in this approach. In the initial phase, the input picture was preprocessed to disintegrate into little sub-pictures, and filtration was applied to each sub-image. The clean characteristics were fed into SVM for further classification of input images and whether images had an injury. The investigations were done separately on 143 fundus photos and acquired exactness for microaneurysms and hemorrhages of 97% and 87%. The literature shows that combining high-level and low-level features may improve diagnostic accuracy. Moreover, in [24], the authors use the SVM voting methods to detect and classify bright and res lesions using the IDRiD dataset.

ML techniques for identifying and classifying diabetic retinopathy identify three main challenges. First, the retrieved handcrafted features need to be verified by ophthalmologists based on the subjectivity of the expert, which takes time and will raise the retinal expert load. Second, baseline methods are limited in generalization and robustness, since most studies are trained on lesser training data. Third, clinical signs for diabetic retinopathy are ambiguous, and the size of the blood vessels in the retinal image is substandard for expert graders. As a result, extracting DR indicating retinal biomarkers from fundus pictures is difficult. Subsequently, deep learning models automate the feature extraction as well as the classification process. The following section covers the details of different DL models used for DR classification.

2.2. DR Detection Based on Deep Learning

Deep learning methods are widely used to solve various medical image analysis issues while avoiding the drawbacks of conventional ML methods. In contrast to ML models, deep learning models quickly discover elevated features using retinal images without the involvement of hominoids. Deep learning models were developed for lesion detection with patch image classification [25]. Throughout this procedure, 243 retinal images, verified by ophthalmologists, were tested. The Kaggle input dataset is split into image patches, including microaneurysms, hemorrhages, exudates, as well as the normal retina structure. Authors are using CNNs to detect as well as classify lesions into five grades. A collection-based framework to improve microaneurysm detection has been identified [26]. The findings of [27,28] are particularly noteworthy because they represent the culmination of a thorough systematic review and meta-analysis. The first paper used the EyePACS dataset of 35,126 images to train a VGG16 CNN with binary-cross entropy as the loss function. They also ran two experiments in which they combined the VGG16 to a linear Support Vector Machine (SVM) as well as a softmax function with an output fully connected layer, yielding the maximum specificity and sensitivity using SVM approach: 93.0% and 85%, respectively. In the second paper, a CNN was combined not only with an SVM and with Teaching Learning Based Optimization (TLBO). Results from a binary classification experiment using the same EyePACS dataset yielded a specificity of 90.89%, accuracy of 91.05%, and sensitivity of 89.30%.

In [29], Lončarić S. and Prentašić P. proposed the automated exudate detection technique based on DCNN. In the proposed exudate extraction method, CNN was used for feature extraction, and the SVM classifier was used for classification. The grayscale morphological procedures were applied in particular areas, and the vigorous curve model was also used to identify the exudate boundaries. After this, the Naïve Bayes classifier was adapted to the area-wise classification of exudates [30]. In [31], the authors proposed U-net and transfer learning base models to classify DR. A convolution neural network based on retinal image performance evaluation methodology was developed [32]. In addition, the recorded method was focused on saliency maps to gather unsupervised information to make decisions on retinal quality images. The saliency maps are gathered locally as well as global information on retinal images at various scales for each pixel. Reference [33] describes a method for detecting diabetic retinopathy utilizing shallow convolutional networks, with 85% accuracy using 35,000 images. In [34], the authors used the evolutionary algorithm grey wolf optimization with CNN for the classification of DR, and later on in [35], they improved the GWO for binary classification of DR. The authors of [3] presented a trained lightweight complexity CNN to 768 FIs, yielding an accuracy of 88.4%. A weighted path CNN was proposed in [36] for binary classification employing the STRAE dataset with 60,000 images, yielding 90.84% accuracy and a 93.4% F1 score.

Deep learning models often utilize several image patches to perform image-wise classification either via the integration of short-distance dependencies or other ensemble approaches, including SVM or majority voting. All such approaches disregard long-distance dependencies. In addition, the other ensemble methods typically pool the final entirely connected layer into a one-dimensional feature vector, which is unsuitable for patch-wise synthesis. To address these constraints, we presented new techniques for addressing annotated data insufficiency issues, selecting a relevant area of interest, and improving the classification performance.

3. Materials

3.1. Datasets

Messidor-2, EyePACS-1, and DIARETDB0 were the three most popular public datasets used to examine the efficacy of the proposed system. There were a total of 10,966 retinal FIs.

3.1.1. EyePACS-1

The EyePACS-1 dataset [37] consisted of nearly 9088 retinal images. The EyePACS-1 dataset comprises 7552 normal, 842 mild, 545 moderates, 54 severe, and 95 PDR images.

3.1.2. Messidor-2

The Messidor-2 dataset comprises 1748 retinal fundus images [38]. The dataset is also highly imbalanced, with 1017 normal, 270 mild, 347 moderate, 75 severe, and 35 PDR images. Messidor-2 dataset digital FIs were captured using a Topcon digital FI camera with a 45-degree field of view.

3.1.3. DiaretDB0

The DIARETDB0 dataset [39] is publicly accessible in DR detection and classification. The DIARETDB0 dataset contains 130 fundus images, 110 classified with DR and 20 deemed normal FIs. Images were taken using a digital FI camera with a 50-degree field view and unknown camera settings. The data relate to real-world scenarios and can be used to assess the overall performance of diagnostic techniques.

The distributions of datasets Messidor-2, EyePACS-1, as well as DIARETDB0 for DR severity grade are stated in Table 1 and Figure 2 as a graphical representation of the class distribution. At the same time, a set of sample images from the datasets are presented in Figure 3.

4. Methodology

This paper presents a novel methodology for detecting diabetic retinopathy (DR) using retinal fundus images. Three publicly available datasets, Messidor-2, EyePACS-1, and DIARETDB0, are used. The quality FIs are enhanced by preprocessing techniques, such as image scaling, GCE (Green Channel Extraction), as well as top-bottom hat transformation. In addition, two different U-Net models are presented for extracting the OD as well as BVs from the enhanced FIs during preprocessing to counteract the influence of retinal biomarkers in DR detection. The improved image is attained after performing preprocessing techniques and extracting the OD as well as the blood vessel. Later, a hybrid novel CNN-SVD model was created after preprocessing for feature extraction as well as choosing the most suitable ones. Finally, the improved Inception-V3 model based on transfer learning is used to diagnose DR using an improved image dataset. Sensitivity, accuracy, precision, F1-score, specificity, and area under the curve are among the performance metrics used to evaluate the suggested approach (AUC). Figure 4 depicts a flowchart of the proposed methodology.

4.1. Preprocessing and Data Augmentation

4.1.1. Preprocessing

Messidor-2, DIARETDB0, and EyePACS-1 were utilized to test the efficiency of the proposed technique. This research considers 10,966 retinal fundus pictures (Messidor2-1748, EyePACS1-9088, and DIARETDB0-130 images). Table 1 shows the distribution of EyePACS-1, DIARETDB0, and Messidor-2 datasets. It is possible that the performance of a deep learning model can be affected by the differing FI size. To solve this problem, we scaled all images to the same size (256 × 256). Directly resizing FIs is also difficult due to the possibility that significant blood vessels, as well as the optic disc can vanish. The retinal FIs were resized using a bicubic interpolation process that maintained the perspective ratio. As demonstrated in Figure 5, the FIs green channel carries more information than the red and blue channels, making it an ideal choice for our investigation. A top-to-bottom hat transformation improves the quality of the retinal image. Figure 6 depicts the use of several preprocessing steps.

4.1.2. Data Augmentation

The size of the training dataset is one of the most critical factors in the effective processing of DL models. As a result, deep learning network training requires an extensive dataset to avoid overfitting and generalization difficulties. The dataset distribution across classes is considerably skewed, with most images coming from grade 0 (Normal). This extremely skewed dataset could lead to incorrect classification. We used data augmentation techniques to magnify the retinal dataset at multiple sizes and remove noise from fundus images. The primary data augmentation processes we performed are listed below.

Rotation: Images were rotated from 0 to 360 degrees at random.
Shearing: Sheared at a random angle ranging from 20 to 200 degrees.
Image flipping: Images were flipped horizontally and vertically.
Zoom: Images were randomly stretched in the (1/1.3, 1.3) range.
Cropping: At random, images were shrunk to 85–95% of their original length.
Image translation: Images were randomly moved between −25 and 25 pixels.

Several examples of postaugmentation images are shown in Figure 7.

4.2. Optic Disc (OD) and Blood Vessel (BV) Segmentation

The variable concentration level of FIs may hamper the extraction of specific biomarkers. The OD is more intense than the background, which might lead to incorrect segmentation of the OD and distressing the bright lesions, such as microaneurysms as well as hemorrhages. The BVs concentration is darker than the backdrop, which might lead to an exudate junction. We developed two separate models of U-Net for the segmentation of OD as well as BVs to overcome the retinal biomarkers effect. U-Net employs a basic convolutional neural network (CNN) for biomedical image segmentation. U-Net is a distinct alternative to the standard CNN for disease detection and abnormality localization in biomedical image segmentation. A small training database has revealed significant effectiveness for various biological image segmentation tasks [40,41]. U-Net design is shaped like a U and consists of two pathways: contracting and expensive paths. Convolution, ReLU, and pooling procedures are performed on the contracted route, while the up-sampling approach is expensive. The suggested model resizes the original image by preserving the aspect ratio to convey the exact contextual information of FIs. Figure 8a,b show the architecture of the fundamental U-Net model for segmenting the OD and BVs, respectively.

4.3. Dimensionality Reduction Using CNN-SVD

When a dataset has many features, some of them have minimal contribution to predicting the target variable or generating redundant data. The feature space strongly influences a classifier’s performance. The blasphemy of dimensionality is the name given to this phenomenon. Dimensionality reduction methods must be used to lower the complexity and time cost. It reduces the original feature region to a bare minimum that can retain the nonredundant data without considerable loss [42]. Principal element analysis, linear discriminant analysis, and singular value decomposition (SVD) are a few well-known approaches for this purpose. CNN was utilized to extract features in this study at first. The features were normalized once they were extracted. Finally, dimensionality reduction was achieved using SVD.

4.3.1. Feature Extraction by CNN from FIs

A basic CNN has been anticipated in this part to obtain the maximum notable features of FIs. If the essential characteristics that discriminate among the several DR phases are extracted, the model classification performance will be improved. As a result, a simple CNN model was employed. Figure 9 depicts the arrangement of the CNN feature extractor.

These derived characteristics can be effectively used to classify DR stages. Batch normalization and max-pooling layers have been applied to each CNN convolutional layer (CL). Batch normalization was used since it speeds up and enhances the model’s performance by re-centering as well as re-scaling the inputs of the layers [43]. Max-polling is utilized to extract the essential features from the processed FIs by selecting the most significant value from each neuron in a cluster [44,45]. In this example, throughout the training phase, dropout prevents overfitting by often bypassing entire training nodes in each layer; this significantly speeds up the training process. Adam was chosen as an optimizer because of their excellent performance when working with large amounts of data [46]. The final dense layer was then utilized to extract 256 discriminating features from every FI.

4.3.2. Features Reduction by SVD

This approach is based on the basic notion of FFT (Fast Fourier Transform). Mathematically, a matrix

A_{(m \times x)}

can be factored into three different matrices, such as

A = {PQR}^{*}

. Every matrix will have a unique factorization. There is a single matrix that contains

P_{(m \times x)}

and

R_{(n \times n)}

. In this case,

R^{*}

stands for R complex conjugate. For a real-valued matrix,

W^{*} = W^{T}

W is termed a unitary matrix if

{WW}^{*} = I

for the matrix W. Let

Q_{(m \times n)}

be the diagonal matrix with descendingly positive-valued diagonal entries and zero off-diagonal elements. The matrix A has the same number of positive-valued diagonal members. Put another way, a matrix’s rank indicates how many columns or rows are linearly independent. Like the FFT,

A = {PQR}^{*}

may be stated in a series form. Let us imagine we have three matrices,

p

,

q

, and

r

. Afterwards, they will be multiplied:

pqr = [\begin{matrix} p_{1} & p_{2} \\ p_{3} & p_{4} \end{matrix}] [\begin{matrix} q_{1} & 0 \\ 0 & q_{2} \end{matrix}] [\begin{matrix} r_{1} & r_{2} \\ r_{3} & r_{4} \end{matrix}] = [\begin{matrix} p_{1} q_{1} & p_{2} q_{2} \\ p_{3} q_{1} & p_{4} q_{2} \end{matrix}] [\begin{matrix} r_{1} & r_{2} \\ r_{3} & r_{4} \end{matrix}] = [\begin{matrix} p_{1} q_{1} r_{1} + p_{2} q_{2} r_{3} & p_{1} q_{1} r_{2} + p_{2} q_{2} r_{4} \\ p_{3} q_{1} r_{1} + p_{4} q_{2} r_{3} & p_{3} q_{1} r_{2} + p_{4} q_{2} r_{4} \end{matrix}] = q_{1} [\begin{matrix} p_{1} \\ p_{3} \end{matrix}] [\begin{matrix} r_{1} & r_{2} \end{matrix}] + q_{2} [\begin{matrix} p_{2} \\ p_{4} \end{matrix}] [\begin{matrix} r_{3} & r_{4} \end{matrix}]

Hence, PQR* can be written as:

[\begin{matrix} | \\ P_{1} \\ | \end{matrix} \begin{matrix} | \\ P_{2} \\ | \end{matrix} \dots \begin{matrix} | \\ P_{m} \\ | \end{matrix}] [\begin{matrix} Q_{1} \\ 0 \\ | \end{matrix} \begin{matrix} 0 \\ Q_{2} \\ | \end{matrix} \begin{matrix} 0 \\ 0 \\ Q_{m} \end{matrix}] [\begin{matrix} \begin{matrix} - & R_{1} & - \end{matrix} \\ \begin{matrix} - & R_{2} & - \end{matrix} \\ \begin{matrix} ⋮ \\ \begin{matrix} - & R_{n} & - \end{matrix} \end{matrix} \end{matrix}] = Q_{1} [\begin{matrix} | \\ P_{1} \\ | \end{matrix}] [\begin{matrix} - & R_{1} & - \end{matrix}] + Q_{2} [\begin{matrix} | \\ P_{2} \\ | \end{matrix}] [\begin{matrix} - & R_{2} & - \end{matrix}] + \dots + Q_{m} [\begin{matrix} | \\ P_{m} \\ | \end{matrix}] [\begin{matrix} - & R_{n} & - \end{matrix}]

In order to transform D into an ideal lower rank approximation, SVD selects the higher-valued components of Q larger than a given value from Q.

4.4. Transfer Learning Models

It is usually unwise to execute CNN classification with a small biomedical dataset as well as train the network from the beginning. TL (Transfer learning) models are commonly used for biomedical image categorization to overcome these constraints. Transfer learning-based models can also be used to transfer knowledge from one task to another similar task, such as Inception-V3 [47], GoogLeNet [48], AexNet [49], and ResNet [50]. It has been trained using the ImageNet [51] dataset, which contains over fourteen million images of one thousand categories. TL aims to improve the network’s performance regardless of its target dataset. Cataract identification [52], breast cancer classification [53], glaucoma diagnosis [54], and diabetic retinopathy detection [31] have demonstrated substantial performance for TL-based models in several medical image classification tasks.

Inception-V3, GoogLeNet, AlexNet, and ResNet models based on transfer learning were proposed in this study for the diagnosis of four classes of diabetic retinopathy: normal, mild, moderate, and severe.

4.4.1. Standard Classifiers

Inception-V3 is a deep neural network (DNN) that can classify 1000 different objects [47]. The model is trained on images from a wide variety, and the model may be retrained for a smaller dataset while keeping the training information. This advantage of the Inception-V3 CNN model is that it eliminates the need for practical training, resulting in improved classification accuracy and reduced processing time. It is the primary objective of the Inception-V3 network to eliminate the limiting representation of subsequent network layers, which significantly reduces the input size of the subsequent layer. The factorization technique is used to lower the computational complexity of the network.

Google Net [48] is also termed as Inception-V1 topology. It is from Google and based on LeNet’s conception component. It was the winner of the ILSVRC-2014 Challenge. Google Net is a 22-layer deep neural network trained using the ImageNet, with one thousand item classifications. The highest error rate on Google Net is 6.67%, which is highly similar to human competence (5.1%). The network comprises pooling layers, convolution layers, rectified linear (Relu) layers, as well as fully connected layers.

AlexNet [49], created by Alex Krizhevsky, achieved the ImageNet 2012 Large Scale Visual Recognition Challenge with a top-five error rate of 0.1530. It has three fully-connected layers and five convolutional layers, with the Relu-activation function implemented after each convolutional and fully connected layer. Before the first two completely associated layers, a dropout value of 0.5 is employed. ImageNet is used to train the network with 100 distinct categories (1000-way softmax).

Kaiming et al. were awarded the ILSVRC-2015 Challenge with its Residual Neural Network (ResNet) [50]. Similar to existing RNN components, this new design is built on batch normalization and skips connections (grated recurrent units). ResNet has 152 layers and produces a top-5 error of 0.357, outperforming human performance.

4.4.2. Experimental Configuration

The proposed model’s PC hardware environment includes an E5-2609 CPU, 32GB RAM, and a Quadro K620 GPU. The model is implemented using the open-source Python package Keras with the Tensor flow. ADAM optimizers with a Categorical cross-entropy loss function are used to train the proposed model for 100 epochs. Other settings are 0.9 momentum, 64 batch size, 0.01 learning rate, and 0.005 weight decay. Table 2 shows the various hyperparameter setups.

The loss value was estimated using the loss function categorical cross-entropy. The possibility of activations throughout the output layer, as well as the target class, is used to calculate this loss function. To represent the value of absolute cross entropy-loss in mathematical terms, the following formula can be used:

L (a, \hat{a}) = - \sum_{i = 0}^{N} \sum_{j = 0}^{M} (a_{i j} * \log ({\hat{a}}_{i j}))

(1)

This is a comparison of the predicted vs actual data distributions, denoted by

L (a, \hat{a})

.

a_{i j}

is the actual value, whereas

{\hat{a}}_{i j}

is the projected value, with N and M being the sample and label counts, respectively. Each label’s loss function is computed independently and aggregated for all N classes.

4.5. Performance Evaluation Metrics

The recognition of diabetic retinopathy in the initial stage utilizing the fundus camera’s automatic retinal images needs basic preprocessing techniques before developing image dispensation algorithms. Multiple preprocessing methods such as contrast modification, standard strain, adaptive histogram equalization, homomorphism, and middle sifting are implemented to preprocess the retinal fundus images dataset. After deploying the algorithmic approach for retinal descriptions, the mean square error (MSE) and the hit the highest check the main level to noise ratio (PSNR) were calculated to test the algorithmic technique’s functionality. The PSNR is considered logarithmic decibel worth. The advanced PSNR value determines that the manipulated image is more significant than the original picture.

The statistics used in medical care are generally classified into two kinds; the first is related to disease information and the other to disease-free data. Understanding and specificity assessments estimate the height of rightness for behavior. Each image’s sympathy computes the digital fundus picture in diabetic retinopathy, and specificity in the medicinal science investigates the field. The true negative value (TN) indicates the non-lesion pixels, and the true positive (TP) determines the lesion pixels based on fundus pictures. On the other side, a false negative (FN) indicates the lesion pixels skipped with the algorithmic move, and a false positive (FP) suggests the number of non-lesion pixels mistakenly followed by the algorithmic rules [55,56]. The performance of the proposed methodology was measured using sensitivity, specificity, accuracy, F-1 score, precision, and AUC.

5. Results and Discussion

Two separate U-Nets are employed in the preprocessing step for the segmentation of the optic disc as well as blood vessels. The improved image that is the result of the preprocessing steps is fed into the CNN models that are based on transfer learning. The suggested model is tested on three publicly available fundus image datasets: Messidor-2, EyePACS-1, and DIARETDB0. Accuracy, sensitivity, precision, specificity, F1-score, as well as Area under the Curve (AUC) are performance indicators used to assess the effectiveness of the proposed approach model. Table 3, Table 4 and Table 5 enlist the TL-based model’s performance of for Messidor-2, EyePACS-1, and DIARETDB0, respectively. According to Table 3, Table 4 and Table 5, the Inception-V3 model obtained an average accuracy of 94.59%, 97.92%, and 93.52%, respectively, when tested on the Messidor-2, EyePACS-1, and DIARETDB0 datasets. Table 3 shows the EyePACS-1 findings, which show that Inception-V3, GoogLeNet, AlexNet, and ResNet are all 97.92%, 96.15%, 95.70%, and 96.90% accurate, respectively. For Inception-V3, GoogLeNet, AlexNet, and ResNet, respectively, the suggested models obtained 94.59%, 93.75%, 93.15%, and 94% on the Messidor-2 dataset, as shown in Table 4.

According to Table 5, the accuracy value of DIARETDB0 for Inception-V3 is 93.52%, whereas the accuracy values of GoogLeNet, AlexNet, and ResNet are 92.05%, 91.30%, and 92.45%, respectively. Because of the poor resolution of the retinal pictures and the small number of training examples, DR classification in DIARETDB0 is more difficult than in the EyePACS-1 as well as Messidor-2 datasets. The ROC curve represents the performance of the Inception-V3 on the EyePACS-1 dataset at various thresholds. Figure 10a–c show the graphical analysis of evaluation matrices for Messidor-2, EyePACS-1, and DIARETDB0, respectively. With accuracy ratings of 94.59%, 97.92%, and 93.52%, it can be said that Inception-V3 is more effective. When tested on improved retinal pictures, it is believed that Inception-V3 demonstrated the best accuracy, outperforming other networks and variants. The proposed Inception-V3 model, combined with U-Net-based OD as well as BV segmentation for DR diagnosis, is compared to several state-of-the-art approaches in Table 6 to assess its efficacy.

6. Conclusions

Early detection is essential in the treatment of diabetic retinopathy patients. This process is moving in lockstep with technological advancements. This study used AI models to classify the fundus images’ severity. We propose a novel two-stage DR detection system consisting of OD and BV segmentation, as well as DR classification based on transfer learning. Extraction of the green channel, uniform resizing, top-bottom hat transformation, as well as OD and BV segmentation were all performed during the preprocessing phase. Then, for DR classification, a transfer learning-based model, Inception-V3, is trained on Messidor-2, EyePACS-1, and DIARETDB0, which are available publicly. The findings of this study suggest that the proposed Inception-V3 evaluated using the EyePACS-1 dataset has a high potential for use in clinical applications. In the future, SVD can be replaced with Gradient descent to overcome the computational expense faster than the SVD. That could be utilized to diagnose other retinal disorders such as cataracts as well as glaucoma, and we could enhance our model’s classification performance by employing ensemble techniques of machine learning and deep learning.

Author Contributions

A.B. envisioned this study for research articles. N.W. and A.D. are engaged in the planning of this study. The study was reviewed, drafted, and revised by L.Z. and H.L. has performed the editing on this study. All authors have read and agreed to the published version of the manuscript.

Funding

This work is partially supported by the 100 Scholar Plan and Special Fund for Bagui Scholars of the Guangxi Zhuang Autonomous Region.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The EyePACS-1 dataset [38], Messidor-2 dataset [39], and DIARETDB0 dataset [40] are available publicly.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kauppi, T.; Kalesnykiene, V.; Kamarainen, J.K.; Lensu, L.; Sorri, I.; Raninen, A.; Voutilainen, R.; Pietilä, J.; Kälviäinen, H.; Uusitalo, H. The DIARETDB1 diabetic retinopathy database and evaluation protocol. In Proceedings of the British Machine Vision Conference, Coventry, UK, 10–13 September 2007. [Google Scholar] [CrossRef] [Green Version]
Kayal, D.; Banerjee, S. A new dynamic thresholding based technique for detection of hard exudates in digital retinal fundus image. In Proceedings of the 2014 International Conference on Signal Processing and Integrated Networks (SPIN), Noida, India, 20–21 February 2014. [Google Scholar] [CrossRef]
Shaban, M.; Mahmoud, A.H.; Shalaby, A.; Ghazal, M.; Sandhu, H.; El-Baz, A. Low-complexity computer-aided diagnosis for diabetic retinopathy. In Diabetes and Retinopathy; Elsevier: Amsterdam, The Netherlands, 2020. [Google Scholar] [CrossRef]
Kanimozhi, J.; Vasuki, P.; Roomi, S.M.M. Fundus image lesion detection algorithm for diabetic retinopathy screening. J. Ambient. Intell. Humaniz. Comput. 2020, 12, 7407–7416. [Google Scholar] [CrossRef]
Manjaramkar, A.; Kokare, M. Automated Red Lesion Detection: An Overview. In Advances in Intelligent Systems and Computing; Springer: Singapore, 2020; Volume 1089. [Google Scholar] [CrossRef]
Rawat, W.; Wang, Z. Deep convolutional neural networks for image classification: A comprehensive review. Neural Comput. 2017, 29, 2352–2449. [Google Scholar] [CrossRef] [PubMed]
Saleh, M.D.; Eswaran, C. An automated decision-support system for non-proliferative diabetic retinopathy disease based on MAs and HAs detection. Comput. Methods Programs Biomed. 2012, 108, 186–196. [Google Scholar] [CrossRef]
Lachure, J.; Deorankar, A.; Lachure, S.; Gupta, S.; Jadhav, R. Diabetic Retinopathy using morphological operations and machine learning. In Proceedings of the 2015 IEEE International Advance Computing Conference (IACC), Banglore, India, 12–13 June 2015. [Google Scholar] [CrossRef]
Wang, L.; Liu, H.; Lu, Y.; Chen, H.; Zhang, J.; Pu, J. A coarse-to-fine deep learning framework for optic disc segmentation in fundus images. Biomed. Signal Process. Control 2019, 51, 82–89. [Google Scholar] [CrossRef] [PubMed]
Kwasigroch, A.; Jarzembinski, B.; Grochowski, M. Deep CNN based decision support system for detection and assessing the stage of diabetic retinopathy. In Proceedings of the 2018 International Interdisciplinary PhD Workshop (IIPhDW), Swinoujscie, Poland, 9–12 May 2018. [Google Scholar] [CrossRef]
Dehghani, A.; Moin, M.-S.; Saghafi, M. Localization of the optic disc center in retinal images based on the Harris corner detector. Biomed. Eng. Lett. 2012, 2, 198–206. [Google Scholar] [CrossRef]
Zhang, W.; Wu, Y.; Yang, B.; Hu, S.; Wu, L.; Dhelim, S. Overview of multi-modal brain tumor mr image segmentation. Healthcare 2021, 9, 1051. [Google Scholar] [CrossRef]
Qomariah, D.; Nopember, I.T.S.; Tjandrasa, H.; Fatichah, C. Segmentation of Microaneurysms for Early Detection of Diabetic Retinopathy using MResUNet. Int. J. Intell. Eng. Syst. 2021, 14, 359–373. [Google Scholar] [CrossRef]
Walter, T.; Klein, J.-C.; Massin, P.; Erginay, A. A contribution of image processing to the diagnosis of diabetic retinopathy—Detection of exudates in color fundus images of the human retina. IEEE Trans. Med. Imaging 2002, 21, 1236–1243. [Google Scholar] [CrossRef]
Zhou, W.; Yi, Y.; Gao, Y.; Dai, J. Optic Disc and Cup Segmentation in Retinal Images for Glaucoma Diagnosis by Locally Statistical Active Contour Model with Structure Prior. Comput. Math. Methods Med. 2019, 2019, 1–16. [Google Scholar] [CrossRef] [Green Version]
Bilal, A.; Sun, G.; Mazhar, S. Survey on recent developments in automatic detection of diabetic retinopathy. J. Fr. Ophtalmol. 2021, 44, 420–440. [Google Scholar] [CrossRef]
Sopharak, A.; Uyyanonvara, B.; Barman, S. Automatic microaneurysm detection from non-dilated diabetic retinopathy retinal images using mathematical morphology methods. IAENG Int. J. Comput. Sci. 2011, 38, 295–301. [Google Scholar]
Lam, C.; Yu, C.; Huang, L.; Rubin, D. Retinal lesion detection with deep learning using image patches. Investig. Ophthalmol. Vis. Sci. 2018, 59, 590–596. [Google Scholar] [CrossRef] [PubMed]
Jaya, T.; Dheeba, J.; Singh, N.A. Detection of Hard Exudates in Colour Fundus Images Using Fuzzy Support Vector Machine-Based Expert System. J. Digit. Imaging 2015, 28, 761–768. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Liu, Q.; Zou, B.; Chen, J.; Ke, W.; Yue, K.; Chen, Z.; Zhao, G. A location-to-segmentation strategy for automatic exudate segmentation in colour retinal fundus images. Comput. Med. Imaging Graph. 2017, 55, 78–86. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Zhang, X.; Thibault, G.; Decencière, E.; Marcotegui, B.; Laӱ, B.; Danno, R.; Cazuguel, G.; Quellec, G.; Lamard, M.; Massin, P.; et al. Exudate detection in color retinal images for mass screening of diabetic retinopathy. Med. Image Anal. 2014, 18, 1026–1043. [Google Scholar] [CrossRef] [Green Version]
Fraz, M.M.; Jahangir, W.; Zahid, S.; Hamayun, M.M.; Barman, S.A. Multiscale segmentation of exudates in retinal images using contextual cues and ensemble classification. Biomed. Signal Process. Control 2017, 35, 50–62. [Google Scholar] [CrossRef] [Green Version]
Srivastava, R.; Wong, D.W.; Duan, L.; Liu, J.; Wong, T.Y. Red lesion detection in retinal fundus images using Frangi-based filters. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBS 2015), Milan, Italy, 25–29 August 2015. [Google Scholar] [CrossRef]
Bilal, A.; Sun, G.; Li, Y.; Mazhar, S.; Khan, A.Q. Diabetic Retinopathy Detection and Classification Using Mixed Models for a Disease Grading Database. IEEE Access 2021, 9, 23544–23553. [Google Scholar] [CrossRef]
Krishna, N.V.; Reddy, N.V.; Ramana, M.V.; Kumar, E.P. The communal system for early detection microaneurysm and diabetic retinopathy grading through color fundus images. Int. J. Sci. Eng. Technol. 2013, 2, 228–232. [Google Scholar]
Khojasteh, P.; Júnior, L.A.P.; Carvalho, T.; Rezende, E.; Aliahmad, B.; Papa, J.P.; Kumar, D.K. Exudate detection in fundus images using deeply-learnable features. Comput. Biol. Med. 2019, 104, 62–69. [Google Scholar] [CrossRef]
Seth, S.; Agarwal, B. A hybrid deep learning model for detecting diabetic retinopathy. J. Stat. Manag. Syst. 2018, 21, 569–574. [Google Scholar] [CrossRef]
Li, Y.H.; Yeh, N.N.; Chen, S.J.; Chung, Y.C. Assisted diagnosis for diabetic retinopathy based on fundus images using deep convolutional neural network. Mob. Inf. Syst. 2019, 2019, 6142839. [Google Scholar] [CrossRef]
Prentasic, P.; Loncaric, S. Detection of exudates in fundus photographs using convolutional neural networks. In Proceedings of the 2015 9th International Symposium on Image and Signal Processing and Analysis (ISPA), Zagreb, Croatia, 7–9 September 2015. [Google Scholar] [CrossRef]
Harangi, B.; Lazar, I.; Hajdu, A. Automatic exudate detection using active contour model and regionwise classification. In Proceedings of the 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, San Diego, CA, USA, 28 August–1 September 2012. [Google Scholar] [CrossRef]
Bilal, A.; Sun, G.; Mazhar, S.; Imran, A.; Latif, J. A Transfer Learning and U-Net-based automatic detection of diabetic retinopathy from fundus images. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 2022, 1–12. [Google Scholar] [CrossRef]
Mahapatra, D.; Roy, P.K.; Sedai, S.; Garnavi, R. Retinal image quality classification using saliency maps and CNNs. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Cham, Switzerland, 2016; Volume 10019. [Google Scholar] [CrossRef]
Chen, W.; Yang, B.; Li, J.; Wang, J. An approach to detecting diabetic retinopathy based on integrated shallow convolutional neural networks. IEEE Access 2020, 8, 178552–178562. [Google Scholar] [CrossRef]
Bilal, A.; Sun, G.; Mazhar, S. Diabetic Retinopathy detection using Weighted Filters and Classification using CNN. In Proceedings of the 2021 International Conference on Intelligent Technologies (CONIT), Hubli, India, 25–27 June 2021. [Google Scholar] [CrossRef]
Bilal, A.; Sun, G.; Mazhar, S.; Imran, A. Improved Grey Wolf Optimization-Based Feature Selection and Classification Using CNN for Diabetic Retinopathy Detection. In Evolutionary Computing and Mobile Sustainable Networks; Lecture Notes on Data Engineering and Communications Technologies; Springer: Singapore, 2022; Volume 116, pp. 1–14. [Google Scholar] [CrossRef]
Liu, Y.-P.; Li, Z.; Xu, C.; Li, J.; Liang, R. Referable diabetic retinopathy identification from eye fundus images with weighted path for convolutional neural network. Artif. Intell. Med. 2019, 99, 101694. [Google Scholar] [CrossRef] [PubMed]
EyePACS-1. The World of Eyepacs. 2015. Available online: http://www.eyepacs.com (accessed on 10 February 2022).
Decencière, E.; Zhang, X.; Cazuguel, G.; Lay, B.; Cochener, B.; Trone, C.; Gain, P.; Ordóñez-Varela, J.-R.; Massin, P.; Erginay, A.; et al. Feedback on a publicly distributed image database: The Messidor database. Image Anal. Stereol. 2014, 33, 231–234. [Google Scholar] [CrossRef] [Green Version]
Kauppi, T.; Kalesnykiene, V.; Kamarainen, J.K.; Lensu, L.; Sorri, I.; Uusitalo, H.; Kälviäinen, H.; Pietilä, J. DIARETDB0: Evaluation Database and Methodology for Diabetic Retinopathy Algorithms; Lappeenranta University of Technology: Lappeenranta, Finland, 2006. [Google Scholar]
Dunnhofer, M.; Antico, M.; Sasazawa, F.; Takeda, Y.; Camps, S.; Martinel, N.; Micheloni, C.; Carneiro, G.; Fontanarosa, D. Siam-U-Net: Encoder-decoder siamese network for knee cartilage tracking in ultrasound images. Med. Image Anal. 2020, 60, 101631. [Google Scholar] [CrossRef]
Karimi, D.; Zeng, Q.; Mathur, P.; Avinash, A.; Mahdavi, S.; Spadinger, I.; Abolmaesumi, P.; Salcudean, S.E. Accurate and robust deep learning-based segmentation of the prostate clinical target volume in ultrasound images. Med. Image Anal. 2019, 57, 186–196. [Google Scholar] [CrossRef]
Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6–11 July 2015; Volume 1. [Google Scholar]
Mittal, S. A survey of FPGA-based accelerators for convolutional neural networks. Neural Comput. Appl. 2020, 32, 1109–1139. [Google Scholar] [CrossRef]
Ciregan, D.; Meier, U.; Schmidhuber, J. Multi-column deep neural networks for image classification. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012. [Google Scholar] [CrossRef] [Green Version]
Karoly, P.; Ruehlman, L.S. Psychological ‘resilience’ and its correlates in chronic pain: Findings from a national community sample. Pain 2006, 123, 90–97. [Google Scholar] [CrossRef]
Gómez-Valverde, J.J.; Antón, A.; Fatti, G.; Liefers, B.; Herranz, A.; Santos, A.; Sánchez, C.I.; Ledesma-Carbayo, M.J. Automatic glaucoma classification using color fundus images based on convolutional neural networks and transfer learning. Biomed. Opt. Express 2019, 10, 892–913. [Google Scholar] [CrossRef] [Green Version]
Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar] [CrossRef] [Green Version]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015. [Google Scholar] [CrossRef] [Green Version]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar] [CrossRef] [Green Version]
Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Li, F.-F. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2010. [Google Scholar] [CrossRef] [Green Version]
Imran, A.; Li, J.; Pei, Y.; Akhtar, F.; Yang, J.-J.; Dang, Y. Automated identification of cataract severity using retinal fundus images. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 2020, 8, 691–698. [Google Scholar] [CrossRef]
Shayma’a, A.H.; Sayed, M.S.; Abdalla, M.I.; Rashwan, M.A. Breast cancer masses classification using deep convolutional neural networks and transfer learning. Multimed. Tools Appl. 2020, 79, 30735–30768. [Google Scholar] [CrossRef]
Hemelings, R.; Elen, B.; Barbosa-Breda, J.; Lemmens, S.; Meire, M.; Pourjavan, S.; Vandewalle, E.; Van De Veire, S.; Blaschko, M.B.; De Boever, P.; et al. Accurate prediction of glaucoma from colour fundus images with a convolutional neural network that relies on active and transfer learning. Acta Ophthalmol. 2020, 98, e94–e100. [Google Scholar] [CrossRef]
Grubbs, F.E. Errors of Measurement, Precision, Accuracy and the Statistical Comparison of Measuring Instruments. Technometrics 1973, 15, 53–66. [Google Scholar] [CrossRef]
Sharma, M.; Sharma, S.; Singh, G. Performance analysis of statistical and supervised learning techniques in stock data mining. Data 2018, 3, 54. [Google Scholar] [CrossRef] [Green Version]

Figure 1. (a) Normal retina, (b) Retina with DR.

Figure 2. Graphical representation of classes distribution. (a) EyePACS-1; (b) Messidor-2; (c) DiaretDB0.

Figure 3. Sample images from the datasets.

Figure 4. Proposed methodology.

Figure 5. FI RGB channel. (a) OI (Original Image); (b) RCH; (c) GCH; (d) BCH.

Figure 6. Preprocessing steps. (a) Original Image, (b) Resized Image, (c) G_C Image, (d) TB_H_T Image.

Figure 7. Using preprocessed images, multiple augmentation processes were applied to augment the retinal dataset.

Figure 8. The OD and BV segmentation from retinal FIs by employing U-Net. (a) OD segmentation; (b) BV segmentation.

Figure 9. CNN model for the features extraction from FIs.

Figure 10. Accuracy (Acc), F-1 Score (F1-S), Sensitivity (Sen), Specificity (Spc), Precision (Pre), and AUC comparison for all models evaluated on (a) Messidor-2, (b) EyePacs-1, and (c) DIARETDB0 datasets.

Table 1. Dataset distribution of Messidor-2, EyePACS-1, and DIARETDB0.

Class	Severity Grade	Messidor-2	EyePACS-1	DIARETDB0
0	Normal	1017	7552	20
1	Mild	270	842	50
2	Moderate	347	545	35
3	Severe	75	54	15
4	PDR	35	95	10

Table 2. Hyper Parameters Configurations (momentum (M), Batch Size (B.S), Learning Rate (L.R), Weight Decay (W.D), Optimizer (OPT), Loss Function (L.F), Dropout (DO), Class Weight (C.W), and Epochs (E)).

M	0.90
B.S	64
L.R	0.001
W.D	0.005
OPT	ADAM
L.F	Categorical Cross-Entropy
DO	0.5
C.W	[−1,1]
E	100

Table 3. Proposed model performance on the EyePACS-1 dataset.

Model	Accuracy	Sensitivity	Precision	Specificity	F1-Score	AUC
Inception-V3	0.9792	0.9694	0.9744	0.969	0.9710	0.9798
GoogLeNet	0.9615	0.9475	0.945	0.9622	0.9339	0.9815
AlexNet	0.9570	0.9308	0.9375	0.9439	0.9254	0.9748
ResNet	0.969	0.9381	0.9595	0.9615	0.9469	0.9705

Table 4. Proposed model performance on the Messidor-2 dataset.

Model	Accuracy	Sensitivity	Precision	Specificity	F1-Score	AUC
Inception-V3	0.9459	0.9481	0.9512	0.9435	0.9299	0.969
GoogLeNet	0.9375	0.943	0.923	0.9266	0.903	0.964
AlexNet	0.9315	0.9305	0.92	0.9234	0.9181	0.961
ResNet	0.94	0.933	0.9415	0.937	0.9365	0.97

Table 5. Proposed model performance on the DIARETDB0 dataset.

Model	Accuracy	Sensitivity	Precision	Specificity	F1-Score	AUC
Inception-V3	0.9352	0.9312	0.9232	0.9099	0.9122	0.949
GoogLeNet	0.9205	0.9074	0.9225	0.9222	0.917	0.947
AlexNet	0.913	0.9008	0.9175	0.919	0.91	0.944
ResNet	0.9245	0.9138	0.935	0.9196	0.922	0.955

Table 6. Performance comparison with literature.

Model	Accuracy	Precision	F-1 Score	Number of Images
CNN + SVM [27]	-	0.93	-	35,126
Shallow CNN [28]	0.805	-	0.85	5000
ShallowNet + MI/VI/PI [33]	0.87	-	-	35,000
	0.9
	0.92
VGG16 without FV [33]	0.93	0.83		35,000
LCNN [33]	0.89	-	-	35,000
LC-CNN [3]	0.884	-	-	1748
Am-InveptionV3[37]	0.944	0.989	0.94	3662
Proposed Method (CNN + SVD + Inception-V3)	0.9792	0.9744	0.9710	10,966

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bilal, A.; Zhu, L.; Deng, A.; Lu, H.; Wu, N. AI-Based Automatic Detection and Classification of Diabetic Retinopathy Using U-Net and Deep Learning. Symmetry 2022, 14, 1427. https://doi.org/10.3390/sym14071427

AMA Style

Bilal A, Zhu L, Deng A, Lu H, Wu N. AI-Based Automatic Detection and Classification of Diabetic Retinopathy Using U-Net and Deep Learning. Symmetry. 2022; 14(7):1427. https://doi.org/10.3390/sym14071427

Chicago/Turabian Style

Bilal, Anas, Liucun Zhu, Anan Deng, Huihui Lu, and Ning Wu. 2022. "AI-Based Automatic Detection and Classification of Diabetic Retinopathy Using U-Net and Deep Learning" Symmetry 14, no. 7: 1427. https://doi.org/10.3390/sym14071427

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

AI-Based Automatic Detection and Classification of Diabetic Retinopathy Using U-Net and Deep Learning

Abstract

1. Introduction

2. Current Research Status

2.1. DR Detection Based on Classical Machine Learning

2.2. DR Detection Based on Deep Learning

3. Materials

3.1. Datasets

3.1.1. EyePACS-1

3.1.2. Messidor-2

3.1.3. DiaretDB0

4. Methodology

4.1. Preprocessing and Data Augmentation

4.1.1. Preprocessing

4.1.2. Data Augmentation

4.2. Optic Disc (OD) and Blood Vessel (BV) Segmentation

4.3. Dimensionality Reduction Using CNN-SVD

4.3.1. Feature Extraction by CNN from FIs

4.3.2. Features Reduction by SVD

4.4. Transfer Learning Models

4.4.1. Standard Classifiers

4.4.2. Experimental Configuration

4.5. Performance Evaluation Metrics

5. Results and Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI