1 s2.0 S2666285X22000152 Main
1 s2.0 S2666285X22000152 Main
1 s2.0 S2666285X22000152 Main
a r t i c l e i n f o a b s t r a c t
Keywords: Agricultural production is something on which the economy significantly relies. Leaf diseases in agriculture are
Algorithm the key issue for every nation, as the food demand is expanding at a rapid speed due to a rise in population.
Classification Skin disorders are usually seen in animals and humans, it is a particular sort of illness caused by germs or
Feature extraction
infection. Early and accurate identification and diagnosis of leaf and skin diseases are vital to keeping them from
Plant leaf diseases
spreading. Image processing techniques can be used for disease detection which involves mathematical equations
Segmentation
Training and mathematical transformations. For humans eyes image is a mixture of RGB colour, because of these colours
we can extract some of the features from the image, but modern computer stores image in a mathematical
format which means computer sees the image as numbers, hence after evaluating the image as a number arrays
or matrix we will perform various transforms on them, these transforms will extract specific details from the
picture, before transforming the image must go under various operation like feature adjustment which is also
carried out mathematically. The project is implemented using K-Means Clustering and Support Vector Machine
Algorithm in MATLAB through which we can detect and distinguish different types of leaf and skin diseases.
Introduction after training this will learn on its own hence for new data types this
will offer a dynamic response. And, in today’s world, AI is employed
Image is a mixture of RGB colours for normal eyes, but images are everywhere, and in the future, all AI-based systems may be integrated
real numbers inside a matrix for digital devices such as computers and as one, making the interface simple.
cameras. Since modern images exhibit digital and numerical character- Initially, Disease Detection System (DDS) is trained by providing nu-
istics, we can apply some mathematical transformation and by adjusting merical data of different disease types. In the disease detection mode,
their parameters we can extract some of the hidden details on them [1]. DDS will constantly compare sample data with training data and pro-
Based on those extracted features, we can do further research like cal- vide an output based on less error between the two data types. In other
culating a man’s age by processing face and extracting skin data, etc. words, it will provide a result based on a higher likelihood of closed ap-
Images are regarded as numbers within the matrix, and the magnitude proximation. Indian agriculture has a rich history, and it contributes to
of these numbers determines the colour of that pixel in the real world. the country’s economic strength. However, as new types of diseases are
To extract a little information, the precise numbers within that matrix emerging and tests take a long time and are expensive, many farmers
should be highlighted by using a few algorithms and mathematically are reluctant to do these tests, and as a result, they may suffer losses
changing the matrix. However, it is not possible to detect the nature [3]. Similarly, skin illnesses are spreading at an alarming rate. Most of
of the sickness by extracting the features; instead, the analysis must be us will neglect minor skin diseases due to our hectic lifestyles and the
done on the collected data, which takes time and involves a human op- initial cost of the test. Later, these minor diseases will cause substantial
eration. Advancements in image processing technology and algorithms issues, therefore we may use image processing-based detectors to deter-
can be used in medical science to determine the various diseases and mine the type of the sickness in the primary stage and take measures or
their stages. Some of the diseases can be visually inspected and some therapy on primary days to avoid health concerns.
of them require both visual inspection and medical tastings for confir-
mation [2]. As computers can adjust the parameters of images, they Literature survey
can accurately differentiate some parts on images as abnormal based on
training data and hence this saves time. The key feature of AI is that Sachin D. Khirade and A.B. Patil [4] proposed a method in which leaf
disease can be detected by providing green leaves, the method is based
∗
Corresponding author.
E-mail address: badiger_manju@yahoo.com (M. Badiger).
https://doi.org/10.1016/j.gltp.2022.03.010
on K-Means clustering and some auxiliary algorithms the detection steps On the contrary, manual interpretation demands a significant amount of
involves image acquisition after that pre-processing stage takes over and effort, knowledge in plant diseases, and also requires excessive process-
does some basic transforms after that image segmentation takes place ing time [20]. In this study, we describe a technique that mixes image
which is carried out by K-Means cluster algorithm then classification is processing and machine learning to enable identifying illnesses from leaf
carried out by ANN block also this paper gives some mathematics behind pictures. This automated technique identifies illnesses on potato plants
mathematical transforms. using a publicly accessible plant picture collection named ‘Plant Village’
The paper proposed by Leon Bottou and Yoshua Bengio [5] gives all [21]. Our segmentation technique and application of support vector ma-
details about popular K-Means clustering especially this paper described chine exhibit illness classification over 300 images with an accuracy of
as gradient descent algorithm and/or this can also be described by im- 95%. Thus, the suggested technique gives way toward automated plant
proving mathematics of EM algorithm. Also, they proved that quantiza- diseases identification on a huge scale.
tion error is minimum, the reason for this is due to the usage of a very
fast Newton algorithm. Proposed methodology
The paper proposed by P.K. Agarwal and C.M. Procopiuc [6] pro-
poses an algorithm for solving the K-Centre problem in various system- The basic control flow diagram of the disease detection system is
related matrixes, they showed that the algorithm can also be extended shown in Fig 1. The system is mainly used to detect the disease of leaves
to other matrixes and by solving them we can get a solution to discreet and skins. The flowchart of the disease detection system comprises of
K-Centre problems. also, they described a simple (1 + l) approximation
algorithm for the K-Centre problem. Input image
The paper proposed by Ahmed S. Abljtaleb [7] has the main goal of
extending entropy-based thresholding technique for 2D histogram, also For performing image processing the presence of the image is the
study of individual pixels’ grey level value and neighbourhoods average primary requirement, an appropriate image with appropriate size and
value is carried out. This proves that the threshold is a vector that has resolutions must be provided, the image can be loaded from the root
two access points, the first one is pixels’ grey level and the second one folder, or for live image processing we can interface an external camera
is its average value. Also, this proposed method works very efficiently to our system after capturing the image can be loaded to the system as
when the noise level is small. the primary input. If required we can use an online image library as an
The paper proposed by Nidhal K. El Abbadi, Nizar Saadi Dahir, image source here the images are stored in cloud storage by accessing
Muhsin A. AL-Dhalimi, and Hind Restom [8], proposed a system for them, we can directly load the image, without the image the system
detecting skin disease using ANN. The key to detecting disease is colour decides input as a null matrix.
of skin and GLCM and based on training data and colour detection they
Pre-processing
successfully detected psoriasis skin disease and the algorithm omitted
healthy skin.
The Disease Detection System converts the input image into a two-
The paper proposed by Dr. S.Arivazhagan, Mrs. R.Newlin Shebiah,
dimensional matrix expressed in RGB format. Based on the magnitude of
Ms. K.Divya, Ms. M.P.Subadevi [9] implemented a human skin disease
detection system that works on the principle of an automated system
based on texture analysis. the skin element melanin and haemoglobin
distributions are differentiated by independent component analysis
which is based on skin colour, the grey level run-length matrixes are
used to derive texture features.
The paper proposed by S. Kolkur1, D. Kalbande, P. Shimpi, C. Ba-
pat2, and J. Jata-kia [10] implemented RGB colour based human skin
disease detection system, here three primary colours RGB, Hue, satu-
ration, value, Chrominance, and Luminance are the base for detection,
their main goal is to detect pixel of given image very efficiently. The al-
gorithm is designed to consider both primary colour and combinational
range which naturally increases accuracy in recognizing the affected
area.
Shima Ramesh and Ramachandra Hebbar [11] proposed that Plant
infections are a substantial danger to sustenance security, but their rapid
distinguishing verification remains problematic in many parts of the
globe because of the non-presence of the fundamental foundation. The
advancement of accurate algorithms in the area of leaf-based image cate-
gorization has demonstrated outstanding results. This article makes use
of Random Forest in differentiating between healthy and sick leaves
using the data sets provided. Their suggested work comprises multi-
ple stages of implementation like dataset generation, feature extraction,
training the classifier, and classification. The produced datasets of in-
fected and healthy leaves are jointly trained under Random Forest to
categorize the infected and healthy images. For extracting characteris-
tics of an image, they utilize a Histogram of an oriented Gradient (HOG).
Applying machine learning to train the massive data sets accessible pub-
licly offers a clear technique to identify the illness existing in plants at
a gigantic scale.
Monzurul Islam and Monzurul Islam [19] Modern phenotyping and
plant disease detection give hopeful steps towards food security and
sustainable agriculture. In particular, image and computer vision-based
phenotyping gives the capacity to analyse quantitative plant physiology. Fig. 1. Flow chart of Disease Detection System.
273
M. Badiger, V. Kumara, S.C.N. Shetty et al. Global Transitions Proceedings 3 (2022) 272–278
the numbers inside the matrix, it is possible to determine the prevailing then its output will be indicated initial classification, else again modifi-
colour of the input image as well as to detect and classify the image type cation to the initial stage is made by recalculating focal points of each
as skin or leaf. If R>G, the image is of the skin type; otherwise, it is of cluster this process will take place till we get a satisfying result, this
the leaf type. Then input image is converted to LAB colour space from method is generally simple and reliable [12]. This K-cluster algorithm
standard RGB format. All the images are finally reduced to a consistent heavily depends upon initial points, since points of selection are random
size. each time the produced outcome will be different. For extremum, this
algorithm uses gradient method or in other words, this algorithm also
Clustering and segmentation of image depends upon target function, the gradient method’s direction search al-
ways travels along the axis which is nothing but the direction of energy
The input image has all of the required information for the process- decrease. In other words, if a selection of the initial point is not proper
ing. But the underlying difficulty is that the impacted area may be any- then the whole process or algorithm becomes the local minimum point
where on the image. To detect the damaged area K-means algorithm [13]. This algorithm uses an objective function which is given by
is used which splits the whole image into tiny sections and then does ∑ ∑ ( )
𝑐 𝑘 ‖ ‖ 2
𝐽 = (𝑗=1) ‖
𝑥 − 𝑣𝑗 ‖ (1)
image processing on each component. If any unaffected areas are dis- (𝑖=1) ‖ 𝑖 ‖
covered it excludes them from consideration. If it detects the impacted Where
areas, it reserves them for further analysis. The K-means algorithm is an xi - vj is the Euclidean distance between 𝑥𝑖 and 𝑣𝑗 .
iterative technique that segments the dataset into K pre-defined discrete c = Number of clusters
non-overlapping subgroups. k = Number of data points
The algorithm flow is described in the following steps:
Extraction of feature & comparison with database Initialize the centroid and randomly select c cluster centre then cal-
culate xi -vj for different values of i & j.
This is one of the most important steps in the flow of disease detec- Now assign a data point to a cluster centre such that the distance
tion. Feature extraction refers to the process of turning raw image data between the data point and that cluster is minimum compared to other
into numerical features. This lowers the number of resources required cluster centroids.
to represent the data and may be processed while maintaining the in- For assigned data point recalculate centre using the equation:
formation of the original data set to offer superior outcomes. The image 1 ∑
components like colour, intensity, etc., are adjusted such that the hidden 𝑣𝑗 = 𝑥𝑖 ∈𝑣𝑗 𝑥𝑖 (2)
𝑐𝑖
features of the diseases get highlighted. These extracted features will be
helpful to detect disease very quickly and efficiently when compared Where 𝑐𝑖 is the number of data points in the i th cluster.
with the test database of skin or leaf disease. Now distance value for the recalculated centre and the cluster can
be calculated. If the distance is very minimum or it is zero, stop the
iteration otherwise repeat the procedure from step 2.
Identification and classification of disease
Otsu threshold algorithm
An image of type either diseased leaf or diseased skin is given as input
to the disease detection system. SVM Classifier is used for classification
The working of this algorithm is mainly dependant on a set threshold
purposes. If the input image is of skin disease, the system will classify it
value, if a certain part of the image falls below the threshold value,
into Melanocytic naevus, Basal-cell carcinoma, and Actinic keratosis. If
then those parts are represented by binary zero, and the parts which are
the input image is of leaf disease, the system will identify and classify it
above thresholds are represented by binary one, thus this algorithm is
to Alternaria Alternata, Anthracnose, Bacterial Blight, Cercospora Leaf
used to transform greyscale images into a binary image [14]. Here each
Spot, and Healthy Leaf.
pixel will hold some numerical values from 0 to 256, the value of these
pixels indicates the intensity of that pixel, so to highlight or remove
Algorithm description that pixel we set threshold values. Once the threshold values are set,
the intensity of that pixel may be modified based on that value, which
Algorithms are sets of instructions that accomplish tasks like arith- aids in the removal of some of the portions of the input picture. This
metic, data processing, automated reasoning, and automated decision- approach is a prominent technique in image processing because of this
making. The majority of machine learning solutions are created and feature. This approach is usually inferred when the picture is converted
deployed using off-the-shelf machine learning algorithms with small to a greyscale image or when the image is converted to a binary image.
tweaks. Some of the algorithms utilised in the DSS are listed below. The threshold value will be determined based on the region of interest
and it will now be extremely easy to separate the undesirable area from
K - Means Clustering the interested area.
The Otsu algorithm is as follows:
It is a centroid based iterative algorithm, where K represents the Since the output is a two-state binary distinguish pixel into two clus-
number of clusters and at the beginning, for K = N it creates a cen- ters.
troid CN and the algorithm starts at some random point Cm, where m Now calculate the mean values of both clusters and square the sub-
is a random number that lies between zero and N. This is used as a traction values of them.
prime technique in our proposed design and this algorithm method was If m and n hold the pixel values of the individual cluster, then mul-
first suggested by Hugo Steinhuas in the year 1956 but the technique tiply both m and n.
was modified and shaped by Stuart Loyd in the year 1956. This method
is widely used in image processing and data mining. In this algorithm Boundary and spot detection algorithm
smaller the index of the cluster better is the speed and performance. To
reduce this index error criterion and square errors are used as the base Boundary detection is a prominent approach in image processing
of this algorithm. At the beginning of the process, it will select some that may be applied in the identification of an item, type, and segmen-
points on the interesting area this point of interest indicates the focal tation in an image. The boundary or edge may be determined by de-
point of the initial cluster. After this it will continue its process con- tecting a change in pixel values across a range of integers. This may
cerning reaming part to their focal point if distance vector is minimum be utilized as a pre-processing technique in disease diagnosis since the
274
M. Badiger, V. Kumara, S.C.N. Shetty et al. Global Transitions Proceedings 3 (2022) 272–278
impacted portion separation is the initial stage in disease identification. background from the target item hence it changes the raw image into
Hence by identifying the edge and by deleting the remainder of the com- the processable image.
ponent, we may sequentially extract that section. The impacted region
detection is based on the change of colour of a target item [15]. Even GLCM algorithm
though it is a pre-processing tool plenty of calculations have to be done.
This algorithm converts the provided colour picture into three sections Any RGB image can be resolved into HSI (Hue, Saturation and Inten-
called Hue, Intensity, and saturation (HIS) model. With the aid of this sity) model, we can extract specific features by adjusting the RGB values.
enhanced image, the algorithm quickly detects the border and spot of The grey level Co-occurrence matrices simply known as GLCM is one of
a specified colour for the provided image. Hence, it’s possible to de- the widely used algorithms in image processing to extract the textural
termine the damaged region. Also, this method helps to separate the aspects of the input image [16]. This is one of the widely used methods
275
M. Badiger, V. Kumara, S.C.N. Shetty et al. Global Transitions Proceedings 3 (2022) 272–278
Where
⎧ 1 ⎫
⎪ 2
[(𝑅 − 𝐵 )(𝑅 − 𝐺)] ⎪
𝜃 = 𝑐𝑜𝑠−1 ⎨ [ 1 ]⎬
(4)
⎪ (𝑅 − 𝐺 ) 2 + ( 𝑅 − 𝐵 )( 𝐺 − 𝐵 ) 2 ⎪
⎩ ⎭
3
𝑆= 𝑚𝑖𝑛 [𝑅, 𝐺, 𝐵 ] (5)
[𝑅 + 𝐺 + 𝐵 ]
1
𝐼= [𝑅 + 𝐺 + 𝐵 ] (6)
3
SVM algorithm
276
M. Badiger, V. Kumara, S.C.N. Shetty et al. Global Transitions Proceedings 3 (2022) 272–278
Experimental results model. K-Means Clustering is used to determine the number of clusters
as shown in Fig 3.
Different types of datasets related to leaf disease namely Alternaria, Once the clusters are obtained from the k-means algorithm, we
Anthracnose, bacterial blights, Cercospora, Leaf Spot and Healthy Leaf should enter the cluster number of the impacted area in the pop-up win-
are considered for training and prediction purposes. Fig 2. shows the dow as shown in the below Fig 4.
enhanced version of the input leaf image which is given as input to the
277
M. Badiger, V. Kumara, S.C.N. Shetty et al. Global Transitions Proceedings 3 (2022) 272–278
Table 1 scribed above may be upgraded to a real-time video access system that
comparison of various detection techniques/ algorithms of leaf disease detec- enables uninterrupted plant care.
tion.
References
Reference Applied technique Disease Accuracy
Asfarian et al., 2013 [17] Texture Analysis and PNN Rice 83% [1] I.H. Sarker, Machine learning: algorithms, real-world applications and research di-
Hu YH et al, 2016 [18] Hyperspectral Imaging Potato 95% rections, SN Comput. SCI. 2 (2021) 160, doi:10.1007/s42979-021-00592-x.
Monzurul Islam et al, 2017 [19] RGB Imaging Potato 95% [2] F. Andreotti, O. Carr, M.A.F. Pimentel, A. Mahdi, M. De Vos, Comparing
Proposed K-Means & SVM Tomato 96% feature-based classifiers and convolutional neural networks to detect arrhyth-
mia from short segments of ECG, 2017 Comput. Cardiol. (CinC) (2017) 1–4,
doi:10.22489/CinC.2017.360-239.
[3] L. Li, S. Zhang, B. Wang, Plant Disease Detection and Classification by Deep
Table 2 Learning—A Review, IEEE Access 9 (2021) 56683–56698, doi:10.1109/AC-
Accuracy analysis of proposed Disease Detection System for different types of CESS.2021.3069646.
leaf and skin disease using SVM. [4] Sachin D. Khirade, A.B. Patil, Savitribai Phule Puneuniversity Pune, India: Plant
Disease Detection Using Image Processing, IEEE, 2015.
Disease type Affected area Accuracy [5] L. Bottou, Y. Bengio, Convergence Properties of the k-means Algorithms, in: Ad-
vances in Neural Information Processing Systems, 7, MIT Press, 1995, pp. 585–592.
Bacterial Blight 15.0217% 96.3871% G. Tesauro and D. Touretzky, eds.
Healthy Leaf 8.8002% 96.2% [6] P.K. Agarwal, C.M. Procopiuc, Exact and Approximation Algorithms for Clustering,
Anthracnose 17.4244% 96.3% in: Proc. Ninth Ann. ACM-SIAM Symp. Discrete Algorithms, Jan. 1998, pp. 658–667.
Cercospora Leaf Spot 20.0391% 96.3871% [7] A.S. Abutaleb, Automatic thresholding of gray-level pictures using two-dimensional
Alternaria Alternata 15.1245% 96.3871% entropy, Comput. Vis. Graph. Image Process. 47 (1989) 22–32.
Melanocytic nevus 15.0015% 96.3871% [8] N. Abbadi, N. SaadiDahir, M. Dhalimi, H. Restom, Psoriasis Detection Using Skin
Color and Texture Features, Journal of Computer Science 6 (6) (2010) 648–652
Melanoma 52.4064% 96.3871%
ISSN 1549–3636.
Basal-cell carcinoma 15.001% 96.3%
[9] S. Arivazhagan, R. Shebiah, K. Divya, M. Subadevi, Skin disease classification by ex-
Dermatofibroma 15.0031% 96.2%
tracting independent components, J. Emerg. Trends Comput. Inf. Sci. 3 (10) (2012)
Actinic keratosis 32.5603% 96.3871% 1379–1382.
[10] S. Kolker, D. Kalbande, P. Shimpi, C. Bapat, J. Jatakia, Human skin detection using
RGB HSV and YCbCr Color models, Adv. Intell. Syst. Res. 137 (2016) 324–332.
[11] Ramesh, et al., Plant Disease Detection Using Machine Learning, in: 2018 Interna-
After the impacted area number is entered a result window will be tional Conference on Design Innovations for 3Cs Compute Communicate Control
displayed which contains information like the type of the input image (ICDI3C), 2018, pp. 41–45, doi:10.1109/ICDI3C.2018.00017.
[12] U. Rahamathunnisa, M.K. Nallakaruppan, A. Anith and S, K.S. Kumar, Vegetable
(leaf or skin), disease type and description, Scientific Classification and
Disease Detection Using K-Means Clustering And Svm, in: 6th International Confer-
required treatments (Fig 5). ence on Advanced Computing & Communi-cation Systems, 2020, pp. 1308–1311,
Similarly, we have considered different types of datasets related to doi:10.1109/ICACCS48705.2020.9074434.
skin disease namely Melanocytic naevus, Melanoma, Basal-cell carci- [13] M.B. Eisten, P.T. Spellman, P.O. Brown, D. Bostein, Cluster analysis and display of
genome-wide expression pattern, Proc. Nat. Acad. Sci. 95 (1998) 14863–14868.
noma, Dermatofibroma and Actinic keratosis. Fig 6. shows the enhanced [14] Z. Ye, L. Ma, W. Zhao, W. Liu, H. Chen, A Multi-level Thresholding Approach
version of the input image. Based on Group Search Optimization Algorithm and Otsu, in: 2015 8th International
Fig. 7 shows the K-Means Clustering of the image. Fig. 8 shows the Symposium on Computational Intelligence and Design, ISCID, 2015, pp. 275–278,
doi:10.1109/ISCID.2015.26.
entry of the afflicted area’s cluster number and Fig. 9 depicts the Result [15] G. Shi, J. Suo, C. Liu, K. Wan, X. Lv, Moving target detection algorithm in image
window, which displays the kind of input, illness type and description, sequences based on edge detection and frame difference, in: 2017 IEEE 3rd Infor-
Severity, and needed treatments. mation Technology and Mechatronics Engineering Conference, 2017, pp. 740–744,
doi:10.1109/ITOEC.2017.8122449.
Table 1 provides the comparative analysis of different techniques/ [16] Kumar Vaibhav, et al., K-mean clustering-based cooperative spectrum sensing in gen-
algorithms of leaf disease detection. From the table above, we can simply eralized K-𝜇 fading channels, Communication (NCC) 2016 Twenty Second National
determine that SVM performs better than other classifiers. Conference, 2016.
[17] A. Asfarian, Y. Herdiyeni, A. Rauf, K.H. Mutaqin, Paddy diseases identification with
Table 2 shows the accuracy analysis of the disease detection system texture analysis using fractal descriptors based on fourier spectrum, in: 2013 Interna-
for different types of input images of either leaf or skin using SVM. The tional Conference on Computer, Control, Informatics and Its Applications (IC3INA),
system provides almost 96% accuracy for all types of diseased input 2013, pp. 77–81, doi:10.1109/IC3INA.2013.6819152.
[18] Y.H. Hu, X.W. Ping, M.Z. Xu, W.X. Shan, Y. He, Detection of Late Blight Disease
images.
on Potato Leaves Using Hyperspectral Imaging Technique, PubMed 36 (2) (2016)
515–519.
Conclusion [19] M. Islam, Anh Dinh, K. Wahid, P. Bhowmik, Detection of potato diseases using im-
age segmentation and multiclass support vector machine, in: 2017 IEEE 30th Cana-
dian Conference on Electrical and Computer Engineering, CCECE, 2017, pp. 1–4,
The appropriate diseases detection and classification is very impor- doi:10.1109/CCECE.2017.7946594.
tant to avoid some unexpected events. In this paper, we have discussed [20] Manjunatha Badiger, Jose Alex Mathew, Retrospective Review of Activation Func-
various functional blocks and algorithms which are mandatory for image tions in Artificial Neural Networks, in: V. Bindhu (Ed.), Proceedings of Third Inter-
national Conference on Communication, Computing and Electronics Systems, Lecture
processing. The project is implemented using MATLAB and successfully Notes in Electrical Engineering, 844, Third, Springer, Singapore, 2022, pp. 905–919.
detected and classified skin and leaf diseases based on their physical ap- 981-16-8862-1_59, In press.
pearances using the above-discussed algorithms. By varying the training [21] J.P. Nayak, PCB Fault detection using Image processing, IOP Conference Series: Ma-
terials Science and Engineering, 255, IOP Publishing Ltd, 2017, 012244.
data, we can extend the disease detection capability. The system de-
278