Fpls 14 1308528

TYPE Original Research
PUBLISHED 08 December 2023

DOI 10.3389/fpls.2023.1308528
Plant disease detection model

OPEN ACCESS for edge computing devices
EDITED BY
José Dias Pereira,
Instituto Politecnico de Setubal (IPS), Ameer Tamoor Khan 1, Signe Marie Jensen 1*,
Portugal
Abdul Rehman Khan 2 and Shuai Li 3
REVIEWED BY
Vı´tor Viegas, 1
Department of Plant and Environmental Science, University of Copenhagen, Copenhagen, Denmark,
Naval School, Portugal 2
Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied
Jakub Nalepa, Sciences, Islamabad, Pakistan, 3 Deparment of Information Technology and Electrical Engineering,
Silesian University of Technology, Poland University of Oulu, Oulu, Finland
*CORRESPONDENCE
Signe Marie Jensen
smj@plen.ku.dk
RECEIVED 06 October 2023

In this paper, we address the question of achieving high accuracy in deep
ACCEPTED 22 November 2023 learning models for agricultural applications through edge computing devices
PUBLISHED 08 December 2023 while considering the associated resource constraints. Traditional and state-of-
CITATION the-art models have demonstrated good accuracy, but their practicality as end-
Khan AT, Jensen SM, Khan AR and Li S user available solutions remains uncertain due to current resource limitations.
(2023) Plant disease detection model for
edge computing devices. One agricultural application for deep learning models is the detection and
Front. Plant Sci. 14:1308528. classification of plant diseases through image-based crop monitoring. We used
doi: 10.3389/fpls.2023.1308528
the publicly available PlantVillage dataset containing images of healthy and
COPYRIGHT
diseased leaves for 14 crop species and 6 groups of diseases as example data.
© 2023 Khan, Jensen, Khan and Li. This is an
open-access article distributed under the The MobileNetV3-small model succeeds in classifying the leaves with a test
terms of the Creative Commons Attribution accuracy of around 99.50%. Post-training optimization using quantization
License (CC BY). The use, distribution or
reproduction in other forums is permitted,
reduced the number of model parameters from approximately 1.5 million to
provided the original author(s) and the 0.93 million while maintaining the accuracy of 99.50%. The final model is in
copyright owner(s) are credited and that ONNX format, enabling deployment across various platforms, including mobile
the original publication in this journal is
cited, in accordance with accepted devices. These findings offer a cost-effective solution for deploying accurate
academic practice. No use, distribution or deep-learning models in agricultural applications.
reproduction is permitted which does not
comply with these terms.
KEYWORDS
PlantVillage, deep learning, classifier, edge computing, MobileNetV3
1 Introduction
Plant diseases can be a major concern for farmers due to the risk of substantial yield loss.
While applying pesticides can prevent or limit the impact of most plant diseases, their use should
be restricted due to environmental considerations. Early and efficient detection of plant diseases
and their distribution in the field is crucial for effective treatment. The implementation of
automatic plant disease detection systems is, therefore, essential for efficient crop monitoring.
Deep Learning Convolutional Neural Networks (CNNs) and computer vision are two developing
AI technologies that have recently been employed to identify plant leaf diseases automatically.
Already in 1980, Fukushima (1980) presented a visual cortex-inspired multilayer
artificial neural network for image classification. The network showed that the initial
layer detects simpler patterns with a narrow receptive field, while later levels combine
patterns from earlier layers to identify more complex patterns with wider fields. In 2012,
Frontiers in Plant Science 01 frontiersin.org

Khan et al. 10.3389/fpls.2023.1308528
Krizhevsky et al. (2012) developed the AlexNet architecture, which devices. In contrast, our work not only achieves high accuracy but
helped them win the ImageNet Large Scale Visual Recognition also emphasizes optimizing deep learning models for such constraints.
Challenge. Several CNN (Convolutional Neural Network) designs Recent advancements in the field substantiate this focus. For instance,
have been introduced since then Krizhevsky et al. (2012); Fu et al. Hao et al. (2023) discusses system techniques that enhance DL
(2018); Yang et al. (2023); Dutta et al. (2016); Sarda et al. (2021). inference throughput on edge devices, a key consideration for real-
These models are called “deep learning” architectures due to their 5- time applications in agriculture. Similarly, the DeepEdgeSoc framework
200 layers. Early investigations employed manually created Al Koutayni et al. (2023) accelerates DL network design for energy-
characteristics from leaf picture samples. Later, the trends shifted efficient FPGA implementations, aligning with our resource efficiency
to DCNN (Deep Convolutional Neural Network) architectures goal. Moreover, approaches like resource-frugal quantized CNNs
capable of effectively classifying data and automatically extracting Nalepa et al. (2020) and knowledge distillation methods Alabbasy
features. Plant disease picture classification has been used to test a et al. (2023) resonate with our efforts to compress model size while
variety of CNN architectures Amara et al. (2017); Sladojevic et al. maintaining performance. These studies highlight the importance of
(2016); Setiawan et al. (2021); Yang et al. (2023); Qiang et al. (2019); balancing computational demands with resource limitations, a core
Swaminathan et al. (2021); Schuler et al. (2022). aspect of our research. Thus, our work stands out by not only
Plant disease diagnosis through image analysis employs various addressing the accuracy of plant disease detection but also ensuring
machine learning techniques Ferentinos (2018). These methods the practical deployment of these models in real-world agricultural
identify and classify diseases affecting cucumbers, bananas Fujita settings where resources are limited.
et al. (2016), cassavas Amara et al. (2017), tomatoes Ramcharan One major drawback in the broader field is that deep-learning
et al. (2017), and wheat Fuentes et al. (2017). Ramcharan et al. approaches often have computational requirements, i.e., higher memory
(2017) tested five architectures—AlexNet, AlexNetOWTBn, and computing capacity, which are not always feasible for edge
GoogLeNet, Overfeat, and VGG on 58 classes of healthy and sick computing devices. Our paper tackles this challenge head-on, focusing
plants. AlexNet achieved 99.06% and VGG 99.48% test accuracy. on maximizing accuracy while operating within the resource constraints
Despite the large variation in trainable parameters, these designs inherent to edge computing devices, thereby significantly enhancing the
had test accuracy above 99%. Maeda-Gutié rrez et al. (2020) tested real-life applicability of deep learning models in agriculture.
five architectures for tomato illnesses. All architectures tested had The remaining part of the paper is organized as follows: Section
accuracies above 99%. However, when tested on field pictures, 2 will look into the PlantVillage dataset, then we will explore the
Ramcharan et al. (2017) encountered shadowing and leaf MobileNetV3-small architecture, model training, and finally, the
misalignment. These factors greatly affected classification accuracy. post-training quantization. Section 3 will discuss the results and
Amara et al. (2017) classified banana leaf diseases using 60×60 the comparison with existing methods. In Section 4, we will discuss
pixel pictures and a simple LeNet architecture. Grayscale images the importance of the problem and the relevance of our results.
had 85.94%, and RGB images had 92.88% test accuracy. Chromatic Finally, Section 5 will conclude the paper with final remarks.
information Mohanty et al. (2016) is essential in plant leaf disease
classification. Mohanty et al. (2016) used AlexNet and GoogLeNet
(Inception V1) designs to study plant leaf diseases and found RGB
2 Materials and methods
images to be more accurate than their grayscale counterparts. 2.1 PlantVillage dataset
Likewise, Schuler et al. (2022) split the Inception V3 architecture
into two branches, one dealing with the grayscale part of the RGB The present work used the publicly available PlantVillage-Dataset
image and the other branch dealing with the other two channels of (2016). All images in the PlantVillage database were captured at
the RGB image. The resultant architecture has 5 million trainable experimental research facilities connected to American Land Grant
parameters and achieved an accuracy of 99.48% on the test dataset. Universities. The dataset included 54,309 images of 14 crop species,
While these studies demonstrate the effectiveness of deep learning including tomato, apple, bell pepper, potato, raspberry, soybean,
in plant disease classification, they often do not address the critical squash, strawberry, and grape. A few sample images of the plants are
challenge of deploying these models on resource-constrained edge shown in Figure 1. It could be seen that some samples were healthy,
FIGURE 1
Sample images of the PlantVillage dataset. It is a diverse dataset with 14 plant species, including healthy and infected plants. The dataset includes a
total of 54,309 image samples.

Khan et al. 10.3389/fpls.2023.1308528
and some were infected. There were 17 fungal infections, 4 bacterial relationships. The data overlapping is quite visible in Figure 2,
diseases, 2 viral diseases, 1 mite disease, and 1 mold (oomycete). There where the dimensions of the PlantVillage dataset were reduced to 2.
were images of healthy leaves from 12 crop species, showing no obvious
signs of disease. In total, the dataset included 38 classes of healthy and
unhealthy crops. A detailed description of the distribution of species 2.2 MobileNetV3-small
and diseases in the dataset is shown in Table 1. It included 14 crop
species with 6 types, i.e., fungi, bacteria, mold, virus, mite, and healthy. Recent research has focused on deep neural network topologies
The dataset is imbalanced and not equally distributed across all 6 types. that balance accuracy and efficiency. Innovative handcrafted
To further elaborate on the imbalanced nature of the dataset, t- structures and algorithmic neural architecture search have
SNE analysis was performed. t-SNE, or tDistributed Stochastic advanced this discipline.
Neighbor Embedding, is a machine learning technique used to SqueezeNet used 1×1 convolutions with squeeze-and-expand
reduce dimensionality and visualize high-dimensional data. It modules to reduce parameters Iandola et al. (2016). Recent research
attempts to represent complex, high-dimensional data in a has focused on minimizing MAdds (Million Additions) and latency
lowerdimensional space while maintaining data point instead of parameters. Depthwise separable convolutions boosted
TABLE 1 Distribution of observations in the PlantVillage dataset.
Fungi Bacteria Mold Virus Mite Healthy

Apple (3172) 1521 1645
Blueberry (1502) 1502
Bell Pepper (2475) 997 1478
Cherry (1906) 1052 854
Corn (3852) 2690 1162
Grape (4063) 3640 423
Orange (5507) 5507
Peach (2657) 2291 360
Potato (2152) 1000 1000 152
Raspberry (371) 371
Soybean (5090) 5090
Squash (1835) 1835
Strawberry (1565) 1109 456
Tomato (18,162) 5127 2127 1910 5730 1676 1592
The bold values represent the total number of images for that class in the dataset.
FIGURE 2
Visualization of the 38 classes in the PlantVillage data in two dimensions based on a t-SNE analysis. Each color in the spectrum represents one class
in the PlantVillage dataset.

Khan et al. 10.3389/fpls.2023.1308528
computational efficiency in MobileNetV1 Howard et al. (2017). to increase the number of channels. A nonlinear activation function
MobileNetV2 added a resource-efficient block with inverted (e.g., ReLU) is applied to introduce nonlinearity.
residuals and linear bottlenecks to improve efficiency Howard
et al. (2018).
Later, MobileNetV3 Howard et al. (2019) extended 2.2.3 Squeeze-and-excite module
MobileNetV2’s efficient neural network design. MobileNetV3’s The Squeeze-and-Excite (SE) module is incorporated into the
backbone network, “MobileNetV3-Large,” used linear bottlenecks MobileNetV3-small architecture to improve feature representation
and inverted residual blocks to increase accuracy and efficiency. and adaptively recalibrate channel-wise information. The SE
module contains two steps:
Hierarchical squeeze-and-excitation (HSqueeze-and-Excitation)
blocks adaptively recalibrated feature responses in MobileNetV3.
• Squeeze: Global average pooling is applied to the feature
Hard-Swish and Mish activation functions balanced computing
efficiency and non-linearity. MobileNetV3 used neural architecture maps, reducing spatial dimensions to 1×1.
• Excite: Two fully connected (FC) layers are used to learn
search to find optimal network architectures.
MobileNetV3-small was created for resource-constrained channel-wise attention weights. These weights are
multiplied with the original feature maps to emphasize
situations. Its tiny, lightweight neural network system is efficient
and accurate. MobileNetV3-small achieved this through essential features and suppress less relevant ones.
architectural optimizations, a simplified design, and decreased
complexity. A reduced network footprint reduced parameters and
operations. MobileNetV3-compact solved several real-world 2.2.4 Stem blocks
problems with low computing resources or edge device MobileNetV3-small introduces stem blocks to further enhance
deployment with a compact but efficient architecture. It feature extraction at the beginning of the network. The stem block
introduced several key components to optimize performance and consists of a combination of depth-wise and point-wise
achieve high accuracy with fewer parameters. convolutions with nonlinear activation.
2.2.1 Initial convolution 2.2.5 Classification head

An RGB image of size (B,H,W,3), where B is the batch size, H is After multiple stacked bottleneck blocks and SE modules, the
the height, and W is the width, is used as an input. The image is final feature maps are passed through a classification head to make
passed through a standard convolutional layer with a small filter predictions. Global average pooling is applied to the feature maps to
size (e.g., 3x3) and a moderate number of channels (e.g., 16). reduce spatial dimensions to 1×1. The output of global average
pooling is then fed into a fully connected layer with “softmax”
2.2.2 Bottleneck residual blocks activation to produce K class probabilities, as shown in Figure 4.
MobileNetV3-small uses inverted bottleneck residual blocks, The overall architecture is shown in Table 2.
similar to its predecessor, MobileNetV2. The architecture is shown The architecture focuses on reducing the number of parameters
in Figure 3. Each block begins with a depth-wise convolution, which while maintaining competitive accuracy. The number of parameters
convolves each input channel separately with its small filter (e.g., in MobileNetV3-small is 1.5 million, which makes it suitable for
3x3), significantly reducing the computational cost. The depth-wise deployment on resource-constrained devices and applications that
convolution is followed by a point-wise convolution with 1×1 filters require real-time inference.
FIGURE 3
The MobileNetV3 block uses depthwise and pointwise convolutions to collect spatial patterns and integrate features. These blocks balance computing
performance and precision, helping MobileNetV3 interpret complicated visual data.

Khan et al. 10.3389/fpls.2023.1308528
FIGURE 4
It shows the overall architecture of MobileNet-V3 Small. It includes a lightweight neural network design featuring depth-wise convolutions, inverted
residuals, and a squeeze-and-excitation module for efficient feature extraction targeted for mobile and edge devices.
TABLE 2 Specification of MobileNetV3-Small.
Input Operator Exp-Size #Out SE NL Stride

b
224 × 224 × 3 Conv2d, 3×3 - 16 - HS 2
112 × 112 × 16 BottleNeck, 3 × 3 16 16 ✓ RE c

2
56 × 56 × 16 BottleNeck, 3 × 3 72 24 - RE 2
28 × 28 × 24 BottleNeck, 3 × 3 88 24 - RE 1
28 × 28 × 24 BottleNeck, 5 × 5 96 40 ✓ HS 2
14 × 14 × 40 BottleNeck, 5 × 5 240 40 ✓ HS 1
14 × 14 × 40 BottleNeck, 5 × 5 240 40 ✓ HS 1
14 × 14 × 40 BottleNeck, 5 × 5 120 48 ✓ HS 1
14 × 14 × 48 BottleNeck, 5 × 5 144 48 ✓ HS 1
14 × 14 × 48 BottleNeck, 5 × 5 288 96 ✓ HS 2
7 × 7 × 96 BottleNeck, 5 × 5 576 96 ✓ HS 1
7 × 7 × 96 BottleNeck, 5 × 5 576 96 ✓ HS 1
7 × 7 × 96 Conv2d, 1×1 - 576 ✓ HS 1
7 × 7 × 576 Pool, 7 × 7 - - - - 1
a
1 × 1 × 576 Conv2d, 1 × 1, NBN - 1024 - HS 1
1 × 1 × 1024 Conv2d, 1×1, NBN - K - - 1
Conv 2 d, Convolution 2 DBottleNeck: Bottleneck Residual Blocks.

NBN, No Batch Normalization HS: Hard-Swish activation function.
RE, Rectified Exponential Linear Unit activation function Pool: Pooling Layer.
“✓” represents that squeeze-excitation (SE) layer is used in that bottleneck block and “-” represents SE-layer is not utilized.
2.3 Model optimization impact on model performance while achieving significant gains in
model size reduction and faster inference times. Static quantization
Model optimization, or quantization, is an essential deep- quantifies model weights and activations during training, whereas
learning technique that reduces a neural network’s memory dynamic quantization quantifies model weights and activations
footprint and computational complexity. Quantization enables based on the observed activation range at runtime.
efficient deployment on resource-constrained devices, such as For model quantization, the “Pytorch” built-in quantization
mobile phones, peripheral devices, and microcontrollers, by tool was used Pytorch (2023). The PyTorch library’s
converting the weights and activations of a full-precision model torch.quantization.quantize dynamic function was used to
into lower-precision representations (e.g., 8-bit integers) Zhu et al. dynamically quantify particular layers in a given classifier model.
(2016). The procedure entails careful optimization to minimize the The torch.quantization.quantize dynamic function clones the input

Khan et al. 10.3389/fpls.2023.1308528
“model” before converting it into a quantized form. It then locates 3 Results

the cloned model’s layers corresponding to the requested classes,
such as Linear (2D convolutional layers) and Conv2d (2D The training and testing dataset included samples from all 38
convolutional layers). The weights and activations of each classes. “Cross-entropy” was used as the loss function for the
recognized layer are subjected to dynamic quantization. The classification. The model’s performance was evaluated based on
activations are quantized at runtime depending on the observed two key metrics: Accuracy (Equation 1) and F1 score (Equation 4).
dynamic range during inference, whereas the weights are quantized Accuracy, defined as the proportion of correctly identified classes to
to int8 (Integer stored with 8 bit). The cloned model replaces the the total number of classes, reflects the overall effectiveness of the
quantized layers while leaving the other layers in their original model in classification tasks. In our study, the initial accuracy of the
floating-point format. Compared to the original full-precision pre-trained model was 97%, which increased to a maximum test
model, the quantized model has less memory and better accuracy of 99.50% at the 154-th epoch. This metric essentially
computational efficiency, and it is prepared for inference on gauges the model’s ability to label classes correctly. On the other
hardware or platforms that support integer arithmetic. hand, the F1 score, a harmonic mean of precision (Equation 2) (the
While quantization is our chosen method, it is important to proportion of true positive predictions in the total positive
acknowledge that there are other effective techniques for predictions) and recall (Equation 3) (the proportion of true
compressing deep learning models. These include knowledge positive predictions in the actual positive cases), measures the
distillation, where a smaller model is trained to emulate a larger model’s ability to accurately identify positive examples while
one Hinton et al. (2015), pruning, which involves removing less minimizing false positives. This metric is especially useful in
important neurons Han et al. (2015), and low-rank factorization, a understanding the model’s precision and robustness in identifying
technique for decomposing weight matrices Jaderberg et al. (2014). correct classifications without mistakenly labeling incorrect ones as
Each of these methods offers unique advantages in model correct. The trajectory of the model’s accuracy with MobileNetV3-
compression and can be particularly beneficial in scenarios with Small is shown in Figure 5. Similarly, the training loss, i.e., cross-
limited computational resources. However, for the goals and entropy loss, rapidly approached 0 and was ultimately reduced to 0
constraints of our current study, quantization emerged as the at the 136-th epoch. The trajectory of the training loss for
most suitable approach. MobileNetV3-Small is depicted in Figure 6.
The above technique was employed to quantize “Linear” and
“Conv2d” layers with lower-precision representations, i.e., 8-bit. Number of Correct Predictions
Accuracy = (Eq: 1)
Total Number of Predictions
2.4 Model training True Positives

Precision = (Eq: 2)
True Positives + False Positives
For the model training, the MobileNetV3-small model from
True Positives
PyTorch, trained on ImageNet data, was employed. The training Recall = (Eq: 3)
True Positives + False Negatives
pipeline was simple as it did not involve any preprocessing of the
image data. The model was fed with PlantVillage images of
resolution 224×224. The hardware specifications were as follows: Precision Recall
F1 Score = 2 (Eq: 4)
Precision + Recall
• Processor: 11th Gen Intel(R) Core(TM) i9-11950H @ 2.60 Later, the model was quantized, and the parameters were
GHz 2.61 GHz reduced to 0.9 million without reducing the accuracy of 99.50%.
• RAM: 64 GB The inference time of the model was 0.01 seconds, and it achieved a
• GPU: Intel(R) UHD Graphics & NVIDIA RTX A3000 frame rate of 100 frames per second (FPS) when running on a CPU.
The higher-dimensional latent space of the model was also
Although the model was trained on a GPU, the final quantized visualized using t-SNE Van der Maaten and Hinton (2008).
model was intended for CPU and edge devices. The optimizer 54,309 images of 38 classes were input to the trained model, and
parameters were as follows: the output from the second-to-last layer of the MobileNetV3-small,
which had dimensions of 1024, was obtained. Using t-SNE, the
• Optimizer: Adam optimizer dimensions were reduced to 2, and the results were plotted to see
• Betas: (0.5,0.99) the underlying classification modeling of the model. The results are
• Learning rate: 0.0001 shown in Figure 7. By forming distant clusters, it can be seen that
the model efficiently classified 38 classes of plants.
Some additional model-training hyperparameters included: Finally, the model was compared with other state-of-the-art
architectures applied to the PlantVillage dataset. The comparison
• Batch Size: 64 was based on three parameters, i.e., the number of model
• Epochs: 200 parameters, model accuracy, and F1 score. The comparison is
• Training Data Percentage: 80% shown in Table 3. In the list of architectures, Schuler Schuler
• Validation & Test Data Percentage: 10% each. et al. (2022) had the highest accuracy and F1 score, and

Khan et al. 10.3389/fpls.2023.1308528
FIGURE 5
After training for 200 in epochs, the MobileNetV3-small gained an accuracy of 99.50 in roughly 154 epochs. The initial accuracy is approximately
97.0% because we used a pre-trained model.
FIGURE 6
The training loss of MobileNetV3-small in 200 epochs quickly decreases and settles to 0.0 at 136 Epoch. The lower initial loss is the result of the
pre-trained model.
Geetharamani Geetharamani and Pandian (2019) had the least assessing crop health. Bulky models can slow the processing of
number of parameters, 0.2M. The proposed solution had the data, causing delays that might compromise timely interventions.
highest accuracy (99.50%) and F1 score (0.9950). However, Deploying these models on edge devices, frequently used in
the number of parameters was 0.9M, which was 5 times less than agriculture for on-site analysis, becomes problematic due to their
the model suggested by the Schuler et al. (2022) model. computational and memory constraints. Furthermore, in regions
with limited connectivity, transferring data for cloud-based
processing by large models can be bandwidth-intensive, leading to
4 Discussion additional lags. The energy and financial costs of running extensive
models can also be prohibitive for many agricultural applications,
Large model sizes can pose significant challenges to their especially for small-scale or resource-constrained farmers.
practical application in classification problems within agriculture. Additionally, the adaptability of these models can be limited;
Such problems often necessitate real-time or near-real-time training and fine-tuning them to cater to the diverse and evolving
solutions, especially when identifying pests and diseases or classification needs of different agricultural contexts can be

Khan et al. 10.3389/fpls.2023.1308528
FIGURE 7
The t-SNE visualization of latent space of trained MobilenetV3-small model The output is from the second-last layer with a dimension of 1024,
which is reduced to 2 using t-SNE. Each color in the spectrum represents one plant class in the PlantVillage dataset.
TABLE 3 Results comparison on PlantVillage dataset.
Author Architecture Parameters Accuracy Fl-score

Proposed MobileNetV3-small 0.9M 99.50% 0.9950
Schiller Schuler et al. (2022) Inception V3 (Modifed) 5M 99.48% 0.9923
Mohanty Mohanty et al. (2016) GoogLeNet 5M 98.37% 0.9836
Mohanty Mohanty et al. (2016) AlexNet 60M 97.82% 0.9782
Toda Toda and Okura (2019) Inception V3 5M 97.15% 0.9720
Geetharamani Geetharamani and Pandian (2019) 9 layers CNN 0.2M 96.46% 0.9815
Mohanty Mohanty et al. (2016) GoogLeNet 5M 96.21% 0.9621
Mohanty Mohanty et al. (2016) AlexNet 60M 94.52% 0.9449
TThe bold values correspond to the best value in each column.
challenging. In essence, while large models might boast superior when compared to larger, more complex models designed for high-
accuracy, their size can often impede their practicality and performance tasks. This reduction in accuracy can be a limitation
responsiveness in addressing agricultural classification problems. for applications where even a slight drop in precision can have
Previously proposed state-of-the-art solutions Schuler et al. significant consequences. Additionally, certain customizations or
(2022); Mohanty et al. (2016) for plant disease classifications fine-tuning required for specific tasks might not be as
achieve good accuracy. However, they have practical limitations straightforward, given its specialized architecture. Thus, while
in size and deployment. To overcome this issue, we proposed a MobileNetV3 is advantageous for many scenarios, it may not be
solution with MobileNetV3-small. Its compact and efficient the best fit for situations demanding the utmost accuracy and
architecture enables rapid data processing, facilitating real-time complex model customizations.
agricultural interventions, such as pest detection or disease The PlantVillage dataset, while comprehensive, exhibits an
identification. The model’s low power consumption makes it ideal unbalanced nature with respect to the number of images available
for battery-operated field devices, and its adaptability ensures for different plant diseases. Unbalanced data can significantly
relevance to diverse agricultural needs. Furthermore, its cost- impact deep learning model performance. Such datasets have
effectiveness and ease of maintainability make it a practical choice extremely skewed class distributions, with one or a few classes
for agricultural scenarios, offering a balance of high performance having disproportionately more samples. This imbalance causes
and resource efficiency. many issues. Deep learning models trained on unbalanced data tend
While MobileNetV3 offers impressive efficiency and is to focus accuracy on the dominant class over the minority classes,
optimized for edge devices, it has certain tradeoffs. The primary biasing them towards the majority class. As a result, the model’s
disadvantage is that, in pursuit of a lightweight and compact design, ability to generalize and forecast underrepresented classes falls,
it might not always achieve the highest possible accuracy, especially resulting in poor training and evaluation performance. Due to

Khan et al. 10.3389/fpls.2023.1308528
their rarity, the model may have trouble learning significant current focus on a controlled dataset lays the groundwork for this
patterns from minority classes, making it less likely to recognize expansion. In future work, we aim to test and refine our models
and classify cases from these classes. against the complexity of real-world agricultural scenarios,
MobileNetV3’s efficient and compact design offers a strategic enhancing their generalization capabilities. This step-by-step
advantage in addressing the imbalances inherent in datasets like approach, progressing from controlled conditions to more diverse
PlantVillage. By leveraging transfer learning, a pre-trained datasets, aims to develop robust and adaptable deep-learning
MobileNetV3 is later fine-tuned on PlantVillage classes, harnessing models for effective plant disease detection in practical
generalized features to counteract dataset disparities. Its lightweight agricultural settings.
nature facilitates rapid training, enabling extensive data augmentation
to enhance underrepresented classes. Furthermore, MobileNetV3 can
serve as a potent feature extractor, with the derived features being 5 Conclusion
suitable for synthetic sample generation techniques like SMOTE or
ADASYN to achieve class balance. The model’s cost-effectiveness The traditional and cutting-edge models have shown good
allows for swift iterative experiments, incorporating regularization accuracy; however, their suitability for onthe-ground applications
techniques to deter overfitting dominant classes. Overall, with limited resources is often limited. By focusing on maximizing
MobileNetV3 presents a versatile toolset for researchers to navigate accuracy within resource constraints, we demonstrated the real-life
and mitigate the challenges of unbalanced datasets. usability of deep learning models in agricultural settings. Using the
Training MobileNetV3 on the PlantVillage dataset and applying MobileNetV3-small model with approximately 1.5 million
it to new images introduces challenges related to generalization. parameters, we achieved a test accuracy of around 99.50%,
Absent categories, like healthy orange and squash, might be offering a cost-effective solution for accurate plant disease
misclassified into familiar classes the model has seen. Diseases not detection. Furthermore, post-training optimization, including
in the training data, such as brown spots on soybeans, could be quantization, reduced the model parameters to 0.9 million,
wrongly identified as another visually similar ailment or even as a enhancing inference efficiency. The final model in ONNX format
healthy state. The model might also grapple with new images that enables seamless deployment across multiple platforms, including
differ in lighting, resolution, or background, especially if not mobile devices. These contributions ensure that deep learning
exposed to such variations during training. The inherent class models can be practically and efficiently utilized in real-world
imbalance in the PlantVillage dataset, if unaddressed, can further agricultural applications, advancing precision farming practices
bias the model towards overrepresented classes, affecting its and plant disease detection.
performance on new or underrepresented classes. In essence,
while MobileNetV3 is efficient, its accuracy on unfamiliar data
hinges on the diversity and comprehensiveness of its training data. Data availability statement
Quantization compresses neural models by reducing the bit
representation of weights and activations, enhancing memory The original contributions presented in the study are included
efficiency and inference speed. “Weight quantization” reduces in the article/supplementary material. Further inquiries can be
weight precision after training. This post-training quantization can directed to the corresponding author.
introduce errors, as the model was not trained to accommodate the
reduced precision. This can sometimes lead to a significant drop in
model performance. Whereas “quantization-aware training” adjusts Author contributions
the model during training to a lower precision. PyTorch’s
torch.quantization.quantize dynamic is notable, dynamically ATK: Methodology, Software, Writing – original draft. SJ:
quantizing mainly the linear layers. This balances reduced model Supervision, Writing – review & editing. ARK: Conceptualization,
size and computational efficiency, preserving accuracy and making it Methodology, Validation, Writing – original draft. SL: Formal
apt for models with varied layer intensities. analysis, Methodology, Validation, Writing – review & editing.
The proposed pipeline, while efficient in its current application,
does have certain limitations. Firstly, the pipeline is optimized for a
specific dataset and task; scaling it to handle larger datasets or Funding
adapting it to different types of plants and diseases might require
additional modifications. Secondly, the maintenance and updating The author(s) declare that no financial support was received for
of the model could present minor challenges. Ensuring that the the research, authorship, and/or publication of this article.
model remains current with the latest data and continuously
performs at its peak might necessitate regular updates and
maintenance, which can be resource-intensive over time. Conflict of interest
As we move forward from this study, we plan to extend our
research to include a wider range of real-world datasets, such as The authors declare that the research was conducted in the
those suggested by Tomaszewski Tomaszewski et al. (2023) and absence of any commercial or financial relationships that could be
Ruszczak Ruszczak and Boguszewska-Mań kowska (2022). Our construed as a potential conflict of interest.

Khan et al. 10.3389/fpls.2023.1308528
Publisher’s note organizations, or those of the publisher, the editors and the
reviewers. Any product that may be evaluated in this article, or
All claims expressed in this article are solely those of the authors claim that may be made by its manufacturer, is not guaranteed or
and do not necessarily represent those of their affiliated endorsed by the publisher.
References
Alabbasy, F. M., Abohamama, A., and Alrahmawy, M. F. (2023). Compressing Maeda-Gutié rrez, V., Galvá n-Tejada, C. E., Zanella-Calzada, L. A., Celaya-Padilla, J.
medical deep neural network models for edge devices using knowledge distillation. J. M., Galvá n -Tejada, J. I., Gamboa-Rosales, H., et al. (2020). Comparison of
King Saud. University-Computer Inf. Sci., 101616. convolutional neural network architectures for classification of tomato plant diseases.
Al Koutayni, M. R., Reis, G., and Stricker, D. (2023). Deepedgesoc: End-to-end deep Appl. Sci. 10, 1245. doi: 10.3390/app10041245
learning framework for edge iot devices. Internet Things 21, 100665. doi: 10.1016/ Mohanty, S. P., Hughes, D. P., and Salathé , M. (2016). Using deep learning for image-
j.iot.2022.100665 based plant disease detection. Front. Plant Sci. 7,1419. doi: 10.3389/fpls.2016.01419
Amara, J., Bouaziz, B., and Algergawy, A. (2017). “A deep learning-based approach Nalepa, J., Antoniak, M., Myller, M., Lorenzo, P. R., and Marcinkiewicz, M. (2020).
for banana leaf diseases classification,” in Datenbanksysteme für Business, Technologie Towards resourcefrugal deep convolutional neural networks for hyperspectral
und Web (BTW 2017) Workshopband. image segmentation. Microprocessors Microsys. 73, 102994. doi: 10.1016/
Dutta, A., Gupta, A., and Zissermann, A. (2016) Vgg image annotator (via). Available j.micpro.2020.102994
at: http://www.robots.ox.ac.uk/~vgg/software/via. PlantVillage-Dataset (2016). GitHub. Available at: https://github.com/spMohanty/
Ferentinos, K. P. (2018). Deep learning models for plant disease detection and PlantVillage-Dataset/tree/master.
diagnosis. Comput. Electron. Agric. 145, 311–318. doi: 10.1016/j.compag.2018.01.009 Pytorch. (2023). Quantization pytorch 2.0 documentation.
Fu, L., Feng, Y., Majeed, Y., Zhang, X., Zhang, J., Karkee, M., et al. (2018). Kiwifruit Qiang, Z., He, L., and Dai, F. (2019). “Identification of plant leaf diseases based on
detection in field images using faster r-cnn with zfnet. IFAC-PapersOnLine 51, 45–50. inception v3 transfer learning and fine-tuning,” in International Conference on Smart
doi: 10.1016/j.ifacol.2018.08.059 City and Informatization (Springer). 118–127.
Fuentes, A., Yoon, S., Kim, S. C., and Park, D. S. (2017). A robust deep-learning- Ramcharan, A., Baranowski, K., McCloskey, P., Ahmed, B., Legg, J., and Hughes, D.
based detector for real-time tomato plant diseases and pests recognition. Sensors 17, P. (2017). Deep learning for image-based cassava disease detection. Front. Plant Sci. 8,
2022. doi: 10.3390/s17092022 1852. doi: 10.3389/fpls.2017.01852
Fujita, E., Kawasaki, Y., Uga, H., Kagiwada, S., and Iyatomi, H. (2016). “Basic Ruszczak, B., and Boguszewska-Mań kowska, D. (2022). Deep potato–the
investigation on a robust and practical plant diagnostic system,” in 2016 15th IEEE hyperspectral imagery of potato cultivation with reference agronomic measurements
international conference on machine learning and applications (ICMLA) (IEEE). 989–992. dataset: Towards potato physiological features modeling. Data Brief 42, 108087. doi:
Fukushima, K. (1980). Neocognitron: A self-organizing neural network model for a 10.1016/j.dib.2022.108087
mechanism of pattern recognition unaffected by shift in position. Biol. Cybernetics 36, Sarda, A., Dixit, S., and Bhan, A. (2021). “Object detection for autonomous driving
193–202. doi: 10.1007/BF00344251 using yolo algorithm,” in 2021 2nd International Conference on Intelligent Engineering
Geetharamani, G., and Pandian, A. (2019). Identification of plant leaf diseases using and Management (ICIEM) (IEEE). 447–451.
a nine-layer deep convolutional neural network. Comput. Electrical Eng. 76, 323–338. Schuler, J. P. S., Romani, S., Abdel-Nasser, M., Rashwan, H., and Puig, D. (2022).
doi: 10.1016/j.compeleceng.2019.04.011 Color-aware two-branch dcnn for efficient plant disease classification. MENDEL 28,
Han, S., Mao, H., and Dally, W. J. (2015). Deep compression: Compressing deep 55–62. doi: 10.13164/mendel.2022.1.055
neural networks with pruning, trained quantization and huffman coding. arXiv. Setiawan, W., Ghofur, A., Rachman, F. H., and Rulaningtyas, R. (2021). Deep
Hao, J., Subedi, P., Ramaswamy, L., and Kim, I. K. (2023). Reaching for the sky: convolutional neural network alexnet and squeezenet for maize leaf diseases image
Maximizing deep learning inference throughput on edge devices with ai multi-tenancy. classification. Kinetik: Game Technol. Inf. Sys. Comput. Netw. Comput. Electr. Control.
ACM Trans. Internet Technol. 23, 1–33. doi: 10.1145/3546192 Sladojevic, S., Arsenovic, M., Anderla, A., Culibrk, D., and Stefanovic, D. (2016).
Hinton, G., Vinyals, O., and Dean, J. (2015). Distilling the knowledge in a neural Deep neural networks based recognition of plant diseases by leaf image classification.
network. arXiv. Comput. Intell. Neurosci. 2016. doi: 10.1155/2016/3289801
Howard, A., Sandler, M., Chu, G., Chen, L.-C., Chen, B., Tan, M., et al. (2019). Swaminathan, A., Varun, C., Kalaivani, S., et al. (2021). Multiple plant leaf disease
“Searching for mobilenetv3,” in Proceedings of the IEEE/CVF international conference classification using densenet-121 architecture. Int. J. Electr. Eng. Technol. 12, 38–57.
on computer vision. 1314–1324. Toda, Y., and Okura, F. (2019). How convolutional neural networks diagnose plant
Howard, A., Zhmoginov, A., Chen, L.-C., Sandler, M., and Zhu, M. (2018). Inverted residuals disease. Plant Phenomics. doi: 10.34133/2019/9237136
and linear bottlenecks: Mobile networks for classification, detection and segmentation. Tomaszewski, M., Nalepa, J., Moliszewska, E., Ruszczak, B., and Smykała, K. (2023).
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., et al. Early detection of solanum lycopersicum diseases from temporally-aggregated
(2017). Mobilenets: Efficient convolutional neural networks for mobile vision hyperspectral measurements using machine learning. Sci. Rep. 13, 7671. doi:
applications. arXiv. 10.1038/s41598-023-34079-x
Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., and Keutzer, K. Van der Maaten, L., and Hinton, G. (2008). Visualizing data using t-sne. J. Mach.
(2016). Squeezenet: Alexnet-level accuracy with 50x fewer parameters and¡ 0.5 mb Learn. Res. 9.
model size. arXiv. Yang, L., Yu, X., Zhang, S., Long, H., Zhang, H., Xu, S., et al. (2023). Googlenet based
Jaderberg, M., Vedaldi, A., and Zisserman, A. (2014). Speeding up convolutional on residual network and attention mechanism identification of rice leaf diseases.
neural networks with low rank expansions. arXiv. doi: 10.5244/C.28.88 Comput. Electron. Agric. 204, 107543. doi: 10.1016/j.compag.2022.107543
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Imagenet classification with Zhu, C., Han, S., Mao, H., and Dally, W. J. (2016). Trained ternary
deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25. quantization. arXiv.

Fpls 14 1308528

Uploaded by

Copyright:

Available Formats

Fpls 14 1308528

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Fpls 14 1308528

Uploaded by

Copyright:

Available Formats

TYPE Original Research

PUBLISHED 08 December 2023

Plant disease detection model

RECEIVED 06 October 2023

PlantVillage, deep learning, classiﬁer, edge computing, MobileNetV3

Frontiers in Plant Science 01 frontiersin.org

Frontiers in Plant Science 02 frontiersin.org

TABLE 1 Distribution of observations in the PlantVillage dataset.

Fungi Bacteria Mold Virus Mite Healthy

Blueberry (1502) 1502

Bell Pepper (2475) 997 1478

Cherry (1906) 1052 854

Corn (3852) 2690 1162

Grape (4063) 3640 423

Orange (5507) 5507

Peach (2657) 2291 360

Potato (2152) 1000 1000 152

Raspberry (371) 371

Soybean (5090) 5090

Squash (1835) 1835

Strawberry (1565) 1109 456

Tomato (18,162) 5127 2127 1910 5730 1676 1592

Frontiers in Plant Science 03 frontiersin.org

2.2.1 Initial convolution 2.2.5 Classiﬁcation head

Frontiers in Plant Science 04 frontiersin.org

TABLE 2 Speciﬁcation of MobileNetV3-Small.

Input Operator Exp-Size #Out SE NL Stride

112 × 112 × 16 BottleNeck, 3 × 3 16 16 ✓ RE c

7 × 7 × 96 Conv2d, 1×1 - 576 ✓ HS 1

1 × 1 × 1024 Conv2d, 1×1, NBN - K - - 1

Conv 2 d, Convolution 2 DBottleNeck: Bottleneck Residual Blocks.

Frontiers in Plant Science 05 frontiersin.org

“model” before converting it into a quantized form. It then locates 3 Results

2.4 Model training True Positives

Frontiers in Plant Science 06 frontiersin.org

Frontiers in Plant Science 07 frontiersin.org

TABLE 3 Results comparison on PlantVillage dataset.

Author Architecture Parameters Accuracy Fl-score

Schiller Schuler et al. (2022) Inception V3 (Modifed) 5M 99.48% 0.9923

Mohanty Mohanty et al. (2016) GoogLeNet 5M 98.37% 0.9836

Mohanty Mohanty et al. (2016) AlexNet 60M 97.82% 0.9782

Toda Toda and Okura (2019) Inception V3 5M 97.15% 0.9720

Mohanty Mohanty et al. (2016) GoogLeNet 5M 96.21% 0.9621

Mohanty Mohanty et al. (2016) AlexNet 60M 94.52% 0.9449

TThe bold values correspond to the best value in each column.

Frontiers in Plant Science 08 frontiersin.org

Frontiers in Plant Science 09 frontiersin.org

Frontiers in Plant Science 10 frontiersin.org

You might also like