Krishi Sahayak Research Paper
Krishi Sahayak Research Paper
Krishi Sahayak Research Paper
Fertilizer Recommendation
1
K. J. Somaiya Institute of Technology, University of Mumbai, India
2
K. J. Somaiya Institute of Technology, University of Mumbai, India
3
K. J. Somaiya Institute of Technology, University of Mumbai, India
4
K. J. Somaiya Institute of Technology, University of Mumbai, India
Corresponding author email : s.halbe@somaiya.edu, deep.lad@somaiya.edu, tirth.bp@somaiya.edu,
pyb@somaiya.edu
ABSTRACT
Krishi Sahayak is a groundbreaking agricultural solution that leverages the power of Hybrid machine
learning algorithms to deliver highly personalized crop and fertilizer recommendations, tailored to the specific
farming conditions of individual agricultural settings. The system takes into account a comprehensive set of vital
environmental parameters.The primary objective of this research is to enhance the precision of crop
recommendations and fertilizer recommendation for farmers, empowering them to make well-informed decisions
regarding the most suitable crops for their unique agricultural contexts. A sequential model is trained using the
formidable capabilities of Random Forest, XGBoost and lightGBM algorithms, renowned for their excellence in
classification tasks to perform the Recommendations. In addition to this, Krishi Sahayak takes a significant step
towards enhancing accessibility and inclusivity by providing multilingual support, including Hindi and Marathi
languages. This feature ensures that a broader spectrum of farmers can benefit from data-backed
recommendations.Furthermore, the project acknowledges the current challenges within the field and hints at future
developments, such as emerging materials and sustainable solutions, which have the potential to shape the trajectory
of crop and fertilizer recommendation systems, further elevating the agricultural industry. This research was
enriched by consulting various agriculture experts and corporations, ensuring alignment with real-world farming
practices and needs.
Key words: Sustainability, Random Forest, XGBoost, LightGBM, Crop Recommendation, Fertilizer
Recommendation
A Crop Recommendation System was introduced for Precision agriculture is explored and ensemble
Farmers (RSF) that recommends crops based on user techniques are employed such as CHAID, Naive
location and agricultural data. It uses various machine Bayes, KNN and Random Forest, emphasizing the
learning algorithms such as Logistic Regression,DT, significance of data mining and ensemble techniques
Linear Regression, SVM, KNN, Naive Bayes and in decision-making [2]. Various machine learning
K-Means to enhance crop selection and productivity techniques, like XGBoost, SVM, Naive Bayes,
[1]. Random Forest, Logistic Regression and DT, were
investigated to refine crop recommendations. undergoes training with both the Light GBM model
Certainly, Random Forest performed exceptionally and logistic regression. Based on accuracy score,
well, achieving an impressive accuracy rate of Light BGM model provides more accuracy than
96.34% [3]. logistic regression.[8]
Mobile app using GPS and user inputs for crop Random Forest and Logistic Regression are
suggestions is developed, employing SVM, ANN, combined for precise crop and fertilizer
RF, MLR, and KNN algorithms. It achieves 95% recommendations. A user-friendly web application
accuracy with Random Forest in yield prediction and ensures accessibility to farmers. Extensive
provides 14-day rainfall forecasts via the Open experimentation reveals that the hybrid model
Weather API. Safety measures include withholding outperforms individual algorithms like Logistic
fertilizer when rainfall exceeds 1.25 mm. [4] Regression, SVM, Logistic Regression, DT and
A web application is proposed for crop and fertilizer Random Forest.[9]
recommendation. Multiple machine learning A state-of-the-art Crop Prediction System is
algorithms, such as ANN, Random Forest, SVM , introduced that uses K-Nearest Neighbors (KNN) for
KNN, Naïve Bayes and DT, have been utilized for accurate predictions based on soil characteristics.The
training in both the crop recommendation system and K-Nearest Neighbors (KNN) Classifier, achieving a
fertilizer recommendation.[5] remarkable 90% precision. Real-time data on soil
A web application-based system is developed for quality is acquired using an Arduino Uno and
crop as well as fertilizer recommendation, and plant sensors.Future plans include integrating KNN with
disease prediction. MobileNet algorithm is used to Geographic Information Systems (GIS) for even
identify plant diseases using leaf images, XGBoost more accurate crop suggestions.[10]
algorithm is employed to predict appropriate crops by Deep neural networks are employed for crop
considering soil nutrients and rainfall and Random recommendation as well as plant disease detection.
Forest (RF) algorithm offers recommendations for An artificial neural network is employed for crop
fertilizers and strategies to enhance soil fertility, recommendation, while a 2D CNN can be utilized to
primarily based on soil nutrient data. These proposed create a system for detecting plant leaf diseases
models outperforms the existing classifiers in terms [11].Crop and fertilizer recommendation and leaf
of accuracy.[6] disease prediction is integrated. Random Forest was
The BiLSTM-MANN algorithm is utilized to identified as the best choice for crop
provide precise crop recommendations. The dataset is recommendation. Fertilizer recommendation, driven
trained using BiLSTM-MANN , MLP and CNN by SVM and RF, achieves a remarkable 100%
models. Based on different evaluation accuracy rate. For leaf disease prediction, CNN using
metrics,BiLSTM-MERNN model outperforms , rest the ResNet architecture was employed, offering deep
of the algorithms in context of crop recommendation learning capabilities and a 95% accuracy rate for
systems.[7] early disease detection.[12]
A crop recommendation system is proposed The challenges posed by climate change while
employing the Light GBM algorithm. The dataset offering a unified platform for crop recommendation
as well as plant disease identification. In the context incorporate GPS coordinates and government rainfall
of crop recommendations,five distinct machine forecasting for precise crop predictions.[16]
learning algorithms were employed , namely Logistic These projects collectively highlight the potential of
regression,DT, SVM, multi-layer perceptron, and machine learning approaches to revolutionize Indian
Random Forest. Random forest algorithm displayed a agriculture, improving crop selection, yield
remarkable accuracy rate of 99.31%.In the realm of prediction, and overall productivity while addressing
plant disease identification, training and evaluation the challenges faced by farmers.
were conducted using three different Convolutional
Neural Network (CNN) architectures: VGG16,
MATERIALS AND METHODS
ResNet50, and EfficientNetV2. Among these,
EfficientNetV2 displayed outstanding performance, Dataset Used
boasting an accuracy rate of 96.06%.[13]
The dataset utilized in this system is
An innovative machine learning algorithm,
compiled through the merging of multiple sources. It
AdaBoost, was introduced, for predicting crop yields
incorporates the open-source Western Maharashtra
and recommending fertilizers based on soil
dataset on crops and fertilizers, supplemented by
conditions. Additionally, the study recommends
additional data on parameters such as sowing season,
fertilizers employing the Random Forest (RF)
electrical conductivity, soil type, and soil
algorithm. Further enhancements may explore the use
characteristics sourced from a booklet provided by
of the Gradient Boost algorithm for prediction
GNFC Soil Testing Lab officer and research papers
alongside potential algorithms like SVM and
by PHD agricultural professors. District Name is also
Decision Trees, in conjunction with Random Forest,
included as a significant parameter, recognizing that
to refine the prediction model.[14]
geographic location profoundly influences crop
Farmers were empowered with insights and
cultivation. This merged dataset combines essential
predictions.The system also includes a harvest
parameters crucial for creating a robust Crop and
prediction feature. It assists in selecting optimal
Fertilizer Recommendation system. It includes data
pesticides, fertilizers, and recommended crops, using
on district names, soil color, nutrient levels
various algorithms like Random Forest, XGBoost,
(Nitrogen, Potassium, Phosphorus, pH), rainfall,
Random Forest, SVM, Logistic Regression, and
temperature, crop names, and suitable fertilizers. This
Naive Bayes.[15]
comprehensive dataset amalgamates insights from
Machine learning algorithms were employed on a
trusted sources, empowering farmers with accurate
well-prepared dataset for crop recommendation.We
recommendations for optimizing crop yields and
find that that Random Forest and XGBoost
promoting sustainable agricultural practices.
performed the best with accuracies of 98.9% and
98.2% respectively and Logistic Regression achieved
Methodology applied
an accuracy of 95.6%, and Decision Trees reached
95.3% accuracy. Future improvements could The proposed methodology involves
utilizing two datasets from Kaggle and agricultural
institutions for crop and fertilizer recommendation. Following model training, the Crop and
Data preprocessing includes one-hot encoding Fertilizer Recommendation System is implemented,
categorical variables, normalizing numerical features, deploying trained models to generate real-time
and rigorous data cleaning. The dataset is split into recommendations based on user-input agricultural
training and testing subsets to prevent overfitting. parameters. A multilingual user interface is
developed for seamless interaction.
Our innovative approach features a
three-layer sequential machine learning algorithm, Performance evaluation metrics such as
integrating Random Forest, XGBoost, and Accuracy, Precision, Recall, and F1-score are used to
LightGBM models. These models undergo rigorous assess model effectiveness. The approach aims to
training to predict suitable crops and fertilizers. A maximize predictive accuracy and robustness for crop
novel sequential modeling approach is employed, and fertilizer prediction tasks.
systematically varying the sequence of models to
RESULTS AND DISCUSSIONS
explore different combinations and optimize
predictive performance.
Comparing Sequence [RF,XGB,LGBM] for Crop
Recommendation Module
In this sequential model, the first model
predicts the probability distribution of each class.This
First sequence considered for Crop
initial step is aimed to provide insights into the
Recommendation Module is [RF,XGB,LGBM] .The
likelihood of each class being the correct label for a
RF model, as the initial layer, predicts the probability
given record. By obtaining probability estimates for
distribution of each class, offering insights into the
each class, the model could capture the uncertainty
likelihood of each class being the correct label for a
inherent in multiclass classification tasks. The second
given record. Following RF, the XGB model
model calculates residuals, identifying areas of
calculates residuals, pinpointing areas of prediction
prediction deviation. This step allows for the
deviation. Lastly, the LGBM model utilizes residual
identification of areas where the initial model's
information and probabilities from the previous
predictions deviated from the true labels, enabling
layers to predict output labels accurately. By
subsequent models to focus on correcting these
integrating residual data and probability estimates, it
discrepancies. Finally, the third model utilizes
enhances the overall accuracy of the recommendation
residual information and probabilities to predict
system. In the given sequence, the obtained metric
output labels.The models with LightGBM in the
scores for Accuracy, Precision, Recall, and F1_Score
middle of the sequence are not considered due to
are exemplary, all achieving a perfect score of 1.0.
LightGBM's constraint of working with target values
This indicates the exceptional performance of the
having only one dimension. As such, the sequencing
sequential model in accurately predicting crop
is designed to accommodate this limitation while still
recommendations.
leveraging the strengths of all three algorithms in the
ensemble. Comparing Sequence [XGB,RF,LGBM] for Crop
Recommendation Module
The sequence [XGB,RF,LGBM] was For the Crop Recommendation Module, the sequence
employed for the Crop Recommendation Module. In [LGBM,XGB,RF] is implemented. Initially, the
this sequence, the XGB model starts by predicting the LGBM model predicts the probability distribution of
probability distribution of each class, providing each class, providing crucial insights into the
insights into the likelihood of correct labels. likelihood of correct labels. Subsequently, the XGB
Subsequently, the RF model calculates residuals to model calculates residuals, pinpointing areas of
identify areas of prediction deviation. Finally, the prediction deviation, followed by the RF model
LGBM model utilizes residual information and utilizing residual information and probabilities to
probabilities to accurately predict output labels. The predict output labels accurately. Although the
achieved metric scores for Accuracy, Precision, obtained metric scores for Accuracy, Precision,
Recall, and F1_Score are impressive, with values of Recall, and F1_Score are marginally below perfect,
0.9989, 0.9970, 0.9969, and 0.9969 respectively. with values of 0.9956, 0.9948, 0.9908, and 0.9924
These scores highlight the strong performance and respectively, they still reflect commendable
reliability of the sequential model in crop performance and reliability in crop recommendation
recommendation tasks. tasks.
Comparing Sequence [LGBM,RF,XGB] for Crop Sequence Accuracy Precision Recall F1_Score
Recommendation Module
[RF,XGB,L 1.0 1.0 1.0 1.0
GBM]
Employing the sequence [LGBM,RF,XGB]
[XGB,RF,L 0.9989 0.9970 0.9969 0.9969
for the Crop Recommendation Module, the LGBM GBM]
model initiates the process by predicting the [LGBM,RF, 1.0 1.0 1.0 1.0
XGB]
probability distribution of each class. This initial step
provides valuable insights into the likelihood of each [LGBM,XG 0.9956 0.9948 0.9908 0.9924
B,RF]
class being the correct label for a given record.
Following LGBM, the RF model calculates residuals Table 1: Comparison between
to identify areas of prediction deviation, while the various sequence performance for
crop recommendation
XGB model utilizes residual information and
probabilities from the preceding layers to predict Comparing Sequence [RF,XGB,LGBM] for
Comparing Sequence [LGBM,XGB,RF] for Crop valuable insights into the likelihood of correct labels.
sequential model in fertilizer recommendation tasks. [RF,XGB 1.0 1.0 1.0 1.0
,LGBM]