Prediction of Plantation and Their Profits
Prediction of Plantation and Their Profits
Prediction of Plantation and Their Profits
ABSTRACT:
Prediction of All India State wise production of plantation and their profits in this project we
are going to build a dashboard which helps to predict the plantation in hectares and their
profits based on their cultivation. In this developed application we are going use some data
mining techniques which helps to predict the plantation and their profits by considering
datasets from Year 2002 to 2019 and predicting the result for the year 2018. The datasets
consists of 36 states( 29 states + 7 territories and their Sistricts). The developed application
consists of each district seasons like summer, kharif,and rabi. Each season has different crop
related to their districts. Based on the year we are going to predict the result based on the
each crop.
Introduction
In India there are multiple types of plantation are grown within states. India is famous for
spices. In India every year we grow the plantation(spices) and we have profits and loss
based on the market values or other factors. In this project we are going to build
application which helps in predicting the spices in hectare production and market values
for the upcoming year.
Data mining is a particular data analysis technique that focuses on modeling and
knowledge discovery for predictive rather than purely descriptive purposes, while business
intelligence covers data analysis that relies heavily on aggregation, focusing on business
information.
Motivation
India is famous for spices plantation. In India in every states different spices were grown
and there will profit and loss depends on atmosphere. In our application we are going to
collect datasets from the different states and analyzing their state wise plantation and
profits. With help of this data we going to predict the current year yield and profit which
helps government to concentrate on particular state.
Existing System:
The government website will give the detail plantation based on hectare wise
thought state. The website contains the previous year datasets which does not help
farmer and government to analyze it.
No future prediction
Proposed System:
The model helps in predicting the result based on hectare and prediction of price.
The model helps the government to take the action based on predicting the result
Objective:
Analyzing the data for previous year and predicting the result based on the other
parameters.
State wise differentiating plantation based on high and low spices rate.
Parameters:
Season:
Summer
Kharif
Rabi
Whole year
Crop
Ragi
Paddy
Wheat
Jowar
soyabean
Cashewnut
Coconut
Cocoa etc
Literature review:
India is an agricultural country. Farmers are the life-blood of the nation. But the current
condition of a farmer in India is very pathetic. Today farmers are not able to enjoy the
yield produced by them. The farmers should be introduced to the modern farming
techniques because upon their well-being depends the welfare of the nation. Here, ICT
plays a very important role meeting these challenges. Precision farming and modern
society can play important roles in promoting ICT in agriculture. But the adoption is very
slow in nature due to the number of yet unresolved issues discussed in earlier developed
projects. The paper presents a generic framework for e-agricultural system comprising of
knowledge management and monitoring system and also gives a brief description of the
application interface.
India is a country where agriculture and agriculture related industries are the major source
of living for the people. Agriculture is a major source of economy of the country. It is also
one of the country which suffer from major natural calamities like drought or flood which
damages the crop. This leads to huge financial loss for the farmers thus leading to the
suicide. Predicting the crop yield well in advance prior to its harvest can help the farmers
and Government organizations to make appropriate planning like storing, selling, fixing
minimum support price, importing/exporting etc. Predicting a crop well in advance
requires a systematic study of huge data coming from various variables like soil quality,
pH, EC, N, P, K etc. As Prediction of crop deals with large set of database thus making
this prediction system a perfect candidate for application of data mining. Through data
mining we extract the knowledge from the huge size of data. This paper presents the study
about the various data mining techniques used for predicting the crop yield. The success
of any crop yield prediction system heavily relies on how accurately the features have
been extracted and how appropriately classifiers have been employed. This paper
summarizes the results obtained by various algorithms which are being used by various
authors for crop yield prediction, with their accuracy and recommendation.
1. Title: “Data Mining Techniques and Applications to Agricultural Yield Data”.
International Journal of Advanced Research in Computer and Communication
Engineering Vol. 2, Issue 9, September 2013.
Author: D Ramesh, B Vishnu Vardhan.
Summary: In this paper author has focused on the applications of Data Mining
techniques in agricultural field. Different Data Mining techniques are used, such as
K-Means, K-Nearest Neighbor (KNN), Artificial Neural Networks (ANN) and
Support Vector Machines (SVM) for very recent applications of Data Mining
techniques in agriculture field. In this paper they have considered the problem of
predicting yield production. This work aims at finding suitable data models that
achieve a high accuracy and a high generality in terms of yield prediction capabilities.
For this purpose, different types of Data Mining techniques were evaluated on
different data sets.
2. Title: “Brief Survey Of Data Mining Techniques Applied To Applications Of
Agriculture”. International Journal of Advanced Research in Computer and
Communication Engineering Vol. 5, Issue 2, February 2016
Author: Ami Mistry, Vinita Shah
Summary: In this paper authors present some of the most used data mining
techniques in the field of agriculture. In the near future the penetration of Information
Technology and Agriculture results is more interesting area of research. The main aim
of this work is to improve and substantiate the validity of yield prediction which is
useful for the farmers. Agricultural crop production depends on various factors such
as biology, climate, economy and geography. Several factors have different impacts
on agriculture, which can be quantified using appropriate statistical methodologies.
Agronomic traits such as yield can be affected by a large number of variables. In this
survey, they have analyzed Data Mining methods like clustering, classification
models to select the most relevant method for the prospect
3. Title: “Applying Data Mining Techniques To Predict Annual Yield Of Major Crops
And Recommend Planting Different Crops In Different Districts In Bangladesh”.
Department of Electrical and Computer Engineering, North South University,
Bangladesh.
Author: A.T.M Shakil Ahamed, Navid Tanzeem Mahmood, Nazmul Hossain
Summary: In this paper, author’s focus is on application of data mining techniques to
extract knowledge from the agricultural data to estimate crop yield for major cereal
crops in major districts of Bangladesh. The prediction is done using the algorithms
Clustering, K-means, K-NN Algorithm, Linear Regression and Neural Network.
4. Title: “Analysis of Soil Behavior and Prediction of Crop Yield Using Data Mining
Approach”. 2015 International Conference on Computational Intelligence and
Communication Networks.
Author: Monali Paul, Santosh K, Vishvakarma and Ashok Verma
Summary: This work presents a system, which uses data mining techniques in order
to predict the category of the analyzed soil datasets. The category, thus predicted will
indicate the yielding of crops. The problem of predicting the crop yield is formalized
as a classification rule, where Naive Bayes and K-Nearest Neighbor methods are
used.
Datasets
A dataset (or data set) is a collection of data, usually presented in tabular form. Each column
represents a particular variable. Each row corresponds to a given member of the dataset in
question. It lists values for each of the variables, such as height and weight of an object. In
the development of the predictive model the data sets were collected internally in secondary
form. Secondary data imply statistical materials or information not originated or obtained by
the investigator himself, but obtain from someone’s record or published source such as the
central government agencies
Data source:
https://data.gov.in/search/site?query=planttion+an+profits
Feasibility Study
The feasibility study which helps to find solutions to the problems of the project. The solution
which is given that how looks as a new system look like.
Technical Feasibility
The project entitled “Prediction of plantation and Profit” is technically feasible because of
the below mentioned features. The project is developed in Java. The web server is used to
develop “Prediction of plantation and Profit” is local serve. The local server very neatly
coordinates between design and coding part. It provides a Graphical User Interface to design
an application while the coding is done in JAVA. At the same time, it provides high level
reliability, availability and compatibility.
Economic Feasibility
In economic feasibility, cost benefit analysis is done in which expected costs and benefits are
evaluated. Economic analysis is used for effectiveness of the proposed system. In economic
feasibility the most important is cost-benefit analysis. The system “Prediction of plantation
and Profit using Data Mining Techniques” is feasible because it does not exceed the
estimated cost and the estimated benefits are equal.
Operational Feasibility
The project entitled “Prediction of plantation and Profit using Data Mining Techniques” is
technically feasible because of the below mentioned features. The system predicts the
automobile buying behaviour and its stages based on the automobile purchased data, further
the details of the patient are added to the Data Base. The performance of the Data mining
techniques are compared based on their execution time and displayed it through graph.
Behavior Feasibility
The project entitled “Prediction of plantation and Profit using Data Mining Techniques”
is beneficial because it satisfies the objectives when developed and installed.
SOFTWARE REQUIREMENT ANALYSIS
2.2 PURPOSE
The Purpose of the Software Requirements Specification for Prediction of plantation and
Profit to provide the technical, Functional and non-functional features, required to develop a
web application App. The entire application designed to provide user flexibility for finding
the shortest and/or time saving path. In short, the purpose of this SRS document is to provide
a detailed overview of our software product, its parameters and goals. This document
describes the project’s target audience and its user interface, hardware and software
requirements. It defines how our client, team and audience see the product and its
functionality.
Scope
The scope of this system is to presents a review on data mining techniques used for the
Prediction of plantation and Profit. It is evident from the system that data mining
technique, like classification, is highly efficient in Prediction of plantation and Profit.
SOFTWARE ARCHIETECTURE:
Technology used
NetBeans :
JAVA Servlets:
Java Servlets are server-side Java program modules that process and answer client requests
and implement the servlet interface. It helps in enhancing Web server functionality with
minimal overhead, maintenance and support. A servlet acts as an intermediary between the
client and the server. As servlet modules run on the server, they can receive and respond to
requests made by the client. Request and response objects of the servlet offer a convenient
way to handle HTTP requests and send text data back to the client. Since a servlet is
integrated with the Java language, it also possesses all the Java features such as high
portability, platform independence, security and Java database connectivity.
Java Server Pages (JSP) is a technology for developing Webpages that supports dynamic
content. This helps developers insert java code in HTML pages by making use of special JSP
tags, most of which start with <% and end with %>.A Java Server Pages component is a type
of Java servlet that is designed to fulfill the role of a user interface for a Java web application.
Web developers write JSPs as text files that combine HTML or XHTML code, XML
elements, and embedded JSP actions and commands. Using JSP, you can collect input from
users through Webpage forms, present records from a database or another source, and create
Webpages dynamically.
Highcharts:
Highcharts is a pure JavaScript based charting library meant to enhance web applications by
adding interactive charting capability. Highcharts provides a wide variety of charts. For
example, line charts, spline charts, area charts, bar charts, pie charts and so on.
MySql:
Functional Requirements:
Pre-Processing:
Data pre-processing is a data mining technique which is used to transform the raw data in a
useful and efficient format. The datasets collected form the above website have some null
values. The null values has to be filled using the pre processing techniques.
Steps in data Preprocessing:
Data cleaning
The data can have many irrelevant and missing parts. To handle this part, data
cleaning is done. It involves handling of missing data, noisy data etc
Missing Data
Cleaning Data
Noisy data is a meaningless data that can’t be interpreted by machines.It can be
generated due to faulty data collection, data entry errors etc.
Data Model:
After preprocessing steps the attributes are selected based on the result.in our application we
used the kmeans clustering and polynomial regression.
Visualization:
The obtained results are shown with visualization which show the complete report
SYSTEM DESIGN
Scope
This software Design Document is for a base level system, which will work as a proof of
concept for the use of building a system that provides a base level of functionality to show
feasibility for large-scale production use. The software Design Document, the focus placed
on generation of the documents and modification of the documents. The system will used in
conjunction with other pre-existing systems and will consist largely of a document interaction
faced that abstracts document interactions and handling of the document objects. This
Document provides the Design specifications of Plantation and profits.
Work flow Diagram
The approach we took for our study follows the traditional data analysis steps
Data Preparation
Missing Value Numeric Nominal
Modeling
Visualization/Result Analysis
Work Flow:
Methodology:
DATA PREPARATION
Data preparation was performed before each model construction. All records with missing
value (usually represented by 0 in the dataset) in the chosen attributes were removed. All
numerical values were converted to nominal value according to the data dictionary.
Missing Values:
Occurs when the no data value is stored for the observation
Modeling
We first calculate several statistics from the dataset to show the basic characteristics of
theplantation, then applied Regression and clustering relationships among the attributes and
the patterns.
Result Analysis
The results of our analysis include Prediction rules among the variables, clustering of states
and districts in the INDIA on their populations and number of rate. We used a data analytic
tool Highcharts to perform these analysis.
Function
Database
Flow
DFD: 0 Level
DFD 0; Clustering
ACTIVITY DIAGRAM
ACTIVITY DIAGRAM:
CLUSTERING:
Sequence Diagram:
USECASE DIAGRAM:
DATABASE Table:
IMPLEMENTATION
4.1 Introduction
The project is implemented using java which is an object oriented programming language.
Object oriented programming is an approach that provides a way of modularizing program by
creating partitioned memory area of both data and function that can be used as a template for
creating copies of such module on demand.
This project is implemented using java programming language. Both servlet and JSP
technologies are used to create a web application. Servlet are java programs are precompiled
which can create dynamic web contents. There are many interfaces and class in the servlet API
such as Httpservlet, servlet request, servletresponse etc. JSP is used to create a web application
just as servlet.it can be thought of as a extension to servlet because it provides more
functionality than servlet. MySQL server is used as a backend.
4.2 Overview
K-means clustering algorithm was used to investigate the high and low-frequency
plantation locations. Further, they have been used association rule mining to
recognize the association between the various factors related to road traffic
plantations at various places with changeable plantation occurrences
Result based on parameters
There are two algorithms are implemented for plantation. The linear regression
algorithm help in predicting the result based on the parameters. In this application we are
using all over india results based on each districts and season wise. Clustering algorithm
is used to differentiate between the district based on year wise.
K-means clustering was implemented to distinguish between high and low frequency data
sets for every states. The algorithm result helps in finding the high and low crop grown
districts according to state wise.
Parameters:
Season:
Summer
Kharif
Rabi
Whole year
Crop
Ragi
Paddy
Wheat
Jowar
soyabean
Cashewnut
Coconut
Cocoa etc
Procedure:
Step 2: Handling missing data (all records with missing value is represented as 0 in the
dataset)
Step 3: estimate the mean and the variance of both the input and output variables
from the training data.
Step 6: Calculate the covariance with taking the mean result of x and y axis.
Y=b0+b1(x).
Type of Weather Condition (hot, cloud, heavy rain, light rain, snow).
Type of Location (near bazar, near factory, near school, near temple,
others).
Procedure:
Step 2: calculate the average value based on the total number of plantation happens
to the total number of states.
Step 3: assume that average value is centroid ‘c’, select ‘C’ cluster center.
Step 4: Calculate the distance between each data point and cluster centers.
Step 5: Assign the data point to the cluster center whose distance from the cluster
center is minimum of all the cluster centers.
Step 8: If no data point was reassigned then stop, otherwise repeat from step 5).
Regression Algorithm
X [] x-axis values
//Initialization
N length of x-axis
//First pass
//Initialization
Sumx empty
Sumy empty
Sumx2 empty
// first pass
Sumx2 x[]*y[]
//second pass
// initialize xxbar 0
yybar 0 xybar 0.0;
for loop begins initialize i
0 to i less than n
xxbar = xxbar+ (x[] - xbar) * (x[] -
xbar); yybar = yybar+(y[]
- ybar) * (y[] - ybar);
xybar = xybar+ (x[i] - xbar) * (y[i] -
ybar); for loop ends slope
= xybar / xxbar;
Input:
Table:
X(year) Y(value)
2008 3496
2009 3500
2010 3987
2011 2987
2012 3019
2013 3999
2014 4015
2015 4786
2016 4018
2017 4445
Sum of x=20125
Sum of y=38252
mean of x) mean of y)
2008 3496 -4 -329 1316 16 108241
2009 3500 -3 -325 975 9 105625
2010 3987 -2 162 -324 4 26244
2011 2987 -1 -838 838 1 702244
2012 3019 0 -806 0 0 649636
2013 3999 1 174 174 1 30276
2014 4015 2 190 380 4 36100
2015 4786 3 961 2883 9 923521
2016 4018 4 193 772 16 37249
2017 4445 5 620 3100 25 384400
Total 5 2 10114 85
3003536
10114/85 118.98
=3825.5-(118.98*2012)
-235600.76
Y=b0 + b1(X)
Y=-235600.76+ 118.8(2020)
Output:
//initialize
Cluster2[] length
Cluster1[] A[]
Else
Cluster2[] A[]
// Initialize
6.1 Introduction
Web applications run on devices with limited memory, CPU power and power supply. The
behaviour of the application also depends on external factors like connectivity, general
system utilization, etc.
Therefore, it is very important to debug, test and optimize web application. Having
reasonable test coverage for web application helps to enhance and maintain the web
application. As it is not possible to test bootstrap web applications on all possible device
configurations, it is a common practice to run on typical device configurations. Should test
application at least on one device with the lowest possible configuration. In addition should
test on one device with the highest available configuration, e.g., pixel density, screen
resolution to ensure that it works fine on these devices.
Web application testing based on Unit. In general, a Unit test is a method whose statements
test a part of the application. Organizes test methods into classes called test cases, and group
test cases into test suites.
Unit tests that run on device. These tests have access to Instrumentation information, such as
the Context of the application are testing. Use this approach to run unit tests that have web
application dependencies, which mock objects cannot easily satisfy.
This type of test verifies that the target app behaves as expected when a user performs a
specific action or enters a specific input in its activities. For example, it allows checking that
the target app returns the correct UI output in response to user interactions in the app’s
activities. UI testing frameworks like Espresso allow programmatically simulating user
actions and testing complex intra-app user interactions.
This type of test verifies the correct behaviour of interactions between different user apps or
between user apps and system apps. For example, might want to test that app behaves
correctly when the user performs an action in the Settings menu. UI testing frameworks that
support crossapp interactions, such as UI Automaton, allow creating tests for such scenarios.
The following tables show the various test causes scenarios that are generated along with the
required inputs o the given scenarios, expected outputs, actual output and the result whether
the test passes or fails.