Car Price Prediction
Car Price Prediction
Car Price Prediction
Prediction
Introduction
● Predicting the price of a used cars has been studied extensively in various
researches.
● As per information from the Agency for Statistics of BiH, the percentage
of personal car usage is increased by 2.7% since 2013 and it is likely that
this trend will continue, and the number of cars will increase day by day.
● Typically, the most significant ones are present price , brand and model,
age, mileage etc. The fuel type used in the car as well as fuel consumption
per mile highly affect the price of a car due to a frequent changes in the
price of a fuel. Different features like exterior color, type of transmission,
safety, air condition, etc. will also influence the car price.
BACKGROUND STUDY
● Wu et al. conducted a car price prediction study, by using a neuro-fuzzy
knowledge-based system. Their prediction model produced similar results as
the simple regression model.
● Listian discussed, in her paper, that a regression model that was built using
Support Vector Machines (SVM) can predict the price of a car that has been
leased with better precision than multivariate regression.
● Noor and Jan built a model for car price prediction by using multiple linear
regression.
PROBLEM STATEMENT
A model to predict the price of a used car should be
developed in order to assess its value based on a variety
of characteristics. Several factors affect the price of a
used car, such as company, model, year, transmission,
distance driven, fuel type, seller type, and owner type.
As a result, it is crucial to know the car's actual market
value before purchasing or selling it.
CHALLENGES
● Finding the best regression algorithm, among Linear Regression,
Lasso Regression, Random Forest Regression, Ridge regression, etc.,
for our problem was a challenge.
Data frames in which we loaded the dataset now include the Selling price ,
Present price, Kms driven, owner, age, Fuel Type, Seller type, Transmission
type.
● Step 3: We created a heat map using Pearson correlation to illustrate how the
features are related.
Solution Approach
● Step 4: Using ExtraTreesRegressor the feature importances were
obtained. And Present_Price turned out to be the feature with most
importance.
● Step 6: The data has been split into training data (80%) and test data
(20%).
Solution APPROACH
● Step 7: After getting the best parameters from the
RandomizedSearchCV we train the model using the Random Forest
Regression .
In the random forest regressor, the decision tree will scale the input,
so we do not have to scale the values.
● Step 8 : The file is put in a pickle file. Also the performances of the
model are computed.
● The "CNG" fuel type, the "Automatic" transmission type, and the
"Dealer" seller type were removed from the unique values.
FINAL DATASET
● Using the Extra Trees Regressor the feature
importances are found.
● This tells us that the error is not much. So the predicted value might actually be accurate.
● Also when we use the scatter plot the test points and the predicted points are along the line y=x , which means that
the predicted values are equal to the test values using the model.
OBSERVATION
● The performance measures are found using the Root Mean Square
Error , R2 Score , Mean Square Error.
● Also accuracy of the model is calculated from the sklearn’s built in
functions. As this is only done for classification we made our
predicted output using a cutoff. If the difference between the test
and predicted value is greater than cutoff (MSE) then we classify as
wrongly predicted else correct prediction.
● Accuracy of the model calculated from the “accuracy_score” gave us
a score of 93.4 percentage.
● Also calculating the Root Mean Square Error for the test and
predicted test values is 1.36.
● Calculating the Mean Square Error for the test and predicted test
values we get 1.97.
FEATURES
Year : 2007
Present price : 3.5 lakhs
Kms driven : 50000
Owner : 0
Fuel Type : Petrol
Seller Type : Dealer
Transmission type : Manual
PREDICTED PRICE
The price predicted is 2.81 lakhs
RESULTS
The results are demonstrated in html page. The
features are entered and the selling price is
calculated.
FEATURES
Year : 2015
Present price : 59 lakhs
Kms driven : 5000
Owner : 1
Fuel Type : Diesel
Seller Type : Individual
Transmission type : Automatic
PREDICTED PRICE
The price predicted is 22.13 lakhs
References [1] Gegic, E.; Isakovic, B.; Keco, D.; Masetic, Z.; Kevric, J. Car price
prediction using machine learning techniques. TEM J. 2019, 8, 113.
[2]USA_CAR_SELLING_PRICE/Second-checkpoint.ipynb at master ·
harsh0703-harsh/USA_CAR_SELLING_PRICE (github.com)
[3]
Capstone_Project_Machine_Learning/Capstone_Project_Report_with_
Python.pdf at master ·
EnesGokceDS/Capstone_Project_Machine_Learning (github.com)