0% found this document useful (0 votes)

157 views

Machine Learning Project Car Price Prediction Algorithm

The document summarizes a project that uses multiple linear regression with gradient descent to predict car prices based on features in a dataset. It describes the process of collecting and preprocessing the data, initializing parameters, performing gradient descent to minimize cost, and calculating error metrics. Code is provided to train the model on training and cross-validation sets and predict prices on new test data by normalizing features and calculating the dot product of parameters and features. The model is able to accurately predict car prices based on selected features in the dataset.

Uploaded by

Ruqaiya Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

157 views

Machine Learning Project Car Price Prediction Algorithm

Uploaded by

Ruqaiya Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

National University of Sciences and Technology

School of Mechanical and Manufacturing Engineering

Robotics and Intelligent Machines Engineering
Project Report: Car Price Prediction

Intent:
Train a Machine Learning algorithm and predict Car price based on the selected features.

Method:
The Machine Learning method used for this Project, Car Price Prediction, is Multiple-Linear
Regression with Gradient Descent.

Process:
1. Packages:
• Numpy
• Pandas
• Matplotlib.pyplot
2. Data:
Dataset collected from PAKWHEELS processed and saved as CSV (Comma Delimited) format.
To access the data ‘Pandas’ library function READ_CSV() is used. The data is then separated
as Features and Label, as ‘x_train’ and ‘y_train’, respectively.
The data() function returns ‘x_train’ and ‘y_train’ to the main() function for further use.
3. Normalization
The feature set is then normalized using the formula
X = (X – MEAN) / (STANDARD DEVIATION)
The normal() function returns ‘x_train’, ‘data_mean’ and ‘data_std’: which are the
normalized features, mean of the features and standard deviation of the features,
respectively, to the main() function. The ‘data_mean’ and ‘data_std’ will be used in
normalizing the TEST INPUT, in the prediction.py, to predict the car price.
4. Complete Data
The feature set is completed for processing by adding the ‘bias’ unit for all the rows, in the
data_complete() function, using the numpy.ones() function.
5. Parameter Initialization
The parameters ‘theta’ is randomly initialized using the numpy.random.rand(), and is
returned to the main() function.
6. Gradient Descent
The gradient() function has been used to minimize the parameter theta and reduce the cost
iteratively, on each iteration theta values are updated which generates a cost history. The
last training cost is then multiplied by million to have our final cost for training set.
7. Mean Absolute error
The mae() function calculates the absolute mean of the error in the test set, training set and
the CV set of the data by using formula which is called in the main function using print() and
displayed for the user.
Flow of Algorithm:

Train Model:
2. Calculate
Retrieve Data 1. Generate Random
Hypothesis
Theta Values

Choose a Learning 3. Calculate initial Cost

Data Cleansing
Algorithm Function

4. Gradient Descent:
Define Features and Update values of theta
Append bias units
Label iteratively to generate
cost history

Separate the data into:

1. Training set Normalize the Data Predict Car Prices
2. Cross validation set
3. Test set

Code Running Instructions:

The code has been split into two parts; Training and Prediction.

Training:
In training part, the data has been read from a csv file, separated into training feature set and
training label. For the data set provided the dimensions of the feature set and the label set are

(16281, 9) (16281,) respectively.

Then the data has been categorized in another pair of sets named cross validation feature set and
cross validation label set. The calculated value for the dimensions of the feature and label set are
(5425, 9) (5425,) respectively.

After that the last pair of needed sets, the test feature set and the test label set has been created
similarly. The dimensions for these pair of sets are (5428, 9) (5428,).

These columns are then completed by adding the bias units or appending the columns with ones
hence the resultant values for dimensions become

(16281, 10) (16281, 1) (5425, 10) (5425, 1) (5428, 10) (5428, 1) (16281, 10) (16281, 1) (5425, 10)
(5425, 1) (5428, 10) (5428, 1).
Random theta values are initialized

Value set of theta is [[0.49354865]

[0.62653841]

[0.11832303]

[0.0742843 ]

[0.42119429]

[0.39886133]

[0.27029176]

[0.76941718]

[0.92276763]

[0.51739262]].

After completing the sets, the cost formula is used to calculate the cost function before and after the
data is trained and for the cross-validation data set. These values are

Cost before training: 1.4038338473544203

Final Cost for Train Set: 0.28237719451856336

Final Cost for CV Set: 0.2663015740160541

The graph of final cost for training set and cross-validation set is plotted as:

Figure 1: Final cost for training set and cross-validation set

The next task is to calculate the absolute mean error on the three sets that we earlier created.

This is done by using the formula of the absolute mean error which is error = np.sum (abs (h-y) / m,
where all the used variables have already been defined in the code.

The results for these errors are

Mean Absolute Error for Training Set: 0.38244467157992945

Mean Absolute Error for CV Set: 0.3486779215566904

Mean Absolute Error for Test Set: 0.3678959796760039.

Prediction:
The training2 file of code is imported to this prediction file, to be able to use the calculated values
for all the sets created.

In the prediction() function an array for all the nine features is created, the data is normalized by
using the mean and standard deviation functions created in the training2 code of file.

This file is appended with ones to be able to calculate the dot product of the test data with the final
value of theta.

Value obtained in the previous step is multiplied with one million to get an appropriate price for the
car whose price needs to be predicted.

An example prediction is attached in the following snapshot

Conclusion:
The algorithm used in this program predicts the car prices by dividing the provided sets into multiple
sets of data. Calculations are performed, and data is normalized to generate efficient prediction
results.

ML-2 Guided Project Report
No ratings yet
ML-2 Guided Project Report
63 pages
Capstone Notes-2
No ratings yet
Capstone Notes-2
27 pages
SMDM Guided Project Sample Business Report
No ratings yet
SMDM Guided Project Sample Business Report
17 pages
Nagareddy 18-Nov-2023
No ratings yet
Nagareddy 18-Nov-2023
20 pages
SMDM Project Report-Survi Ghura
100% (1)
SMDM Project Report-Survi Ghura
26 pages
Sample - Customer Churn Prediction Python Documentation
No ratings yet
Sample - Customer Churn Prediction Python Documentation
33 pages
COMPX310-19A Machine Learning: An Introduction Using Python, Scikit-Learn, Keras, and Tensorflow
No ratings yet
COMPX310-19A Machine Learning: An Introduction Using Python, Scikit-Learn, Keras, and Tensorflow
44 pages
Data Science & Business Analytics: Post Graduate Program in
No ratings yet
Data Science & Business Analytics: Post Graduate Program in
16 pages
Pima Indian Diabetes Questions
No ratings yet
Pima Indian Diabetes Questions
6 pages
Help File
No ratings yet
Help File
92 pages
Azure Machine Learning Studio - Automobile Price Prediction
No ratings yet
Azure Machine Learning Studio - Automobile Price Prediction
11 pages
ML Assignemnt PDF
No ratings yet
ML Assignemnt PDF
21 pages
LDA KNN Logistic
100% (1)
LDA KNN Logistic
29 pages
Anshul Dyundi Machine Learning July 2022
50% (2)
Anshul Dyundi Machine Learning July 2022
46 pages
Clustering Analysis: Prepared by Muralidharan N
100% (1)
Clustering Analysis: Prepared by Muralidharan N
16 pages
FINANCE & RISK ANALYTICS – PROJECT - YARESH VIJAYASUNDARAM
No ratings yet
FINANCE & RISK ANALYTICS – PROJECT - YARESH VIJAYASUNDARAM
48 pages
Predicting Mode of Transport (ML) : Akalya KS
No ratings yet
Predicting Mode of Transport (ML) : Akalya KS
17 pages
Wholesale Custumer
100% (1)
Wholesale Custumer
32 pages
Data Mining Problem 2 Report
No ratings yet
Data Mining Problem 2 Report
13 pages
Capstone Notes-Model
No ratings yet
Capstone Notes-Model
20 pages
Dinya Antony MRA ML2
100% (1)
Dinya Antony MRA ML2
24 pages
Capstone Project Report 2
No ratings yet
Capstone Project Report 2
178 pages
FRA Extended
No ratings yet
FRA Extended
22 pages
M4 Data Mining W4 Business Report
No ratings yet
M4 Data Mining W4 Business Report
22 pages
Machine Learning Guided Project
No ratings yet
Machine Learning Guided Project
23 pages
AS Extended Buisnesss Report
No ratings yet
AS Extended Buisnesss Report
25 pages
Cars Project PDF
No ratings yet
Cars Project PDF
9 pages
Predictive Modeling
No ratings yet
Predictive Modeling
38 pages
Project: ©great Learning. Proprietary Content. All Rights Reserved. Unauthorised Use or Distribution Prohibited
No ratings yet
Project: ©great Learning. Proprietary Content. All Rights Reserved. Unauthorised Use or Distribution Prohibited
8 pages
PM Guided Project Sample Business Report
No ratings yet
PM Guided Project Sample Business Report
52 pages
Australian Gas Production - Project On Time Series Forecasting
100% (19)
Australian Gas Production - Project On Time Series Forecasting
29 pages
Problem 2 - Survey: Importing Nessceary Libraries
No ratings yet
Problem 2 - Survey: Importing Nessceary Libraries
10 pages
SMDM Report
No ratings yet
SMDM Report
12 pages
Project - Ipynb - Colaboratory
No ratings yet
Project - Ipynb - Colaboratory
4 pages
Chapter 5 - Classification Problems
100% (1)
Chapter 5 - Classification Problems
25 pages
Assignment ML
100% (2)
Assignment ML
21 pages
House Price Prediction Using Data Science
No ratings yet
House Price Prediction Using Data Science
8 pages
Report On Linear Regression Using R
No ratings yet
Report On Linear Regression Using R
15 pages
Business Report On Data Mining: By: Aditya Janardan Hajare Batch: PGPDSBA Mar'C21 Group 1
100% (1)
Business Report On Data Mining: By: Aditya Janardan Hajare Batch: PGPDSBA Mar'C21 Group 1
12 pages
House Price Prediction 1
No ratings yet
House Price Prediction 1
27 pages
An Introduction To Clustering and Different Methods of Clustering
No ratings yet
An Introduction To Clustering and Different Methods of Clustering
9 pages
Bankruptcy Prevention Project
No ratings yet
Bankruptcy Prevention Project
16 pages
Solution To Problem 1: Importing The Libraries
No ratings yet
Solution To Problem 1: Importing The Libraries
6 pages
PG Program Dsba
No ratings yet
PG Program Dsba
16 pages
Capstone Notes-1
No ratings yet
Capstone Notes-1
18 pages
Rahulsharma - 03 12 23
No ratings yet
Rahulsharma - 03 12 23
25 pages
ML - Project - Business Report
No ratings yet
ML - Project - Business Report
43 pages
Vijayalakshmi
No ratings yet
Vijayalakshmi
17 pages
Simple Regression Quiz
No ratings yet
Simple Regression Quiz
6 pages
Buisiness Reoprt Extended As Project Report
No ratings yet
Buisiness Reoprt Extended As Project Report
18 pages
Akshaya SMDM Project Report
100% (1)
Akshaya SMDM Project Report
18 pages
Project Predictive Modeling PDF
100% (1)
Project Predictive Modeling PDF
58 pages
Random Forest - US - Heart - Patients - Class
100% (1)
Random Forest - US - Heart - Patients - Class
24 pages
Answer Book (Ashish)
100% (1)
Answer Book (Ashish)
21 pages
Answer Report (Preditive Modelling)
100% (1)
Answer Report (Preditive Modelling)
29 pages
Clustering Project
100% (1)
Clustering Project
44 pages
Machine Learning - Nabeel Khan - Final Project Report - Problem 2
100% (1)
Machine Learning - Nabeel Khan - Final Project Report - Problem 2
24 pages
Assignment 2 Solution
No ratings yet
Assignment 2 Solution
6 pages
Car Price Prediction Using Machine Learning
33% (3)
Car Price Prediction Using Machine Learning
15 pages
Meta
No ratings yet
Meta
21 pages
Personalization in User Interface Design
No ratings yet
Personalization in User Interface Design
8 pages
Essay Outline
No ratings yet
Essay Outline
3 pages
Soft Computing Lab Manual
No ratings yet
Soft Computing Lab Manual
24 pages
EEGFormer Towards Transferable and Interpretable Large-Scale
No ratings yet
EEGFormer Towards Transferable and Interpretable Large-Scale
6 pages
Assignment-2 ML
No ratings yet
Assignment-2 ML
1 page
Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch Adi Polak pdf download
No ratings yet
Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch Adi Polak pdf download
50 pages
Lec-1 ML Intro
No ratings yet
Lec-1 ML Intro
15 pages
A Detection System For Stolen Vehicles Using Vehicle Attributes With Deep Learning
No ratings yet
A Detection System For Stolen Vehicles Using Vehicle Attributes With Deep Learning
4 pages
A Comprehensive Survey On Model Compression and Acceleration
No ratings yet
A Comprehensive Survey On Model Compression and Acceleration
43 pages
Resume Krish
No ratings yet
Resume Krish
2 pages
Artificial Intelligence (AI): When Humans and Machines Might Have to Coexist
No ratings yet
Artificial Intelligence (AI): When Humans and Machines Might Have to Coexist
15 pages
Explainable AI
No ratings yet
Explainable AI
41 pages
Unit-1
No ratings yet
Unit-1
18 pages
Ai Resume Analyzer
No ratings yet
Ai Resume Analyzer
13 pages
NLP Assignment 5
No ratings yet
NLP Assignment 5
5 pages
What Is Microsoft's Approach To AI - Microsoft Source
No ratings yet
What Is Microsoft's Approach To AI - Microsoft Source
13 pages
IIITB+ED+ML+AI
No ratings yet
IIITB+ED+ML+AI
24 pages
5 Layers of A Convolutional Neural Network
No ratings yet
5 Layers of A Convolutional Neural Network
15 pages
RTNU PHD Syllabus - Computer Application
No ratings yet
RTNU PHD Syllabus - Computer Application
14 pages
Astra Whitepaper
No ratings yet
Astra Whitepaper
19 pages
Software Engineering BSC - Innovation Hub Programme Flyer
No ratings yet
Software Engineering BSC - Innovation Hub Programme Flyer
5 pages
[FREE PDF sample] Deep Learning and Medical Applications Mathematics in Industry 40 Jin Keun Seo (Editor) ebooks
100% (3)
[FREE PDF sample] Deep Learning and Medical Applications Mathematics in Industry 40 Jin Keun Seo (Editor) ebooks
65 pages
LLMs and Retrieval-Augmented Generation (RAG)
No ratings yet
LLMs and Retrieval-Augmented Generation (RAG)
120 pages
Anomaly Detection in Lte Traffic Time Series Data Using Machine Learning
No ratings yet
Anomaly Detection in Lte Traffic Time Series Data Using Machine Learning
14 pages
Ai engineer roadmap-kdtech
No ratings yet
Ai engineer roadmap-kdtech
18 pages
CP5261 Data Analytics Laboratory LTPC0042 Objectives
No ratings yet
CP5261 Data Analytics Laboratory LTPC0042 Objectives
80 pages
DOC-20250125-WA0000.
No ratings yet
DOC-20250125-WA0000.
15 pages
AIOps Whitepaper
100% (1)
AIOps Whitepaper
28 pages
C2 W2 SoftMax
No ratings yet
C2 W2 SoftMax
7 pages
Introduction To Data Mining Clustering Analysis
No ratings yet
Introduction To Data Mining Clustering Analysis
84 pages

Machine Learning Project Car Price Prediction Algorithm

Uploaded by

Machine Learning Project Car Price Prediction Algorithm

Uploaded by

National University of Sciences and Technology

School of Mechanical and Manufacturing Engineering

Choose a Learning 3. Calculate initial Cost

Separate the data into:

Code Running Instructions:

(16281, 9) (16281,) respectively.

Value set of theta is [[0.49354865]

Cost before training: 1.4038338473544203

Final Cost for Train Set: 0.28237719451856336

Final Cost for CV Set: 0.2663015740160541

Figure 1: Final cost for training set and cross-validation set

The results for these errors are

Mean Absolute Error for Training Set: 0.38244467157992945

Mean Absolute Error for CV Set: 0.3486779215566904

Mean Absolute Error for Test Set: 0.3678959796760039.

An example prediction is attached in the following snapshot

You might also like