ML Assignment2 33418
ML Assignment2 33418
ML Assignment2 33418
SAP ID 33418
Assignment No 02
Introduction
• The purpose of the project: predicting housing prices using a Linear Regression model.
Steps of Implementation
1. Importing Dependencies
The first step is to import necessary libraries, such as numPy, matplotlib, pandas, and sklearn.
• The dataset (BostonHousing.csv) was uploaded to Google Co lab, and the pandas library was used to load
the data.
MACHINE LEARNING
Riphah International University I-14 Islamabad
Those pandas. read_csv() was used to read the CSV file, and df.head() displays the first five rows to understand
the dataset structure.
To understand the relationships between different features, we use Seaborn's pair plot function to visualize the
dataset.
MACHINE LEARNING
Riphah International University I-14 Islamabad
This creates pairwise plots of all features, allowing us to visually explore potential correlations between variables.
Seaborn makes it easier to observe patterns and distributions.
We separate the dataset into features (X) and the target variable (y). Here, we assume that the last column in the
dataset represents the house prices (target), and the rest are the features.
X includes all columns except the last, which is the target (y). The iloc function is used to select the features and
the target from the dataset.
We split the dataset into training and testing sets using train_test_split() from sklearn.
This step is critical to evaluate the model's performance. We use 60% of the data for training and 40% for testing.
The random_state ensures reproducibility of results.
We create and train the Linear Regression model on the training dataset
MACHINE LEARNING
Riphah International University I-14 Islamabad
The LinearRegression() object is created and fitted with the training data using fit(). The model learns the
relationships between the features and the target (house prices) during this training phase.
After training the model, we make predictions on the test data and evaluate the model's performance.
MACHINE LEARNING
Riphah International University I-14 Islamabad
8. Visualizing the Actual vs Predicted Prices
We create a scatter plot to compare the actual house prices with the predicted values
This plot allows us to visually assess how well the model performed. The closer the points are to the red dashed
line (which represents perfect predictions), the better the model's performance
Conclusion
In this project, we successfully built a Linear Regression model using the Boston Housing dataset to predict
house prices. The model's performance was evaluated using Mean Squared Error and Variance score, and the
actual vs predicted prices were visualized using a scatter plot. We observed that the model was able to capture the
general trends in the data, though some predictions deviated from actual prices. Future improvements could
include using more advanced models or tuning the existing model for better accuracy.
MACHINE LEARNING