Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
73 views

Linear Regression Mca Lab - Jupyter Notebook

This document discusses performing linear regression on a Boston housing dataset using Python. It loads the dataset, splits it into training and test sets, trains two linear regression models with different independent variables ('crim' and 'lstat'), predicts values for the test sets, and calculates the mean squared error for each model. The models aim to predict the median house price ('medv') using other features from the dataset.

Uploaded by

Smriti Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views

Linear Regression Mca Lab - Jupyter Notebook

This document discusses performing linear regression on a Boston housing dataset using Python. It loads the dataset, splits it into training and test sets, trains two linear regression models with different independent variables ('crim' and 'lstat'), predicts values for the test sets, and calculates the mean squared error for each model. The models aim to predict the median house price ('medv') using other features from the dataset.

Uploaded by

Smriti Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

28/08/2023, 13:22 linear regression mca lab - Jupyter Notebook

In [1]:  #loading dataset


import pandas as pd

In [2]:  #saving in boston object


boston=pd.read_csv("D:\smriti iitr and tarot material\data science and python course docs\data files for panda\Boston.csv")

In [3]:  boston.head()
(https://getlin
Out[3]:
Unnamed: 0 crim zn indus chas nox rm age dis rad tax ptratio black lstat medv

0 1 0.00632 18.0 2.31 0 0.538 6.575 65.2 4.0900 1 296 15.3 396.90 4.98 24.0

1 2 0.02731 0.0 7.07 0 0.469 6.421 78.9 4.9671 2 242 17.8 396.90 9.14 21.6

2 3 0.02729 0.0 7.07 0 0.469 7.185 61.1 4.9671 2 242 17.8 392.83 4.03 34.7

3 4 0.03237 0.0 2.18 0 0.458 6.998 45.8 6.0622 3 222 18.7 394.63 2.94 33.4

4 5 0.06905 0.0 2.18 0 0.458 7.147 54.2 6.0622 3 222 18.7 396.90 5.33 36.2

In [4]:  #different factors to determine house price



#now to find median size of house
#medv dependent variable

#any other independent variable like crim

In [5]:  y= boston[["medv"]]

In [6]:  x = boston[["crim"]]

In [7]:  #import linear regression module from sklearn



#Scikit-learn is a free machine learning library for Python

#sklearn is a dummy project on PyPi that will in turn install scikit-learn



In [8]:  from sklearn.linear_model import LinearRegression

In [9]:  #before making model we have to divide our data set into train n test data

from sklearn.model_selection import train_test_split

In [10]:  #before making model we have to divide our data set into traion n test data

In [11]:  x_train, x_test, y_train , y_test =train_test_split(x,y,test_size= 0.3)



#x dependent variable , y independent , size test size

#30 % observations in test set , rest in train set

# x test , test set of dependent variable , x train train set (BECAUSE IT GIVES 4 RESULTS , SO
#4 OBJECTS )

In [12]:  lr = LinearRegression()

#lr instantiate object, STORE INSTANCE OF LINEAR REGRESSION IN Lr

In [13]:  lr.fit (x_train, y_train)



#model fit on training data

Out[13]: LinearRegression()

#predict values now on x_set



# train on traning set , predict on test set

In [14]:  y_pred=lr.predict(x_test)

In [15]:  y_test.head()

Out[15]:
medv

216 23.3

293 23.9

51 20.5

2 34.7

151 19.6

In [16]:  y_pred[0:5]

#it has residual errors

Out[16]: array([[23.80451205],
[23.78965233],
[23.80540644],
[23.81185568],
[23.22266879]])

localhost:8888/notebooks/linear regression mca lab .ipynb 1/2


28/08/2023, 13:22 linear regression mca lab - Jupyter Notebook

In [17]:  #calculate residual error,, for it import metrics

In [18]:  from sklearn.metrics import mean_squared_error

In [19]:  mean_squared_error(y_test, y_pred) #pass actual and predicated value , which is in y_pred

Out[19]: 81.57800347209668

In [20]:  x = boston[["lstat"]]
(https://getlin

In [21]:  x_train, x_test, y_train , y_test =train_test_split(x,y,test_size= 0.3)

In [22]:  lr2 = LinearRegression()

In [23]:  lr2.fit (x_train, y_train)

Out[23]: LinearRegression()

In [24]:  y_pred=lr2.predict(x_test)

In [25]:  y_test.head()

Out[25]:
medv

217 28.7

346 17.2

278 29.1

324 25.0

396 12.5

In [26]:  y_pred[0:5]

Out[26]: array([[25.58101115],
[22.70333202],
[27.9951715 ],
[29.02843213],
[16.23338229]])

In [27]:  mean_squared_error(y_test, y_pred)

Out[27]: 34.66868885942193

In [ ]:  ​

localhost:8888/notebooks/linear regression mca lab .ipynb 2/2

You might also like