Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
89 views

Simple Linear Regression - Assign2

A food delivery service recorded delivery time and time taken for orders to be sorted to improve services. A simple linear regression model was built with delivery time as the target variable. Log, exponential, and polynomial transformations were applied and RMSE and correlation values were recorded for each. The log transformation model had the lowest RMSE and was selected as best. The final model was fitted on training and test split data, and the final RMSE value was reported.

Uploaded by

Sravani Adapa
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views

Simple Linear Regression - Assign2

A food delivery service recorded delivery time and time taken for orders to be sorted to improve services. A simple linear regression model was built with delivery time as the target variable. Log, exponential, and polynomial transformations were applied and RMSE and correlation values were recorded for each. The log transformation model had the lowest RMSE and was selected as best. The final model was fitted on training and test split data, and the final RMSE value was reported.

Uploaded by

Sravani Adapa
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Simple Linear Regression With scikit-learn

There are five basic steps when you’re implementing linear regression:

1. Import the packages and classes you need.


2. Provide data to work with and eventually do appropriate transformations.
3. Create a regression model and fit it with existing data.
4. Check the results of model fitting to know whether the model is satisfactory.
5. Apply the model for predictions.

These steps are more or less general for most of the regression approaches and implementations.

Problem Statement: -

A food delivery service recorded the data of delivery time taken and the time taken for the
deliveries to be sorted by the restaurants in order to improve their delivery services.
Approach – A Simple Linear regression model needs to be built with target variable
‘Delivery.Time’. Apply necessary transformations and record the RMSE values, Correlation
coefficient values for different transformation models.

Step 1: Import packages and classes

The first step is to import the package numpy and the class LinearRegression from sklearn.linear_model:

import numpy as np
from sklearn.linear_model import LinearRegression
Now, you have all the functionalities you need to implement linear regression.

The fundamental data type of NumPy is the array type called numpy.ndarray. The rest of this article
uses the term array to refer to instances of the type numpy.ndarray.

The class sklearn.linear_model.LinearRegression will be used to perform linear and polynomial


regression and make predictions accordingly.

Step 2: Provide data

The second step is defining data to work with. The inputs (regressors, 𝑥) and output (predictor, 𝑦).
calories_consumed.csv is imported .

Exploratory data analysis is performed on data

Step 3: Create a model and fit it

The next step is to create a linear regression model and fit it using the existing data.

Let’s create an instance of the class LinearRegression, which will represent the regression model:

Simple linear regression


model = LinearRegression()
This statement creates the variable model as the instance of LinearRegression. You can provide several
optional parameters to LinearRegression

statsmodels.formula.api is imported to build a model based on ols of data

model1=smf.ols('calories ~ weight',data=cal_data).fit()

Regression line is plotted after obtaining predicted values


after plotting scattered plot root mean squared error is calculated

In order to reduce the errors and to obtain best fit line Transformation is performed on data

Log transformation

In exponential transformation, transformation is applied on y data

#x=log(sort_time),y=time

scattered plot is plotted

later correlation coefficient is obtained between transformed input and output

model2 is built on obtained data

new regression line is plotted


new rmse is calculated

Exponential transformation

In exponential transformation, transformation is applied on y data

#x=(sort_time),y=log(time)

scattered plot is plotted


later correlation coefficient is obtained between transformed input and output

model3 is built on obtained data

new regression line is plotted

new rmse is calculated


Polynomial transformation

x=sort_time ,x^2=sort_time*sort_time, y=log(time)

from sklearn.preprocessing import PolynomialFeatures to build the polynomial regression

new regression line

from the above regressive model the rmse is obtained

choose the best model by using all RMSE values of above transformations

models with respective RMS values are tabulated


from the above observations log model is taken as best

Step 4: Get results

Once you have your model fitted, you can get the results to check whether the model works
satisfactorily and interpret it.

the summary of final model is


final model is fitted on train and test split data and prediction is observed

the final rmse value is

You might also like