Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
37 views

Linear Regression-Part 2

Linear regression models the relationship between one or more independent variables (x) and a dependent variable (y). The linear regression equation is f(x) = b0 + b1x, where b0 is the y-intercept and b1 is the slope of the line. The goal of linear regression is to choose values for b0 and b1 that minimize the sum of squared residuals between the actual y-values and the predicted y-values from the linear model. The residuals are the vertical distances between the data points and the linear regression line, representing the error in the predictions. Linear regression aims to minimize these errors by fitting the "line of best fit" through the data points.

Uploaded by

fathiah
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Linear Regression-Part 2

Linear regression models the relationship between one or more independent variables (x) and a dependent variable (y). The linear regression equation is f(x) = b0 + b1x, where b0 is the y-intercept and b1 is the slope of the line. The goal of linear regression is to choose values for b0 and b1 that minimize the sum of squared residuals between the actual y-values and the predicted y-values from the linear model. The residuals are the vertical distances between the data points and the linear regression line, representing the error in the predictions. Linear regression aims to minimize these errors by fitting the "line of best fit" through the data points.

Uploaded by

fathiah
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Linear regression

Part 2
Linear Regression
Continue
• When implementing simple linear regression, you typically start with a given set of input-output (𝑥-𝑦) pairs.
• These pairs are your observations, shown as green circles in the figure.
• For example, the leftmost observation has the input 𝑥 = 5 and the actual output, or response, 𝑦 = 5. The next one has 𝑥 = 15 and 𝑦 = 20, and so
on.
• The estimated regression function, represented by the black line, has the equation
𝑓(𝑥) = 𝑏₀ + 𝑏₁𝑥.
• Your goal is to calculate the optimal values of the predicted weights 𝑏₀ and 𝑏₁ that minimize SSR and determine the estimated regression
function.
• The value of 𝑏₀, also called the intercept, shows the point where the estimated regression line crosses the 𝑦 axis.
• It’s the value of the estimated response 𝑓(𝑥) for 𝑥 = 0.
• The value of 𝑏₁ determines the slope of the estimated regression line.
• The predicted responses, shown as red squares, are the points on the regression line that correspond to the input
values.
• For example, for the input 𝑥 = 5, the predicted response is 𝑓(5) = 8.33, which the leftmost red square represents.
• The vertical dashed grey lines represent the residuals, which can be calculated as
𝑦ᵢ - 𝑓(𝐱ᵢ) = 𝑦ᵢ - 𝑏₀ - 𝑏₁𝑥ᵢ for 𝑖 = 1, …, 𝑛.

• They’re the distances between the green circles and red squares.
• When you implement linear regression, you’re actually trying to minimize these distances and make the red squares as close to the
predefined green circles as possible.
linear regression equation
Example
Line of Best Fit

• The Linear Regression model have to find the line of best fit.
• We know the equation of a line is y=mx+c.
• There are infinite m and c possibilities, which one to chose?
• Out of all possible lines, how to find the best fit line?
• The line of best fit is calculated by using the cost function — Least
Sum of Squares of Errors.
• The line of best fit will have the least sum of squares error.
Cost Function

• For all possible lines, calculate the sum of squares of errors. The line
which has the least sum of squares of errors is the best fit line.
The Least Squares Regression Line
• Definition
• Given a collection of pairs (x,y) of numbers (in which not all the x-values are the same), there
is a line that best fits the data in the sense of minimizing the sum of the
squared errors.
• It is called the least squares regression line.
• Its slope and y-intercept are computed using the formulas:

Where
Example
• Find the least squares regression line for the five-point data set:
X 2 2 6 8 10
Y 0 1 2 3 3

1. Show the data in the tabular form.


Continue

The least squares regression line for these data is


Continue
• The computations for measuring how well it fits the sample data are
given below:
The sum of the squared errors (SSE) is the
sum of the numbers in the last column,
which is 0.75

**Residual Sum of Squares (RSS) also known


as SSE finds the difference between the
observed, or actual value of the variable, and
the estimated value, which is what it should
be according to the line of regression

The value estimated by the regression line


Performance metric for Regression Line
• The important element in any machine learning model is to evaluate
the accuracy of the model.
• Metrics below can be used to evaluate the accuracy of the model
• The Mean Squared Error
• Mean absolute error
• Root Mean Squared Error
• and R-Squared or Coefficient of determination metrics
Mean Absolute error (MAE)
• Mean absolute error (MAE) is a loss
function used for regression.
• Use MAE when you are doing regression
and don't want outliers to play a big role.
• The Mean absolute error represents the average of • The loss is the mean over the absolute
the absolute difference between the actual and differences between true and predicted
predicted values in the dataset. values, deviations in either direction from
the true value are treated the same way.
• It measures the average of the residuals in the dataset.
Mean Squared Error (MSE)
• Mean Squared Error represents the average of the squared difference
between the original and predicted values in the data set.
• It measures the variance of the residuals.

variance measures how far each


number in the set is from the
mean (average), and thus from
every other number in the set.
Root Mean Squared Error (RMSE)
• Root Mean Squared Error is the square root of Mean Squared error.
• It measures the standard deviation of residuals.

• A standard deviation (or σ) is a measure of how dispersed the data is in relation to the mean.
• Low standard deviation means data are clustered around the mean,
• High standard deviation indicates data are more spread out.
The coefficient of determination or R-squared
• The coefficient of determination or R-squared represents the
proportion of the variance in the dependent variable which is
explained by the linear regression model.
• It is a scale-free score i.e. irrespective of the values being small or
large, the value of R square will be less than one.
Continue
• The lower value of MAE, MSE, and RMSE implies higher accuracy of a
regression model.
• However, a higher value of R square is considered desirable.
• Both RMSE and R- Squared quantifies how well a linear regression
model fits a dataset.
• The RMSE tells how well a regression model can predict the value of a
response variable in absolute terms .
• R- Squared tells how well the predictor variables can explain the
variation in the response variable.
Polynomial Regression

• You can regard polynomial regression as a generalized case of linear regression. You assume the polynomial
dependence between the output and inputs and, consequently, the polynomial estimated regression function.
• In other words, in addition to linear terms like 𝑏₁ 𝑥₁, your regression function 𝑓 can include nonlinear terms such as
𝑏₂𝑥₁², 𝑏₃𝑥₁³, or even 𝑏₄𝑥₁𝑥₂, 𝑏₅𝑥₁²𝑥₂.
• The simplest example of polynomial regression has a single independent variable, and the estimated regression
function is a polynomial of degree two: 𝑓(𝑥) = 𝑏₀ + 𝑏₁ 𝑥 + 𝑏₂ 𝑥².
• Now, remember that you want to calculate 𝑏₀, 𝑏₁, and 𝑏₂ to minimize SSR. These are your unknowns!
• Keeping this in mind, compare the previous regression function with the function 𝑓( 𝑥₁, 𝑥₂) = 𝑏₀ + 𝑏₁ 𝑥₁ + 𝑏₂ 𝑥₂, used for
linear regression. They look very similar and are both linear functions of the unknowns 𝑏₀, 𝑏₁, and 𝑏₂. This is why
you can solve the polynomial regression problem as a linear problem with the term 𝑥² regarded as an input
variable.
• In the case of two variables and the polynomial of degree two, the regression function has this form: 𝑓( 𝑥₁, 𝑥₂) = 𝑏₀
+ 𝑏₁𝑥₁ + 𝑏₂𝑥₂ + 𝑏₃𝑥₁² + 𝑏₄𝑥₁𝑥₂ + 𝑏₅𝑥₂².
• The procedure for solving the problem is identical to the previous case. You apply linear regression for five inputs:
𝑥₁, 𝑥₂, 𝑥₁², 𝑥₁𝑥₂, and 𝑥₂². As the result of regression, you get the values of six weights that minimize SSR: 𝑏₀, 𝑏₁, 𝑏₂,
𝑏₃, 𝑏₄, and 𝑏₅.
Polynomial Regression
Polynomial regression is needed when there is no linear
correlation fitting all the variables. So instead of looking like a
line, it looks like a nonlinear function.
Linear vs Polynomial
example of weight loss.

• In the case of multiple linear


regression, you are interested in
how multiple different values
impact weight loss – like hours
spent at the gym, sugar intake,
and so on.

• In the case of polynomial


regression, you are interested in
how multiple different powers
of one variable impact it.
(xx, x^2x2, x^3x3, and so on,
where xx is the sugar intake, for
example.)
Why Polynomial?
• Polynomial regression is useful in many cases.
• Since a relationship between the independent and dependent variables isn’t
required to be linear, you get more freedom in the choice of datasets and
situations you can be working with.
• So this method can be applied when simple linear regression underfits the data.
• Polynomial regression is a simple yet powerful tool for predictive analytics. It
allows user to consider non-linear relations between variables and reach
conclusions that can be estimated with high accuracy.
• This type of regression can help user to predict disease spread rate, calculate
fair compensation, or implement a preventative road safety regulation
software.
Underfitting and Overfitting

• Underfitting 
• occurs when a model can’t accurately capture the dependencies among data, usually as a
consequence of its own simplicity.
• It often yields a low 𝑅² with known data and bad generalization capabilities when applied
with new data.
• Overfitting 
• A model learns the existing data too well.
• Complex models, which have many features or terms, are often prone to overfitting.
• When applied to known data, such models usually yield high 𝑅².
• However, they often don’t generalize well and have significantly lower 𝑅² when used with
new data.
• The left plot shows a linear

Continue regression line that has a low 𝑅².


• It might also be important that a
straight line can’t take into
account the fact that the actual
response increases as 𝑥 moves
away from twenty-five and
toward zero.
• This is likely an example of
underfitting.

• The right plot illustrates polynomial


regression with the degree equal to two.
• In this instance, this might be the optimal
degree for modeling this data.
• The model has a value of 𝑅² that’s
satisfactory in many cases and shows
trends nicely.
• The left plot presents polynomial
regression with the degree equal to
Continue three.
• The value of 𝑅² is higher than in the
preceding cases.
• This model behaves better with known
data than the previous ones.
• However, it shows some signs of
overfitting, especially for the input
values close to sixty, where the line
starts decreasing, although the actual
data doesn’t show that.

• The right plot, you can see the


perfect fit: six points and the
polynomial line of the degree
five (or higher) yield 𝑅² = 1.
• Each actual response equals
its corresponding prediction.
Exercise in class
1. The pricing schedule for labor on a service call by an elevator repair
company is $150 plus $50 per hour on site.
• Write down the linear equation that relates the labor cost y to the number of
hours x that the repairman is on site.
• Calculate the labor cost for a service call that lasts 2.5 hours.
2. The cost of a telephone call made through a leased line service is
2.5 cents per minute.
• Write down the linear equation that relates the cost y (in cents) of a call to its
length x.
• Calculate the cost of a call that lasts 23 minutes.
exercise

You might also like