Week-4 BA Linear Regression
Week-4 BA Linear Regression
Fall 2021
Week-04
Making Numerical Predictions
Simple Linear Regression
Chap 13-4
Simple Linear Regression Model
Simple: One (independent) variable
Linear: Relationship between Variables is
Described by a Linear Function
The Change of One Variable Causes the Other
Variable to Change
A Dependence of One Variable on the Other
6
Introduction to Linear Regression
(cont.)
How well a set of data points fits a straight line can be
measured by calculating the distance between the data points
and the line.
The total error between data points and the line is obtained by
squaring each distance & then summing the squared values.
Regression equation: Minimize the sum of squared errors.
8
Regression Equation
Calculating a and b
X Y X2 XY
2 15 n=9 (no of data points)
3 28
5 42
13 64
8 50
16 90
11 58
1 8
9 54
Avg(X) Avg(Y) Sum(X2) Sum(XY)
Example Solution
Finding the regression equation
X Y X2 XY
2 15 4 30
n=9
3 28 9 84
5 42 25 210
13 64 169 832
8 50 64 400
b = 4.80
16 90 256 1440
11 58 121 638 a = 45.44 – (4.80 x 7.56)
1 8 1 8 a = 9.18
9 54 81 486
Y = 9.18 + 4.80 X
Sal = 9.18 + 4.80 (Exp)
Avg(X) = 7.56 Sum(X2) = 730
Avg(Y) = 45.44 Sum(XY) = 4128
Interpretation of Results: Example
Interpreting the slope
Y = 9.18 + 4.80 X
Sal = 9.18 + 4.80 (Exp)
Making predictions
_
Y
X
Xi
Measures of Variation:
The Sum of Squares
(continued)
Y
SSE =(Yi - Yi )2
_
SST = (Yi - Y)2
_
SSR = (Yi - Y)2
_
Y
X
Xi
Measures of Variation:
The Sum of Squares
(continued)
Measures of Variation:
The Sum of Squares
Total
= Explained + Unexplained
Sample
Variability Variability
Variability
Measures of Variation
The Sum of Squares: Example
Excel Output for Salary example
SSR
SSE SST
Recall:
The coefficient of determination r2 is the
proportion of variability in the response variable
“explained” by the regression.
Chap 13-26
Important concepts
Calculating ‘a’ and ‘b’ using formula, & writing the
regression equation.
Interpreting the slope (with + or – sign).
Using the equation, making a prediction &
calculating prediction errors (residuals)
From regression output: R-sq, SSR, SSE, SST, F-
statistic and its significance, Slope and intercept
values.
Calculating & Interpreting the r-squared value
28
Scatterplot
Car Odometer Price
1 37388 14636
2 44758 14122
3 45833 14016
4 30862 15590
5 31705 15568
6 34010 14718
. . .
. . .
. . .
29
6.30
Excel Output of car price example
QUESTIONS
Data for 7 Stores:
1. What is the correlation
Annual coefficient? SSR, SST, SSE?
Store Square Sales
2. What is the regression equation?
Feet ($000)
3. Interpret the slope.
1 1,726 3,681
4. Interpret R-sq value.
2 1,542 3,395
5. Interpret the F-statistic.
3 2,816 6,653
6. What’s the prediction error for
4 5,555 9,543
Store 3 and 6?
5 1,292 3,318
7. Consider a store with 2000 square
6 2,208 5,563 feet area. Predict Annual Sales for
7 1,313 3,760 this case…