QAM Chapter 4
QAM Chapter 4
QAM Chapter 4
Regression Models
To accompany
Quantitative Analysis for Management, Eleventh Edition,
by Render, Stair, and Hanna
Power Point slides created by Brian Peterson
4.1 Introduction
4.2 Scatter Diagrams
4.3 Simple Linear Regression
4.4 Measuring the Fit of the Regression
Model
4.5 Using Computer Software for Regression
4.6 Assumptions of the Regression Model
Figure 4.1
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-10
Simple Linear Regression
Regression models are used to test if there is a
relationship between variables.
There is some random error that cannot be
predicted.
Y 0 1X
where
Y = dependent variable (response)
X = independent variable (predictor or explanatory)
0 = intercept (value of Y when X = 0)
1 = slope of the regression line
= random error
Yˆ b0 b1 X
where
Y^ = predicted value of Y
b0 = estimate of β0, based on sample results
b1 = estimate of β1, based on sample results
Y = Sales
X = Area payroll
The line chosen in Figure 4.1 is the one that
minimizes the errors.
X
X
average (mean) of X values
n
Y
Y
average (mean) of Y values
n
b1
( X X )(Y Y )
(X X ) 2
b0 Y b1 X
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-14
Triple A Construction
Regression calculations for Triple A Construction
Y X (X – X)2 (X – X)(Y – Y)
6 3 (3 – 4)2 = 1 (3 – 4)(6 – 7) = 1
8 4 (4 – 4)2 = 0 (4 – 4)(8 – 7) = 0
9 6 (6 – 4)2 = 4 (6 – 4)(9 – 7) = 4
5 4 (4 – 4)2 = 0 (4 – 4)(5 – 7) = 0
4.5 2 (2 – 4)2 = 4 (2 – 4)(4.5 – 7) = 5
9.5 5 (5 – 4)2 = 1 (5 – 4)(9.5 – 7) = 2.5
ΣY = 42 ΣX = 24 Σ(X – X)2 = 10 Σ(X – X)(Y – Y) = 12.5
Y = 42/6 = 7 X = 24/6 = 4
Table 4.2
X
X 24
4
6 6
Y
Y 42
7
6 6
b1
( X X )(Y Y ) 12.5
1.25
(X X ) 10 2
b0 Y b1 X 7 (1.25 )( 4 ) 2
Therefore Yˆ 2 1.25 X
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-16
Triple A Construction
Regression calculations
X
X 24
4
6 6 sales = 2 + 1.25(payroll)
Y
Y 42
7
If the payroll next
year is $600 million
6 6
ˆ
b1
( X X )(Y YY ) 2 12
1.5
.25(6) 9.5 or $ 950,000
1.25
(X X ) 10 2
b0 Y b1 X 7 (1.25 )( 4 ) 2
Therefore Yˆ 2 1.25 X
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-17
Measuring the Fit
of the Regression Model
Regression models can be developed
for any variables X and Y.
How do we know the model is actually
helpful in predicting Y based on X?
We could just take the average error, but
the positive and negative errors would
cancel each other out.
Three measures of variability are:
SST – Total variability about the mean.
SSE – Variability about the regression line.
SSR – Total variability that is explained by
the model.
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-18
Measuring the Fit
of the Regression Model
Sum of the squares total:
SST (Y Y )2
SSE e 2 (Y Yˆ )2
SSR (Yˆ Y )2
An important relationship:
SST SSR SSE
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-19
Measuring the Fit
of the Regression Model
Sum of Squares for Triple A Construction
^ ^ ^
Y X (Y – Y)2 Y (Y – Y)2 (Y – Y)2
6 3 (6 – 7)2 = 1 2 + 1.25(3) = 5.75 0.0625 1.563
SST = 22.5
Sum of the squared error SSE = 6.875
SSRˆ=2 15.625
SSE e (Y Y )
2
SSR (Yˆ Y )2
An important relationship
SST SSR SSE
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-21
Measuring the Fit
of the Regression Model
Deviations from the Regression Line and from the Mean
Figure 4.2
r r2
r 0.6944 0.8333
Program 4.1A
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-26
Using Computer Software
for Regression
Data Input for Regression in Excel
Program 4.1B
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-27
Using Computer Software
for Regression
Excel Output for the Triple A Construction Example
Program 4.1C
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-28
Assumptions of the Regression Model
Figure 4.4A
X
Figure 4.4B
Figure 4.4C
2 SSE
s MSE
n k 1
where
n = number of observations in the sample
k = number of independent variables
SSR
MSR
k
where
k =number of independent variables in the
model
The F statistic is:
MSR
F
MSE
This describes an F distribution with:
degrees of freedom for the numerator = df1 = k
degrees of freedom for the denominator = df2 = n – k –
1
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-37
Testing the Model for Significance
If there is very little error, the MSE would
be small and the F-statistic would be large
indicating the model is useful.
If the F-statistic is large, the significance
level (p-value) will be low, indicating it is
unlikely this would have occurred by
chance.
So when the F-value is large, we can reject
the null hypothesis and accept that there is
a linear relationship between X and Y and
the values of the MSE and r2 are
meaningful.
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-38
Steps in a Hypothesis Test
1. Specify null and alternative hypotheses:
H0 : 1 0
H1 : 1 0
df1 = k = 1
df2 = n – k – 1 = 6 – 1 – 1 = 4
The value of F associated with a 5% level of
significance and with degrees of freedom 1
and 4 is found in Appendix D.
F0.05,1,4 = 7.71
Fcalculated = 9.09
Reject H0 because 9.09 > 7.71
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-42
Triple A Construction
We can conclude there is a
statistically significant
relationship between X and
Y.
The r2 value of 0.69 means
about 69% of the variability
in sales (Y) is explained by
local payroll (X).
0.05
F = 7.71 9.09
Figure 4.5
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-43
Analysis of Variance (ANOVA) Table
DF SS MS F SIGNIFICANCE
Regression k SSR MSR = SSR/k MSR/MSE P(F >
MSR/MSE)
Residual n-k-1 SSE MSE =
SSE/(n - k - 1)
Total n-1 SST
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-44
Table 4.4
ANOVA for Triple A Construction
Program 4.1C
(partial)
P(F > 9.0909) = 0.0394
Because this probability is less than 0.05, we reject
the null hypothesis of no linear relationship and
conclude there is a linear relationship between X
and Y.
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-45
Multiple Regression Analysis
Multiple regression models are
extensions to the simple linear model
and allow the creation of models with
more than one independent variable.
Y = 0 + 1X1 + 2X2 + … + kXk +
where
Y =dependent variable (response variable)
Xi =ith independent variable (predictor or
explanatory variable)
0 =intercept (value of Y when all Xi = 0)
i =coefficient of the ith independent variable
k =number of independent variables
=random error
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-46
Multiple Regression Analysis
To estimate these values, a sample is taken the
following equation developed
Yˆ b0 b1 X 1 b2 X 2 ... bk X k
where
Ŷ =predicted value of Y
b0 =sample intercept (and is an estimate of
0 )
bi =sample coefficient of the ith variable (and
is an estimate of i)
Yˆ b0 b1 X 1 b2 X 2
where
Ŷ =predicted value of dependent variable
(selling price)
b0 =Y intercept
X1 and X2 =value of the two independent
variables (square footage and age)
respectively
b1 and b2 =slopes for X1 and X2 respectively
She selects a sample of houses that have sold
recently and records the data shown in Table 4.5
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-48
Jenny Wilson Real Estate Data
SELLING SQUARE
AGE CONDITION
PRICE ($) FOOTAGE
95,000 1,926 30 Good
119,000 2,069 40 Excellent
124,800 1,720 30 Excellent
135,000 1,396 15 Good
142,000 1,706 32 Mint
145,000 1,847 38 Mint
159,000 1,950 27 Mint
165,000 2,323 30 Excellent
182,000 2,285 26 Mint
183,000 3,752 35 Good
200,000 2,300 18 Good
211,000 2,525 17 Good
215,000 3,800 40 Excellent
Table 4.5 219,000 1,740 12 Mint
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-49
Jenny Wilson Realty
Input Screen for the Jenny Wilson Realty Multiple
Regression Example
Program 4.2A
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall 4-50
Jenny Wilson Realty
Output for the Jenny Wilson Realty Multiple
Regression Example
2SSR SSE
r 1
SST SST
* * * *
*
** * * ** *
*
*** * ** *
Linear relationship Nonlinear relationship
Figure 4.6A
Figure 4.6B
X 2 ( weight)2
This gives us a model that can be solved with
linear regression software:
Yˆ b0 b1 X 1 b2 X 2
Program 4.5
A better model with a
smaller F-test for
significance and a larger
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall
adjusted r2 value 4-68
Cautions and Pitfalls