Industrial Engineering (Simple Linear)
Industrial Engineering (Simple Linear)
Chap 13-2
Learning Objectives
In this chapter, you learn:
How to use regression analysis to predict the value of
a dependent variable based on an independent
variable
The meaning of the regression coefficients b0 and b1
How to evaluate the assumptions of regression
analysis and know what to do if the assumptions are
violated
To make inferences about the slope and correlation
coefficient
To estimate mean values and predict individual values
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
Chap 13-3
Chap 13-4
Introduction to
Regression Analysis
Chap 13-5
Chap 13-6
Types of Relationships
Linear relationships
Y
Curvilinear relationships
Y
X
Y
X
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
X
Chap 13-7
Types of Relationships
(continued)
Strong relationships
Y
Weak relationships
Y
X
Y
X
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
X
Chap 13-8
Types of Relationships
(continued)
No relationship
Y
X
Y
X
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
Chap 13-9
Population
Slope
Coefficient
Independent
Variable
Random
Error
term
Yi 0 1Xi i
Linear component
Random Error
component
Chap 13-10
(continued)
Yi 0 1Xi i
Observed Value
of Y for Xi
Predicted Value
of Y for Xi
Slope = 1
Random Error
for this Xi value
Intercept = 0
Xi
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
X
Chap 13-11
Estimate of
the regression
Estimate of the
regression slope
intercept
Yi b0 b1Xi
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
Value of X for
observation i
Chap 13-12
Chap 13-13
Chap 13-14
Interpretation of the
Slope and the Intercept
Chap 13-15
Chap 13-16
Square Feet
(X)
245
1400
312
1600
279
1700
308
1875
199
1100
219
1550
405
2350
324
2450
319
1425
255
1700
Chap 13-17
Chap 13-18
Chap 13-19
0.76211
R Square
0.58082
Adjusted R Square
0.52842
Standard Error
41.33032
Observations
ANOVA
10
df
SS
MS
F
11.0848
Regression
18934.9348
18934.9348
Residual
13665.5652
1708.1957
Total
32600.5000
Coefficients
Intercept
Square Feet
Standard Error
t Stat
P-value
Significance F
0.01039
Lower 95%
Upper 95%
98.24833
58.03348
1.69296
0.12892
-35.57720
232.07386
0.10977
0.03297
3.32938
0.01039
0.03374
0.18580
Chap 13-20
T
P
1.69 0.129
3.33 0.010
DF
1
8
9
SS
MS
F
P
18935 18935 11.08 0.010
13666 1708
32600
Chap 13-21
Slope
= 0.10977
Intercept
= 98.248
Chap 13-22
Chap 13-23
Chap 13-24
Chap 13-25
Do not try to
extrapolate
beyond the range
of observed Xs
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
Chap 13-26
Measures of Variation
SST
SSR
Total Sum of
Squares
Regression Sum
of Squares
SST ( Yi Y )2
SSR ( Yi Y )2
SSE
Error Sum of
Squares
SSE ( Yi Yi )2
where:
Chap 13-27
Measures of Variation
(continued)
(Total Variation)
Chap 13-28
Measures of Variation
(continued)
Y
Yi
SSE = (Yi - Yi )2
Xi
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
_
Y
X
Chap 13-29
Coefficient of Determination, r2
SST
total sum of squares
2
note:
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
0 r 1
2
Chap 13-30
Examples of Approximate
r2 Values
Y
r2 = 1
r2 = 1
r =1
2
Chap 13-31
Examples of Approximate
r2 Values
Y
0 < r2 < 1
X
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
Chap 13-32
Examples of Approximate
r2 Values
r2 = 0
No linear relationship
between X and Y:
r2 = 0
Chap 13-33
0.58082
SST 32600.5000
2
Regression Statistics
Multiple R
0.76211
R Square
0.58082
Adjusted R Square
0.52842
Standard Error
41.33032
Observations
ANOVA
10
df
SS
MS
F
11.0848
Regression
18934.9348
18934.9348
Residual
13665.5652
1708.1957
Total
32600.5000
Coefficients
Intercept
Square Feet
Standard Error
t Stat
P-value
Significance F
0.01039
Lower 95%
Upper 95%
98.24833
58.03348
1.69296
0.12892
-35.57720
232.07386
0.10977
0.03297
3.32938
0.01039
0.03374
0.18580
Chap 13-34
T
P
1.69 0.129
3.33 0.010
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
8
9
SS
MS
F
P
18935 18935 11.08 0.010
13666 1708
32600
SSR 18934.9348
0.58082
SST 32600.5000
SSE
S YX
n2
i 1
(Yi Yi ) 2
n2
Where
SSE = error sum of squares
n = sample size
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
Chap 13-36
0.76211
R Square
0.58082
Adjusted R Square
0.52842
Standard Error
41.33032
Observations
ANOVA
S YX 41.33032
10
df
SS
MS
F
11.0848
Regression
18934.9348
18934.9348
Residual
13665.5652
1708.1957
Total
32600.5000
Coefficients
Intercept
Square Feet
Standard Error
t Stat
P-value
Significance F
0.01039
Lower 95%
Upper 95%
98.24833
58.03348
1.69296
0.12892
-35.57720
232.07386
0.10977
0.03297
3.32938
0.01039
0.03374
0.18580
Chap 13-37
T
P
1.69 0.129
3.33 0.010
S YX 41.33032
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
8
9
SS
MS
F
P
18935 18935 11.08 0.010
13666 1708
32600
Chap 13-38
small SYX
large SYX
Chap 13-39
Assumptions of Regression
L.I.N.E
Linearity
The relationship between X and Y is linear
Independence of Errors
Error values are statistically independent
Normality of Error
Error values are normally distributed for any given
value of X
Equal Variance (also called homoscedasticity)
The probability distribution of the errors has constant
variance
Chap 13-40
Residual Analysis
ei Yi Yi
Chap 13-41
Not Linear
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
residuals
residuals
Linear
Chap 13-42
residuals
residuals
residuals
Independent
X
Chap 13-43
Chap 13-44
100
0
-3
-2
-1
Residual
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
Chap 13-45
x
Non-constant variance
residuals
residuals
Constant variance
Chap 13-46
Residuals
251.92316
-6.923162
273.87671
38.12329
284.85348
-5.853484
304.06284
3.937162
218.99284
-19.99284
268.38832
-49.38832
356.20251
48.79749
367.17929
-43.17929
254.6674
64.33264
10
284.85348
-29.85348
Chap 13-47
Measuring Autocorrelation:
The Durbin-Watson Statistic
Chap 13-48
Autocorrelation
Chap 13-49
(e e
i 2
i 1
2
e
i
i1
0
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
Inconclusive
dL
Do not reject H0
dU
2
Chap 13-51
(continued)
Is there autocorrelation?
Chap 13-52
(continued)
Excel/PHStat output:
Durbin-Watson Calculations
Sum of Squared
Difference of Residuals
3296.18
Sum of Squared
Residuals
3279.98
Durbin-Watson
Statistic
1.00494
n
(e e
i 2
ei
i1
)2
3296.18
1.00494
3279.98
i 1
Chap 13-53
(continued)
Inconclusive
dL=1.29
Do not reject H0
dU=1.45
2
Chap 13-54
S YX
Sb1
SSX
S YX
(X X)
where:
Sb1
S YX
SSE
Chap 13-55
Test statistic
t STAT
b1 1
Sb
d.f. n 2
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
where:
b1 = regression slope
coefficient
1 = hypothesized slope
Sb1 = standard
error of the slope
Chap 13-56
Square Feet
(x)
245
1400
312
1600
279
1700
308
1875
199
1100
219
1550
405
2350
324
2450
319
1425
255
1700
Chap 13-57
Standard Error
t Stat
P-value
98.24833
58.03348
1.69296
0.12892
0.10977
0.03297
3.32938
0.01039
b1
Predictor
Coef
SE Coef
Constant
98.25
58.03
Square Feet 0.10977 0.03297
T
P
1.69 0.129
3.33 0.010
b1
Sb1
Sb1
t STAT
b1 1
Sb
0.10977 0
3.32938
0.03297
Chap 13-58
H0: 1 = 0
H1: 1 0
d.f. = 10- 2 = 8
/2=.025
Reject H0
/2=.025
Do not reject H0
-t/2
-2.3060
Reject H0
t/2
2.3060
3.329
Decision: Reject H0
There is sufficient evidence
that square footage affects
house price
Chap 13-59
H1: 1 0
From Excel output:
Coefficients
Intercept
Square Feet
Standard Error
t Stat
P-value
98.24833
58.03348
1.69296
0.12892
0.10977
0.03297
3.32938
0.01039
T
P
1.69 0.129
3.33 0.010
p-value
Chap 13-60
F Test statistic:
where
MSR
FSTAT
MSE
MSR
SSR
k
MSE
SSE
n k 1
Chap 13-61
0.76211
R Square
0.58082
Adjusted R Square
0.52842
Standard Error
41.33032
Observations
ANOVA
MSR 18934.9348
FSTAT
11.0848
MSE 1708.1957
10
df
MS
F
11.0848
Regression
18934.9348
18934.9348
Residual
13665.5652
1708.1957
Total
32600.5000
p-value for
the F-Test
Significance F
0.01039
Chap 13-62
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
8
9
SS
MS
F
P
18935 18935 11.08 0.010
13666 1708
32600
FSTAT
p-value for
the F-Test
MSR 18934.9348
11.0848
MSE 1708.1957
Chap 13-63
Test Statistic:
H 0 : 1 = 0
H 1 : 1 0
= .05
df1= 1
FSTAT
df2 = 8
Decision:
Reject H0 at = 0.05
Critical
Value:
F = 5.32
Conclusion:
= .05
Do not
reject H0
Reject H0
MSR
11.08
MSE
F.05 = 5.32
Chap 13-64
b1 t / 2 S b
d.f. = n - 2
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
98.24833
58.03348
1.69296
0.12892
-35.57720
232.07386
0.10977
0.03297
3.32938
0.01039
0.03374
0.18580
Chap 13-65
(continued)
Coefficients
Intercept
Square Feet
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
98.24833
58.03348
1.69296
0.12892
-35.57720
232.07386
0.10977
0.03297
3.32938
0.01039
0.03374
0.18580
Chap 13-66
Hypotheses
H0: = 0 (no correlation between X and Y)
H1: 0 (correlation exists)
Test statistic
r -
t STAT
(with n
2 degrees of freedom)
2
1 r
n2
where
r r 2 if b1 0
r r 2 if b1 0
Chap 13-67
(No correlation)
H1: 0
(correlation exists)
=.05 , df = 10 - 2 = 8
t STAT
r
1 r2
n2
.762 0
1 .762 2
10 2
3.329
Chap 13-68
t STAT
r
1 r2
n2
.762 0
1 .762 2
10 2
3.329
Conclusion:
There is
evidence of a
linear association
at the 5% level of
significance
d.f. = 10-2 = 8
/2=.025
Reject H0
-t/2
-2.3060
/2=.025
Do not reject H0
Decision:
Reject H0
Reject H0
t/2
2.3060
3.329
Chap 13-69
Y = b0+b1Xi
Prediction Interval
for an individual Y,
given Xi
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
Xi
Chap 13-70
Y t / 2 S YX hi
Size of interval varies according
to distance away from mean, X
1 (Xi X)2 1
(Xi X)2
hi
n
SSX
n (Xi X)2
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
Chap 13-71
Y t / 2 S YX 1 hi
Chap 13-72
0.025 YX
n
(X i X) 2
(X i X) 2
317.85 37.12
Chap 13-73
0.025 YX
n
(X i X) 2
(X i X) 2
317.85 102.28
Chap 13-74
Check the
confidence and prediction interval for X=
box and enter the X-value and confidence level
desired
Chap 13-75
(continued)
Input values
Y
Confidence Interval Estimate for Y|X=Xi
Prediction Interval Estimate for YX=Xi
Basic Business Statistics, 11e 2009 Prentice-Hall, Inc..
Chap 13-76
Input values
Chap 13-77
Chap 13-78
Chap 13-79
(continued)
Chap 13-80
Chapter Summary
Chap 13-81
Chapter Summary
(continued)
Chap 13-82