SBE11 CH 16
SBE11 CH 16
SBE11 CH 16
John
Loucks
St. Edward’s
University
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
1
or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
Regression Analysis: Model Building
General Linear Model
Determining When to Add or Delete Variables
Variable Selection Procedures
Multiple Regression Approach to
Experimental Design
Autocorrelation and the
Durbin-Watson Test
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
2
or duplicated, or posted to a publicly accessible website, in whole or in part.
General Linear Model
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
3
or duplicated, or posted to a publicly accessible website, in whole or in part.
General Linear Model
y 0 1 x1
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
4
or duplicated, or posted to a publicly accessible website, in whole or in part.
Modeling Curvilinear Relationships
y 0 1 x1 2 x12
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
5
or duplicated, or posted to a publicly accessible website, in whole or in part.
Interaction
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
6
or duplicated, or posted to a publicly accessible website, in whole or in part.
Transformations Involving the Dependent Variable
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
7
or duplicated, or posted to a publicly accessible website, in whole or in part.
Nonlinear Models That Are Intrinsically Linear
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
8
or duplicated, or posted to a publicly accessible website, in whole or in part.
Determining When to Add or Delete Variables
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
9
or duplicated, or posted to a publicly accessible website, in whole or in part.
Determining When to Add or Delete Variables
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
10
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable Selection Procedures
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
11
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable Selection: Stepwise Regression
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
12
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable Selection: Stepwise Regression
Compute F stat. and Any
p-value for each indep. p-value < alpha
variable not in model to enter
? No
No
Indep. variable Yes
Any Yes with largest
p-value > alpha p-value is Stop
to remove removed
? from model
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
14
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable Selection: Forward Selection
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
16
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable Selection: Backward Elimination
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
18
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable Selection: Backward Elimination
Partial Data
A B C D E F
Selling House Number Number Garage
Segment Price Size of of Size
1 of City ($000) (00 sq. ft.) Bedrms. Bathrms. (cars)
2 Northwest 290 21 4 2 2
3 South 95 11 2 1 0
4 Northeast 170 19 3 2 2
5 Northwest 375 38 5 4 3
6 West 350 24 4 3 2
7 South 125 10 2 2 0
8 West 310 31 4 4 2
9 West 275 25 3 2 2
Note: Rows 10-26 are not shown.
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
19
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable Selection: Backward Elimination
Regression Output
A B C D E
42
43 Coeffic. Std. Err. t Stat P-value
44 Intercept -59.416 54.6072 -1.0881 0.28951
45 House Size 6.50587 3.24687 2.0037 0.05883
46 Bedrooms 29.1013 26.2148 1.1101 0.28012
47 Bathrooms 26.4004 18.8077 1.4037 0.17574
48 Cars -10.803 27.329 -0.3953 0.6968
49
Greatest
Variable p-value
to be > .05
removed
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
20
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable Selection: Backward Elimination
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
21
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable Selection: Backward Elimination
Regression Output
A B C D E
42
43 Coeffic. Std. Err. t Stat P-value
44 Intercept -47.342 44.3467 -1.0675 0.29785
45 House Size 6.02021 2.94446 2.0446 0.05363
46 Bedrooms 23.0353 20.8229 1.1062 0.28113
47 Bathrooms 27.0286 18.3601 1.4721 0.15581
48
49 Greatest
Variable p-value
to be > .05
removed
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
22
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable Selection: Backward Elimination
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
23
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable Selection: Backward Elimination
Regression Output
A B C D E
42
43 Coeffic. Std. Err. t Stat P-value
44 Intercept -12.349 31.2392 -0.3953 0.69642
45 House Size 7.94652 2.38644 3.3299 0.00304
46 Bathrooms 30.3444 18.2056 1.6668 0.10974
47
48
49 Greatest
Variable p-value
to be > .05
removed
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
24
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable Selection: Backward Elimination
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
25
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable Selection: Backward Elimination
Regression Output
A B C D E
42
43 Coeffic. Std. Err. t Stat P-value
44 Intercept -9.8669 32.3874 -0.3047 0.76337
45 House Size 11.3383 1.29384 8.7633 8.7E-09
46
47
48
49 Greatest
p-value
is < .05
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
26
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable Selection: Backward Elimination
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
27
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable Selection: Best-Subsets Regression
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
28
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable Selection: Best-Subsets Regression
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
29
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable-Selection Procedures
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
31
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable-Selection Procedures
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
32
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable-Selection Procedures
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
33
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable-Selection Procedures
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
34
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable-Selection Procedures
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
35
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable-Selection Procedures
Minitab Output
The regression equation
Score = 74.678 - .0398(Drive) - 6.686(Fair)
- 10.342(Green) + 9.858(Putt)
Predictor Coef Stdev t-ratio p
Constant 74.678 6.952 10.74 .000
Drive -.0398 .01235 -3.22 .004
Fair -6.686 1.939 -3.45 .003
Green -10.342 3.561 -2.90 .009
Putt 9.858 3.180 3.10 .006
s = .2691 R-sq = 72.4% R-sq(adj) = 66.8%
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
36
or duplicated, or posted to a publicly accessible website, in whole or in part.
Variable-Selection Procedures
Minitab Output
Analysis of Variance
SOURCE DF SS MS F P
Regression 4 3.79469 .94867 13.10
Error .000 20 1.44865 .07243
Total 24 5.24334
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
37
or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Approach to
Experimental Design
The use of dummy variables in a multiple regression
equation can provide another approach to solving
analysis of variance and experimental design
problems.
We will use the results of multiple regression to
perform the ANOVA test on the difference in the
means of three populations.
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
38
or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Approach to
Experimental Design
Example: Reed Manufacturing
Janet Reed would like to know if there is any
significant difference in the mean number of hours
worked per week for the department managers at
her three manufacturing plants (in Buffalo,
Pittsburgh, and Detroit).
A simple random sample of five managers from
each of the three plants was taken and the number
of hours worked by each manager for the previous
week is shown on the next slide.
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
39
or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Approach to
Experimental Design
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
40
or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Approach to
Experimental Design
We begin by defining two dummy variables, A and
B, that will indicate the plant from which each sample
observation was selected.
In general, if there are k populations, we need to
define k – 1 dummy variables.
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
41
or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Approach to
Experimental Design
Input Data
0 0 48 1 0 73 0 1 51
0 0 54 1 0 63 0 1 63
0 0 57 1 0 66 0 1 61
0 0 54 1 0 64 0 1 54
0 0 62 1 0 74 0 1 56
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
42
or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Approach to
Experimental Design
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
43
or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Approach to
Experimental Design
Buffalo b0 = 55
Pittsburgh b0 + b1 = 55 + 13 = 68
Detroit b0 + b2 = 55 + 2 = 57
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
44
or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Approach to
Experimental Design
Next, we observe that if there is no difference in
the means:
E(y) for the Pittsburgh plant – E(y) for the Buffalo plant = 0
E(y) for the Detroit plant – E(y) for the Buffalo plant = 0
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
45
or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Approach to
Experimental Design
Because b0 equals E(y) for the Buffalo plant and
b0 + b1 equals E(y) for the Pittsburgh plant, the first
difference is equal to (b0 + b1) - b0 = b1.
Because b0 + b2 equals E(y) for the Detroit plant, the
second difference is equal to (b0 + b2) - b0 = b2.
We would conclude that there is no difference in the
three means if b1 = 0 and b2 = 0.
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
46
or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Approach to
Experimental Design
The null hypothesis for a test of the difference of
means is
H0: b1 = b2 = 0
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
47
or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Approach to
Experimental Design
ANOVA Table Produced by Excel
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
48
or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Approach to
Experimental Design
At a .05 level of significance, the critical value of
F with k – 1 = 3 – 1 = 2 numerator d.f. and nT – k =
15 – 3 = 12 denominator d.f. is 3.89.
Because the observed value of F (9.55) is greater than
the critical value of 3.89, we reject the null hypothesis.
Alternatively, we reject the null hypothesis because
the p-value of .003 < a = .05.
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
49
or duplicated, or posted to a publicly accessible website, in whole or in part.
Autocorrelation and the Durbin-Watson Test
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
50
or duplicated, or posted to a publicly accessible website, in whole or in part.
Autocorrelation and the Durbin-Watson Test
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
51
or duplicated, or posted to a publicly accessible website, in whole or in part.
Autocorrelation and the Durbin-Watson Test
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
52
or duplicated, or posted to a publicly accessible website, in whole or in part.
Autocorrelation and the Durbin-Watson Test
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
53
or duplicated, or posted to a publicly accessible website, in whole or in part.
Autocorrelation and the Durbin-Watson Test
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
54
or duplicated, or posted to a publicly accessible website, in whole or in part.
Autocorrelation and the Durbin-Watson Test
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
55
or duplicated, or posted to a publicly accessible website, in whole or in part.
Autocorrelation and the Durbin-Watson Test
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
56
or duplicated, or posted to a publicly accessible website, in whole or in part.
Autocorrelation and the Durbin-Watson Test
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
57
or duplicated, or posted to a publicly accessible website, in whole or in part.
Autocorrelation and the Durbin-Watson Test
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
59
or duplicated, or posted to a publicly accessible website, in whole or in part.