STA302 Week11 Full
STA302 Week11 Full
STA302 Week11 Full
1/49
Last Week
2/49
Week 11- Learning objectives & Outcomes
,¥fK
#
itk"!,j"
¥
!
a
striped
ml
:7=atAKtE →
Y=potp,X,tRXzE
MZ : !
.is#xyNsExk
→
sSTo= # Hitp :
same form !mz ,)
i
g-
fixed SSRCKLX, )
I
may given his
SSRLXZIX ,
.
3/49
Review on Extra Sum of Squares
4/49
Review on Extra Sum of Squares
SSEZ =/
#
a
1
:
1*1×4×3>4*4
ml :
Y= pot BXZFBXHE
m2iY=
SSTO under ml :
SSRIK # }
ssTounderm2@sSRlXl.k.X3.Xd-SSRlXz.X
ftp.XHAXstp.xitpyxytq
, ,
} )
= ,
, 1
Sequential SS (type I SS)
• SSR (Type I SS) decomposition
SSR df
Extra
X1 1 )
sum of squares
X2 |X1 1 t SSRIX } IX.
given previous X 's { X |X , X 1
Xz
)
3 1 2
(X1 , X2 , X3 ) 3
D= D=
lssrlx.in#=SsRlXi)tSSRlXz1Xi
ISSRCX.be#=SsRlXzHSSR(XiIXz
same
SSR df
X2 1 )
X1 |X2 1
T
X3 |X1 , X2 1 SSRlX3|X , ,X2 )
(X1 , X2 , X3 ) 3
Y = —0 + —1 Xi + . . . + —p≠1 Xp≠1 + ‘
q
• SSR = i (Ŷi ≠ Ȳ )2 = SSR(X1 , . . . , Xp≠1 )
q
• SSE = i (Yi ≠ Ŷi )2 = SSE (X1 , . . . , Xp≠1 )
• Extra Sum of Squares
• Break down the SSR to contributions from different X’s sequentially
• SSR(X1 ), SSR(X2 |X1 ), SSR(X3 |X1 , X2 ), . . .
• SSR(X2 |X1 ) = SSR(X1 , X2 ) ≠ SSR(X1 ) = SSE (X1 ) ≠ SSE (X1 , X2 )
7/49
Extended ANOVA Table
.
capers
Source of Variance SS df MS
ELIMX 1 SSR(X1 ) 1 MSR(X1 )
X2 |X1 SSR(X2 |X1 ) 1 MSR(X2 |X1 )
X3 |X1 , X2 SSR(X3 |X1 , X2 ) 1 MSR(X3 |X1 , X2 )
Error SSE(X1 , X2 , X3 ) n-4 MSR(X1 , X2 , X3 )
Total SSTO n-1
t R
default anova output in
using annoy
8/49
F test in anova output (type I SS)
• Type I SS: variables added in order. Sum of sequential SSR gives SSR.
• F-tests are testing each variable given previous variables already in
model.
9/49
Type III/II SS
SSR df
XKIX K
X1 |X2 , X3 1
-
Xi :
-
xz :
X2 |X1 , X3 1 given all the rest
X3 :
X3 |X1 , X2 1
(X1 , X2 , X3 ) 3 X 's
sutmssrlxklkk ) # SSR
• Does not depend on variable order
• Type II SS are pretty much the same as Type III, except they ignore
interaction terms.
10/49
Type I vs Type III
• Estimates using Type III SS tell us how much of the residual variability
in Y can be accounted for by X1 after having accounted for everything
else, and how much of the residual variability in Y can be accounted
for X2 after having accounted for everything else as well, and so on.
11/49
7.2 Use of Extra Sums of Squares in Tests for
Regression Coefficients
12/49
Partial F test: Test whether several —k = 0
• Consider a regression model, we call Full model:
• We want to test the null hypothesis that some of the —k are zero
H0 : —q+1 = . . . = —q+p = 0
Fú = mnmgt
Cnn
(SSER ≠ SSEF )/(dfR ≠ dfF )
SSEF /dfF
• Full model:
Y = —0 + —1 X1 + . . . + —q Xq + ‘
14/49
Partial F test: Test whether several —k = 0 (contd.)
• The extra sum of squares is obtained as
SSEF
= SSE (X1 , . . . , Xq ) ≠ SSE (X1 , . . . , Xq , . . . , Xq+p )
• Alternatively
cqtil
dtf = n -
(
qtptt )
(SSER ≠ SSEF )/p p=dfr At
Fú =
-
Full model Yi = —i + —1 X1 + —2 X2 + —3 X3 + ‘
• Test:
H0 : —3 = 0 Ha : — ”= 0
• Reduced model under H0 :
Reduced model : Yi = —i + —1 X1 + —2 X2 + ‘
16/49
Body fat Example:Testing a single —3 = 0 (contd.)
17/49
Body fat Example:Testing a single —3 = 0 (contd.)
}
MSEF MSRCX } IX. ,
Xz )
• Test statistics:
F ú = MSR(X3 |X1 , X2 )/MSEF = 11.54/6.15 = 1.876423
• Decision: F ú = 1.876 Æ 4.494 = F1≠0.05,1,16 , failed to reject H0 .
18/49
Body fat Example:Testing a single —k = 0 (contd.)
body=read.table("/Users/Wei/TA/Teaching/0-STA302-2016F/Week10-Nov14/bodyfat.txt",header=T)
n <- dim(body)[1]
fmod <- lm(Y~. ,data=body) ← f- ftp.XH/zXzTBX3tE
'
anova(fmod)
type Iss
## Analysis of Variance Table
##
## Response: Y
## Df Sum Sq Mean Sq F value Pr(>F)
## X1 1 352.27 352.27 57.2768 1.131e-06 ***
## X2 1 33.17 33.17 5.3931 0.03373 *
## X3 1 11.55 11.55 1.8773 0.18956
## Residuals 16 98.40 6.15
## ---
## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
SSEf = deviance(fmod)
Method I :
SSEr = deviance(rmod)
Ft <- ((SSEr-SSEf)/1)/(SSEf/(n-4)) Hersseflkdfr dfe ,
.
-
Ft - p*=
Feldt
## [1] 1.877289
~¥¥€n÷
pf(Ft,1,n-4,lower.tail=F)
. -
## [1] 0.1895628
19/49
Body fat Example:Testing a single —3 = 0 (contd.)
body=read.table("/Users/Wei/TA/Teaching/0-STA302-2016F/Week10-Nov14/bodyfat.txt",header=T)
n <- dim(body)[1]
fmod <- lm(Y~. ,data=body) → Full model
rmod <- lm(Y~X1+X2,data=body) → Reduced model
anova(rmod,fmod)
¥* ICE
,
.by#I=o.l896
20/49
Body fat Example:Testing —1 = —3 = 0
• Full model:
Yi = —i + —1 X1 + —2 X2 + —3 X3 + ‘
• Reduced model (under H0 ):
Yi = —i + —2 X2 + ‘
• Test statistics
SSRCXI ,X}Hz)/z
MSEF =
## [1] 1.22098
MSEp(¥
"
-
-
p*=(sser-ssEf)kdtR-df=-)
-
' -
-
⇒
anova(rmod,fmod)
*
## Analysis of Variance Table
##
## Model 1: Y ~ X2
## Model 2: Y ~ X2 + X1 + X3
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 18 113.424
## 2 16 98.405 2 15.019 1.221 0.321 22/49
Comments
.
→
refer slides
25 oelq
• the F ú general linear test statistic
ú 2
• F. = (t ) when Xk is the last predictor in the full model using Type
ú
I SS.
• F ú = (t ú )2 for ’k when use Type III SS. refer slides 25126
=
• The latter formula using R 2 is not appropriate when the full and
reduced models do not contain —0
23/49
Show
.
For given Y , ssto are the same for full model and reduced model
ssdefh.su?glssto sser RE
p*= t
.
to
=
stage lssto
]µ€ SEE =
TRI
rarities
=
÷ thee
24/49
Body Fat example: F ú = (t ú )2 using Type III SS
body=read.table("/Users/Wei/TA/Teaching/0-STA302-2016F/Week10-Nov14/bodyfat.txt",header=T)
fmod <- lm(Y~X1+X2+X3,data=body)
summary(fmod)
##
## Call:
## lm(formula = Y ~ X1 + X2 + X3, data = body)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.7263 -1.6111 0.3923 1.4656 4.1277
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 117.085 99.782 1.173 0.258
## X1 4.334 3.016 1.437 0.170
## X2 -2.857 2.582 -1.106 0.285
## X3 -2.186 1.595 -1.370 0.190
##
## Residual standard error: 2.48 on 16 degrees of freedom
## Multiple R-squared: 0.8014, Adjusted R-squared: 0.7641
## F-statistic: 21.52 on 3 and 16 DF, p-value: 7.343e-06
(summary(fmod)$coef[,"t value"])^2 ← ¥2
as F- value
same
## (Intercept) X1 X2 X3
## 1.376868 2.065734 1.224212 1.877289
→
in next slide
25/49
Body Fat example: F ú = (t ú )2 using Type III SS
body=read.table("/Users/Wei/TA/Teaching/0-STA302-2016F/Week10-Nov14/bodyfat.txt",header=T)
fmod <- lm(Y~X1+X2+X3,data=body)
library(car)
Anova(fmod,type=3)
-
-type
## Anova Table (Type III tests) It ss
##
## Response: Y
## Sum Sq Df F value Pr(>F)
## (Intercept) 8.468 1 1.3769 0.2578
## X1 12.705 1 2.0657 0.1699
## X2 7.529 1 1.2242 0.2849
## X3 11.546 1 1.8773 0.1896
## Residuals 98.405 16
sqrt(Anova(fmod,type=3)[1:4,3])
26/49
7.3 Summary of Tests concerning Regression
coefficients
27/49
Summary
28/49
Summary (contd.)
• partial F test:
29/49
Summary (contd.)
Full :Y = —0 + —1 X1 + —2 X2 + —3 X3 + ‘ ← dff = n -4
• H0 : —1 = 2—2 , Ha : —1 ”= 2—2
• Reduced:Y = —0 + —c (2X1 + X2 ) + —3 X3 + ‘ ←
dfr = n
-
dtp dtf
.
=
(h
-
3)
-
( h 4) -
=/
• H0 : —1 = 3; —3 = 5, Ha : not both equalities in H0 hold
• Reduced:Y ≠ 3X1 ≠ 5X3 = —0 + —2 X2 + ‘ ←
dfr= n 2
• The general F ú test statistics ≥ F2,n≠4
-
dfrtdfp =
( h 2) .
-
In -41=2
30/49
7.4 Coefficient of Partial Determination
31/49
Coefficient of Partial Determination
• Coefficient of determination
SSR SSE
R2 = =1≠
SSTO SSTO
*
isms -1223*3
# ,
R: .name#xYM=-=sserIsIg
32/49
Coefficient of Partial Determination (contd.)
• Full model: Y = —0 + —1 X1 + —2 X2 + —3 X3 + ‘
• Reduced model: Y = —0 + —1 X2 + —2 X3 + ‘
SSER SSEF
SSE (X2 , X3 ) ≠ SSE (X1 , X2 , X3 ) SSR(X1 |X2 , X3 )
RY2 1|23 = =
SSE (X2 , X3 ) SSE (X2 , X3 )
"
SSER
• Measure relative reduction in Y variance after introducing X1 to
model with X2 , X3 .
• Takes values in [0,1]
• RY2 1|23 = R 2 of regressing residuals of reduced model to residuals of
E (X1 ) = —0 + —1 X2 + —2 X3 .
↳ .
mk lml 't xztx } )
{ TmImYsYrdId×Y×mY$
!raid ) → her ?
' in
33/49
Coefficient of Partial Determination (contd.)
• More generally
34/49
7.6 Multicollinearity and its effect
35/49
What is multicollinearity
36/49
Uncorrelated predictors
## [1] 0 Xi xz :
uncorrelated
,
anova(lm(Y~X1+X2))
37/49
Uncorrelated predictors
X1 = c(rep(4,4),rep(6,4)) # crew size
X2=c(2,2,3,3,2,2,3,3) # bonus pay
Y=c(42,39,48,51,49,53,61,60) # crew productivity
cor(X1,X2)
## [1] 0
anova(lm(Y~X1))
anova(lm(Y~X2))
X2 = 5 + 0.5X1
E (Y ) = —0 + —1 X1 + —2 (5 + 0.5X1 ) = (—0 + 5—2 ) + (—1 + 0.5—2 )X1
39/49
Predictors are perfect correlated
• The perfect relation between X1 and X2 did not inhabit the ability to
obtain a good fit to the data.
• Since many different response functions provided the same good fit,
we can not interpret any one set of regression coefficients as reflecting
the effects of the different predictor variables.
40/49
Collinearity and its effects: body fat data
41/49
Collinearity and its effects: body fat data
Ml :
X1 0.8572(ú ú ú) - evidence
strong
X2 - 0.8565(ú ú ú)
←
d.
43 :
X1 , X2 0.2224 0.6594 (ú) moderate
X1 , X2 , X3 4.334 -2.857
a-
M4 evidence
:
MZ :
Y ~X2
M3 : Y -
XHXZ
M4 :
Y~XitXztX3
42/49
Collinearity and its effects: body fat data
43/49
Collinearity and its effects: body fat data
44/49
Collinearity and its effects: body fat data
45/49
Summary of Multicollinearity
q
• When X’s are collinear, Xk = j”=k aj Xj
• Different set of parameters identical mean response function
• marginal contribution of each X depends on which of the other
variables are already in model
• The X T X matrix is not invertible.
46/49
Summary of Multicollinearity
• Effects of multicollinearity
• b’s are highly correlated. cor (bk , bj ) ¥ 1
• b’s have high variance: s(bk ) is high.
• individual estimates appear insignificant
47/49
VIF: Diagnostic for multicollinearity
VIFk = [rXX
≠1
]k,k , k = 1, . . . , p ≠ 1
• Alternative definition
1
VIFk =
1 ≠ Rk2
q
• Rk2 is R 2 for E (Xk ) = j”=k
—j Xj
• VIFk measures how much bigger is S 2 (bk ) as compared to a model
.
Use of VIF
• If X’s are linearly independent, then VIFk ¥ 1
• If X’s are collinear, then VIF >> 1
• Rule of thumb: if maxk=1,...,p≠1 (VIFk ) > 10, then multicollinearity.
48/49
Practice problems after Week 11 lectures
49/49