Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

STA302 Week11 Full

Download as pdf or txt
Download as pdf or txt
You are on page 1of 49

STA302/1001 - Methods of Data Analysis I

(Week 11 lecture notes)

Wei (Becky) Lin

Nov. 21, 2016

1/49
Last Week

• Geometric perspective of least squares regression


• F test for regression coefficients.
• Coefficient of Multiple Determination.
• Inferences about Regression Parameters.
• Interval Estimation of —k , E(Yh ).
• Extra sum of squares

2/49
Week 11- Learning objectives & Outcomes

,¥fK
#
itk"!,j"
¥
!
a

striped
ml
:7=atAKtE →

Y=potp,X,tRXzE
MZ : !

.is#xyNsExk

sSTo= # Hitp :
same form !mz ,)
i
g-
fixed SSRCKLX, )
I
may given his
SSRLXZIX ,
.

• Review of Extra sum of squares ) ⇒ SEN , )

• Type I and Type III SS =ssRlk.Xy SSRLX


-
Elk .k)

• Use of Extra Sum of Squares in Tests for Regression coefficients


• Coefficient of Partial Determination
• Summary of Tests Concerning Regression Coefficients
• Multicollinearity and Its Effects

3/49
Review on Extra Sum of Squares

• Extra Sum of Squares measures the marginal decrease in SSE


(equivalently, the marginal increase in SSR) when one or several
predictor variables are added to the regression model, given that other
variables are already in the model.
• Extra: SSE goes down by the amount of x , SSR goes up by the same
amount of x since SSTO=SSR+SSE.
• Examples:
• SSR(X1 , X2 , X3 ) is the total variation explained by X1 X2 , and X3 in a
model
• SSR(X1 |X2 ) is the additional variation explained by X1 added to a
model already containing X2 .
• SSR(X1 , X4 |X2 , X3 ) is the additional variation explained by X1 and X4
when they are added to a model already containing X2 and X3 .
• Subscripts after bar ( | ) represent variables already in model.

4/49
Review on Extra Sum of Squares

SSEZ =/
#
a
1
:

1*1×4×3>4*4
ml :
Y= pot BXZFBXHE

m2iY=
SSTO under ml :
SSRIK # }

ssTounderm2@sSRlXl.k.X3.Xd-SSRlXz.X
ftp.XHAXstp.xitpyxytq
, ,

} )
= ,

SSECXS ,X} ) SSECXI ,XzX},X41 SSRCX ,X41k,X ,) 5/49


= !
-

, 1
Sequential SS (type I SS)
• SSR (Type I SS) decomposition

SSR df

Extra
X1 1 )
sum of squares
X2 |X1 1 t SSRIX } IX.
given previous X 's { X |X , X 1
Xz
)
3 1 2
(X1 , X2 , X3 ) 3
D= D=
lssrlx.in#=SsRlXi)tSSRlXz1Xi
ISSRCX.be#=SsRlXzHSSR(XiIXz
same

SSR df
X2 1 )
X1 |X2 1
T
X3 |X1 , X2 1 SSRlX3|X , ,X2 )
(X1 , X2 , X3 ) 3

• Depends on variable order


6/49
Extra Sum of Squares (type I SS)

Y = —0 + —1 Xi + . . . + —p≠1 Xp≠1 + ‘
q
• SSR = i (Ŷi ≠ Ȳ )2 = SSR(X1 , . . . , Xp≠1 )
q
• SSE = i (Yi ≠ Ŷi )2 = SSE (X1 , . . . , Xp≠1 )
• Extra Sum of Squares
• Break down the SSR to contributions from different X’s sequentially
• SSR(X1 ), SSR(X2 |X1 ), SSR(X3 |X1 , X2 ), . . .
• SSR(X2 |X1 ) = SSR(X1 , X2 ) ≠ SSR(X1 ) = SSE (X1 ) ≠ SSE (X1 , X2 )

• SSR = SSR(X1 ) + SSR(X2 |X1 ) + . . . SSR(Xp≠1 |X1 , . . . , Xp≠2 )


• Degrees of freedom of extra SSR equal to number of extra variables
• e.g. dfSSR(X3 |X1 ) = 1, dfSSR(X2 ,X3 |X1 ) = 2

• Contributions are not additive

SSR(X1 , X2 , X3 ) ”= SSR(X1 ) + SSR(X2 ) + SSR(X3 )

7/49
Extended ANOVA Table

.
capers
Source of Variance SS df MS
ELIMX 1 SSR(X1 ) 1 MSR(X1 )
X2 |X1 SSR(X2 |X1 ) 1 MSR(X2 |X1 )
X3 |X1 , X2 SSR(X3 |X1 , X2 ) 1 MSR(X3 |X1 , X2 )
Error SSE(X1 , X2 , X3 ) n-4 MSR(X1 , X2 , X3 )
Total SSTO n-1

t R
default anova output in
using annoy

8/49
F test in anova output (type I SS)

• Type I SS: variables added in order. Sum of sequential SSR gives SSR.
• F-tests are testing each variable given previous variables already in
model.

9/49
Type III/II SS

• SSR (Type III SS) decomposition: refers to variable added last.


• These do NOT add to the SSR.
• F-tests are testing variable given that all of the other variables already
in the model.

SSR df
XKIX K
X1 |X2 , X3 1
-

Xi :
-

xz :
X2 |X1 , X3 1 given all the rest
X3 :
X3 |X1 , X2 1
(X1 , X2 , X3 ) 3 X 's

sutmssrlxklkk ) # SSR
• Does not depend on variable order
• Type II SS are pretty much the same as Type III, except they ignore
interaction terms.

10/49
Type I vs Type III

• Estimates using Type I SS tell us how much of the variation in Y can


be explained by X1 , how much of the residual variability (SSE (X1 ))
can be explained by X2 , how much of the remaining residual
(SSE (X1 , X2 )) can be explained by X3 and so on, in order.

• Estimates using Type III SS tell us how much of the residual variability
in Y can be accounted for by X1 after having accounted for everything
else, and how much of the residual variability in Y can be accounted
for X2 after having accounted for everything else as well, and so on.

11/49
7.2 Use of Extra Sums of Squares in Tests for
Regression Coefficients

12/49
Partial F test: Test whether several —k = 0
• Consider a regression model, we call Full model:

Y = —0 + —1 X1 + . . . + —q Xq + —q+1 Xq+1 + . . . + —q+p Xq+p + ‘

• We want to test the null hypothesis that some of the —k are zero

H0 : —q+1 = . . . = —q+p = 0

• The alternative hypothesis is

HA : not all —q+1 , . . . , —q+p equal zero

• The general linear test approach:

Fú = mnmgt
Cnn
(SSER ≠ SSEF )/(dfR ≠ dfF )
SSEF /dfF

• Decision: reject H0 , in favor of Ha at – significance level if

F ú Ø F1≠–;dfR ≠dfF ,dfF


13/49
Partial F test: Test whether several —k = 0 (contd.)

• Full model:

Y = —0 + —1 X1 + . . . + —q Xq + —q+1 Xq+1 + . . . + —q+p Xq+p + ‘


Full model
• Reduced model (under H0 ):
"

Y = —0 + —1 X1 + . . . + —q Xq + ‘

• From Full model, we get:

SSR(X1 , . . . , Xq , . . . , Xq+p ), SSE (X1 , . . . , Xq , . . . , Xq+p ) =


SSEF

• From Reduced model, we get:

SSR(X1 , . . . , Xq ), SSE (X1 , . . . , Xq ) =


SSER

14/49
Partial F test: Test whether several —k = 0 (contd.)
• The extra sum of squares is obtained as

SSR(Xq+1 , . . . , Xp+q |X1 , . . . , Xq ) =


SSER -

SSEF
= SSE (X1 , . . . , Xq ) ≠ SSE (X1 , . . . , Xq , . . . , Xq+p )
• Alternatively

SSR(Xq+1 , . . . , Xp+q |X1 , . . . , Xq ) =


SSRF SSRR -

= SSR(X1 , . . . , Xq , . . . , Xq+p ) ≠ SSR(X1 , . . . , Xq )


• Hence, the test statistic is dfr =
n .

cqtil
dtf = n -
(
qtptt )
(SSER ≠ SSEF )/p p=dfr At
Fú =
-

SSE (X1 , . . . , Xp+q )/(n ≠ p ≠ q ≠ 1)


SSR(Xq+1 , . . . , Xp+q |X1 , . . . , Xq )/p
=
SSE (X1 , . . . , Xp+q )/(n ≠ p ≠ q ≠ 1)
• Decision: Reject H0 , conclude Ha at – significance level if
F ú Ø F1≠–;p,n≠p≠q≠1 15/49
Body fat Example:Testing a single —3 = 0

Full model Yi = —i + —1 X1 + —2 X2 + —3 X3 + ‘

• Test:
H0 : —3 = 0 Ha : — ”= 0
• Reduced model under H0 :

Reduced model : Yi = —i + —1 X1 + —2 X2 + ‘

• The test whether or not —3 = 0 is a marginal test, given X1 , X2 are


already in the the model.

16/49
Body fat Example:Testing a single —3 = 0 (contd.)

• SSEF = SSE (X1 , X2 , X3 ), df = n ≠ 4


• SSER = SSE (X1 , X2 ), df = n ≠ 3
• The general linear test statistics

(SSER ≠ SSEF )/(dfR ≠ dfF )


Fú =
SSEF /dfF
[SSE (X1 , X2 ) ≠ SSE (X1 , X2 , X3 )]/[(n ≠ 3) ≠ (n ≠ 4)]
=
SSE (X1 , X2 , X3 )/(n ≠ 4)
SSR(X3 |X1 , X2 )/1
=
SSE (X1 , X2 , X3 )/(n ≠ 4)
MSR(X3 |X1 , X2 )
=
MSEF

17/49
Body fat Example:Testing a single —3 = 0 (contd.)

}
MSEF MSRCX } IX. ,
Xz )

• Test statistics:
F ú = MSR(X3 |X1 , X2 )/MSEF = 11.54/6.15 = 1.876423
• Decision: F ú = 1.876 Æ 4.494 = F1≠0.05,1,16 , failed to reject H0 .

18/49
Body fat Example:Testing a single —k = 0 (contd.)
body=read.table("/Users/Wei/TA/Teaching/0-STA302-2016F/Week10-Nov14/bodyfat.txt",header=T)
n <- dim(body)[1]
fmod <- lm(Y~. ,data=body) ← f- ftp.XH/zXzTBX3tE
'

anova(fmod)
type Iss
## Analysis of Variance Table
##
## Response: Y
## Df Sum Sq Mean Sq F value Pr(>F)
## X1 1 352.27 352.27 57.2768 1.131e-06 ***
## X2 1 33.17 33.17 5.3931 0.03373 *
## X3 1 11.55 11.55 1.8773 0.18956
## Residuals 16 98.40 6.15
## ---
## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

rmod <- lm(Y~X1+X2,data=body)


Ho :
P3=o Us Ha :
past 0

SSEf = deviance(fmod)
Method I :

SSEr = deviance(rmod)
Ft <- ((SSEr-SSEf)/1)/(SSEf/(n-4)) Hersseflkdfr dfe ,
.
-

Ft - p*=
Feldt
## [1] 1.877289

~¥¥€n÷
pf(Ft,1,n-4,lower.tail=F)
. -

## [1] 0.1895628
19/49
Body fat Example:Testing a single —3 = 0 (contd.)

body=read.table("/Users/Wei/TA/Teaching/0-STA302-2016F/Week10-Nov14/bodyfat.txt",header=T)
n <- dim(body)[1]
fmod <- lm(Y~. ,data=body) → Full model
rmod <- lm(Y~X1+X2,data=body) → Reduced model
anova(rmod,fmod)

## Analysis of Variance Table


##
## Model 1: Y ~ X1 + X2
## Model 2: Y ~ X1 + X2 + X3
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 17 109.951
## 2 16 98.405 1 11.546 1.8773 0.1896

¥* ICE
,
.by#I=o.l896

20/49
Body fat Example:Testing —1 = —3 = 0

H0 : —1 = —3 = 0 Ha : not both —1 , —3 equal zero.

• Full model:
Yi = —i + —1 X1 + —2 X2 + —3 X3 + ‘
• Reduced model (under H0 ):

Yi = —i + —2 X2 + ‘

• Test statistics

[SSE (X2 ) ≠ SSE (X1 , X2 , X3 )]/[(n ≠ 2) ≠ (n ≠ 4)]


Fú =
SSE (X1 , X2 , X3 )/(n ≠ 4)
SSR(X1 , X3 |X2 )/2
=
SSEF /(n ≠ 4)
MSR(X1 , X3 |X2 )
=
MSEF
21/49
Body fat Example:Testing —1 = —3 = 0
body=read.table("/Users/Wei/TA/Teaching/0-STA302-2016F/Week10-Nov14/bodyfat.txt",header=T)
n <- dim(body)[1]; fmod <- lm(Y~X2+X1+X3,data=body)
anova(fmod)

## Analysis of Variance Table


##
## Response: Y
## Df Sum Sq Mean Sq F value Pr(>F)
## X2 1 381.97 381.97 62.1052 6.735e-07 ***
## X1 1 3.47 3.47 0.5647 0.4633 SSRCXI ,X}|X2/
## X3 1 11.55 11.55 1.8773 0.1896
## Residuals 16 98.40 6.15 = SSRIX .kz/tSSRlX31Xi.Xy
)
## ---
## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 = 3.47+11.55
u
sum(anova(fmod)[2:3,2])/2/anova(fmod)[4,3] = 15.02
mm
-

SSRCXI ,X}Hz)/z
MSEF =

## [1] 1.22098

MSEp(¥
"
-
-

p*=(sser-ssEf)kdtR-df=-)
-
' -
-

rmod <- lm(Y~X2,data=body) -


anova(rmod,fmod)
*
## Analysis of Variance Table
##
## Model 1: Y ~ X2
## Model 2: Y ~ X2 + X1 + X3
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 18 113.424
## 2 16 98.405 2 15.019 1.221 0.321 22/49
Comments

• Testing whether a single —k equals zero:


• the t ú test statistic

.

refer slides
25 oelq
• the F ú general linear test statistic
ú 2
• F. = (t ) when Xk is the last predictor in the full model using Type
ú

I SS.
• F ú = (t ú )2 for ’k when use Type III SS. refer slides 25126
=

• Testing whether several —k equal zero:


• the F ú general linear test statistic ( partial F test )
• General linear test statistic can be expressed in term of the the
coefficients of multiple determination R 2

(SSER ≠ SSEF )/(dfR ≠ dfF ) (R 2 ≠ RR2 )/(dfR ≠ dfF )


Fú = = F
SSEF /dfF (1 ≠ RF2 )/dfF

• The latter formula using R 2 is not appropriate when the full and
reduced models do not contain —0

23/49
Show

(SSER ≠ SSEF )/(dfR ≠ dfF ) (R 2 ≠ RR2 )/(dfR ≠ dfF )


Fú = = F
SSEF /dfF (1 ≠ RF2 )/dfF

.
For given Y , ssto are the same for full model and reduced model

ssdefh.su?glssto sser RE
p*= t
.

to
=

stage lssto
]µ€ SEE =
TRI

rarities
=

÷ thee

24/49
Body Fat example: F ú = (t ú )2 using Type III SS
body=read.table("/Users/Wei/TA/Teaching/0-STA302-2016F/Week10-Nov14/bodyfat.txt",header=T)
fmod <- lm(Y~X1+X2+X3,data=body)
summary(fmod)

##
## Call:
## lm(formula = Y ~ X1 + X2 + X3, data = body)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.7263 -1.6111 0.3923 1.4656 4.1277
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 117.085 99.782 1.173 0.258
## X1 4.334 3.016 1.437 0.170
## X2 -2.857 2.582 -1.106 0.285
## X3 -2.186 1.595 -1.370 0.190
##
## Residual standard error: 2.48 on 16 degrees of freedom
## Multiple R-squared: 0.8014, Adjusted R-squared: 0.7641
## F-statistic: 21.52 on 3 and 16 DF, p-value: 7.343e-06

(summary(fmod)$coef[,"t value"])^2 ← ¥2
as F- value
same
## (Intercept) X1 X2 X3
## 1.376868 2.065734 1.224212 1.877289

in next slide
25/49
Body Fat example: F ú = (t ú )2 using Type III SS

body=read.table("/Users/Wei/TA/Teaching/0-STA302-2016F/Week10-Nov14/bodyfat.txt",header=T)
fmod <- lm(Y~X1+X2+X3,data=body)
library(car)
Anova(fmod,type=3)
-

-type
## Anova Table (Type III tests) It ss
##
## Response: Y
## Sum Sq Df F value Pr(>F)
## (Intercept) 8.468 1 1.3769 0.2578
## X1 12.705 1 2.0657 0.1699
## X2 7.529 1 1.2242 0.2849
## X3 11.546 1 1.8773 0.1896
## Residuals 98.405 16

sqrt(Anova(fmod,type=3)[1:4,3])

## [1] 1.173400 1.437266 1.106441 1.370142

26/49
7.3 Summary of Tests concerning Regression
coefficients

27/49
Summary

• Test whether all —k = 0


• overall F test:
MSR
Fú = ≥ Fp≠1,n≠p
MSE
• Test whether a single —k = 0
• Partial F test:
MSR(Xk |X1 , . . . , Xk≠1 , Xk+1 , . . . , Xp≠1 )
Fú = ≥ F1,n≠p
MSE
bk
• F ú = (t ú )2 = s{b k}
true for last predictor using Type I SS, and true
for any k using type III SS.

28/49
Summary (contd.)

• Test whether some —k = 0

H0 : —q = —q+1 = . . . = —p≠1 = 0, Ha : not all —k in H0 equal zero.

• partial F test:

(SSER ≠ SSEF )/(dfR ≠ dfF ) (R 2 ≠ RR2 )/(dfR ≠ dfF )


Fú = = F
SSEF /dfF (1 ≠ RF2 )/dfF
MSR(Xq , . . . , Xp≠1 |X1 , . . . , Xp≠2 )
= ≥ Fp≠q,n≠p
MSEF

29/49
Summary (contd.)

• Other test using general linear test

Full :Y = —0 + —1 X1 + —2 X2 + —3 X3 + ‘ ← dff = n -4

• H0 : —1 = 2—2 , Ha : —1 ”= 2—2
• Reduced:Y = —0 + —c (2X1 + X2 ) + —3 X3 + ‘ ←
dfr = n
-

• The general F ú test statistics ≥ F1,n≠4


"

dtp dtf
.
=
(h
-

3)
-

( h 4) -

=/
• H0 : —1 = 3; —3 = 5, Ha : not both equalities in H0 hold
• Reduced:Y ≠ 3X1 ≠ 5X3 = —0 + —2 X2 + ‘ ←
dfr= n 2
• The general F ú test statistics ≥ F2,n≠4
-

dfrtdfp =
( h 2) .
-
In -41=2

30/49
7.4 Coefficient of Partial Determination

31/49
Coefficient of Partial Determination

• Coefficient of determination
SSR SSE
R2 = =1≠
SSTO SSTO

• the percentage of the total variation in Y that has been explained by


the model.
• Partial Determination: the amount of remaining variation explained
by a variable given other variables already in the model, this is called
partial determination.
• Coefficient of Partial Determination:
SSE (X2 , X3 ) ≠ SSE (X1 , X2 , X3 ) SSR(X1 |X2 , X3 )
RY2 1|23 = =
SSE (X2 , X3 ) SSE (X2 , X3 )

*
isms -1223*3

# ,
R: .name#xYM=-=sserIsIg
32/49
Coefficient of Partial Determination (contd.)

• Full model: Y = —0 + —1 X1 + —2 X2 + —3 X3 + ‘
• Reduced model: Y = —0 + —1 X2 + —2 X3 + ‘
SSER SSEF
SSE (X2 , X3 ) ≠ SSE (X1 , X2 , X3 ) SSR(X1 |X2 , X3 )
RY2 1|23 = =
SSE (X2 , X3 ) SSE (X2 , X3 )
"

SSER
• Measure relative reduction in Y variance after introducing X1 to
model with X2 , X3 .
• Takes values in [0,1]
• RY2 1|23 = R 2 of regressing residuals of reduced model to residuals of
E (X1 ) = —0 + —1 X2 + —2 X3 .

↳ .
mk lml 't xztx } )

{ TmImYsYrdId×Y×mY$
!raid ) → her ?
' in

33/49
Coefficient of Partial Determination (contd.)

• RY2 1|23 = R 2 of regressing residuals of reduced model to residuals of


E (X1 ) = —0 + —1 X2 + —2 X3 .
• Regress Y on X2 , X3 to get Ŷi (X2 , X3 ) and ei (Y |X2 , X3 )
{ • Regress X1 on X2 , X3 to get X̂i (X2 , X3 ) and ei (X1 |X2 , X3 )
• R 2 between ei (Y |X2 , X3 ) and ei (X1 |X2 , X3 ) will be the same as
RY2 1|23 .
• added variable plots or partial regression plot: the strength of the
relationship between Y and X1 adjusted for X2 , X3 .

ei (Y |X2 , X3 ) vs ei (X1 |X2 , X3 )

• More generally

SSR(Xp , . . . , Xm |X1 , . . . , Xp , . . . Xm ) SSER ≠ SSEF


RY2 p,...,m|1,2,...,p≠1 = =
SSE (X1 , . . . , Xp , . . . , Xm ) SSER

• Coefficient of Partial Correlation


Ò
rY k|1,...,p≠1 = sign(bk ) RY2 k|1,...,p≠1

34/49
7.6 Multicollinearity and its effect

35/49
What is multicollinearity

• Multicollinearity, is also called collinearity or intercorrelation: the


predictor variables are correlated among themselves.
• Uncorrelated predictor variables: the marginal reduction in the SSE
when the other predictor variables are in the model is exactly the
same when the predictor variable is in the model.
• eg. X1 , X2 are uncorrelated, then
SSR(X1 |X2 ) = SSR(X1 ), SSR(X2 |X1 ) = SSR(X2 )

36/49
Uncorrelated predictors

X1 = c(rep(4,4),rep(6,4)) # crew size


X2=c(2,2,3,3,2,2,3,3) # bonus pay
Y=c(42,39,48,51,49,53,61,60) # crew productivity
cor(X1,X2)

## [1] 0 Xi xz :
uncorrelated
,

anova(lm(Y~X1+X2))

## Analysis of Variance Table


##
## Response: Y
## Df Sum Sq Mean Sq F value Pr(>F)
## X1 1 231.125 231.125 65.567 0.0004657 ***
## X2 1 171.125 171.125 48.546 0.0009366 ***
## Residuals 5 17.625 3.525 J
## --- SSRCXZIX , 1=171.125
## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

37/49
Uncorrelated predictors
X1 = c(rep(4,4),rep(6,4)) # crew size
X2=c(2,2,3,3,2,2,3,3) # bonus pay
Y=c(42,39,48,51,49,53,61,60) # crew productivity
cor(X1,X2)

## [1] 0

anova(lm(Y~X1))

## Analysis of Variance Table


##
## Response: Y
## Df Sum Sq Mean Sq F value Pr(>F)
## X1 1 231.12 231.125 7.347 0.03508 *
## Residuals 6 188.75 31.458
## ---
## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

anova(lm(Y~X2))

## Analysis of Variance Table


##
## Response: Y
## Df Sum Sq Mean Sq F value Pr(>F)
## X2 1 171.12 171.125 4.1276 0.08846 . still 37
##
##
Residuals 6 248.75 41.458 >
--- sSRlX2 )= ssrcxzlx ,
,k
## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
38/49
Predictors are perfect correlated

X2 = 5 + 0.5X1
E (Y ) = —0 + —1 X1 + —2 (5 + 0.5X1 ) = (—0 + 5—2 ) + (—1 + 0.5—2 )X1

39/49
Predictors are perfect correlated

• When two predictor variables are perfectly correlated, many response


functions will lead to the same fitted values for the observations.

• The perfect relation between X1 and X2 did not inhabit the ability to
obtain a good fit to the data.

• Since many different response functions provided the same good fit,
we can not interpret any one set of regression coefficients as reflecting
the effects of the different predictor variables.

40/49
Collinearity and its effects: body fat data

41/49
Collinearity and its effects: body fat data

• Collinearity effect on regression coefficients

Variables in model b1 ( aoefotx ) b2 ( wet otk )


,
pvalcaol
.

Ml :
X1 0.8572(ú ú ú) - evidence
strong
X2 - 0.8565(ú ú ú)

d.
43 :
X1 , X2 0.2224 0.6594 (ú) moderate
X1 , X2 , X3 4.334 -2.857
a-

M4 evidence
:

aokp ,uol< 0.05


Ml : Y~X ,

MZ :
Y ~X2

M3 : Y -
XHXZ

M4 :
Y~XitXztX3
42/49
Collinearity and its effects: body fat data

• Collinearity effect on s(bk )

Variables in model s(b1 ) s(b2 )


X1 0.1288 -
X2 - 0.1100
X1 , X2 0.3034 0.2912
X1 , X2 , X3 3.016 2.582

The high degree of multicollinearity among the predictor variables is


responsible for the inflated variability of the estimated regression
coefficients.

43/49
Collinearity and its effects: body fat data

• Collinearity effect on fitted values and predictions

Variables in model MSE


X1 7.95
X1 , X2 6.47
X1 , X2 , X3 6.15

• Estimated means and predicted values are not affected.


-

44/49
Collinearity and its effects: body fat data

• Collinearity effect on simultaneous tests of —k


• it is possible that when individual t tests are performed, neither —1 or
—2 is significant.

be significant. ÷
However, when the F test is performed for both, the results may still
.

• Need for more powerful diagnostics for multicollinearity.

45/49
Summary of Multicollinearity

• When X’s are orthogonal, XkÕ Xj = 0, so X T X is a diagonal matrix


• parameter estimates are independent, s 2 (b) = MSE (X T X )≠1 is also
diagonal.
• marginal contribution of each X is additive

SSR(X1 , .., Xp≠1 ) = SSR(X1 ) + . . . + SSR(Xp≠1 )

• Type I and Type III SS are equivalent.

q
• When X’s are collinear, Xk = j”=k aj Xj
• Different set of parameters identical mean response function
• marginal contribution of each X depends on which of the other
variables are already in model
• The X T X matrix is not invertible.

46/49
Summary of Multicollinearity

• Effects of multicollinearity
• b’s are highly correlated. cor (bk , bj ) ¥ 1
• b’s have high variance: s(bk ) is high.
• individual estimates appear insignificant

• signs of b’s contrary to intuition.


• problems with parameter interpretation
• bk is the rate change in E(Y) per unit change in Xk , keeping other X’s
fixed, but X Õ s change with Xk .
• inference for E(Y) and predictions remains valid.
• Type I and Type III SS are different, except for the last Type I SS.

47/49
VIF: Diagnostic for multicollinearity

• VIF: Variance Inflation Factors

VIFk = [rXX
≠1
]k,k , k = 1, . . . , p ≠ 1

• k th diagonal element of rXX


≠1

• Alternative definition
1
VIFk =
1 ≠ Rk2
q
• Rk2 is R 2 for E (Xk ) = j”=k
—j Xj
• VIFk measures how much bigger is S 2 (bk ) as compared to a model
.

with independent X’s.

Use of VIF
• If X’s are linearly independent, then VIFk ¥ 1
• If X’s are collinear, then VIF >> 1
• Rule of thumb: if maxk=1,...,p≠1 (VIFk ) > 10, then multicollinearity.
48/49
Practice problems after Week 11 lectures

• keep trying all problems that we have covered today in Ch7:


• 7.2, 7.3, 7.7, 7.8, 7.12, 7.15, 7.20, 7.22, 7.23, 7.27, 7.31.
• Upcoming topic:
• model selection.
• Final review

49/49

You might also like