Multivariate Regression
Multivariate Regression
Multivariate Regression
Introduction
• The population regression model of a dependent variable, Y, on a
set of k independent variables, X1, X2,. . . , Xk is given by:
x1
yˆ b b x
0 1
X x2 y b0 b1 x1 b2 x 2
x1
x2
Total Deviation = Regression Deviation + Error Deviation
SST = SSR + SSE
11-3 The F Test of a Multiple Regression
Model
AAstatistical
statisticaltest
testfor
forthe
theexistence
existenceof ofaalinear
linearrelationship
relationshipbetween
betweenYYand
andany
anyor
or
allof
all ofthe
theindependent
independentvariables
variablesXX,1,XX,2,...,
...,XX:k:
1 2 k
H0: 1==2==...=
H0: ...==k= 0
1 2 k 0
H1: Not
H1: Notall thei(i=1,2,...,k)
allthe (i=1,2,...,k)areareequal
equaltoto00
i
Regressio SSR k S
SR
n M
SR
k
Error SSE n - (k+1) S
SE
M
SE
(n(k1
))
Total SST n-1 SST
M
ST
(n1)
Analysis of Variance Table
ANOVA Table
Source SS df MS F FCritical p-value
Regn. 32.9463 4 8.2366 73.059 2.5201 0.0000 s 0.3358
Error 6.98978 62 0.1127
Adjusted
Total 39.9361 66 0.6051 R 0.8250
2
R2 0.8137
2
Th
emu
lt
ip
le
co
ef
f
ic
ie
nt
ofde
te
rm
ina
ti
on,R ,m
eas
ur
es
th
e p
ro
p
or
ti
onof
t
hev
ar
i
at
io
ni
nth
ede
p
ende
nt
var
i
abl
eth
ati
sexp
l
ai
ned
byt
hec
om
bi
nat
i
on
o
fth
ei
nde
pe
nd
en
tv
ar
i
abl
esi
nth
emul
ti
ple
re
gre
s
si
onmo
de
l:
SS
RS SE
2
R= =
1-
SS
TS ST
Decomposition of the Sum of Squares and the
Adjusted Coefficient of Determination
SST
SSR SSE
2 S
SR S S
E
R= =1-
S
ST S S
T
Hypothesistests
Hypothesis testsabout
aboutindividual
individualregression
regressionslope
slope
parameters:
parameters:
(1) HH00::bb11==00
(1)
HH11::bb11 00
(2) HH00::bb22==00
(2)
HH11::bb22 00
...
..
.
(k) HH00::bbkk==00
(k)
HH1::bbk 00 0
b
1 k i
Test statistic for test i : t( n ( k 1)
s (bi )
Regression Results for Individual Parameters
0 1 2 3 4
0 0
x or y x or y
Residuals Residuals
0 0
Time x or y
Positively Skewed
Normal Probability Plot of the Residuals
Negatively Skewed
Multicollinearity
x2
x1 x2 x1
x2
x2
x1 x1
Some degree of collinearity.
A high degree of negative
Problems with regression
collinearity also causes
depend on the degree of
problems with regression.
collinearity.
Effects of Multicollinearity
•• Variancesof
Variances ofregression
regressioncoefficients
coefficientsare
areinflated.
inflated.
•• Magnitudesof
Magnitudes ofregression
regressioncoefficients
coefficientsmay
maybe bedifferent
different
fromwhat
from whatare
areexpected.
expected.
•• Signsof
Signs ofregression
regressioncoefficients
coefficientsmay
maynotnotbebeas
asexpected.
expected.
•• Addingor
Adding orremoving
removingvariables
variablesproduces
produceslarge
largechanges
changesin in
coefficients.
coefficients.
•• Removingaadata
Removing datapoint
pointmay
maycause
causelarge
largechanges
changesin in
coefficientestimates
coefficient estimatesor orsigns.
signs.
•• Insome
In somecases,
cases,the
theFFratio
ratiomay
maybe besignificant
significantwhile
whilethe
thett
ratiosare
ratios arenot.
not.
Solutions to the Multicollinearity Problem
•• Dropaacollinear
Drop collinearvariable
variablefrom
fromthetheregression
regression
•• Changein
Change insampling
samplingplan
planto
toinclude
includeelements
elements
outsidethe
outside themulticollinearity
multicollinearityrange
range
•• Transformationsof
Transformations ofvariables
variables
•• Ridgeregression
Ridge regression