Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Multivariate Regression

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 20

Multivariate Regression

Introduction
• The population regression model of a dependent variable, Y, on a
set of k independent variables, X1, X2,. . . , Xk is given by:

Y = β0 + β1X1 + β2X2 + β3X3 + . . . + βkXk+ ε


Y = the value of the dependent (response) variable
β0 = the regression constant
β1 = the partial regression coefficient of independent variable 1
β2 = the partial regression coefficient of independent variable 2
βk = the partial regression coefficient of independent variable k
k = the number of independent variables
ε = the error of prediction
Model Assumptions

1. ε ~ N(0,σ2), independent of other errors.


2. The variables Xi are uncorrelated with the error term.
Simple and Multiple Least-Squares
Regression
Y y

x1
yˆ  b  b x
0 1
X x2 y  b0  b1 x1  b2 x 2

In a simple regression model, In a multiple regression model,


the least-squares estimators the least-squares estimators
minimize the sum of squared minimize the sum of squared
errors from the estimated errors from the estimated
regression line. regression plane.
Example 11-2
• Data given in Table 11-6 (page 512-13)
• Dependent Variable:
– Exports: US exports to Singapore in billions of Singapore
dollars
• Independent variables:
– M1: Money supply figures in billions of Singapore dollars
– Lend: minimum Singapore bank lending rate in %
– Price: An index of local prices where the base year is 1974
– Exchange: The exchange rate of Singapore dollars per US
dollar.
Decomposition of the Total Deviation in a
Multiple Regression Model
y
Y  Y: Error Deviation
Total deviation: Y  Y
Y  Y : Regression Deviation
y

x1

x2
Total Deviation = Regression Deviation + Error Deviation
SST = SSR + SSE
11-3 The F Test of a Multiple Regression
Model
AAstatistical
statisticaltest
testfor
forthe
theexistence
existenceof ofaalinear
linearrelationship
relationshipbetween
betweenYYand
andany
anyor
or
allof
all ofthe
theindependent
independentvariables
variablesXX,1,XX,2,...,
...,XX:k:
1 2 k
H0: 1==2==...=
H0: ...==k= 0
1 2 k 0
H1: Not
H1: Notall thei(i=1,2,...,k)
allthe (i=1,2,...,k)areareequal
equaltoto00
i

Source of Sum of Degrees of


Variation Squares Freedom Mean Square F Ratio

Regressio SSR k S
SR
n M
SR
k
Error SSE n - (k+1) S
SE
M
SE
(n(k1
))
Total SST n-1 SST
M
ST
(n1)
Analysis of Variance Table
ANOVA Table                  
  Source SS df MS F FCritical p-value  
  Regn. 32.9463 4 8.2366 73.059 2.5201 0.0000 s 0.3358 
  Error 6.98978 62 0.1127  
Adjusted
  Total 39.9361 66 0.6051 R 0.8250
2
R2 0.8137  
                     

F Distribution with 2 and 7 Degrees of Freedom Thetest


The teststatistic,
statistic,FF==86.34,
86.34,isisgreater
greater
f(F)
thanthe
than thecritical
criticalpoint
pointof
ofF(2,
F(2,7)
7)for
for
Test statistic
86.34
anycommon
any commonlevel levelofofsignificance
significance
(p-value 0),so
(p-value0), sothe
thenull
nullhypothesis
hypothesisisis
rejected,and
rejected, andwewemight
mightconclude
concludethat
that
=0.01 thedependent
dependentvariable
variableisisrelated
relatedtoto
the
F
oneor
one ormore
moreof ofthe
theindependent
independent
0
F0.01=9.55 variables.
variables.
How Good is the Regression
y The mean square error is an unbiased
estimator of the variance of the population
errors, , denoted by  2 :
SSE ˆ )2
 ( y y
MSE  
(n  (k 1)) (n (k 1))
x1
Standard error of estimate:
Errors: y - y s  MSE
x2

2
Th
emu
lt
ip
le
co
ef
f
ic
ie
nt
ofde
te
rm
ina
ti
on,R ,m
eas
ur
es
th
e p
ro
p
or
ti
onof
t
hev
ar
i
at
io
ni
nth
ede
p
ende
nt
var
i
abl
eth
ati
sexp
l
ai
ned
byt
hec
om
bi
nat
i
on
o
fth
ei
nde
pe
nd
en
tv
ar
i
abl
esi
nth
emul
ti
ple
re
gre
s
si
onmo
de
l:
SS
RS SE
2
R= =
1-
SS
TS ST
Decomposition of the Sum of Squares and the
Adjusted Coefficient of Determination
SST

SSR SSE
2 S
SR S S
E
R= =1-
S
ST S S
T

The adjusted multiple coefficient of determination, R 2, is the coefficient of


determination with the SSE and SST divided by their respective degrees of freedom:
SSE
R 2 =1- (n-(k+1))
SST
(n-1)
Tests of the Significance of Individual
Regression Parameters

Hypothesistests
Hypothesis testsabout
aboutindividual
individualregression
regressionslope
slope
parameters:
parameters:
(1) HH00::bb11==00
(1)
HH11::bb11 00
(2) HH00::bb22==00
(2)
HH11::bb22 00
...
..
.
(k) HH00::bbkk==00
(k)
HH1::bbk 00 0
b

1 k i
Test statistic for test i : t( n ( k 1)
s (bi )
Regression Results for Individual Parameters

0 1 2 3 4

Intercept M1 Lend Price Exch.

b -4.01546 0.368456 0.004702 0.036511 0.267896

s(b) 2.766401 0.063848 0.049222 0.009326 1.17544

t -1.45151 5.7708 0.095531 3.914914 0.227911

p-value 0.151679 2.71E-07 0.924201 0.000228 0.820465


Residual Analysis and Checking
for Model Inadequacies
Residuals Residuals

0 0

x or y x or y

Homoscedasticity: Residuals appear completely Heteroscedasticity: Variance of residuals


random. No indication of model inadequacy. increases when x changes.

Residuals Residuals

0 0

Time x or y

Curved pattern in residuals resulting from


Residuals exhibit a linear trend with time. underlying nonlinear relationship.
Normal Probability Plot of the Residuals

Flatter than Normal


Normal Probability Plot of the Residuals

More Peaked than Normal


Normal Probability Plot of the Residuals

Positively Skewed
Normal Probability Plot of the Residuals

Negatively Skewed
Multicollinearity
x2

x1 x2 x1

Orthogonal X variables Perfectly collinear X


provide information from variables provide identical
independent sources. No information content. No
multicollinearity. regression.

x2
x2
x1 x1
Some degree of collinearity.
A high degree of negative
Problems with regression
collinearity also causes
depend on the degree of
problems with regression.
collinearity.
Effects of Multicollinearity

•• Variancesof
Variances ofregression
regressioncoefficients
coefficientsare
areinflated.
inflated.
•• Magnitudesof
Magnitudes ofregression
regressioncoefficients
coefficientsmay
maybe bedifferent
different
fromwhat
from whatare
areexpected.
expected.
•• Signsof
Signs ofregression
regressioncoefficients
coefficientsmay
maynotnotbebeas
asexpected.
expected.
•• Addingor
Adding orremoving
removingvariables
variablesproduces
produceslarge
largechanges
changesin in
coefficients.
coefficients.
•• Removingaadata
Removing datapoint
pointmay
maycause
causelarge
largechanges
changesin in
coefficientestimates
coefficient estimatesor orsigns.
signs.
•• Insome
In somecases,
cases,the
theFFratio
ratiomay
maybe besignificant
significantwhile
whilethe
thett
ratiosare
ratios arenot.
not.
Solutions to the Multicollinearity Problem

•• Dropaacollinear
Drop collinearvariable
variablefrom
fromthetheregression
regression
•• Changein
Change insampling
samplingplan
planto
toinclude
includeelements
elements
outsidethe
outside themulticollinearity
multicollinearityrange
range
•• Transformationsof
Transformations ofvariables
variables
•• Ridgeregression
Ridge regression

You might also like