Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
80 views

Multiple Regression Analysis (MLR)

Multiple regression analysis allows modeling of a dependent variable (y) as a linear function of multiple independent variables (x1, x2,...xk). The key parameters are the intercept (β0) and slope parameters (β1,...βk). Ordinary least squares estimation minimizes the sum of squared residuals to estimate these parameters. The R-squared statistic measures the proportion of variance in y explained by the model. Omitted variable bias can occur if an important independent variable is excluded from the model.

Uploaded by

Samara Chaudhury
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views

Multiple Regression Analysis (MLR)

Multiple regression analysis allows modeling of a dependent variable (y) as a linear function of multiple independent variables (x1, x2,...xk). The key parameters are the intercept (β0) and slope parameters (β1,...βk). Ordinary least squares estimation minimizes the sum of squared residuals to estimate these parameters. The R-squared statistic measures the proportion of variance in y explained by the model. Omitted variable bias can occur if an important independent variable is excluded from the model.

Uploaded by

Samara Chaudhury
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 28

Multiple Regression Analysis

y = 0 + 1x1 + 2x2 + . . . kxk + u

1. Estimation

Economics 20 - Prof. Anderson 1


Parallels with Simple Regression
0 is still the intercept
1 to k all called slope parameters
u is still the error term (or disturbance)
Still need to make a zero conditional mean
assumption, so now assume that
E(u|x1,x2, ,xk) = 0
Still minimizing the sum of squared
residuals, so have k+1 first order conditions
Economics 20 - Prof. 2
Interpreting Multiple Regression

y 0 1 x1 2 x2 ... k xk , so
y x x ... x ,
1 1 2 2 k k

so holding x2 ,..., xk fixed implies that


y x , that is each has
1 1

a ceteris paribus interpretation

Economics 20 - Prof. 3
A Partialling Out Interpretation
Consider the case where k 2, i.e.
y x x , then
0 1 1 2 2

1 ri1 yi r 2
i1 , where ri1 are
the residuals from the estimated
regression x1 0 2 x2
Economics 20 - Prof. 4
Partialling Out continued
Previous equation implies that regressing y
on x1 and x2 gives same effect of x1 as
regressing y on residuals from a regression of
x1 on x2
This means only the part of xi1 that is
uncorrelated with xi2 are being related to yi so
were estimating the effect of x1 on y after x2
has been partialled out
Economics 20 - Prof. 5
Simple vs Multiple Reg Estimate

~ ~ ~
Compare the simple regression y 0 1 x1
with the multiple regression y 0 1 x1 2 x2
~
Generally, 1 1 unless :
0 (i.e. no partial effect of x ) OR
2 2

x1 and x2 are uncorrelated in the sample

Economics 20 - Prof. 6
Goodness-of-Fit
We can think of each observation as being made
up of an explained part, and an unexplained part,
yi y i ui We then define the following :
y y is the total sum of squares (SST)
2
i

y y is the explained sum of squares (SSE)


2
i

u is the residual sum of squares (SSR)


2
i

Then SST SSE SSR


Economics 20 - Prof. 7
Goodness-of-Fit (continued)
How do we think about how well our
sample regression line fits our sample data?

Can compute the fraction of the total sum


of squares (SST) that is explained by the
model, call this the R-squared of regression

R2 = SSE/SST = 1 SSR/SST
Economics 20 - Prof. 8
Goodness-of-Fit (continued)
We can also think of R 2 as being equal to
the squared correlation coefficient between
the actual yi and the values y i
y y y y
2

y y y y
i i
R 2
2 2
i i

Economics 20 - Prof. 9
More about R-squared
R2 can never decrease when another
independent variable is added to a
regression, and usually will increase

Because R2 will usually increase with the


number of independent variables, it is not a
good way to compare models

Economics 20 - Prof. 10
Assumptions for Unbiasedness
Population model is linear in parameters:
y = 0 + 1x1 + 2x2 ++ kxk + u
We can use a random sample of size n,
{(xi1, xi2,, xik, yi): i=1, 2, , n}, from the
population model, so that the sample model
is yi = 0 + 1xi1 + 2xi2 ++ kxik + ui
E(u|x1, x2, xk) = 0, implying that all of the
explanatory variables are exogenous
None of the xs is constant, and there are no
exact linear relationships among them
Economics 20 - Prof. 11
Too Many or Too Few Variables
What happens if we include variables in
our specification that dont belong?
There is no effect on our parameter
estimate, and OLS remains unbiased

What if we exclude a variable from our


specification that does belong?
OLS will usually be biased
Economics 20 - Prof. 12
Omitted Variable Bias
Suppose the true model is given as
y 0 1 x1 2 x2 u , but we
~ ~ ~
estimate y x u , then
0 1 1

~
1 xi1 x1 yi
x x1
2
i1

Economics 20 - Prof. 13
Omitted Variable Bias (cont)
Recall the true model, so that
yi 0 1 xi1 2 xi 2 ui , so the
numerator becomes
x x
i1 1 0 1 xi1 2 xi 2 ui
x x 2 xi1 x1 xi 2 xi1 x1 ui
2
1 i1 1

Economics 20 - Prof. 14
Omitted Variable Bias (cont)

~
1 2
x x x x
i1 1 i2 i1 x1 ui
x x x x1
2 2
i1 1 i1

since E(ui ) 0, taking expectations we have


~
E 1 1 2
x x x
i1 1 i2

x x
2
i1 1

Economics 20 - Prof. 15
Omitted Variable Bias (cont)

Consider the regression of x2 on x1


~ ~ ~ ~
x2 0 1 x1 then 1
x x x
i1 1 i2

x x
2
i1 1


~ ~
so E 1 1 2 1

Economics 20 - Prof. 16
Summary of Direction of Bias
Corr(x1, x2) > 0 Corr(x1, x2) < 0

2 > 0 Positive bias Negative bias

2 < 0 Negative bias Positive bias

Economics 20 - Prof. 17
Omitted Variable Bias Summary
Two cases where bias is equal to zero
2 = 0, that is x2 doesnt really belong in model
x1 and x2 are uncorrelated in the sample

If correlation between x2 , x1 and x2 , y is the


same direction, bias will be positive
If correlation between x2 , x1 and x2 , y is the
opposite direction, bias will be negative
Economics 20 - Prof. 18
The More General Case
Technically, can only sign the bias for the
more general case if all of the included xs
are uncorrelated

Typically, then, we work through the bias


assuming the xs are uncorrelated, as a
useful guide even if this assumption is not
strictly true

Economics 20 - Prof. 19
Variance of the OLS Estimators
Now we know that the sampling
distribution of our estimate is centered
around the true parameter
Want to think about how spread out this
distribution is
Much easier to think about this variance
under an additional assumption, so
Assume Var(u|x1, x2,, xk) = 2
(Homoskedasticity)
Economics 20 - Prof. 20
Variance of OLS (cont)
Let x stand for (x1, x2,xk)
Assuming that Var(u|x) = 2 also implies
that Var(y| x) = 2

The 4 assumptions for unbiasedness, plus


this homoskedasticity assumption are
known as the Gauss-Markov assumptions

Economics 20 - Prof. 21
Variance of OLS (cont)
Given the Gauss - Markov Assumptions

2
Var j , where
SST j 1 R j
2

SST j xij x j and R is the R


2 2 2
j

from regressing x j on all other x' s


Economics 20 - Prof. 22
Components of OLS Variances
The error variance: a larger 2 implies a
larger variance for the OLS estimators
The total sample variation: a larger SST j
implies a smaller variance for the estimators
Linear relationships among the
independent variables: a larger Rj2 implies a
larger variance for the estimators

Economics 20 - Prof. 23
Misspecified Models

Consider again the misspecified model

2
~ ~ ~ ~
y 0 1 x1 , so that Var 1
SST1

~

Thus, Var Var unless x and
1 1 1

x2 are uncorrelated, then they' re the same

Economics 20 - Prof. 24
Misspecified Models (cont)
While the variance of the estimator is
smaller for the misspecified model, unless
2 = 0 the misspecified model is biased

As the sample size grows, the variance of


each estimator shrinks to zero, making the
variance difference less important

Economics 20 - Prof. 25
Estimating the Error Variance
We dont know what the error variance, 2,
is, because we dont observe the errors, ui

What we observe are the residuals, i

We can use the residuals to form an


estimate of the error variance
Economics 20 - Prof. 26
Error Variance Estimate (cont)
u
2
n k 1 SSR df
2
i

thus, se SST 1 R
j j
2 12
j

df = n (k + 1), or df = n k 1
df (i.e. degrees of freedom) is the (number
of observations) (number of estimated
parameters)
Economics 20 - Prof. 27
The Gauss-Markov Theorem
Given our 5 Gauss-Markov Assumptions it
can be shown that OLS is BLUE
Best
Linear
Unbiased
Estimator
Thus, if the assumptions hold, use OLS
Economics 20 - Prof. 28

You might also like