5 - 7. MR - Estimation
5 - 7. MR - Estimation
5 - 7. MR - Estimation
Learning Objectives
• Allows to explicitly control for many factors that affect the dependent variable.
• Hence, more amenable to ceteris paribus analysis.
• Naturally, if we add more factors to our model, then more of the variation in y can be
explained.
• It can also incorporate fairly general functional form relationships.
• Wage Example.
– wage = β0 + β1 educ + β2 exper + u.
– W age is determined by the two independent variables.
– Other unobserved factors are contained in u.
– MR effectively takes exper out of the u and puts it explicitly in the equation.
– We still have to make assumptions about how u is related to educ and exper. (zero-
conditional mean)
– However, we can be confident of one thing.
– Because equation contains exper explicitly, we will be able to measure the effect of educ
on wage, holding exper fixed.
– In a simple regression analysis - which puts exper in the error term - we would have to
assume that exper is uncorrelated with educ, a tenuous assumption.
1
– avginc: average family income.
– By including avginc explicitly in the model, we are able to control for its effect on
avgscore.
– In simple regression, avginc would be included in u, which would likely be correlated
with expend, causing the OLS estimator of β1 to be biased.
2
– The
Pn method of OLS chooses the estimates to minimize the sum of squared residuals.
2 2
– û
i=1 i = (yi − β̂0 − β̂1 xi1 − β̂2 xi2 − ... − β̂k xik ) .
3
– It may seem that we actually went out and sampled people with the same high school
GPA but possibly with different ACT scores.
– If we could collect a sample of individuals with the same high school GPA, then we
could perform a simple regression analysis relating colGP A to ACT .
– MR effectively allows us to mimic this situation without restricting the values of any
independent variables.
– MR allows us to keep other factors fixed in nonexperimental environments.
-Fitted values - ŷi = β̂0 + β̂1 xi1 + β̂2 xi2 + ... + β̂k xik .
• Residuals
– ûi = yi − ŷi .
Comparison of SR & MR
• Equations
– SR: ỹ = β̃0 + β̃1 x1 .
– MR: ŷ = β̂0 + β̂1 x1 + β̂2 x2 .
• Relationship between β̃1 and β̂1 .
– β̃1 = β̂1 + β̂2 δ̃1 .
– δ̃1 is the slope coefficient from the simple regression of x2 on x1 .
4
• Two cases when they are equal.
– The partial effect of x2 on ŷ is zero in the sample. That is, β̂2 = 0.
– x1 and x2 are uncorrelated in the sample. That is, δ̃1 = 0.
Comparison of SR & MR
Comparison of SR & MR
• Way of measuring how well the independent variables explains the dependent variable.
• For a sample, we can define the following:
Pn
– Total Sum of Squares (SST) = i=1P (yi − ȳ)2 .
n
– Explained Sum of Squares (SSE) =P i=1 (ŷi − ȳ)2 .
n
– Residual Sum of Squares (SSR) = i=1 ûi 2 .
5
– This means that hsGP A and ACT together explain about 17.6% of the variation in
colGP A.
• Assumptions, under which the OLS estimators are unbiased for the population parameters.
– Assumption MLR 1: Linear in Parameters.
– Assumption MLR 2: Random Sampling.
– Assumption MLR 3: No Perfect Collinearity.
– Assumption MLR 4: Zero Conditional Mean.
6
Expected Value of the OLS Estimators
7
Expected Value of the OLS Estimators
8
Expected Value of the OLS Estimators
• AssumptionMLR 5: Homoscedasticity.
– V ar(u|x1 , x2 , ..., xk ) = σ 2 .
– Formulas are simplified.
– OLS has an important efficiency property.
– Assumptions MLR.1 through MLR.5 are collectively known as the Gauss - Markov
assumptions.
– x (bold x) to denote all independent varibles.
– V ar(y|x) = σ 2 .
9
Variance of OLS Estimators
10
Variance: Components of OLS Variances
• However, when β2 6= 0, there are 2 favorable reasons for including x_2in the model.
– Bias in β̃1 does not shrink, as the sample size increases but variance can be reduced.
– σ 2 increases when x2 is dropped from the equation.
Variance: Estimating σ 2
11
σ̂
• se(β̂j ) = [SSTj (1−Rj2 ]1/2
. Invalid in presence of heteroscedasticity.
• Thus, heteroscedasticity does not cause bias in β̂j , but lead to bias in the V ar(β̂j ), which
then invalidates the standard errors.
• Standard errors can also be written as:
– se(β̂j ) = √nsd(x σ̂)√1−R2 .
j j
• Justifies the use of the OLS method rather than using a variety of competing estimators.
• Gauss-Markov Theorem.
– Under assumptions MLR 1 through MLR 5, the OLS estimator β̂j for βj is the best
linear unbiased estimator (Efficient).
– Best: Having the smallest variance.
– Linear: If it can be expressed as a linear function of the dependent variable.
– Unbiased: E(β̂j ) = βj .
– Estimator: It is a rule that can be applied to any sample of data to produce an estimate.
12