Module 5
Module 5
Module 5
Linear Regression with a
Single Regressor:
Inference
Department of Economics, HKUST
Instructor: Junlong Feng
Fall 2022
Menu of Module 5
IV. Variance
estimation and
heteroscedasticity
2
I. Hypothesis testing
3
I. Hypothesis testing
4
I. Hypothesis testing
Now suppose we want to test the null 𝐻' : 𝛽( = 𝛽(' against 𝐻( : 𝛽( ≠ 𝛽(' . We can use exactly
the same idea.
• 𝛽]( is approximately 𝑁 𝛽( , 𝜎-* . .&
-& %.&
.
• Standardize: is approximately 𝑁 0,1 .
"(
' &
• Under the null: 𝐻' : 𝛽( = 𝛽(' with size 𝛼,
𝛽]( − 𝛽('
Pr ≤ 𝑧(%) ≈ 1 − 𝛼
𝜎.- *
&
5
I. Hypothesis testing
)# &*#$
* )# &*#$
*
Reject if +,(*)# ) > 𝑧#&% . Do not reject if +,(*)# ) ≤ 𝑧#&% .
" "
)# &*#$
*
• )# ) is a t-statistic.
+,(*
• The rejection rule is a two-sided t-test.
• With critical value 𝑧#&% , the size or the significance level of the test is controlled
"
at 𝛼.
• The asymptotic power is 1, just like the t-test for population mean with sample
average as the estimator.
6
I. Hypothesis testing
7
I. Hypothesis testing
Example:
i
𝑇𝑒𝑠𝑡𝑠𝑐𝑜𝑟𝑒 = 698.9 − 2.28 ⋅ 𝑆𝑇𝑅
10.4 (0.52)
• Convention: numbers in parentheses and below the estimated parameter are the
standard errors.
• 𝛽]' = 698.9; 𝑆𝐸 𝛽]' = 10.4.
• 𝛽]( = −2.28; 𝑆𝐸 𝛽]( = 0.52.
• Consider a two-sided test for 𝐻' : 𝛽( = 0.
• Interpretation of the null: since 𝛽( is the marginal average causal effect of STR on test
score, 𝛽( = 0 means 𝑆𝑇𝑅 has no causal effect on test score on average.
%*.*7
• T-statistic: = 4.38 > 2.58 = 𝑧(%$.$& . So reject the null at 1% significance level.
'.8* "
8
I. Hypothesis testing
9
I. Hypothesis testing
Example:
h
𝑇𝑒𝑠𝑡𝑠𝑐𝑜𝑟𝑒 = 698.9 − 2.28 ⋅ 𝑆𝑇𝑅
10.4 (0.52)
• We can also test hypotheses for 𝛽".
• E.g. 𝐻": 𝛽" = 690 𝑣𝑠 𝐻#: 𝛽" ≠ 690
012.1&01"
• T-statistic: #".4
= 0.86
• Smaller than any commonly used critical values.
• Not rejecting at 𝛼 = 0.1,0.05,0.01.
• 𝑝 = 2Φ −0.86 = 0.39.
10
I. Hypothesis testing
Output from R
• The t-value and p-value are all for 𝐻": 𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟 = 0 against a two-sided
alternative.
11
II. Confidence interval
12
II. Confidence interval
Example:
h
𝑇𝑒𝑠𝑡𝑠𝑐𝑜𝑟𝑒 = 698.9 − 2.28 ⋅ 𝑆𝑇𝑅
10.4 (0.52)
• A 95% confidence interval for 𝛽#: −2.28 − 1.96×0.52, −2.28 + 1.96×0.52 =
−3.30, −1.26 .
• 0 is not is the confidence interval, so 𝐻" = 0 can be rejected if the alternative is two-sided.
• Cannot reject any null inside the interval.
• A 95% confidence interval for 𝛽": 698.9 − 1.96×10.4, 698.9 + 1.96×10.4 =
678.52, 719.28 .
• 690 is in the confidence set.
13
II. Confidence interval
From the output from R, you’re able to compute the confidence intervals for any
given coverage probability.
• Example: 90% CI for 𝛽", i.e., the intercept, is [698.93 − 1.64 ⋅ 10.36, 698.93 +
1.64 ⋅ 10.36]
14
III. Two-sample mean differential
We didn’t talk about the two-sample mean testing problem. It’s not because of lack
of importance, but because the conventional way is not convenient.
• Two-sample mean testing problem is crucial for causal inference.
• Suppose you randomize a treatment variable 𝐷 ∈ 0,1 in an i.i.d sample.
• 𝐷! = 1 means individual 𝑖 receives the treatment.
• 𝐷! = 0 means individual 𝑖 does not receive the treatment.
• Examples for 𝐷 include vaccine, vouchers, draft (military service), etc.
• Let the outcome for 𝑖 be 𝑌! . By the randomness of 𝐷! ,
𝐴𝑇𝐸 = 𝐸 𝑌! 𝐷! = 1 − 𝐸 𝑌! 𝐷! = 0
• 𝐻": 𝐴𝑇𝐸 = 0 is equivalent as 𝐻": 𝐸 𝑌! 𝐷! = 1 = 𝐸 𝑌! 𝐷! = 0 .
15
III. Two-sample mean differential
16
III. Two-sample mean differential
17
III. Two-sample mean differential
Example: In the STR-Test score data, construct 𝐷 = 1 if 𝑆𝑇𝑅 < 20 and 𝐷 = 0 otherwise.
• Interpretation: 𝛽F& : sample average of test scores for group 𝐷 = 0. 𝛽F' : sample mean difference of test
scores in the two groups. 𝛽F& + 𝛽F' : sample average of test scores for group 𝐷 = 1.
• For the null 𝐻& : 𝛽' = 0, t value is 4.04, significant at all commonly adopted levels.
• The means of the two subsamples 𝑆𝑇𝑅 < 20 and 𝑆𝑇𝑅 ≥ 20 are thus significantly different at all
commonly adopted levels.
• 95% confidence interval for 𝛽' : [7.37 − 1.96 ⋅ 1.82, 7.37 + 1.96 ⋅ 1.82].
18
IV. Variance estimation and heteroscedasticity
• However in practice we use 𝑆𝐸 𝛽,# and 𝑆𝐸 𝛽," to replace 𝜎*) and 𝜎*)
# $
19
IV. Variance estimation and heteroscedasticity
20
IV. Variance estimation and heteroscedasticity
21
IV. Variance estimation and heteroscedasticity
6(
Under homoscedasticity, 𝜎*) = . A corresponding consistent estimator is
# 76)
1
1 ∑! 𝑢• !5
𝑛
𝑆𝐸 𝛽,# = ⋅
𝑛 1
∑! 𝑋! − 𝑋. 5
𝑛
• Much simpler than the general case without assuming homoscedasticity.
• Incorrect when homoscedasticity fails.
.
• When homoscedasticity fails, this simpler formula of 𝑆𝐸 𝛽"! is still consistent of ( , but
&.#
.(
is no longer equal to 𝜎# . Then t-test based on this simpler 𝑆𝐸 𝛽"! no longer has the
$!
&.#
desired size control.
22
IV. Variance estimation and heteroscedasticity
When 𝐸 𝑢/* 𝑋/ ≠ 𝐸 𝑢/* with positive probability, we say the error has
heteroscedasticity.
• Homo- and heteroscedasticity are concerning the conditional variance independence of
𝑢/ and 𝑋/ .
• Our three assumptions for OLS to be unbiased, consistent and asymptotically normal
are irrelevant to them.
• The only thing that matters is whether you can use the simpler formula for 𝑆𝐸.
• The simpler formula can only be used under homoscedasticity.
• The more complicated one can be used in both scenarios because we derived it without
assuming either homo- or heteroscedasticity.
• For this reason, the more complicated formula is called heteroscedastic robust standard
error.
23
IV. Variance estimation and heteroscedasticity
In the past, people test homoscedasticity first and if homo cannot be rejected, they
use the simpler formula for SE.
This is NOT necessary because the homo tests are known for many problems (low
power etc.)
Why don’t we just use a universal formula that applies to both cases and stay
agnostic about the conditional variance? It doesn’t affect unbiasedness and
consistency of OLS anyways.
Just using the heteroscedastic robust SE (the more complicated version) without
worrying about heteroscedasticity is today’s standard.
24