Multiple Regression Inference
Multiple Regression Inference
Inference
In this section, we will learn how to use the OLS estimates to test various
hypotheses about the coe¢cients. This is called inference. Inference is crucial
for economic analysis as it allows to test economic theories and evaluate impact
of policies.
Throughout this chapter, we will maintain the classical linear regression
assumptions 6-10. In addition, we make the following assumption:
2
yjX N (X ; In ):
or
2
yi jX N (i ; )
and !
b 2
j jX N j ; :
SSTj (1 Rj2 )
b :
We can standardize the distribution of j
b
j j
N (0; 1) (50)
b )
sd( j
42
q
b ) =
where sd( b . However, in practice
b ) is the standard deviation of
V ar(
j j j
q
b ) =
se( j Vd b ):
ar( j (51)
b
j j
(52)
b )
se( j
b
j j
tnk1 (53)
se( b
j )
Y
tp = q
X1 + :::: + Xp2 =p
2
where Y N (0; 1), Xi N (0; 1); i = 1; :::; p – all independent of each other.
As you know, the distribution of the sum of squares of p independent N (0; 1)
variables is called p2 distribution with p degrees of freedom. Hence, tp distribu-
tion is the ratio of the standard normal distribution N (0; 1) and the square root
2
of p divided by its degree of freedoms, which is independent of the numerator:
N (0; 1)
tp q
2 =p
p
Thus, the intuition for the last theorem is as follows: the random variable in
(52) can be re-written as:
b b
b )
=sd(
j j j j j
T = = :
b )
se( b )=sd( b
se( j )
j j
43
The numerator has the standard normal distribution, while the denominator is
the square root of the sum of n independent squared N (0; 1) variables. The
degrees of freedom is not n but n k 1 to account for the fact that we have
used k + 1 degrees of freedoms (or data points) to estimate k + 1 coe¢cients.
H0 : j = 0 (54)
H1 : j 6= 0
If the hypothesis is valid then xj has no e¤ect (no signi…cance) on the expected
value of y when all other factors are accounted for. H0 is called the null hypoth-
esis or the null. Don’t write H0 : b = 0: It is a mistake since we are making
j
inferences about the true parameter, not its random estimate. In addition, we
need to formulate an alternative hypothesis H1 – the conclusion that we would
make it H0 is not true. The alternative in (54) is called a two-sided alternative.
Though the true parameter j is not known, we use its estimate bj to check if
the hypothesis is true.
Suppose we want to test the hypothesis that experience has no e¤ect on wage
once education is accounted for:
H0 : 2 = 0
H1 : 2 6= 0
b will be close
Suppose for a moment that the true value of 2 = 0. Then, 2
0 with a high probability, and the statistic:
b
2 2 H0
b
2
T = = (55)
se( b
2) b )
se( 2
44
b
not true, i.e., 2 6= 0, 2 will be large with a high probability, and fall into
the region on the x-axis corresponding to the shaded area, see graph.
In this
b
case, we would reject H0 . But how "large" should be "large 2 " to reject
the null hypothesis? In other words, we need to specify the rejection region for
the hypothesis. In our example, this amounts to choosing the cut-o¤ or critical
value of the t-distribution – c – such that if jT j > c we reject H0 in favor of H1 ,
and if jT j c, we don’t reject H0 . [Note on terminology: It is better to say not
reject instead of accept H0 .] The probability that T falls in the rejection region
is
Pr(jT j > cjH0 ) (56)
is called the signi…cance level of the test. Note that the conditioning on H0
is added to stress that the distribution of T is derived under the assumption
that H0 is valid. Recall from the stat course, is also called Type I error : the
probability of rejecting H0 when it is in fact true. Note that (56) is equivalent
to
Pr(T > c) = and Pr(T < c) = or (57)
2 2
Pr(jT j c) = 1
\
colGP A = 1:39 + 0:412 hsGP A + 0:015AC T 0:083 skipped
(0:33) (0:094) (0:011) (0:026)
2
n = 141; R = 0:234
where hsGP A is the high school GPA, AC T is the score on the ACT exam,
skipped is the number of skipped classes. The standard errors of the coe¢cient
45
estimates are reported in parenthesis. We want to test if high school GPA has
any e¤ect on college GPA.
1. The …rst step is to formulate the hypothesis in mathematic terms:
H0 : 1 = 0
H1 : 1 6= 0
So, the null H0 is that hsGP A has no e¤ect on colGP A. The alternative is
two-sided, i.e., we allow both positive and negative correlation between hsGP A
and colGP A.
2. The second step is to construct a test statistic and establish its distri-
bution. In general, a statistic is a function of the sample (data) that does not
involve any unknown parameters. A test statistic should meet two criteria: (i) it
should allow to check validity of H0 , and (ii) it should have a known (tabulated)
distribution. In this example, a convenient test statistic is the t-statistic:
b
H0
b
1 1 1
T = = t137
se( b
1) b )
se( 1
because it mimics H0 and allows to check its validity, and second, it has a known
t-distribution for which tables exist. In our example, the degree of freedoms of
the T statistic is n k 1 = 141 4 = 137.
3. The third step is to choose a rejection region or rejection rule. For
example, let = 0:05 (or 5%) so that the rejection rule under the two-sided
alternative is
reject H0 if jT j > c
The 5%-critical value of the t-distribution with 137 degrees of freedom for
the two-sided test is 1.96: c = 1:96:
4. The …nal step is to compute the value of the test statistic and check
the rejection rule.
b
1 0:412
jT j = = = 4:38 > c = 1:96
b )
se( 0:094
1
b is statistically signi…cant at 5%
Thus, we reject H0 , and conclude that 1
(and hence also at 10%) signi…cance level and high school GPA has a posi-
tive statistically signi…cant e¤ect on college GPA. Note the two-sided test of
statistical signi…cance is included in all regression packages.
Let’s now test signi…cance of the coe¢cient on ACT score at 5% signi…cance
level. The T test is now:
0:015
T = = 1:36 < 1:96
0:011
Hence, we cannot reject the null, and the coe¢cient on ACT, b2 is not signi…cant
at 5% level. This means that even though the estimate b2 = 0:015, the true
= 0 with a high probability. We have got a positive estimate b = 0:015
2 2
only by chance (because of randomness of the sample).
46