Lecture 11 Slides - Testing Restrictions
Lecture 11 Slides - Testing Restrictions
In which you …
Often useful if want to know whether worth adding (or removing) sets of
variables,
rather than test that all of them (except the constant) are zero as with the
original F test
Y = b0 + b1X1 + b2X2 + b3X3 + u
Tests of Restrictions
rather than test that all of them (except the constant) are zero as with the
original F test
Y = b0 + b1X1 + b2X2 + b3X3 + u
Test: b1 = 0 & b2= 0 & b3 = 0
Tests of Restrictions
rather than test that all of them (except the constant) are zero as with the
original F test
Y = b0 + b1X1 + b2X2 + b3X3 + u
Test: b1 = 0 & b2= 0 & b3 = 0
rather than test that all of them (except the constant) are zero as with the
original F test
Y = b0 + b1X1 + b2X2 + b3X3 + u
Test: b1 = 0 & b2= 0 & b3 = 0
rather than test that all of them (except the constant) are zero as with the
original F test
Y = b0 + b1X1 + b2X2 + b3X3 + u
Test: b1 = 0 & b2= 0 & b3 = 0
rather than test that all of them (except the constant) are zero as with the
original F test
Y = b0 + b1X1 + b2X2 + b3X3 + u
Test: b1 = 0 & b2= 0 & b3 = 0
rather than test that all of them (except the constant) are zero as with the
original F test
Y = b0 + b1X1 + b2X2 + b3X3 + u
Test: b1 = 0 & b2= 0 & b3 = 0
rather than test that all of them are zero as with the original F test
Y = b0 + b1X1 + b2X2 + b3X3 + u
Test: b1 = 0 & b2= 0 & b3 = 0
and looking to see whether the RSS is significantly different in the two
specifications
Can show that test becomes
Where
J = No. of variables to be tested
F = R2unrestricted – R2restricted /J ~ F(J, N-Kunrestricted)
1-R2unrestricted /N- Kunrestricted
Where
restricted = values from model with variables set to zero (ie excluded
from the regression specification) Y = b0 + b1X1 + u
F = R2unrestricted – R2restricted /J ~ F(J, N-Kunrestricted)
1-R2unrestricted /N- Kunrestricted
Where
restricted = values from model with variables set to zero (ie excluded
from the regression specification) Y = b0 + b1X1 + u
where
restricted = values from model with variables set to zero (ie excluded
from the regression specification) Y = b0 + b1X1 + u
Intuition: Under the null that the extra variables have no explanatory
power then wouldn’t expect the RSS (or the R2 ) from the two models to
differ much
To test whether the union dummy variable is significantly different from zero, look at
the estimated t value
test union=0
( 1) union = 0
F( 1, 6021) = 37.82
Prob > F = 0.0000
^ 0
(β i − β i )2
(which is just the square of the t value F= ^
)
Var ( β i )
To test whether the variables union and public are (jointly) insignificant – they don’t contribute to explaining the dependent
variable
So omit union and public from the model and compare RSS
(Intuitively: If RSS is significantly different between the 2 models then suggests omitted variables do contribute something
to explain behaviour of dependent variable
= 20.2
( 1) union = 0
( 2) public = 0
F( 2, 6021) = 20.21
Prob > F = 0.0000
So reject null that union and public sector variables jointly have no
explanatory power in the model
Note that the t value on the public sector dummy indicates that the effect
of this variable is statistically insignificant from zero, yet the combined F
test has rejected the null that both variables have no explanatory power.
(technically the F test for joint restrictions is “less powerful” test of single
restrictions than the t test
Since this test is essentially a test of (linear) restrictions
– in the above case the restriction was that the coefficients on the sub-
set of variables were restricted to zero
Y = b0 + b1X1 + b2X2 + b3X3 + u
Test: b2= 0 b3 = 0
y = ALαKβ
– other important uses of this test also include
y = ALαKβ
y = ALαKβ
y = ALαKβ
becomes
becomes
and can test the null H0: by imposing the restriction that α+β=1 in (1)
against an unrestricted version that does not impose the constraint.
Example: Using the data set prodfn.dta containing information on the output,
labour input and capital stock of 27 firms
The unrestricted regression (ie not constraining the coefficients to sum to one) is
. reg logo logl logk
Source | SS df MS Number of obs = 27
-------------+------------------------------ F( 2, 24) = 200.25
Model | 14.2115637 2 7.10578187 Prob > F = 0.0000
Residual | .85163374 24 .035484739 R-squared = 0.9435
-------------+------------------------------ Adj R-squared = 0.9388
Total | 15.0631975 26 .57935375 Root MSE = .18837
------------------------------------------------------------------------------
logo | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
logl | .6029994 .125954 4.79 0.000 .3430432 .8629556
logk | .3757102 .085346 4.40 0.000 .1995648 .5518556
_cons | 1.170644 .326782 3.58 0.002 .4961988 1.845089
This produces a restricted OLS regression with the coefficients on logL and logk
constrained to add to one
. cnsreg logo logl logk, constraint(1)
F( 1, 24) = 0.12
Prob > F = 0.7366
So estimated F < Fcritical at 5% level
So accept null that H0: α+β=1
Question: How get the unrestricted RSS if the regression is split across
two sub-samples?
Can show that the unrestricted RSS in this case equals the sum of the
RSS from the two sub-regressions (2) & (3)
Can show that the unrestricted RSS in this case equals the sum of the
RSS from the two sub-regressions (2) & (3)
So that
So that
becomes
where J is again the number of variables restricted (in this case the
entire set of rhs variables including the constant Y = β0 + β1X1 + β2X2 + u)
Can show that the unrestricted RSS in this case equals the sum of the
RSS from the two sub-regressions (2) & (3)
So that
becomes
where J is again the number of variables restricted (in this case the
entire set of rhs variables including the constant Y = β0 + β1X1 + β2X2 + u)
Can show that the unrestricted RSS in this case equals the sum of the
RSS from the two sub-regressions (2) & (3)
So that
becomes
where J is again the number of variables restricted (in this case the
entire set of rhs variables including the constant Y = β0 + β1X1 + β2X2 + u)
So that
becomes
where J is again the number of variables restricted (in this case the
entire set of rhs variables including the constant Y = β0 + β1X1 + β2X2 + u)
(because then the RSS from the unrestricted two regressions are
significantly different from the RSS in the pooled (restricted) sample
Example 1: Chow Test for Structural Break in Time Series Data
99
500000
98
97
96
9495
90 93
400000
89 9192
88
cons
87
86
85
8384
300000
79808182
78
7374757677
72
7071
69
6768
200000
646566
63
62
596061
58
555657
55 60 65 70 75 80 85 90 95 100
year
Graph suggests relationship between consumption and income changes over the
sample period.
(slope is steeper in 2nd period)
Try sample split before and after 1990
Looks like coefficients are different across periods, but standard error for second period
estimate is much larger. (why?)
Compare with regression pooled over both periods (restricting coefficients to be the same
in both periods).
. reg cons income
Source | SS df MS Number of obs = 45
---------+------------------------------ F( 1, 43) = 5969.79
Model | 4.7072e+11 1 4.7072e+11 Prob > F = 0.0000
Residual | 3.3905e+09 43 78849774.6 R-squared = 0.9928
---------+------------------------------ Adj R-squared = 0.9927
Total | 4.7411e+11 44 1.0775e+10 Root MSE = 8879.7
------------------------------------------------------------------------------
cons | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
income | .9172948 .0118722 77.264 0.000 .8933523 .9412372
_cons | 13496.16 4025.456 3.353 0.002 5378.05 21614.26
Important: With this form of the test there are twice as many coefficients in the
unrestricted regressions (income and the constant for the period 1955-89, and a
different estimate for income and the constant for the period 1990-99,
and k = 2*2
)
From table F critical at 5% level is 3.00. Therefore reject null that coefficients are
the same in both time periods. Hence mpc is not constant over time.
Eample 2: Chow Test of Structural Break – Cross Section Data
Suppose wish to test whether estimated OLS coefficients were the same for men
and women in ps4data.dta
This is like asking whether the estimated intercepts and the slopes are different
across the 2 sub-samples
Restricted regression is obtained by pooling all observations on men & women
and running a single OLS regression
. reg lhwage age edage union public
Women
. reg lhwage age edage union public if female==1
= 175.4
Note
because in the unrestricted regression there are 2*5 estimated parameters (5 for
men and 5 for women)
Reject null that coefficients are the same for men and women