Section and Solution
Section and Solution
Andrew Dustan
Section Handout 13
where y is the dummy variable. This is called the linear probability model.
Solution: Use the logit or probit model. These models are specifically made for binary dependent variables and
always result in 0 < < 1. Let's leave the technicalities aside and look at a graph of a case where LPM goes
wrong and the logit works:
1 1
--------
--------
= 1|
= 1|
0.5 0.5
0 0
-0.5 -0.5
0 + 1 1 + ⋯ + 0 + 1 1 + ⋯ +
This is the main feature of a logit/probit that distinguishes it from the LPM – predicted probability of = 1 is
never below 0 or above 1, and the shape is always like the one on the right rather than a straight line.
2. Marginal Effects for Logit (or Probit)
We talked about how to estimate the logit using "maximum likelihood" in lecture, which is fairly complicated—
much more complicated than OLS. Moreover, the results from the estimation are not easy to interpret.
What we want are results that look like those from OLS or the LPM: the marginal effect of changing x on
, the
probability of getting = 1.
"Problem": the marginal effect is different depending on what the x values are. Look again at the graph:
1
---------------
= 1| 0.5
0
0 + 1 1 + ⋯ +
How much does change as we increase + + ⋯ + (i.e. how big are marginal effects) when:
We compromise by finding the marginal effect for the "average" person/whatever in the data, i.e. the marginal
effect when = ̅ , … , = ̅ . This is what the Stata command "mfx" does.
Example: Probability of a male adult being arrested, as a function of income (in $100) and minority status:
. logit arrest minority inc86
------------------------------------------------------------------------------
arrest | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
minority | .5853512 .0886866 6.60 0.000 .4115286 .7591738
inc86 | -.0074475 .0008404 -8.86 0.000 -.0090947 -.0058003
_cons | -.8499352 .069239 -12.28 0.000 -.9856411 -.7142294
------------------------------------------------------------------------------
The signs of these coefficients tell us something: minorities are more likely to be arrested, and higher income
lowers the probability of being arrested. How big are these effects? Run "mfx" to find out:
. mfx
Practice:
1. For males with the average level of income in this sample ($5497 in 1986 dollars), how much more likely are
minorities to be arrested? (Notice that for dummy variables, Stata calculates the change from going from 0 to 1.)
11.7%
2. For males with the average level of income in this sample, how does a $1000 increase in income affect the
predicted probability of being arrested?
To perform the likelihood ratio test, estimate the restricted (fewer variables) and unrestricted (more variables)
models and then construct the test statistic:
1/ = 2logℒ7 − logℒ8
where ℒ9 is the likelihood from the unrestricted model and ℒ: is from the restricted model. The test statistic is
distributed ; 0 < where q is the number of restrictions, just like in the F-test. If LR is higher than the critical
value, we reject the null hypothesis. This is exactly like the F-test but using the ; 0 table instead of the F table.
Practice:
We can add two variables to the arrest model: total time spent in prison in the past, and average sentence length
from previous sentences (if any):
. logit arrest minority inc86 tottime avgsen
------------------------------------------------------------------------------
arrest | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
minority | .5956365 .0891583 6.68 0.000 .4208894 .7703836
inc86 | -.0075452 .0008458 -8.92 0.000 -.0092029 -.0058875
tottime | -.035892 .02659 -1.35 0.177 -.0880074 .0162233
avgsen | .0332144 .0334359 0.99 0.321 -.0323187 .0987474
_cons | -.8407443 .0696835 -12.07 0.000 -.9773215 -.7041672
------------------------------------------------------------------------------
Do these new variables help to predict arrest, after controlling for minority status and income?
Step:
1: Write hypotheses : =>==(?@ = ABCD@) = 0
: EFG
4: Reject/fail to reject 2.66 < 5.99 so fail to reject the null hypothesis
5: Conclude We have no evidence that time spent in prison and average sentence length from
previous sentences help to predict future imprisonment, after controlling for minority
status and income.