Lecture 7 Probit
Lecture 7 Probit
ECMT-7302
Econometrics II
MA Eco. 2022, Fall 2023
Instructor: Sunaina Dhingra
Lectures: Wednesday, (11.20-12.50pM) & Thursday (9-40-11.10 am)
Lecture Meeting Mode: In person (Classroom: T4-F99)
Office Hours: Wednesday 1-2.30 pm & by appointment in FOB, Office No.1B in south on 7th Floor)
Email-id: sunaina@jgu.edu.in
Lecture Material: Slides and textbooks
Credits: 4.5
• Assume that in the two-variable model Yi = β1 + β2 Xi + ui the Yi are normally
and independently distributed with mean = β1 + β2 Xi and variance = σ 2.
• The joint probability density function of Y1, Y2, ... , Yn , given the preceding
mean and variance, can be written as
• But in view of the independence of the Y’s, this joint probability density
function can be written as a product of n individual density functions as
• Where
• which is the density function of a normally distributed variable with the given mean and 1-2
variance.
• Substituting Equation (2) for each Yi into Equation (1) gives
• If Y1, Y2, . . . , Yn are known or given, but β1, β2, and σ2 are not known, the function in
Equation (3) is called a likelihood function, denoted by LF(β1, β2, σ2), and written as
• MLE Method consists in estimating the unknown parameters (β1, β2, and σ2 )in such a manner
that the probability of observing the given Y’s is as high (or maximum) as possible.
• Therefore, we find the maximum of the function in Equation (4) using differential calculus.
• For differentiation it is easier to express Equation (4) in the log term as follows.
(Note: ln = natural log.)
• Differentiating Equation (5) partially with respect to β1, β2, and σ2, we obtain
1-4
• After simplifying, Eqs. (9) and (10) yield
• which are precisely the normal equations of the least-squares theory obtained by OLS
1-5
• the ML estimator of σ2 is biased. The magnitude of this bias can be easily determined
as follows.
1-6
Limited Dependent Variable Models
• Logit and Probit models for binary response
7
Limited Dependent Variable Models
• Choices for the link function
8
Limited Dependent Variable Models
• Interpretation of coefficients in Logit and Probit models
9
Limited Dependent Variable Models
• Maximum likelihood estimation of Logit and Probit models
10
Probit and Logit Regression
• Instead, we want:
1-17
STATA Example: HMDA data
. probit deny p_irat, r;
Iteration 0: log likelihood = -872.0853 We’ll discuss this later
Iteration 1: log likelihood = -835.6633
Iteration 2: log likelihood = -831.80534
Iteration 3: log likelihood = -831.79234
Probit estimates Number of obs = 2380
Wald chi2(1) = 40.68
Prob > chi2 = 0.0000
Log likelihood = -831.79234 Pseudo R2 = 0.0462
------------------------------------------------------------------------------
| Robust
deny | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
p_irat | 2.967908 .4653114 6.38 0.000 2.055914 3.879901
_cons | -2.194159 .1649721 -13.30 0.000 -2.517499 -1.87082
------------------------------------------------------------------------------
Pr (deny = 1|P / Iratio) = Φ(-2.19 + 2.97×P/I ratio)
(.16) (.47)
STATA Example: HMDA data, ctd.
Pr (deny = 1|P / Iratio) = Φ(-2.19 + 2.97×P/I ratio)
(.16) (.47)
• Positive coefficient: Does this make sense?
• Standard errors have the usual interpretation
• Predicted probabilities:
Pr (deny = 1|P / Iratio = .3) = Φ (-2.19+2.97×.3)
= Φ (-1.30) = .097
• Effect of change in P/I ratio from .3 to .4:
Pr (deny = 1|P / Iratio = .4) = Φ (-2.19+2.97×.4)
= Φ (-1.00) = .159
• Predicted probability of denial rises from .097 to .159
• increase in the probability of denial of 6.2 percentage points,
from 9.7% to 15.9%
• Because the probit regression function is nonlinear, the effect of
a change in X depends on the starting value of X.
1-20
Probit regression with multiple regressors
Pr(Y = 1|X1, X2) = Φ (β0 + β1X1 + β2X2)
• The model is best interpreted by computing predicted probabilities and the
effect of a change in a regressor.
• Φ is the cumulative normal distribution function.
• The predicted probability that Y = 1, given values of X1, X2 is calculated by
computing the z-value, z = β0 + β1X1 + β2X2 and then looking up this z-value
in the normal distribution table (Appendix Table 1).
• z = β0 + β1X1 + β2X2 is the “z-value” or “z-index” of the probit model.
• β1 is the effect on the z-score of a unit change in X1, holding constant X2
• The effect on the predicted probability of a change in a regressor is
computed by
• (1) computing the predicted probability for the initial value of the regressors,
• (2) computing the predicted probability for the new or changed value of the
regressors, and
• (3) taking their difference.
STATA Example: HMDA data
. probit deny p_irat black, r;
Iteration 0: log likelihood = -872.0853
Iteration 1: log likelihood = -800.88504
Iteration 2: log likelihood = -797.1478
Iteration 3: log likelihood = -797.13604
Probit estimates Number of obs = 2380
Wald chi2(2) = 118.18
Prob > chi2 = 0.0000
Log likelihood = -797.13604 Pseudo R2 = 0.0859
------------------------------------------------------------------------------
| Robust
deny | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
p_irat | 2.741637 .4441633 6.17 0.000 1.871092 3.612181
black | .7081579 .0831877 8.51 0.000 .545113 .8712028
_cons | -2.258738 .1588168 -14.22 0.000 -2.570013 -1.947463
------------------------------------------------------------------------------
We’ll go through the estimation details later…
STATA Example, ctd.: Predicted probit probabilities
. probit deny p_irat black, r;
Probit estimates Number of obs = 2380
Wald chi2(2) = 118.18
Prob > chi2 = 0.0000
Log likelihood = -797.13604 Pseudo R2 = 0.0859
------------------------------------------------------------------------------
| Robust
deny | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
p_irat | 2.741637 .4441633 6.17 0.000 1.871092 3.612181
black | .7081579 .0831877 8.51 0.000 .545113 .8712028
_cons | -2.258738 .1588168 -14.22 0.000 -2.570013 -1.947463
------------------------------------------------------------------------------
. sca z1 = _b[_cons]+_b[p_irat]*.3+_b[black]*0;
. display "Pred prob, p_irat=.3, white: " normprob(z1);
Pred prob, p_irat=.3, white: .07546603
NOTE
_b[_cons] is the estimated intercept (-2.258738)
_b[p_irat] is the coefficient on p_irat (2.741637)
sca creates a new scalar which is the result of a calculation
display prints the indicated information to the screen
STATA Example, ctd.
Pr (deny = 1|P/I, black)
= Φ(-2.26 + 2.74×P/I ratio + .71×black)
(.16) (.44) (.08)
• Is the coefficient on black statistically significant?
• Estimated effect of race for P/I ratio = .3:
Pr (deny = 1|.3,1)= Φ(-2.26+2.74×.3+.71×1) = .233