Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

Lecture 7 - Binary

The document discusses binary response models, focusing on the decision-making process for COVID-19 vaccine uptake using survey data. It explains the use of Linear Probability Models (LPM), Logit, and Probit models for analyzing binary dependent variables, highlighting their advantages and disadvantages. Additionally, it covers estimation methods, hypothesis testing, and the interpretation of coefficients in these models.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Lecture 7 - Binary

The document discusses binary response models, focusing on the decision-making process for COVID-19 vaccine uptake using survey data. It explains the use of Linear Probability Models (LPM), Logit, and Probit models for analyzing binary dependent variables, highlighting their advantages and disadvantages. Additionally, it covers estimation methods, hypothesis testing, and the interpretation of coefficients in these models.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

BINARY RESPONSE

MODELS
Nguyen Quang
quangn@ueh.edu.vn
• Sometimes the dep var under consideration is
BINARY binary:
DEPENDENT • Whether loan application is approved
VARIABLE • Whether borrower can repay loan
• Whether a person has credit card
• ...

2
EXAMPLE: COVID-19 VACCINE PURCHASE
DATA IS MADE AVAILABLE BY EEPSEA

• Problem: the decision to vaccinate for oneself with a hypothetical COVID-19 vaccine
• Data: a survey of 377 individuals in HCMC in 2020
• Data file: EMP4.xlsx
• Dep var:
§ dself : 1 = decide to vaccinate for oneself
• Regressors:
§ efficacy80 : 1 = the efficacy is 80%, 0 = 50%
§ duration3 : 1 = effectiveness duration is 3 years, 0 = 1 year
§ priceUS (USD/2-dose vaccine): price of vaccine.
§ pbenefit : 1 = respondent was provided information on the externality of vaccination

3
EXAMPLE: COVID-19 VACCINE PURCHASE
DATA IS MADE AVAILABLE BY EEPSEA

• Regressors (cont.):
§ hhincomeUS (USD/month): total monthly household income
§ hhsize (members): household size
§ age (years): age of respondent
§ male : gender of respondent, 1 = male, 0 = female
§ edu (categorical): education attainment, 1 = under primary school, 2 = primary,
3 = secondary, 4 = high school, 5 = college, 6 = university or higher.
§ risk (ordinal): perceived risk of COVID-19 infection: 1 = “Very unlikely”,
2 = “Unlikely”, 3 = “Neither”, 4 = “Likely”, 5 = “Very likely”.

4
DATA PREPARATION
5
DATA PREPARATION
6
SUMMARY
STATISTICS

7
SUMMARY STATISTICS
8
9
BIVARIATE
ANALYSIS:
T-TEST FOR
EQUAL MEAN
BIVARIATE
ANALYSIS:
T-TEST FOR
EQUAL MEAN
CHI-SQUARED
TEST

12
OLS WITH BINARY DEP
VAR: THE LINEAR
PROBABILITY MODEL

If 𝑦 is a binary variable (0/1),


and we apply OLS, then the
model is called Linear
Probability Model (LPM)

13
14
DISADVANTAGES OF LPM

• Assume that Pr 𝑦 = 1 linearly correlate with 𝑋 regardless the initial


value of 𝑋
• Fitted value of Pr 𝑦 = 1 may be out of [0,1].
• Violate the assumption that 𝜀 is normally distributed.
• 𝜀 has unequal variance, resulting in unreliability of hypothesis testing.

15
THE LOGIT MODEL

16
THE LOGIT MODEL

• Logit model assume ui follows a logistic distribution


• Probability for 𝑌 = 1:

1
Pr 𝑌! = 1 = 𝑃! =
1 + 𝑒 "#$!

• With −∞ < 𝑍! < +∞, then 0 < Pr(𝑌! = 1) < 1.

17
Optional
THE LOGIT MODEL

• The odd ratio in this case is the ratio between probability of default and
probability of non-default:
𝑃! 1 + 𝑒 #$! #$!
= = 𝑒
1 − 𝑃! 1 + 𝑒 "#$!
• Taking log of both sides, we obtain the logit:
%!
ln = 𝛽𝑋!
&"%!
• LPM assumes Pi linearly correlates with 𝑋! , the Logit model assumes the
logit linearly correlates with 𝑋! .

18
PROPERTIES OF LOGIT MODEL

• Pi varies from 0 to 1 while the logit varies from –¥ to +¥.


• Although Li is a linear function of Xi, the probability is not.
• Interpretation of estimated coefficients:
• bj is the change in log-odd ratio when Xj increase by 1 unit,
• bj shows the direction of change in Pi.
• In LPM, the marginal effect of Xj is constant. In the Logit model, the
marginal effect of Xj changes.

19
• Maximum Likelihood (ML)

• ML seeks bj such that log 𝐿 is maximized


ESTIMATION
$
METHOD log 𝐿 = & 𝑌! log 𝑃! + 1 − 𝑌! log 1 − 𝑃!
!"#

#
where 𝑃! = , 𝑌! is the observed choice.
#%& !"#$

20
21
INTERPRETATION OF THE COEFFICIENTS

• bj is the change in log-odd ratio when Xj increase by 1 unit,


𝑃!
𝐿! = ln = 𝛽𝑋! + 𝑢!
1 − 𝑃!
• bj shows the direction of change in Pi.
1
𝑃! =
1 + 𝑒 "#$!
• Coefficient 𝛽 only indicates the direction of the effect. It says nothing
about the magnitude of the effect.

22
23
HYPOTHESIS TESTING AFTER LOGIT:
LIKELIHOOD RATIO TEST

The procedure:
• Estimate the full model:
1
𝑃! =
1 + 𝑒 '()$
then obtain the Log-likelihood value 𝐿𝐿* . (Note that log-likelihood = - deviance/2.)
• Suppose we test the null hypothesis: H0: 𝛽# = 𝛽+ = 0 (could be one or more coef.)
• Impose the null hypothesis to the full model, we have the restricted model in which the
variables with 𝛽# and 𝛽+ are removed.
• Estimate the restricted model to obtain the log-likelihood value 𝐿𝐿, .
• The test statistic = 2 𝐿𝐿* − 𝐿𝐿, , follow 𝜒 + distribution with df = number of
coefficients tested.
24
LOGIT REGRESSION – OVERALL SIGNIFICANCE

25
HYPOTHESIS TESTING AFTER LOGIT: WALD CHI-SQUARED TEST

26
• If we want to know when 𝑋 increases by 1 unit, then
how much 𝑃 changes (marginal effect)
MARGINAL 𝜕𝑃! 𝜕 1 𝑒 '()$
EFFECTS = '()
= 𝛽!
𝜕𝑋! 𝜕𝑋! 1 + 𝑒 $ 1 + 𝑒 '()$ +

• Marginal effect in the logit model is not constant. It


varies with 𝑋.

27
PARTIAL
EFFECTS
AFTER LOGIT
- FOR THE
AVERAGE
OBSERVATION

28
AVERAGE
PARTIAL
EFFECTS
AFTER
LOGIT

29
PREDICTED
PROBABILITY
MARGINAL EFFECTS AT SPECIFIC POINTS
31
ROBUST
STANDARD
ERRORS
LOGISTIC
REGRESSION
WITH ODDS
RATIO

33
THE PROBIT MODEL

34
THE PROBIT MODEL

• In thE LOGIT model, u follows logistic distribution

1
Pr 𝑌! = 1 = 𝑃! =
1 + 𝑒 '()$
• In the PROBIT model, u follows normal distribution

()$
1 '. % /+
Pr 𝑌! = 1 = 𝑃! = 4 𝑒 𝑑𝑧
2𝜋
'-
where F is the cumulative distribution function (CDF) of the normal distribution.

35
ESTIMATING
PROBIT
MODEL IN R

36
OVERALL SIGNIFICANCE AFTER PROBIT

37
TEST FOR JOINT
SIGNIFICANCE

38
PARTIAL
EFFECTS
AFTER
PROBIT – FOR
THE
AVERAGE OBS
AVERAGE
PARTIAL
EFFECTS
AFTER
PROBIT

40
PREDICT
PROBABILITY
AFTER
PROBIT
MARGINAL EFFECTS
AT A SPECIFIC DATA POINT

42
LOGIT OR PROBIT?

• Pi approaches 0 and 1 slower in the Logit, compared to the


Probit model.
• No obvious reason of choosing between the two models.
• However, Logit is preferred for its simplicity in computing
the marginal effects.

43
LOGIT OR
PROBIT?
COMPARING THE
COEFFICIENTS

44
LOGIT OR
PROBIT?
COMPARING
THE
PREDICTED
PROBABILITIES

You might also like