Econometrics Assignment HW4
Econometrics Assignment HW4
Group- B
Group Members:
Aman Bansal
(PGP15006)
M.Vikas
(PGP15027)
Nikhil Sharma
(PGP15094)
Sumit Agarwal
(PGP15052)
Aman Chandila (PGP15066)
Vikas Srivastava(PGP15119)
Q1. We will work with w.data dataset in this assignement. Bring this
data to R.
Ans. R script
Console:
Console:
Q3. We want to know whether the probability that a person will visit a
doctor can be predicted by some of the demographic characteristics.
Create a binary variable docvisit which takes the value of 1 only if a
person has a non-zero number of visits to doctor. Note: We are NOT
going to use the panel characteristics of this data.
Ans. R script
Console:
Q4. Run a linear probability model where the docvisit variable depends
on the log of income, age, good health, male, and household size
variables. Interpret your results.
Ans. R script
From the above results, we can conclude that a person is more likely to visit a
doctor if:
We have also used the describe function to calculate the summary statistics of
the fitted model which shows that the probability of a person visiting a doctor
lies between 0.36 to 0.97 and the mean is 0.65.
Although the simple linear model cannot be used in the case when we use a
binary dependent variable as the model violates the rule of probability lying
between 0 and 1, in this case the model is not violating the rule.
Q5. Now add some employment variables to your linear probability
model: whether this person receives welfare payments or not, whether
this person is unemployed or not, whether this person has a full time
work or not. Run the model, obtain heteroscedasticity-robust standard
errors, and interpret your results.
Ans. R script
Q6. Now test whether the employment related variables should belong
in your model or not.
Ans. R script
The employment related variables i.e. unemployment, full time work, welfare
payments are stored in a variable q0 and the linear hypothesis test is done to
check whether these variables should belong in our model or not. From the
results, it can be concluded that the Probability is very low and we can easily
reject the Null Hypothesis at 5% level. Thus, all the employment related variables
should belong to our model.
Q7. Suppose we have two male individuals, both with good health, not
unemployed and the employment being full time, not receiving welfare
payments, with following other characteristics: log of income 5.02 and
10.03 respectively, age 20 and 60 respectively, and household size of 4.
Find out the likelihood of these two individuals visiting a doctor. What
do you think is going on here?
Ans. R script
The likelihood of the two individuals visiting a doctor is 32.41 % and 61.65 %.
This shows that keeping everything constant and increasing the age and income
of the individual increases the probability of a person visiting a doctor.
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.166536 0.246044 -0.677 0.498
loginc
0.241696 0.031736 7.616 2.62e-14 ***
age
0.005205 0.001144 4.551 5.35e-06 ***
goodh
-0.973606 0.026848 -36.263 < 2e-16 ***
male
-0.610408 0.027252 -22.399 < 2e-16 ***
hsize
-0.052709 0.009240 -5.705 1.17e-08 ***
unemp
-0.188675 0.047145 -4.002 6.28e-05 ***
sozh
0.044193 0.067146 0.658 0.510
ft
-0.235469 0.030354 -7.757 8.67e-15 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 42494 on 32836 degrees of freedom
Residual deviance: 39669 on 32828 degrees of freedom
AIC: 39687
Number of Fisher Scoring iterations: 4
Interc
ept
-0.166536
Loginc
0.005205
Age
goodh
male
hsize
unem
p
sozh
0.241696
-0.973606
-0.610408
-0.052709
-0.188675
0.044193
-0.235469
Ft
Y*
Probabil
ity
10
10
10
10
30
60
30
30
12
0.2928
91
0.4490
41
0.0233
6
2.2568
56
0.5727
04
0.6104
11
0.4941
6
0.9052
4
These are some predicted values from the model. The coefficients prove that the
probability of a person visiting a doctor increases if:
The Lrtest test the hypothesis of the goodness of fit of two models.
Null Hypothesis: The smaller model provides better goodness of fit.
From the above results it can be easily seen that the probability value is very low
and thus the null hypothesis is rejected and we conclude that our model is
significant.
Q11. Get the predicted likelihood for the two individuals from part 7.
Ans. R script
predict(logit,indvalues,type="response")
1
2
0.2932925 0.6317290
From the above result, we see that the probability of model 2 is very small and
can be easily rejected at 1 % significance level, i.e. the p value is very low, so the
. Thus, the Null Hypothesis is rejected and we conclude that employment related
variables should belong to our model.