0% found this document useful (0 votes)

95 views

Econometrics Assignment HW4

This document contains an econometrics assignment with 13 questions. It involves analyzing a dataset using R to predict the likelihood of visiting a doctor based on demographic characteristics. Linear and logit regression models are used. Key results show income, age, gender, health, employment status, and household size impact the probability of visiting a doctor. The employment variables and overall models are found to be statistically significant.

Uploaded by

Nikhil Sharma

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

95 views

Econometrics Assignment HW4

Uploaded by

Nikhil Sharma

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

Econometrics Assignment- HW4

Group- B

Group Members:
Aman Bansal
(PGP15006)
M.Vikas
(PGP15027)
Nikhil Sharma
(PGP15094)
Sumit Agarwal
(PGP15052)
Aman Chandila (PGP15066)
Vikas Srivastava(PGP15119)
Q1. We will work with w.data dataset in this assignement. Bring this
data to R.
Ans. R script

Console:

Q2. Provide a table with summary statistics (number of observations,

mean, standard deviation, minimum, maximum) of all the variables in
the dataset.
Ans. R script

Console:

Q3. We want to know whether the probability that a person will visit a
doctor can be predicted by some of the demographic characteristics.
Create a binary variable docvisit which takes the value of 1 only if a
person has a non-zero number of visits to doctor. Note: We are NOT
going to use the panel characteristics of this data.
Ans. R script

Console:

Q4. Run a linear probability model where the docvisit variable depends
on the log of income, age, good health, male, and household size
variables. Interpret your results.
Ans. R script

From the above results, we can conclude that a person is more likely to visit a
doctor if:

His/her income is higher

He/she is of old age
The person is a female
Their household size is small
The person has bad health

We have also used the describe function to calculate the summary statistics of
the fitted model which shows that the probability of a person visiting a doctor
lies between 0.36 to 0.97 and the mean is 0.65.
Although the simple linear model cannot be used in the case when we use a
binary dependent variable as the model violates the rule of probability lying
between 0 and 1, in this case the model is not violating the rule.
Q5. Now add some employment variables to your linear probability
model: whether this person receives welfare payments or not, whether
this person is unemployed or not, whether this person has a full time
work or not. Run the model, obtain heteroscedasticity-robust standard
errors, and interpret your results.
Ans. R script

studentized Breusch-Pagan test

data: reg2
BP = 1733.6, df = 8, p-value < 2.2e-16

As we can see the p-value is significant so we reject the null hypothesis

that the coefficients of the regressed terms on the square of residuals
is zero, hence there exists heteroscedasticity. So we need to adjust the
coefficient for heteroscedasticity and hence included vcov=hccm.
The results are
coeftest(reg2)
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.4777448 0.0509612 9.3747 < 2.2e-16 ***
loginc
0.0503848 0.0065586 7.6823 1.606e-14 ***
age
0.0010003 0.0002361 4.2368 2.274e-05 ***
goodh
-0.2035693 0.0054209 -37.5524 < 2.2e-16 ***
male
-0.1308615 0.0057018 -22.9510 < 2.2e-16 ***
hsize
-0.0111345 0.0019470 -5.7187 1.083e-08 ***
unemp
-0.0382156 0.0097844 -3.9058 9.412e-05 ***
sozh
0.0092916 0.0139168 0.6677 0.5044
ft
-0.0476117 0.0062582 -7.6079 2.861e-14 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
coeftest(reg2,vcov=hccm)
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.47774478 0.05077345 9.4093 < 2.2e-16 ***
loginc
0.05038479 0.00653149 7.7141 1.252e-14 ***
age
0.00100032 0.00023317 4.2900 1.792e-05 ***
goodh
-0.20356928 0.00536717 -37.9286 < 2.2e-16 ***
male
-0.13086147 0.00569630 -22.9730 < 2.2e-16 ***
hsize
-0.01113445 0.00198496 -5.6094 2.046e-08 ***
unemp
-0.03821556 0.00970430 -3.9380 8.233e-05 ***
sozh
0.00929160 0.01373041 0.6767 0.4986
ft
-0.04761173 0.00614553 -7.7474 9.651e-15 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

As we can see the p-values have changed significantly, the estimates

havent changed though and can be interpreted as we have done
before.

Q6. Now test whether the employment related variables should belong
in your model or not.
Ans. R script

The employment related variables i.e. unemployment, full time work, welfare
payments are stored in a variable q0 and the linear hypothesis test is done to
check whether these variables should belong in our model or not. From the
results, it can be concluded that the Probability is very low and we can easily
reject the Null Hypothesis at 5% level. Thus, all the employment related variables
should belong to our model.

Q7. Suppose we have two male individuals, both with good health, not
unemployed and the employment being full time, not receiving welfare
payments, with following other characteristics: log of income 5.02 and
10.03 respectively, age 20 and 60 respectively, and household size of 4.
Find out the likelihood of these two individuals visiting a doctor. What
do you think is going on here?
Ans. R script

The likelihood of the two individuals visiting a doctor is 32.41 % and 61.65 %.
This shows that keeping everything constant and increasing the age and income
of the individual increases the probability of a person visiting a doctor.

Q8. Run a logit model instead of a linear probability model in part 5.

What can we infer from the results?
Ans. R Script
glm(formula = docvisit ~ loginc + age + goodh + male + hsize +
unemp + sozh + ft, family = binomial(link = logit), data = mydata)
Deviance Residuals:
Min
1Q Median
3Q
Max
-2.1022 -1.1511 0.6393 0.9268 1.4913

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.166536 0.246044 -0.677 0.498
loginc
0.241696 0.031736 7.616 2.62e-14 ***
age
0.005205 0.001144 4.551 5.35e-06 ***
goodh
-0.973606 0.026848 -36.263 < 2e-16 ***
male
-0.610408 0.027252 -22.399 < 2e-16 ***
hsize
-0.052709 0.009240 -5.705 1.17e-08 ***
unemp
-0.188675 0.047145 -4.002 6.28e-05 ***
sozh
0.044193 0.067146 0.658 0.510
ft
-0.235469 0.030354 -7.757 8.67e-15 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 42494 on 32836 degrees of freedom
Residual deviance: 39669 on 32828 degrees of freedom
AIC: 39687
Number of Fisher Scoring iterations: 4

Interc
ept

-0.166536

Loginc

0.005205

Age
goodh
male
hsize
unem
p
sozh

0.241696

-0.973606
-0.610408
-0.052709
-0.188675
0.044193
-0.235469

Y*
Probabil
ity

0.2928
91

0.4490
41

0.0233
6

2.2568
56

0.5727
04

0.6104
11

0.4941
6

0.9052
4

These are some predicted values from the model. The coefficients prove that the
probability of a person visiting a doctor increases if:

The persons income is higher

He/she is of old age
The person has bad health
The person is a female
Household size is less
The person is employed
The person does not receive welfare payments
The person works full-time

Q9. Calculate McFaddens pseudo R-squared.

Ans. R Script
> logLik(logit)
'log Lik.' -19834.49 (df=9)
> 1 - logit$deviance/logit$null.deviance
[1] 0.06648934

Q10. Is the overall model significant?

Ans. R Script
lrtest(logit)
Likelihood ratio test
Model 1: docvisit ~ loginc + age + goodh + male + hsize + unemp + sozh +
ft
Model 2: docvisit ~ 1
#Df LogLik Df Chisq Pr(>Chisq)
1 9 -19835
2 1 -21247 -8 2825.4 < 2.2e-16 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

The Lrtest test the hypothesis of the goodness of fit of two models.
Null Hypothesis: The smaller model provides better goodness of fit.
From the above results it can be easily seen that the probability value is very low
and thus the null hypothesis is rejected and we conclude that our model is
significant.

Q11. Get the predicted likelihood for the two individuals from part 7.
Ans. R script
predict(logit,indvalues,type="response")
1
2
0.2932925 0.6317290

The predicted likelihood is 29.33% and 63.17%.

Q12. Test whether the employment variables should be in the model or

not.
Ans. R Script
lrtest(logit,logit1)
Likelihood ratio test
Model 1: docvisit ~ loginc + age + goodh + male + hsize + unemp + sozh +
ft
Model 2: docvisit ~ loginc + age + goodh + male + hsize
#Df LogLik Df Chisq Pr(>Chisq)
1 9 -19835
2 6 -19866 -3 63.196 1.219e-13 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

From the above result, we see that the probability of model 2 is very small and
can be easily rejected at 1 % significance level, i.e. the p value is very low, so the
. Thus, the Null Hypothesis is rejected and we conclude that employment related
variables should belong to our model.

Q13. Calculate the average partial effects. Interpret your results.

Ans. R Script

Average partial effects explain how much on an average a dependent variable

will change by one-unit change in the independent variable.
From the results shown above, we can infer that goodh variable plays an
important role in changing the probability of a person visiting a doctor. On an
average, a person with bad health has 20.39 % more chance of visiting a doctor
than a person with good health. Similarly, gender also plays an important role in
determining the probability of a person visiting a doctor. A female is 12.9% more
likely to visit a doctor than a male. Unemployment and income also play a role in
determining the probability with both affecting the probability by 4% and 5%
respectively.

Midterm Fall2011
No ratings yet
Midterm Fall2011
13 pages
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
ps5 Fall+2015
No ratings yet
ps5 Fall+2015
9 pages
PDF
No ratings yet
PDF
9 pages
Econometrics 2 Exam Answers
67% (3)
Econometrics 2 Exam Answers
6 pages
Heckman Selection Models
No ratings yet
Heckman Selection Models
4 pages
Organizational Structure
No ratings yet
Organizational Structure
4 pages
Class 3 Count Models 1.0
No ratings yet
Class 3 Count Models 1.0
39 pages
Binary
No ratings yet
Binary
135 pages
ECON-C4210 PS3 solutions
No ratings yet
ECON-C4210 PS3 solutions
21 pages
Econometrics Eviews 6
No ratings yet
Econometrics Eviews 6
12 pages
Econometrics-CH-4 (1)
No ratings yet
Econometrics-CH-4 (1)
14 pages
Cap1_Slides
No ratings yet
Cap1_Slides
30 pages
1
No ratings yet
1
9 pages
Mock Exam2
No ratings yet
Mock Exam2
17 pages
Adv Econometrics
No ratings yet
Adv Econometrics
8 pages
Problem Set 7
No ratings yet
Problem Set 7
5 pages
DA R Assignment2
No ratings yet
DA R Assignment2
9 pages
Tutorials2016s1 Week9 Answers
No ratings yet
Tutorials2016s1 Week9 Answers
4 pages
Michael Joseph-Introductory Econometrics
No ratings yet
Michael Joseph-Introductory Econometrics
8 pages
Chapter 4
No ratings yet
Chapter 4
11 pages
Unit 540 Differences Between Two Groups With Answers
No ratings yet
Unit 540 Differences Between Two Groups With Answers
8 pages
Cross Section Answers
No ratings yet
Cross Section Answers
22 pages
Presentación Modelo 4
No ratings yet
Presentación Modelo 4
27 pages
CH 5. Discrete Choice Model
No ratings yet
CH 5. Discrete Choice Model
38 pages
Problem Set 4
No ratings yet
Problem Set 4
3 pages
Assignment -Group 2
No ratings yet
Assignment -Group 2
2 pages
Assignment -Group 3
No ratings yet
Assignment -Group 3
2 pages
Text On Class
No ratings yet
Text On Class
18 pages
case_3 (1)
No ratings yet
case_3 (1)
3 pages
GMU Econ535-Applied Econometrics Problem Set2 (PS2) Solutions Spring 2024
No ratings yet
GMU Econ535-Applied Econometrics Problem Set2 (PS2) Solutions Spring 2024
14 pages
Econometric Methods
No ratings yet
Econometric Methods
8 pages
Regression hw3
No ratings yet
Regression hw3
3 pages
Statistics Econometrics Exam Feb
No ratings yet
Statistics Econometrics Exam Feb
8 pages
Module 4 - Logistic Regression - Afterclass1b
No ratings yet
Module 4 - Logistic Regression - Afterclass1b
54 pages
Mock Exam 2 - Solutions
No ratings yet
Mock Exam 2 - Solutions
6 pages
Problem-Set - 1 Practise Problems From Textbook
No ratings yet
Problem-Set - 1 Practise Problems From Textbook
2 pages
Mock_exam (17)
No ratings yet
Mock_exam (17)
5 pages
SOCY7706: Longitudinal Data Analysis Instructor: Natasha Sarkisian Two Wave Panel Data Analysis
No ratings yet
SOCY7706: Longitudinal Data Analysis Instructor: Natasha Sarkisian Two Wave Panel Data Analysis
12 pages
Week 12 LPN Logit 0
No ratings yet
Week 12 LPN Logit 0
35 pages
Regn_lect_5
No ratings yet
Regn_lect_5
9 pages
Empirical Methods For Microeconomic Applications: William Greene Department of Economics Stern School of Business
No ratings yet
Empirical Methods For Microeconomic Applications: William Greene Department of Economics Stern School of Business
56 pages
8
No ratings yet
8
23 pages
Logit Probit
No ratings yet
Logit Probit
66 pages
Seu Ds610 Mod03
No ratings yet
Seu Ds610 Mod03
45 pages
Quiz 1 - Econometrics 2
No ratings yet
Quiz 1 - Econometrics 2
8 pages
14 - 382 - Pset - 5 (1) - Merged
No ratings yet
14 - 382 - Pset - 5 (1) - Merged
9 pages
Econometrics I Final Examination Summer Term 2013, July 26, 2013
No ratings yet
Econometrics I Final Examination Summer Term 2013, July 26, 2013
9 pages
HW3 Solutions - Stats 500: Problem 1
No ratings yet
HW3 Solutions - Stats 500: Problem 1
4 pages
Dummy Variable Ques
No ratings yet
Dummy Variable Ques
7 pages
DS535 Note 4 (With Marks)
No ratings yet
DS535 Note 4 (With Marks)
18 pages
Unit 540 Differences Between Two Groups Without Answers
No ratings yet
Unit 540 Differences Between Two Groups Without Answers
5 pages
Applied Econometrics For Managers (MBAA-II, AY: 2023-24) IIM Kashipur
No ratings yet
Applied Econometrics For Managers (MBAA-II, AY: 2023-24) IIM Kashipur
3 pages
Major Assignment F21 (Friday)
No ratings yet
Major Assignment F21 (Friday)
4 pages
Lab 4: Logistic Regression: PSTAT 131/231, Winter 2019
No ratings yet
Lab 4: Logistic Regression: PSTAT 131/231, Winter 2019
10 pages
PS3 Stata
No ratings yet
PS3 Stata
3 pages
STAT3301 - Term Exam 2 - CH11 Study Package
No ratings yet
STAT3301 - Term Exam 2 - CH11 Study Package
6 pages
12th B BSS 7th Sem ECON 405 2021
No ratings yet
12th B BSS 7th Sem ECON 405 2021
3 pages
Discrete Choice Modeling: William Greene Stern School of Business New York University
No ratings yet
Discrete Choice Modeling: William Greene Stern School of Business New York University
58 pages
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
Foundations of Elementary Analysis
From Everand
Foundations of Elementary Analysis
Roshan Trivedi
No ratings yet
JWT China
No ratings yet
JWT China
7 pages
Themes
No ratings yet
Themes
1 page
Deloitte Maverick: A Study and Analysis of The US Telco Case
No ratings yet
Deloitte Maverick: A Study and Analysis of The US Telco Case
11 pages
Convocation Ques
No ratings yet
Convocation Ques
1 page
Uber
100% (2)
Uber
15 pages
Deloitte Maverick: A Study and Analysis of The US Telco Case
No ratings yet
Deloitte Maverick: A Study and Analysis of The US Telco Case
3 pages
A Very Warm Good Evening To One and All Present Over Here
No ratings yet
A Very Warm Good Evening To One and All Present Over Here
1 page
Airline Industry: Group - 7, Marketing Management - Ii
No ratings yet
Airline Industry: Group - 7, Marketing Management - Ii
7 pages
Assignment #1 - Mutual Fund Data
No ratings yet
Assignment #1 - Mutual Fund Data
29 pages
Flashcards - Quantitative Aptitude Revision Maths Shortcuts
100% (3)
Flashcards - Quantitative Aptitude Revision Maths Shortcuts
108 pages
AP Econ 3210.03a f15 Outline
No ratings yet
AP Econ 3210.03a f15 Outline
4 pages
Predicting BABIP
No ratings yet
Predicting BABIP
37 pages
The Effect of Food Expenditure To The Total of Household Expenditure
No ratings yet
The Effect of Food Expenditure To The Total of Household Expenditure
12 pages
Cambodia Hotel Satisfaction
No ratings yet
Cambodia Hotel Satisfaction
18 pages
Modified Breusch-Godfrey Test For Restricted Highe
No ratings yet
Modified Breusch-Godfrey Test For Restricted Highe
11 pages
Q.1 Explain The Underlying Ideas Behind The Log It Model. Explain On What Grounds Log It Model Is An Improvement Over Linear Probability Model. Ans
No ratings yet
Q.1 Explain The Underlying Ideas Behind The Log It Model. Explain On What Grounds Log It Model Is An Improvement Over Linear Probability Model. Ans
17 pages
Organization of Land Surrounding Airports: The Case of The Aerotropolis
No ratings yet
Organization of Land Surrounding Airports: The Case of The Aerotropolis
31 pages
EViews 2nd Week Assignment With Solution
No ratings yet
EViews 2nd Week Assignment With Solution
12 pages
Temas 4 Al 7
No ratings yet
Temas 4 Al 7
191 pages
Jurnal JWC English Eka Darmadi
No ratings yet
Jurnal JWC English Eka Darmadi
11 pages
TSA: James D. Hamilton, Time Series Analysis, Princeton University Press, 1994
No ratings yet
TSA: James D. Hamilton, Time Series Analysis, Princeton University Press, 1994
6 pages
ARCH Model
No ratings yet
ARCH Model
26 pages
"Introductory Econometrics", Chapter 8 by Wooldridge: Heteroskedasticity
No ratings yet
"Introductory Econometrics", Chapter 8 by Wooldridge: Heteroskedasticity
14 pages
Interpreting Correlation
No ratings yet
Interpreting Correlation
13 pages
Forecasting and Risk Analysis in Supply Chain Management
No ratings yet
Forecasting and Risk Analysis in Supply Chain Management
20 pages
Fraud Prevention: Relevance To Religiosity and Spirituality in The Workplace
No ratings yet
Fraud Prevention: Relevance To Religiosity and Spirituality in The Workplace
9 pages
Management Analysis Journal: The Effect of Environment, Training, Motivation, and Satisfaction On Work Productivity
No ratings yet
Management Analysis Journal: The Effect of Environment, Training, Motivation, and Satisfaction On Work Productivity
8 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
004 - Modelling Volatility - Arch and Garch Models
0% (1)
004 - Modelling Volatility - Arch and Garch Models
31 pages
UT Dallas Syllabus For Eco5311.501.08s Taught by Wim Vijverberg (Vijver)
No ratings yet
UT Dallas Syllabus For Eco5311.501.08s Taught by Wim Vijverberg (Vijver)
4 pages
Cumby Huizinga - Testing Autocorrelation OLS
No ratings yet
Cumby Huizinga - Testing Autocorrelation OLS
12 pages
Eco Trix
No ratings yet
Eco Trix
16 pages
Generalized AutoRegressive Conditional Heteroskedasticity
No ratings yet
Generalized AutoRegressive Conditional Heteroskedasticity
3 pages
Introduction To Bivariate Regression
No ratings yet
Introduction To Bivariate Regression
51 pages
Ken Black QA 5th Chapter14 Solution
100% (2)
Ken Black QA 5th Chapter14 Solution
43 pages
Investment Behavior in Generation Z and Millennial
No ratings yet
Investment Behavior in Generation Z and Millennial
15 pages
CLRM Assumptions
No ratings yet
CLRM Assumptions
20 pages
2018 CFA Level 2 Mock Exam Morning
No ratings yet
2018 CFA Level 2 Mock Exam Morning
40 pages
SVKM's Narsee Monjee Institute of Management Studies Name of School - SBM, Bangalore
No ratings yet
SVKM's Narsee Monjee Institute of Management Studies Name of School - SBM, Bangalore
3 pages
The Effect of Human Development Index (IPM), Gini Ratio, and Gross Domestic Products On The Number of Stunting in Indonesia
No ratings yet
The Effect of Human Development Index (IPM), Gini Ratio, and Gross Domestic Products On The Number of Stunting in Indonesia
4 pages