Econometrics Assignment
Econometrics Assignment
HAWASSA UNIVERSITY
ECONOMICS
DEPARTMENT OF ECONOMICS
1. Nature of Multicollinearity
3. Consequences of Multicollinearity
Significance Testing Problems: With large standard errors, the t-statistics for
individual predictors decrease, which can lead to a failure in rejecting the null
hypothesis that a coefficient is equal to zero (i.e., failing to find significant
predictors even if they should be significant).
"income" and "education level" are highly correlated, they might be combined
into a socioeconomic status score.
Use Variance Inflation Factor (VIF): VIF is a diagnostic tool that quantifies
how much the variance of a regression coefficient is inflated due to
collinearity with other predictors. A high VIF (typically greater than 10)
indicates significant multicollinearity. Variables with high VIFs can be
considered for removal or transformation.
5. Diagnosing Multicollinearity
Variance Inflation Factor (VIF): VIF measures how much the variance of a
regression coefficient is inflated because of collinearity with other variables. A
VIF value greater than 10 is typically considered indicative of problematic
multicollinearity.
1. Nature of Heteroscedasticity
2. Causes of Heteroscedasticity
3. Consequences of Heteroscedasticity
Invalid Inference: Since the standard errors are incorrect, hypothesis tests
(e.g., testing if a coefficient is zero) may lead to false positives (Type I errors)
or false negatives (Type II errors). For example, you might incorrectly reject a
null hypothesis (finding a variable significant when it isn’t) or fail to reject a
false null hypothesis (failing to detect a significant relationship).
Model Fit and Predictive Power: The presence of heteroscedasticity does not
necessarily affect the fit of the model (i.e., R-squared remains valid), but it
affects the reliability of statistical tests on the coefficients, which can
undermine the usefulness of the model for making predictions or drawing
inferences.
There are several ways to deal with heteroscedasticity, depending on the severity of
the problem and the context of the data:
5. Diagnosing Heteroscedasticity
Residual Plot: Plotting the residuals (errors) versus the fitted values is one of
the most common ways to detect heteroscedasticity. A pattern where residuals
fan out or contract as the fitted values change indicates heteroscedasticity.
1. Use the data file wage to work on using STATA and answer the following
questions
. describe
Unique
Variable Obs=. Obs>. Obs<. values Min Max
. mean(ERSP)
. mean DEP
. mean RACE
D, Conduct model specification tests using linktest and ovtest commands of STATA,
and interpret the result
. linktest
. ovtest
. vif
. hettest
chi2(1) = 5.25
Prob > chi2 = 0.0220
These findings suggest that ERNO and NEIN are key predictors of HRS, with ERNO
decreasing and NEIN increasing its valu
5,Use the data file EARNINGS and, using STATA for analysis, carry out the
following tasks.
a. Perform a regression of EARNINGS on S where EARNINGS represent
Current hourly earnings in $ and S represents education (highest grade
completed) in number of years of schooling of the respondent. Interpret the
regression results
. regress EARNINGS S
Earnings = -12.922 + 2.45(Saving): The intercept (-12.922) means that when saving is zero,
earnings are expected to be -12.922, indicating borrowing. The slope (2.45) shows that for
each unit increase in saving, earnings increase by 2.45 units on average.
R² = 0.1725 indicates that 17.25% of the total variation in earnings is explained by savings.
The remaining 82.75% of the variation is due to factors not included in the model or
explained by the error term.
To test the significance of regression coefficients, we use methods like the standard error
test, t-test, or confidence intervals.
t-Test: Compare the t-calculated (10.59) to the t-critical (2.05). Since , reject , indicating the
slope coefficient is statistically significant.
Confidence Interval: At 95% confidence, the slope coefficient lies within [1.999, 2.91].
Thus, both the intercept and slope coefficients are statistically significant.
D, Perform an F test for the goodness of fit and comment on the result
The F-test evaluates the overall significance of the model. The null hypothesis states that all
coefficients are equal to zero (). Since the F-statistic (112.15) is greater than the F-critical
value (4) and the p-value (0.000) is less than 0.05, we reject the null hypothesis. This
indicates that the coefficients are jointly significant, and the model is valid.
. regress S ASVABC SM
. regress S ASVABC SF
. regress S ASVABC SM SF
The regression results show that the coefficient for SM (mother’s education) is statistically
insignificant, meaning that a mother's education does not have a significant impact on the
outcome (S). As a result, we reject the view that "if you educate a female, you educate a
nation." This implies that the mother’s education does not play a large role in shaping the
individual’s outcome in this case.
We conduct a hypothesis test for the father’s education, where:Null Hypothesis (Ho): "If you
educate a male, you don’t educate an individual."Alternative Hypothesis (H1): "If you
educate a male, you educate an individual."The p-value for SF is 0.001, which is less than the
0.05 significance level. This means we reject the null hypothesis and accept the alternative
hypothesis, indicating that educating a male has a significant positive impact on the
individual’s outcome.
In conclusion, the results support the idea that father's education plays a more significant
role in shaping an individual's outcome, and the impact of mother's education appears to be
less significant in this context. Thus, the hypothesis "if you educate a male, you educate an
individual" holds, but the idea "if you educate a female, you educate a nation" does not.
the analysis involves testing the significance of the intercept (β0) and two slope coefficients
(β1 and β2) using t-tests:
1. Intercept (β0):
t* = 6.20, which is greater than the critical t-value (tc) of 1.96, so we reject H₀ and conclude
that β0 is statistically significant.
Interpretation: The intercept suggests that when all explanatory variables are zero, earnings
are expected to be -26.48.
t* = 11.46, which is greater than the critical t-value (tc) of 1.96, so we reject H₀ and conclude
that β1 is statistically significant.
Interpretation: For every unit increase in S, earnings (Y) increase by 2.678 units on average,
t* = 4.38, which is greater than the critical t-value (tc) of 1.96, so we reject H₀ and conclude
that β2 is statistically significant.
Interpretation: For every unit increase in DEP, earnings (Y) increase by 0.562 units on
average, holding other factors constant.All coefficients are statistically significant, indicating
that the explanatory variables (S and DEP) have meaningful relationships with earnings (Y).