Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

ECON 2P91: Assignment #1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

ECON 2P91: Assignment #1

1. Think of the situation of rolling two dice and let M denote the sum of the number of
dots on the two dice. (So M is a number between 2 and 12.)

a. In a table, list all the possible outcomes for the random variable M together with
its probability distribution and cumulative probability distribution. Sketch both
distributions
b. Calculate the expected value and the standard deviation for M

c. Looking at the sketch of the probability distribution, you notice that it resembles a
normal distribution. Should you be able to use the standard normal distribution
to calculate probabilities of events? Why or why not?

Even though the graph resembles a normal distribution curve, we should not use
normal distributions to calculate probabilities. This is because, normal distribution is a
continuous distribution, whereas, in this problem the sum of two dices is a discrete
distribution. Therefore, it would not be applicable.

2. By looking at the results of activity #1 above, what is the probability (in percentage) of
the following outcomes?

a. Pr(M = 7)
 ANS: 16.67%
b. Pr(M = 2 or M = 10)
 ANS: 2.78% + 8.33% = 11.11%
c. Pr(M = 4 or M ≠ 4)
 ANS: 1
d. Pr(M < 8)
 ANS: 2.78% + 5.56% + 8.33% + 11.11% + 13.89% + 16.67% = 58.33%
e. Pr(M = 6 or M > 10)
 ANS: 13.89% + 5.56% + 2.78% = 22.22%
3.
4. By looking at a large Statistics Canada data set (accessed at https://www.statcan.gc.ca/)
with over 60,000 observations for the year 2018, you find that the average number of
years of education is approximately 15.6. However, a surprising large number of
individuals (approximately 800) have quite a low value for this variable, namely 6 years
or less. You decide to drop these observations, since none of your relatives or friends
have that few years of education. In addition, you are concerned that if these
individuals cannot report the years of education correctly, then the observations on
other variables, such as their average hourly earnings, can also not be trusted. As a
matter of fact, you have found several of these to be below minimum wages in Ontario
in 2018. Discuss if dropping these odd observations is reasonable.

In the year 2018, Statistics Canada conducted an analysis with over 60,000 individuals,
establishing that the average number of years of education is approximately 15.6 years.
However, in this analysis there was approximately 800 individuals who had only 6 years or
less of education. Additionally, there was also a situation in which several hourly wages were
recorded incorrectly, as they were below the minimum wage. In my personal opinion, I
believe dropping these odd observations is completely reasonable. This is because, these
observations can significantly have an impact on the national average. These odd
observations are called outliers which can cause the data to be skewed. This is a very
important matter because skewness can affect the mean. Therefore, it is vital that we drop
these odd observations, so that we can obtain more accurate results.

5. IQs of individuals are normally distributed with a mean of 100 and a standard
deviation of 16. If you sample Brock University students and assume, as the null
hypothesis, that they have the same IQ as the population, then find the following
probabilities on given samples: ̅

a. n = 25, find Pr(� < 105). (3 Marks)

Ý −μ 105−100

(
Pr ( Ý <105 ) =Pr Z <
σ
√n) (
=Pr Z<
16
√ 25 )
=Pr ( Z <1.56 ) =0.9406

b. n = 100, find Pr(� > 97). (3 Marks)

Ý −μ 97−100

(
Pr ( Ý > 97 )=Pr Z >
σ
√n) (
=Pr Z >
16
√100 )
=Pr ( Z >−1.88 )=1−0.0301=0.9699

c. n = 144, find Pr(101 < � < 103). (4 Marks)


Y´ 1 −μ Y´ −μ 101−100 103−100
Pr ( 101< Ý <103 ) =Pr
( σ
√n
<Z< 2
σ
√n
) (
=Pr
16
√144
< Z<
16
√144 )
=Pr ( 0.75<Z <2.25 )=0.9878−0.773

6. Review the excel file cps04-assign01.xls for this assignment. Using GRETL, show the
summary statistics for the variables AHE & AGE respectively.

7. By reviewing the results of activity #6 above, comment on the coefficients of Kurtosis &
skewness for variables AHE & AGE, respectively.

For AHE Variable:


Skewness Coefficient: 1.2963
 A positive skewness coefficient indicates that the data is skewed towards the right
 Since the coefficient is greater than 1 ( it is exactly 1.2963), it means that the data is
highly skewed
 Also, due to the fact that it is positively skewed, mean > median > mode
Kurtosis Coefficient: 1.8921
 A positive kurtosis coefficient indicates that the distribution has heavy tails, compared to
the normal distribution

 Also, since the kurtosis coefficient is a positive number, it means that the curve is a lot
pointier than a normal distribution (kurtosis of a normal distribution = 0)

For AGE Variable:


Skewness Coefficient: -0.12766
 A negative skewness coefficient indicates that the data is skewed towards the left
 Due to the fact that it is negatively skewed, mean < median < mode

Kurtosis Coefficient: -1.1940


 A negative kurtosis coefficient indicates that the distribution has lighter tails, compared
to the normal distribution
 Also, since the kurtosis coefficient is a negative number, it means that the curve is flatter
than a normal distribution because a normal distribution has a kurtosis of 0

8. Given the outcome in activity #6 above, graph the distribution of the two variables AGE
& AHE, respectively. Interpret the shape of the tails of the two variables AGE & AHE.

Therefore, by looking at the shape of the tails of the two variables, AGE & AHE, we can
confidently say that the results analyzed in question 6 & 7 match the two graphs above. In the
description (under question 7), I mentioned that the graph of AGE would have a negative
skewness, and also a negative kurtosis, indicating that the distribution has lighter tails. This is
exactly what is shown in the image above (on the left). Furthermore, the graph of AHE shows a
positive skewness and also a positive kurtosis, indicating that the distribution has heavy tails, as
shown in the image on the right.
9. What is the estimated slope of the regression line? Comment
about this slope

Linear Regression Equation: ΑHE=−4.66024+ 0.823615 ΑGE+ μi

Therefore, from looking at the equation above, I can confidently say that the slope of the line
is 0.823615. This slope states that, AHE will increase by 0.823615 units for each additional
AGE. In simple terms, as you increase the number for AGE, AHE will increase as well.

10. Test the null hypothesis of �0 ∶ �1 = 0 , against the alternative of �a ∶ �1 ≠ 0. Deploy (i)
the pre- specified level of significance approach and (ii) the p-value approach. Make
sure you interpret your findings. Consider 5% level of significance on both methods of
your testing procedure.

�0 ∶ �1 = 0 (Null Hypothesis)

�a ∶ �1 ≠ 0 (Alternative Hypothesis)

First: use the t-test to test the Null Hypothesis:


 T-Test calculated = 5.148
 Since the data is extremely large, encircling 500 observations, we can use the z-
distribution. Therefore, the 5% (0.05) level of significance for the critical value of the
two tailed test would be 1.96.
 Since the t-test calculated > t-critical, we can reject the null hypothesis

Second: use the p-test to test the Null Hypothesis


 P-Test calculated is < 0.0001
 CRITERIA:
o P-Test > Level of Significance (we do not reject the null hypothesis)
o P-Test < Level of Significance (we reject the null hypothesis)
 Since the given p-test is < 0.0001, which is less than the level of significance (0.05), we
can reject the null hypothesis

In conclusion, we can reject the null hypothesis.

You might also like