Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
276 views

ProblemSet1 Correction

This document provides corrections and explanations for problems in an econometrics problem set covering descriptive and correlation analysis. It includes: 1) An econometric model to study the relationship between class size and student performance, and reasons we might expect a negative correlation. 2) Issues with a substitute teacher sampling the top 10 students to assess class performance. 3) Identifying time series, cross-sectional, and panel data. 4) Proving the sample variance is a biased estimator of the population variance and how bias decreases with larger samples. 5) Comparing the efficiency and sufficiency of two estimators of population mean using four observations. 6) Calculating correlation between unemployment and inflation rates

Uploaded by

Marcelo Miranda
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
276 views

ProblemSet1 Correction

This document provides corrections and explanations for problems in an econometrics problem set covering descriptive and correlation analysis. It includes: 1) An econometric model to study the relationship between class size and student performance, and reasons we might expect a negative correlation. 2) Issues with a substitute teacher sampling the top 10 students to assess class performance. 3) Identifying time series, cross-sectional, and panel data. 4) Proving the sample variance is a biased estimator of the population variance and how bias decreases with larger samples. 5) Comparing the efficiency and sufficiency of two estimators of population mean using four observations. 6) Calculating correlation between unemployment and inflation rates

Uploaded by

Marcelo Miranda
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

ECONOMETRICS: PS1

PROBLEM SET 1: DESCRIPTIVE AND CORRELATION


ANALYSIS
CORRECTION

1.

Suppose you are asked to conduct a study to determine whether small class sizes lead to
increase student performance.
a) Postulate an econometric model that allows you to conduct this study.
The econometric model that will allow conducting this study reads as follows:
= + +

where = , , and is the sample size (in this case, number of students)
is the intercept of the model and is the coefficient of class size.
Variables:
could be measured with the GPA or the grade in a given test for

each student depending on the data availability.


is usually measured with the number of students belonging to the class.
The error term, , could include other relevant variables that affect students
performance, such as study hours, teachers performance, personal problems,
ability, family background, etcetera.
Note: takes the same value the students belonging to the same class.
b) Why might you expect a negative correlation between class size and student
performance?
We would expect 1 < 0: on average, students in larger classes tend to have worse
performance, ceteris paribus. And vice versa.
Why? Larger classes tend to be noisier, more likely that some students disturbed
the rhythm of the class and disturb other classmates. For the teacher it is easier to
follow personally each student if the class is of small size. All these reasons can
affect students performance.

ECONOMETRICS: PS1
c) Would a negative correlation necessarily show that smaller class sizes cause better
performance? Explain.
Correlation does not necessarily imply causality: we would need to extend the
model adding more explanatory factors that help us approaching the ceteris
paribus effect of class size on student performance.
Literature: The effects of class size on academic achievement have been studied for
decades. Although the results of small scale randomized experiments and largescale econometric studies point to positive effects of small classes, some scholar
have seen the evidence as ambiguous. There could be unobserved factors for the
econometrician that make this effect just a pure correlation.

2.

A substitute teacher wants to know how students in the class did on their last test. The
teacher asks the 10 students sitting in the front row to state their latest test score. He
concludes from their report that the class did extremely well.
a) What is the sample? What is the population?
The population is the class we dont know the total number of students in that
class, and the selected sample are those 10 students sited in the front row.
b) Could you identify any problems with choosing the sample in this way?
The substitute teaches chooses a non-random sample of the population. The main
problem that can arise in this particular example is that probably she overestimates
the average performance of the class by just asking the students sited in the front
row. Perhaps those students are more interested in the course, work harder and
exhibit better performance.

3.

Determine whether the following data correspond to time series, cross sectional or
panel data.
a) Italian monthly salaries in the time period 19802014 TIME SERIES
b) Inflation rate in each of the OECD countries in 2010 CROSS SECTION
c) R&D expenditure in each of the European Union member states in 2013 CROSS
SECTION
d) Automobile production in France, Italy and Spain in the time period 1980-2010
PANEL DATA
e) Monthly water consumption during the 20th century in the city of Madrid TIME
SERIES

ECONOMETRICS: PS1
4.

Answer the following questions:


a) Analytically show that the sample variance

1
2 = ( )2

=1

is a BIASED estimator of the population variance, 2 . (Hint: look at the proof we


did in class).

() = ( )2
=1

() = (( ) ( ))2
=1

() = (( )2 + ( )2 2( )( ))
=1

() = ( ) +( ) 2 ( )( )
=1

=1

() = ( )2 +( )2 2( )2
=1

() = ( )2 ( )2
=1

And now we take the expected value of the asterisk:

() = [( )2 ( )2 ]
=1

() = 2

() = ( 1) 2
Finally,
( 2 ) =

1 2

ECONOMETRICS: PS1
b) Find the BIAS of this estimator and explain the effect on the bias as we increase
the sample size.
( 2 ) 2
1 2
2
( 2 ) = ( 2 ) 2 =
2 =

As we increase the sample size, tends to infinite and the bias tends to zero. In
mathematical terms:
lim

2
0

Therefore, the estimator is consistent!!!

5.

Let be a random variable with () = and () = 2


We have information about four independent observations: 1 , 2 , 3 , 4 .
1

Let 1 = 4 (1 + 2 + 3 + 4 ) be an estimator for the population mean.


a) What are the expected value and variance of 1 in terms of and 2 ?
1
(1 + 2 + 3 + 4 )
4
1
( 1 ) = [(1 ) + (2 ) + (3 ) + (4 )]
4
1
( 1 ) = ( + + + ) =
4
( 1 ) =

1
(1 + 2 + 3 + 4 )
16
1
( 1 ) =
[(1 ) + (2 ) + (3 ) + (4 )]
16
1 2
1
( + 2 + 2 + 2 ) = 2
( 1 ) =
16
4
( 1 ) =

b) Now consider a second estimator for the population mean 2 being defined as:
2 =

1
1
1
1
1 + 2 + 3 + 4
8
8
4
2

ECONOMETRICS: PS1
Show that this second estimator is also an unbiased estimator for the population
mean. Find its variance.
1

( 2 ) = ( 1 +
8

1
1
2 + 3 + 4 )
8
4
2

1
1
1
1
( 2 ) = (1 ) + (2 ) + (3 ) + (4 )
8
8
4
2
1
1
1
1
( 2 ) = + + + =
8
8
4
2
2 is also an unbiased estimator of the sample mean since ( 2 ) =
1

( 2 ) = ( 1 +
8

1
1
2 + 3 + 4 )
8
4
2

1
1
1
1
(1 ) +
(2 ) + (3 ) + (4 )
64
64
16
4
1 2
11
( + 2 + 4 2 + 16 2 ) =
( 2 ) =
2
64
32
( 2 ) =

c) Discuss both the efficiency and sufficiency of both estimators.


Efficiency: The second estimator shows higher variance than the first one:
11
32

> .
4

In terms of their relative efficiency we can compute the ratio of their variances.
( 2 ) 1132
=
= 1.375
( 1 )
14
Sufficiency: both of them are sufficient in the sense that they use all the
information available, 1 , 2 , 3 , 4 .

d) Based on all your previous answers, which estimator do you prefer?


I will choose the first estimator given that among the unbiased estimators of this
example, it is the one with the lowest variance (or the most efficient).

ECONOMETRICS: PS1
In the table below, U denotes the unemployment rate and I the inflation rate in six
American countries in 2012.
Country
Mexico
Argentina
Brazil
Chile
Colombia
Venezuela

U
5.2
7.2
6.0
6.6
10.8
8.2

I
3.4
9.5
6.6
3.3
3.4
26.1

a) Determine whether the data is a time series, a cross-sectional data or panel data.
Cross-sectional data
b) Draw a graph with the data of the table.
30
VE

25

Inflation rate (%)

6.

20
15
10

AR
BR

ME

CO

CH

0
0

10

Unemployment rate (%)

c) Calculate the correlation coefficient (U, I) and interpret your result.

mean
variance
deviation

U
7.33
3.93
1.98

I
8.72
78.63
8.87

12

ECONOMETRICS: PS1
Covariance (U, I) = 2.93
Correlation coefficient (U, I) = 0.169
The correlation seems to be positive but weak. From a macro perspective we would
have expected a negative correlation Phillips curve. Venezuela is destroying the
relationship given that at that time it was suffering an hyperinflation period, so that
it is different from the rest of the countries and we should drop it from the analysis
outlier.
d) Calculate a new correlation coefficient eliminating the Venezuelan observation and
interpret your result.
10

AR

Inflation rate (%)

8
7

BR

6
5
4

ME

CO

CH

2
1
0
0

10

12

Unemployment rate (%)


Covariance (U, I) = -0.85
Correlation coefficient (U, I) = -0.143
After dropping Venezuela we get the expected sign in the correlation coefficient
even though the relationship is still weak this may be due to the small sample size
of the example.

You might also like