ProblemSet1 Correction
ProblemSet1 Correction
1.
Suppose you are asked to conduct a study to determine whether small class sizes lead to
increase student performance.
a) Postulate an econometric model that allows you to conduct this study.
The econometric model that will allow conducting this study reads as follows:
= + +
where = , , and is the sample size (in this case, number of students)
is the intercept of the model and is the coefficient of class size.
Variables:
could be measured with the GPA or the grade in a given test for
ECONOMETRICS: PS1
c) Would a negative correlation necessarily show that smaller class sizes cause better
performance? Explain.
Correlation does not necessarily imply causality: we would need to extend the
model adding more explanatory factors that help us approaching the ceteris
paribus effect of class size on student performance.
Literature: The effects of class size on academic achievement have been studied for
decades. Although the results of small scale randomized experiments and largescale econometric studies point to positive effects of small classes, some scholar
have seen the evidence as ambiguous. There could be unobserved factors for the
econometrician that make this effect just a pure correlation.
2.
A substitute teacher wants to know how students in the class did on their last test. The
teacher asks the 10 students sitting in the front row to state their latest test score. He
concludes from their report that the class did extremely well.
a) What is the sample? What is the population?
The population is the class we dont know the total number of students in that
class, and the selected sample are those 10 students sited in the front row.
b) Could you identify any problems with choosing the sample in this way?
The substitute teaches chooses a non-random sample of the population. The main
problem that can arise in this particular example is that probably she overestimates
the average performance of the class by just asking the students sited in the front
row. Perhaps those students are more interested in the course, work harder and
exhibit better performance.
3.
Determine whether the following data correspond to time series, cross sectional or
panel data.
a) Italian monthly salaries in the time period 19802014 TIME SERIES
b) Inflation rate in each of the OECD countries in 2010 CROSS SECTION
c) R&D expenditure in each of the European Union member states in 2013 CROSS
SECTION
d) Automobile production in France, Italy and Spain in the time period 1980-2010
PANEL DATA
e) Monthly water consumption during the 20th century in the city of Madrid TIME
SERIES
ECONOMETRICS: PS1
4.
1
2 = ( )2
=1
() = ( )2
=1
() = (( ) ( ))2
=1
() = (( )2 + ( )2 2( )( ))
=1
() = ( ) +( ) 2 ( )( )
=1
=1
() = ( )2 +( )2 2( )2
=1
() = ( )2 ( )2
=1
() = [( )2 ( )2 ]
=1
() = 2
() = ( 1) 2
Finally,
( 2 ) =
1 2
ECONOMETRICS: PS1
b) Find the BIAS of this estimator and explain the effect on the bias as we increase
the sample size.
( 2 ) 2
1 2
2
( 2 ) = ( 2 ) 2 =
2 =
As we increase the sample size, tends to infinite and the bias tends to zero. In
mathematical terms:
lim
2
0
5.
1
(1 + 2 + 3 + 4 )
16
1
( 1 ) =
[(1 ) + (2 ) + (3 ) + (4 )]
16
1 2
1
( + 2 + 2 + 2 ) = 2
( 1 ) =
16
4
( 1 ) =
b) Now consider a second estimator for the population mean 2 being defined as:
2 =
1
1
1
1
1 + 2 + 3 + 4
8
8
4
2
ECONOMETRICS: PS1
Show that this second estimator is also an unbiased estimator for the population
mean. Find its variance.
1
( 2 ) = ( 1 +
8
1
1
2 + 3 + 4 )
8
4
2
1
1
1
1
( 2 ) = (1 ) + (2 ) + (3 ) + (4 )
8
8
4
2
1
1
1
1
( 2 ) = + + + =
8
8
4
2
2 is also an unbiased estimator of the sample mean since ( 2 ) =
1
( 2 ) = ( 1 +
8
1
1
2 + 3 + 4 )
8
4
2
1
1
1
1
(1 ) +
(2 ) + (3 ) + (4 )
64
64
16
4
1 2
11
( + 2 + 4 2 + 16 2 ) =
( 2 ) =
2
64
32
( 2 ) =
> .
4
In terms of their relative efficiency we can compute the ratio of their variances.
( 2 ) 1132
=
= 1.375
( 1 )
14
Sufficiency: both of them are sufficient in the sense that they use all the
information available, 1 , 2 , 3 , 4 .
ECONOMETRICS: PS1
In the table below, U denotes the unemployment rate and I the inflation rate in six
American countries in 2012.
Country
Mexico
Argentina
Brazil
Chile
Colombia
Venezuela
U
5.2
7.2
6.0
6.6
10.8
8.2
I
3.4
9.5
6.6
3.3
3.4
26.1
a) Determine whether the data is a time series, a cross-sectional data or panel data.
Cross-sectional data
b) Draw a graph with the data of the table.
30
VE
25
6.
20
15
10
AR
BR
ME
CO
CH
0
0
10
mean
variance
deviation
U
7.33
3.93
1.98
I
8.72
78.63
8.87
12
ECONOMETRICS: PS1
Covariance (U, I) = 2.93
Correlation coefficient (U, I) = 0.169
The correlation seems to be positive but weak. From a macro perspective we would
have expected a negative correlation Phillips curve. Venezuela is destroying the
relationship given that at that time it was suffering an hyperinflation period, so that
it is different from the rest of the countries and we should drop it from the analysis
outlier.
d) Calculate a new correlation coefficient eliminating the Venezuelan observation and
interpret your result.
10
AR
8
7
BR
6
5
4
ME
CO
CH
2
1
0
0
10
12