Week 2
Week 2
Week 2
2022-09-17
Biased Estimator
Biased estimation occurs when the estimated parameter β ′ s is systematically different from the true parameter
β. Mathematically, it is denoted by E(β̂) ̸= β. The biased estimator is represented as the black-line, while
the unbiased estimator is depicted in the red-line. To understand this concept, we must treat β̂ as a
random variable, that is, if we conduct sampling and estimation repeatedly, the resulting β̂ ′ s will be vary.
0.4
0.3
density
0.2
0.1
0.0
−2 0 2 4
Estimated Beta
1
Efficient vs Inefficient Estimator
An estimator might be unbiased, but it can be inefficient. An inefficient estimator has a higher standard error,
0.4
0.3
density
0.2
0.1
0.0
0 5
Estimated Beta
which makes the statistical decision becomes more difficult.
2
100
75
count
50
25
1 2 3
betas
We have a BLUE estimator.
3
90
60
count
30
2 3 4
betas
We have a Biased estimator.
Goodness of Fit
• R-squared implies the goodness-of-fit of the estimator: How much the variation in the dependent
variable could be explained by the model?
SSR
R2 = 1 −
SST
where
− ȳ)2
P
• SST : sum of squared total with (yi P
• SSR: sum of squared regression with (ŷi − ȳ)2
• SSE: sum of squared error with (yi − ŷi )2
P
β̂ ± tα/2 ∗ SE(β̂)
4
Exercises
One
##
(Potential) Answer
a. Jenis data cross-section: setiap observasi memiliki satu data hasil pengamatan.
b. Interpretasi β̂0 adalah intercept, sehingga bisa logis dan bisa juga tidak. Dalam hal ini, β̂0 mencerminkan
jumlah TPTKW (Y ) ketika presentase jumlah penduduk wanita (X) adalah nol. Untuk interpretasi
β̂1 : kenaikan presentase jumlah penduduk wanita sebanyak 1 percentage poin, meningkatkan TPTKW
sebanyak 0.656 percentage poin.
βˆ1 −1 0.656−1
c. Uji signifikansi bisa dihitung dengan menggunakan t-test: t = SE(βˆ1 )
= 0.1961 = −1.75. Sehingga
β̂1 tidak signifikan secara statistik dengan level signifikansi 95%. Asumsi yang diperlukan: Normality
error.
5
Two
##
(Potential) Answer
• a. OLS Estimation
data <- tibble(M = c(21, 24, 26, 27, 28, 29, 30, 33, 35, 37, 39),
Y = c(81, 95, 103, 110, 114, 117, 121, 134, 139, 150, 156)
)
lm <- lm(M ~ Y, data)
summary(lm)
##
## Call:
## lm(formula = M ~ Y, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.5000 -0.2341 -0.1363 0.3023 0.5137
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.000205 0.645326 1.55 0.156
## Y 0.240907 0.005289 45.55 5.93e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3863 on 9 degrees of freedom
## Multiple R-squared: 0.9957, Adjusted R-squared: 0.9952
## F-statistic: 2074 on 1 and 9 DF, p-value: 5.932e-12
• b. Hypothesis testing
Hasil tes statistik t:
coefficients(lm)[2]/coef(summary(lm))[, "Std. Error"][2]
6
## Y
## 45.54533
Pendapatan Y berpengaruh positif terhadap permintaan uang M
• c. Rescaling effect: Variabel M dirubah menjadi milyar rupiah (sehingga dikalikan dengan 1000).
Efeknya akan mengubah koefisien β0 dan β1 , dengan dikalikan 1000 juga. Kesimpulan Statistical
significance akan tidak berubah.
data2 <- tibble(Y = data$Y, M = data$M*1000)
lm2 <- lm(M ~ Y, data2)
summary(lm2)
##
## Call:
## lm(formula = M ~ Y, data = data2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -500.0 -234.1 -136.3 302.3 513.7
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1000.205 645.326 1.55 0.156
## Y 240.907 5.289 45.55 5.93e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 386.3 on 9 degrees of freedom
## Multiple R-squared: 0.9957, Adjusted R-squared: 0.9952
## F-statistic: 2074 on 1 and 9 DF, p-value: 5.932e-12
Stata Exercises