Stat 700 HW3 Solutions, 10/9/09
Stat 700 HW3 Solutions, 10/9/09
Stat 700 HW3 Solutions, 10/9/09
1 5 1 1
E(X1 X2 ) = .5 (1)2 +.5( )2 = , Cov(X1 , X2 ) = , Corr(X1 , X2 ) =
2 8 16 11
(b) Let S = X1 + · · · + X10 , s = x1 + · · · + x10 . Thenthe joint mixed-type
pdf for (X1 , . . . , Xn , ϑ) is .5 e−s I[ϑ=1] + .5 210 e−2s I[ϑ=2] , so that at s = 13,
n √
Y ϑ 2
s
√ e−ϑ(xi −µ0 ) /2 ∝ ϑn/2 exp(−ϑ )
i=1
2π 2
1
Bickel-Doksum, # 1.2.14. P (a) Joint (ϑ, X1 , . . . , Xn ) density is
C exp(− 12 (ϑ − ϑ0 )2 /τ02 − 12 σ0−2 ni=1 (xi − ϑ)2 ). The conditional density of
Xn+1 given (ϑ, X1 , . . . , Xn ) is N (ϑ, σ02 ), and this ‘predictive’ density does
not change with n. The posterior predictive density (of θ given Xn ) is a
random variable depending on X̄n , which can be found either by a direct use
of the definitions involving integrals or through the following argument. First
we know (ϑ, X̄n ) is bivariate normal with X̄ −ϑ ∼ N (0, σ02 ) independent of
ϑ. Next, with γ ≡ τ02 /(τ02 +σ02 /n), we have ϑ−γ X̄n independent of X̄n , and
therefore its variance added to the variance of γ X̄n is equal to the variance
τ02 of ϑ: thus conditionally given X̄n , ϑ ∼ N (γ X̄n + (1 − γ)ϑ0 , γσ02 /n).
Since Xn+1 = (Xn+1 − ϑ) + ϑ is a sum of two conditionally normal
and independent variables given Xn , we conclude that it is conditionally
N (γ X̄n + (1−γ)ϑ0 , σ02 + γσ02 /n). Now when n gets large, X̄n → ϑ (either
in probability or almost surely) by the Law of Large Numbers), and we find
X̄n conditionally given Xn has distribution converging to N (ϑ, σ02 ), the
same as the frequentist predictive density.
(b) In both (i),(ii), the risk curve is decreasing for negative ϑ , increasing for
positive (with equal limits at 0 for (i) but not for (ii)), but the curves have a
discontinuity at 0 with isolated risk value ϑ lower than either the left- or right-
limits there. The risk for (i) is 2Φ(−1)I[ϑ=0] + (Φ(1−|ϑ|)+Φ(−1−|ϑ|)) I[ϑ6=0] ,
and that for (ii) is (Φ(ϑ + 1) + Φ(ϑ − 2))I[ϑ<0] + (Φ(−1) + Φ(−2))I[ϑ=0] +
(Φ(2 − ϑ) + Φ(−1 − ϑ))I[ϑ>0] . The risk is smaller under (ii) than under (i)
if and only if ϑ ≤ 0.
2
at ϑ iff Eϑ (a(X)) = ϑ (it is definitely not minimized there at all when
Eϑ (a(X)) 6= ϑ). For this to hold for all ϑ is to say that the estimator a(X)
is unbiased in the usual sense.
(b) Note: this problem part is stated incorrectly: it should have
said that the ‘unbiased’ definition implies that the power is always
at least as large over the alternative parameter region as over the
null-hypothesis parameter region, but the converse does not gen-
erally hold ! Here
3
which occurs at the point p(R(1, δ3 ), R(2, δ3 ) + (1 − p)(R(1, δ2 ), R(2, δ2 ) at
p = 5/14. So the minimax among all randomized procedures is the one which
chooses δ3 with probability p = 5/14 and δ2 with the remaining probability
9/14.
(c) The four rπ (δj ) values for j = 1, . . . , 4 are: 2.7, 1.1, 2.02, 1.78. So
with this prior, δ2 is the Bayes optimal rule, with Bayes risk 1.1.
1 −z2 /2 . ∞ 1 −z2 /2
Z w
Φ((w − a)/b) − Φ(−a/b)
Z
FW (w) = √ e dz √ e dz =
0 b π 0 b π 1 − Φ(−a/b)
In part (d), substitute for the posterior median with a = (Z0 − µ − σλ)/β
and b = σ/β. In part (e), substitute for the posterior expectation with
a = z − λ and b = 1.