Solution 3 Problem 1: Let X

6.434J/16.
391J
Statistics for Engineers and Scientists March 9
MIT, Spring 2006 Handout #7
Solution 3
Problem 1: Let X1 , X2 , . . . , Xn be a random sample from the uniform p.d.f.
f (x|θ) = 1/θ, for 0 < x < θ and for some unknown parameter θ > 0.
(a) Find a maximum likelihood estimator of θ, say Tn .
(b) Find a bias of Tn .
(c) Based on (b), derive an unbiased estimator of θ, say Wn .
(d) [Extra Credit] Compare variances of Tn and Wn .
(e) [Extra Credit] Show that Tn is a consistence sequence of estimators.

Solution
(a) The likelihood function is

 1 , if 0 < X < θ, 0 < X < θ, . . . , and 0 < X < θ;
n 1 2 n
L(θ) = θ
0 otherwise

 1 , if θ > max{X , X , . . . , X };
n 1 2 n
= θ
0 otherwise.
Since L(·) is maximum when θ = max{X1 , X2 , . . . , Xn }, a MLE for an

unknown θ > 0 is
Tn = max{X1 , X2 , . . . , Xn }.
See Fig. 1 for an illustration. (More precisely, Tn satisfies L(Tn ) =

sup{L(θ) | θ > 0}. Hence it is a MLE.)
(b) Note that Tn is a random variable since it is a function of random variables

Xi ’s. To obtain moments of Tn , we derive the cdf and pdf of Tn .
Let any real number x be given. Then, the cdf of Yn is
P {Tn ≤ x} = P {X1 ≤ x and X2 ≤ x and · · · and Xn ≤ x}
= P {X1 ≤ x} P {X2 ≤ x} · · · P {Xn ≤ x}


 0, if x ≤ 0; (1)

 n
x
= if 0 < x < θ;


θ

1, otherwise.
1
L(θ)
θ
max{X1 , X2 , . . . , Xn }
Figure 1: The likelihood function for problem 5(a) is non-zero when θ >
max{X1 , X2 , . . . , Xn }.
Differentiating both sides of the equations (for each case of the range for
x) yields the pdf

 nxn−1 , for 0 < x < θ;
θn
fTn (x) =
0, otherwise.
Therefore, the mean of Tn is

θ
nxn−1
E {Tn } = x· dx
0 θn
nθ
= ,
n+1
and the bias of Tn is
θ
bn (θ) E {Tn } − θ = − .
n+1
(c) The expression for E {Tn } in part (a) implies that

n + 1
Wn Tn
n
n + 1
= max{X1 , X2 , . . . , Xn }
n
is an unbiased estimator of θ.
2
(d) [Extra Credit] By construction and by a property of the variance, we
derive a relation

n + 1
Var {Wn } = Var Tn
n
n + 1 2
= Var {Tn }
n
> Var {Tn } .
The inequality is strict because Var {Tn } = 0. In other words, Tn is a

random variable (non-constant), and its variance must be strictly positive.
Note This problem does not require us to compute the variances, Var {Tn }
and Var {Wn }. To derive those variances, we use the pdf of Tn , given in
part (b). Notice that the second moment of Tn is
θ

nxn−1
E Tn2 = x2 · dx
0 θn
nθ2
= .
n+2
Then, the variance of Tn is

Var {Tn } E Tn2 − E2 {Tn }
nθ2 nθ 2
= −
n+2 n+1
2
nθ
= .
(n + 1)2 (n + 2)
A relationship between Wn and Tn implies that

n + 1 2
Var {Wn } = Var {Tn }
n
θ2
= .
n(n + 2)
(e) We need to show that the sequence of estimators, T1 , T2 , T3 , . . . , converges

to θ in probability. That is, for any > 0, limn→∞ P { |Tn − θ| > } = 0.
Since Xi < θ for any integer i = 1, 2, 3, . . . , the MLE, defined to be
Tn max{X1 , X2 , . . . , Xn }, satisfies Tn < θ. That is, |Tn − θ| = θ − Tn .
3
Let any > 0 be given. Then,
P { |Tn − θ| > } = P { θ − Tn > }

= P {Tn < θ − }
 n
 θ− , if 0 < < θ;
θ
=
0 otherwise.
The last equality follows from the cdf of Tn in equation (1). Taking the
limit both sides of the equation (for each case of the range for ), we have
lim P { |Tn − θ| > } = 0,

n→∞
for any > 0. Therefore, the MLE Tn is a consistent estimator.
Problem 2: Suppose that X1 , X2 , . . . , Xn are independent random variables,

each N (µ, σ 2 ) with both µ and σ 2 unknown:
(µ, σ 2 ) ∈ Θ {(x, y) | − ∞ < x < +∞, y > 0}.
(a) Find a maximum likelihood estimator of (µ, σ 2 ).
(b) Suppose that n = 131 and that X1 , X2 , . . . , X131 are taken to be the
average temperature observed at Boston, MA, since the 1872 that are given
in Table 1. Use MATLAB to find the numerical values of the estimator of
(µ, σ 2 ) for this data set.
Solution
(a) The likelihood function is
1 n (Xi −µ)2
L(µ, σ) = √ e− i=1 2σ 2 ,
(σ 2π)n
for a real number −∞ < µ < ∞ and a positive number σ > 0. We want to
find µ and σ that maximize the likelihood function, or equivalently, that
maximize the log-likelihood function,
n
n (Xi − µ)2
ln L(µ, σ) = − ln(2π) − n ln σ − .
2 i=1
2σ 2
Taking the partial derivatives of the log-likelihood function with respect to

µ and to σ, and setting the partial derivatives to zero yield two equations
4
with two unknowns, µ and σ:
n
(Xi − µ)
∂
0= ln L(µ, σ) = , (2)
∂µ i=1
σ2
n
∂ n (Xi − µ)2
0= ln L(µ, σ) = − + . (3)
∂σ σ i=1 σ3
Solving for µ∗ and σ ∗ that satisfy those two equations yield the solution
n
∗ Xi
µ = i=1 Xn
n
n 2
∗ i=1 (Xi − X n )
σ = .
n
(Equation (2) gives the expression of µ∗ = X n . Then, substitute µ in (3)

with X n , and solve for σ.) Therefore, MLEs for unknown parameters µ
and σ 2 are
= Xn
µ
n 2
2 = i=1 (Xi − X n ) ,
σ
n
where the second equality follows from invariance property of a MLE.
Note It is not hard to verify that µ∗ and σ ∗ maximize the likelihood

function. For a two-variable function, say, f (x, y), we need to verify that
∂2f
< 0,
∂x2
∂2f
< 0, and (4)
∂y 2
∂ 2 f ∂ 2 f ∂ 2 f 2
· − > 0.
∂x2 ∂y 2 ∂x∂y
Note that
∂2 n
2
ln L(µ, σ) = − 2
∂µ σ
n
∂2 n 3 i=1 (Xi − µ)2
ln L(µ, σ) = 2 −
∂σ 2 σ σ4
2 n
∂ 2 i=1 (Xi − µ)
ln L(µ, σ) = − .
∂µ∂σ σ3
5
Substituting µ∗ and σ ∗ to the above partial derivatives yields the relations
∂2
n
2
ln L(µ, σ) =− ∗ 2 < 0,
∂µ ∗
µ=µ ,σ=σ ∗ (σ )
∂2
2n
2
ln L(µ, σ) =− ∗ 2 < 0, and
∂σ ∗
µ=µ ,σ=σ ∗ (σ )
∂2 2n2

ln L(µ, σ) = > 0.
∂µ∂σ µ=µ∗ ,σ=σ ∗ (σ ∗ )4
Therefore, µ∗ and σ ∗ maximize the likelihood function.
2 for this data set
and σ
(b) The MATLAB code to compute estimators µ
yields
= 50.64
µ and 2 = 2.34.
σ
Average temperatures (F) in Boston, MA, from 1872−2002

0.4
pdf of N(50.64, 2.34)
0.35
0.3
normalized frequency
0.25
0.2
0.15
0.1
0.05
0
35 40 45 50 55 60 65
Temperature (F)
Figure 2: In problem 2(b), average temperatures at Boston are modelled to be

normal random variables.
Problem 3: Let X1 , X2 , . . . , Xn be independent random variables, each

N (µ, 1). Find an unbiased estimator of µ2 that is a function of X n .
[Hint: Consider a bias of (X n )2 .]
Solution In class, we show that X n ∼ N (µ, 1/n). Hence, the second moment
of this normal random variable is given by

E (X n )2 Var X n + E2 X n
1
= + µ2 .
n
By inspection, (X n )2 − 1/n is an unbiased estimator of µ2 .
6
Problem 4: A one-bit analog-to-digital (A/D) converter is defined by a
single threshold α and has two outputs, 0 and 1. Suppose a random variable X
with probability density function fX (.) is input to this A/D and the output is
defined as Y .
(a) Find the MMSE estimator of x based on the observation y. (Question

does not ask for estimator to be linear.)
(b) The output of the A/D is input to a binary symmetric channel character-
ized by a single parameter 0 ≤ p ≤ 1. Let Z be the output of the channel,
with the following conditional pdf:

p, z = y
pZ|Y (z|y) =
1 − p, z = y
Find the optimum (MMSE) estimator of x based on the observation z.
(c) For the special cases of p = 0, p = 1 and p = 0.5, interpret the results of
part (b).
Solution
(a) The MMSE estimator x̂(y) directly follows from the following:
∞
x̂(y) = xfX|Y (x|y)dx
−∞
∞
pY |X (y|x)fX (x)
= x dx
−∞ pY (y)
∞
1
= xpY |X (y|x)fX (x)dx (5)
pY (y) −∞
From the description of A/D, we can write the conditional pmf pY |X as

well as pmf of Y , pY (y). Although X may be continuous or discrete, Y is
discrete, with 0 and 1 as its possible values.

P {X ≤ α} , for y = 0
pY (y) =
P {X > α} , for y = 1
α
fX (x)dx, for y = 0
= −∞
∞
α
fX (x)dx, for y = 1
and pY |X is simply:

1, for x ≤ α
pY |X (y = 0|x) =
0, for x > α
7

0, for x ≤ α
pY |X (y = 1|x) =
1, for x > α
Hence we get the MMSE estimate x̂(y) by substituting corresponding

values for the cases y = 0 and y = 1, in (5).
∞
1
x̂(y = 0) = xpY |X (y = 0|x)fX (x)dx
pY (0) −∞
α
xfX (x)dx
= −∞ α
−∞ X
f (x)dx
∞
1
x̂(y = 1) = xpY |X (y = 1|x)fX (x)dx
pY (1) −∞
∞
xfX (x)dx
= α∞
α
fX (x)dx
(b) We are given fZ|Y which can be stated as:

p, z=1
pZ|Y (z|y = 0) =
1 − p, z = 0

p, z=0
pZ|Y (z|y = 1) =
1 − p, z = 1
The pmf for Z is:

(1 − p)P {Y = 0} + pP {Y = 1} , z = 0
pZ (z) =
pP {Y = 0} + (1 − p)P {Y = 1} , z = 1
α ∞
(1 − p) −∞ fX (x)dx + p α fX (x)dx, z = 0
= α ∞
p −∞ fX (x)dx + (1 − p) α fX (x)dx, z = 1
Also we see that y = 0 =⇒ x ≤ α and y = 1 =⇒ x > α. Therefore,

1 − p, x ≤ α
pZ|X (z = 0|x) =
p, x>α
and similarly,
p, x≤α
pZ=|X (z = 1|x) =
1 − p, x > α
As previous case, the MMSE estimate of x given z is:
8
∞
x̂(z) = xfX|Z (x|z)dx
−∞
∞
pZ|X (z|x)fX (x)
= x dx
−∞ pZ (z)
∞
1
= xpZ|X (z|x)fX (x)dx
pZ (z) −∞
Using pZ|X in this equation, we get the result. For z = 0,

∞
1
x̂(z = 0) = xpZ|X (z = 0|x)fX (x)dx
pZ (0) −∞
α ∞
1 1
= x(1 − p)fX (x)dx + xpfX (x)dx
pZ (0) −∞ pZ (0) α
α ∞
(1 − p) −∞ xfX (x)dx + p α xfX (x)dx
= α ∞
(1 − p) −∞ fX (x)dx + p α fX (x)dx
Similarly, for z = 1,
∞
1
x̂(z = 1) = xpZ|X (z = 1|x)fX (x)dx
pZ (1) −∞
α ∞
1 1
= xpfX (x)dx + x(1 − p)fX (x)dx
pZ (1) −∞ pZ (1) α
α ∞
p −∞ xfX (x)dx + (1 − p) α xfX (x)dx
= α ∞
p −∞ fX (x)dx + (1 − p) α fX (x)dx
(c) p = 0 corresponds to x̂(z) = x̂(y) as expected, since in this case Z = Y

with probability 1. For p = 1, we have an almost sure inversion of binary
value of Y . Therefore, x̂(z) and x̂(y) are the same but at the opposite
values of observations: x̂(z = 0) = x̂(y = 1) and vice versa.
For p = 0.5, the estimate of x from z is the average of the estimate based
on y. That is, x̂(z) = 0.5x̂(y = 0) + 0.5x̂(y = 1) for either value of z which
is the same as E {X}.
Problem 5: Suppose X and Y are jointly Gaussian, and z = F x + g, where

F and g are known. Show that MMSE estimator of z given y is given by
ẑ = F x̂ + g, where x̂ is the MMSE estimator of x given y. Find an expression
for the mean square estimation error of z given y.
Solution X and Y are jointly Gaussian. From a result proved in class (lecture
6 and 7), the MMSE for x given y is given as:
x̂MMSE (y) = µX + ΣTXY Σ−1

Y (Y − µY )
9
Z = F x + g is also Gaussian (because F and g are known constants which only
scale and shift the distribution of X). Hence,
ẑMMSE (y) = µZ + ΣTZY Σ−1

Y (Y − µY )
(We write them here as x̂ and ẑ for brevity.) Let’s find µZ .
µZ = E {Z} = E {F X + g}
= F E {X} + g
= F µX + g
The variance is:

ΣZY = E (Z − µZ )(Y − µY )T

= E (F X + g − F µX − g)(Y − µY )T

= F E (XµX )(Y − µY )T
= F ΣXY (6)
Substitute µZ and ΣZY in ẑ, we get ẑ = F x̂ + g.
Variance of estimation error is available from the derivation of the MMSE. Recall
that the distribution of X|Y in case of jointly Gaussian X and Y is normal. Its
mean is the MMSE x̂(y) and its variance is the variance of estimation error, i.e,
Var {x − x̂}. Therefore, for ẑ, variance of estimation error is: ΣZ −ΣTZY ΣY ΣZY .
Where,
ΣZ = F ΣX F T (7)
Therefore, using (6) and (7), the variance of estimation error of ẑ:
F ΣX F T − ΣTXY (F T ΣY F )ΣXY
Since mean of estimation error is zero (E {Z − ẑ} = µZ − µZ = 0), the mean

square estimation error is the same as variance of estimation error.
Note that, Σz−ẑ = F Σx−x̂ F T .
10
Table 1: Average temperatures (in F ) at Boston, MA. (source: National
Weather Service Eastern Region)
Year Average Year Average Year Average Year Average

1872 53.0 1899 50.1 1926 49.0 1953 53.6
1873 48.2 1900 50.9 1927 51.8 1954 51.4
1874 48.6 1901 49.0 1928 51.3 1955 51.4
1875 46.5 1902 49.7 1929 51.4 1956 50.6
1876 47.9 1903 49.5 1930 52.3 1957 52.5
1877 50.1 1904 47.1 1931 53.0 1958 50.0
1878 50.2 1905 49.1 1932 52.4 1959 51.8
1879 48.4 1906 50.0 1933 50.9 1960 51.4
1880 50.2 1907 48.7 1934 48.8 1961 51.0
1881 49.6 1908 51.1 1935 48.9 1962 49.8
1882 48.9 1909 50.5 1936 49.8 1963 51.0
1883 47.9 1910 50.8 1937 51.2 1964 50.1
1884 49.0 1911 50.9 1938 51.1 1965 49.6
1885 47.6 1912 50.6 1939 49.8 1966 51.4
1886 48.3 1913 52.3 1940 48.5 1967 49.5
1887 48.5 1914 49.7 1941 51.2 1968 50.4
1888 47.3 1915 51.2 1942 50.7 1969 51.2
1889 50.6 1916 49.7 1943 49.9 1970 50.9
1890 49.1 1917 47.9 1944 50.9 1971 51.2
1891 50.4 1918 49.8 1945 51.0 1972 50.4
1892 49.4 1919 51.1 1946 51.7 1973 53.0
1893 47.9 1920 50.0 1947 51.2 1974 50.9
1894 50.3 1921 52.4 1948 50.7 1975 52.8
1895 49.8 1922 51.3 1949 53.6 1976 52.2
1896 49.2 1923 50.3 1950 51.2 1977 52.5
1897 49.9 1924 50.4 1951 52.2 1978 50.3
1898 50.8 1925 51.7 1952 52.5 1979 52.1
11
Table 1: (Continued)
Year Average Year Average Year Average Year Average

1980 50.6 1986 50.8 1992 50.2 1998 53.0
1981 51.5 1987 50.4 1993 51.6 1999 52.7
1982 51.0 1988 51.3 1994 52.2 2000 50.6
1983 53.2 1989 50.3 1995 51.4 2001 52.5
1984 51.6 1990 53.2 1996 50.9 2002 56.2
1985 51.0 1991 53.4 1997 50.9
12

Solution 3 Problem 1: Let X

Uploaded by

Copyright:

Available Formats

Solution 3 Problem 1: Let X

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Solution 3 Problem 1: Let X

Uploaded by

Copyright:

Available Formats

6.434J/16.

(b) Find a bias of Tn .

(c) Based on (b), derive an unbiased estimator of θ, say Wn .

(d) [Extra Credit] Compare variances of Tn and Wn .

(e) [Extra Credit] Show that Tn is a consistence sequence of estimators.

Since L(·) is maximum when θ = max{X1 , X2 , . . . , Xn }, a MLE for an

See Fig. 1 for an illustration. (More precisely, Tn satisﬁes L(Tn ) =

(b) Note that Tn is a random variable since it is a function of random variables

Therefore, the mean of Tn is

(c) The expression for E {Tn } in part (a) implies that

The inequality is strict because Var {Tn } = 0. In other words, Tn is a

A relationship between Wn and Tn implies that

(e) We need to show that the sequence of estimators, T1 , T2 , T3 , . . . , converges

P { |Tn − θ| > } = P { θ − Tn > }

lim P { |Tn − θ| > } = 0,

for any > 0. Therefore, the MLE Tn is a consistent estimator.

Problem 2: Suppose that X1 , X2 , . . . , Xn are independent random variables,

(µ, σ 2 ) ∈ Θ {(x, y) | − ∞ < x < +∞, y > 0}.

(a) Find a maximum likelihood estimator of (µ, σ 2 ).

(a) The likelihood function is

Taking the partial derivatives of the log-likelihood function with respect to

(Equation (2) gives the expression of µ∗ = X n . Then, substitute µ in (3)

Note It is not hard to verify that µ∗ and σ ∗ maximize the likelihood

Average temperatures (F) in Boston, MA, from 1872−2002

Figure 2: In problem 2(b), average temperatures at Boston are modelled to be

Problem 3: Let X1 , X2 , . . . , Xn be independent random variables, each

(a) Find the MMSE estimator of x based on the observation y. (Question

Find the optimum (MMSE) estimator of x based on the observation z.

From the description of A/D, we can write the conditional pmf pY |X as

Hence we get the MMSE estimate x̂(y) by substituting corresponding

(b) We are given fZ|Y which can be stated as:

Also we see that y = 0 =⇒ x ≤ α and y = 1 =⇒ x > α. Therefore,

As previous case, the MMSE estimate of x given z is:

Using pZ|X in this equation, we get the result. For z = 0,

(c) p = 0 corresponds to x̂(z) = x̂(y) as expected, since in this case Z = Y

Problem 5: Suppose X and Y are jointly Gaussian, and z = F x + g, where

x̂MMSE (y) = µX + ΣTXY Σ−1

ẑMMSE (y) = µZ + ΣTZY Σ−1

(We write them here as x̂ and ẑ for brevity.) Let’s ﬁnd µZ .

The variance is:

Substitute µZ and ΣZY in ẑ, we get ẑ = F x̂ + g.

Since mean of estimation error is zero (E {Z − ẑ} = µZ − µZ = 0), the mean

Year Average Year Average Year Average Year Average

Year Average Year Average Year Average Year Average

You might also like