EE/Ma 126b Information Theory - Homework Set #4
EE/Ma 126b Information Theory - Homework Set #4
EE/Ma 126b Information Theory - Homework Set #4
4.1 One bit quantization of a single Gaussian random variable.∗ Let the boundary be x. We
use one symbol to represent all t ≤ x and another symbol for t > x. Since squared error
measure
R x is used, the two conditional expectations
R ∞ f (t)should be the reproduction points.1 That is,
f (t) 2 2
x0 = −∞ 1/2+A(x) tdt for t ≤ x and x1 = x 1/2−A(x) tdt for t > x, where f (t) = √ 2 et /2σ
Rx 2πσ
is the probability density function and A(x) = 0 f (t)dt. Thus the distortion D is the weighted
sum of two conditional variances:
"Z 2 #
x Z x
f (t) 2 f (t)
D(x) = (1/2 + A(x)) t dt − tdt
−∞ 1/2 + A(x) −∞ 1/2 + A(x)
"Z 2 #
∞ Z ∞
f (t) f (t)
+ (1/2 − A(x)) t2 dt − tdt
x 1/2 − A(x) x 1/2 − A(x)
Z ∞
σ 4 f 2 (x) σ 4 f 2 (x)
= f (x)t2 dt − −
−∞ 1/2 + A(x) 1/2 − A(x)
2 2
4σ 4 f 2 (x) 2σ 2 e−x /σ
= σ2 − = σ 2
− . (1)
1 − 4A2 (x) π(1 − 4A2 (x))
2 /σ 2
Now we want to prove e−x ≤ 1 − 4A2 (x). Without loss of generality, assume x ≥ 0.
Z x 2 Z x Z x
2
4A (x) = f (t)dt = f (u)f (v)dudv
−x −x −x
√ √ !
Z 2x Z 2x2 −u2 u2 +v 2
1
≤ √ √ e− 2σ 2 dv du (2)
2πσ 2 − 2x − 2x2 −u2
√
Z 2π Z 2x 2
1 − r2 2 /σ 2
= dθ e 2σ rdr = 1 − e−x .
2πσ 2 0 0
1
4.2 Rate distortion for uniform source with hamming distortion. The distortion with the Hamming
measure is D̄ = Ed(X, X̂) = Pr{d(X, X̂) = 1}. We have
Notice that
d[log m − H(D̄) − D̄ log(m − 1)] D̄
= log − log(m − 1),
dD̄ 1 − D̄
1 1
which is less than 0 when 0 < D̄ < 1 − m . Thus when D ≤ 1 − m , we have
We can design distributions p(X|X̂) and p(X̂) to achieve the minimum I(X; X̂). Let p(X̂) be
1
uniform distribution. For 0 ≤ D ≤ 1 − m , set
1 − D, X = X̂;
p(X|X̂) =
D/(m − 1), X 6= X̂.
Thus
X
p(X = x) = p(X = x|X̂ = x̂)p(X̂ = x̂)
x̂
1 X
= p(X = x|X̂ = x) + p(X = x|X̂ = x̂)
m
x̂6=x
1 D 1
= 1 − D + (m − 1) × = ,
m m−1 m
and the distortion is
X
Pr{X 6= X̂} = 1 − Pr{X = X̂} = 1 − p(X = x̂|X̂ = x̂)p(X̂ = x̂) = D.
x̂
So such distribution meets the requirements on the distribution of X and the distortion. And,
now
I(X; X̂) = H(X) − H(X|X̂)
D D
= H(X) − H(1 − D, ,..., )
m−1 m−1
D D
= H(X) + (1 − D) log(1 − D) + (m − 1) × log
m−1 m−1
= log m − H(D) − D log(m − 1).
2
1 1
Thus we know R(D) = log m − H(D) − D log(m − 1) for 0 ≤ D ≤ 1 − m. When D > 1 − m,
we can send nothing and simply choose X̂ at random. Thus the distortion is Pr{X 6= X̂} =
m−1 1
m < D. So obviously R(D) = 0 when D > 1 − m .
4.3 Erasure distortion. Let {0, E, 1} denote the set X̂ , where ‘E’ stands for erasure. Since d(0, 1) =
d(1, 0) = ∞, we must have p(0, 1) = p(1, 0) = 0 for a finite distortion. Thus
and
When p(X|X̂ = E) = 21 and D ≤ 1, we can set pX̂ (0) = pX̂ (1) = 1−D 2 and pX̂ (E) = D. Then
1 1 1
pX (x) = pX̂ (x) + 2 pX̂ (E) = 2 , meeting that X ∼ Bernoulli( 2 ). And now the equality of (3)
holds. Thus R(D) = 1 − D when 0 ≤ D ≤ 1. When D > 1, obviously R(D) = 0 .
A simple strategy to achieve R(D) is to erase X at random with probability 1 − R(D). Since
X is uniformly distributed, we have with such strategy, p(0, E) = p(1, E) = 12 (1 − R(D)) and
p(0, 1) = p(1, 0) = 0. Thus from the above discussion, the rate is 1 − (1 − R(D)) = R(D).
4.4 Bounds on the rate distortion function for squared error distortion. With D as the upper bound
of the distortion, we have
2 σ 2 −D
For Z ∼ N (0, σDσ
2 −D ) independent with X and X̂ = σ2
(X + Z), we have the distortion is
2
σ2 − D
2 D
E(X − X̂) = E X− Z
σ2 σ2
2 2 2
D σ −D
= E X + E Z
σ2 σ2
2 2 2
D 2 σ −D Dσ 2
= σ +
σ2 σ2 σ2 − D
= D.
3
And it is surprising to find out that for a constant a,†
You can find the proof in the footnote. Thus the mutual information is
Dσ 2
σ 2 (X + Z) = σ 2 (X) + σ 2 (Z) = σ 2 + .
σ2 − D
1 σ2
R(D) ≤ log .
2 D
2
Since for Gaussian random variable with variance σ 2 , R(D) = 12 log σD achieves the maximum
of R(D), it is harder to describe the Gaussian random variable than other random variables
with the same variance.
4.5 Properties of optimal rate distortion code. The conditions of equalities are listed at the right
† 1
Let Z = aY . Then fZ (z) = f
|a| Y
( az ), and fX,Z (x, z) = 1
f
|a| X,Y
(x, az ). Thus with variable replacing,
Z
fX,Z (x, z)
h(X|aY ) = − fX,Z (x, z) log dxdz
S fZ (z)
fX,Y (x, az )
Z
1 z
= − fX,Y (x, ) log dxdz
S |a| a fY ( az )
Z
fX,Y (x, y)
= − fX,Y (x, y) log dxdy
S fY (y)
= h(X|Y ).
I(X; aY ) = h(aY ) − h(aY |X) = h(Y ) + log |a| − h(Y |X) − log |a| = I(X; Y ).
4
side: