1 Sufficient Statistics: I I N I 1 I N I 1 I
1 Sufficient Statistics: I I N I 1 I N I 1 I
Sufficient Statistics
f (x1 , x2 , ...., xn , t; )
g(t; )
pxi (1 p)1xi = p
Pn
i=1
xi
(1 p)n
Pn
i=1
xi
Pn
where
n t
)p (1 p)nt
t
n
X
xi = t
i=1
Pn
i=1
for t = 0, 1, ..., n
f (x1 , x2 , ...., xn |T = t; p) =
=
f (x1 , x2 , ...., xn , t; p)
g(t; p)
pt (1 p)nt
1
=
n
n
( )pt (1 p)nt
( )
t
t
f (xi ; ) =
i=1
x1 +x2 2
e
x1 !x2 !
where x1 + x2 = t
and is zero elsewhere. Further, we know that the sum T = X1 + X2 has the
Poisson distribution with parameter 2.
(2)t 2
g(t; ) =
e
t!
Consequently, the conditional distribution of the sample, given T = t is
f (x1 , x2 |T = t; ) =
=
f (x1 , x2 , t; )
g(t; )
t 2 (2)t 2
e /
e
x1 !x2 !
t!
=(
t 1
)
x1 2 t
An Interpretation of Sufficiency In what sense does a sufficient statistic T (X1 , ..., Xn ) carry all of the sample information concerning the parameter? Consider the following two stage procedure for obtaining a random
sample from f (x; )
Step 1. Observe the sufficient statistic T which is distributed as g(t; ).
Note that the distribution of T depends on the value of that prevails.
Step 2. Given the value t observed in step 1, generate (X1 , ..., Xn ) from
the conditional distribution f (x1 , x2 , , ...., xn |T = t), which by the definition
is free of theta. That is, at this second step, it is not necessary to know the
value of .
In the Poisson example, suppose we observe T = X1 + X2 = 5. The five
events are assigned to the two components of the random sample (X1 , X2 )
according to the binomial fair coin model for five flips. The sample (x1 , 5
x1 ) is generated when x1 is the value of the binomial variable where
f (x1 , x2 |T = t) = (
5 1
)
x1 2 5
This last experiment could be conducted by flipping a fair coin five times and
letting x1 be the number of heads. Clearly T = X1 + X2 has the information
on because the second step is conducted with random numbers unrelated
to the parameter.
Fortunately, their is an easier way to obtain sufficient statistics. J. Neyman and R. A. Fisher produced equivalent conditions for a sufficient statistic
to exist. It connects sufficiency to a factorization of the joint distribution or
density, f (x1 , x2 , ..., xn ; ), of X1 , X2 , ..., Xn
Neyman-Fisher Factorization Criteria A statistic T (X1 , ..., Xn ) is
sufficient for if and only if the joint distribution or density can be factored
as
f (x1 , x2 , ..., xn ; ) = g(t, )h(x1 , ..., xn )
where the function g(t, ) only depends on t = t(x1 , x2 , ...., xn ) and while
h(x1 , x2 , ..., xn ) does not depend on .
We illustrate the use of the factorization criterion with the following examples.
Example Let X have the binomial distribution.
Show that X is a sufficient statistic
Solution The distribution of X
n
( )px (1 p)nx
x
3
n
Y
i=1
= (
1
2
2
e(xi ) /20
2 0
Pn
Pn 2
1
2
2
2
)n e(n +2 i=1 xi )/20 e( i=1 xi )/20
2 0
Taking
1 , x2 , ..., xn ) =
Pn the first term to be the function of t and , g(t; ), and h(x
Pn
( i=1 x2i )/202
e
we obtain a factorization that shows that T = i=1 Xi is sufficient.
Example Let X1 , X2 , ...Xn be a random sample from an exponential
family with probability distribution or density
f (x1 ; ) = eA()+B()t(x1 ) k(x1 )
P
n
Y
Pn
i=1
t(xi )
n
Y
k(xi )
i=1
i=1
factors with h(x1 , x2 , ..., xn ) = ni=1 k(xi ) and the other term is g(t; ).
This last example applies to the normal distributions with known variance, the normal distributions with known mean and the Poisson distributions, among others. Somewhat surprisingly, there is a one dimensional
sufficient statistic whatever the sample size.
The factorization criteria even applies when there is more than one parameter.
Example Let X1 , X2 , ...Xn be a random sample from a normal distribution with mean and variance both unknown.
Obtain the sufficient statistics.
Solution
4
We first write
(xi )2 = (xi x + x )2 = (xi x)2 + (
x )2 + 2(xi x)(
x )
Summing, we see that the last term vanishes since
n
X
i=1
(xi )2 =
n
X
i=1
Pn
i=1 (xi
x) = 0, so
(xi x)2 + n(
x )2
n
Y
i=1
= (
1
2
2
e(xi ) /2
2
Pn
1
2
2
)n e( i=1 (xi x) +n(x)
2
P
)/2 2
which is a function only of x and ni=1 (xi x)2 . Together, x and ni=1 (xi x)2
are sufficient for and 2 .
Note that, in the last example, we could also factor the joint density as
P
a function of x and s2 = ni=1 (xi x)2 /(n 1). In fact any one-to-one
transformation of a sufficient statistic is still sufficient.