Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

The Order Statistics and The Uniform Distribution

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

The order statistics and the

uniform distribution
Posted on February 21, 2010

In this post, we show that the order statistics of the uniform distribution on the unit
interval are distributed according to the beta distributions. This leads to a discussion
on estimation of percentiles using order statistics. We also present an example of
using order statistics to construct confidence intervals of population percentiles. For a
discussion on the distributions of order statistics of random samples drawn from a
continuous distribution, see the previous post The distributions of the order statistics.

Suppose that we have a random sample of size from a continuous


distribution with common distribution function and common density
function . The order statistics are obtained by
ordering the sample in ascending order. In other words, is the
smallest item in the sample and is the second smallest item in the sample and so
on. Since this is random sampling from a continuous distribution, we assume that the
probability of a tie between two order statistics is zero. In the previous post The
distributions of the order statistics, we derive the probability density function of the
order statistic:

The Order Statistics of the Uniform Distribution


Suppose that the random sample are drawn from . Since the
distribution function of is where , the probability density
function of the order statistic is:

where .

The above density function is from the family of beta distributions. In general, the pdf
of a beta distribution and its mean and variance are:
where where is the gamma
function.

Then, the following shows the pdf of the order statistic of the uniform distribution
on the unit interval and its mean and variance:

where .

Estimation of Percentiles
In descriptive statistics, we define the sample percentiles using the order statistics
(even though the term order statistics may not be used in a non-calculus based
introductory statistics course). For example, if sample size is an odd integer
, then the sample median is the order statistic . The preceding
discussion on the order statistics of the uniform distribution can show us that this
approach is a sound one.

Suppose we have a random sample of size from an arbitrary continuous distribution.


The order statistics listed in ascending order are:

For each , consider . Since the distribution function is a non-


decreasing function, the are also increasing:

It can be shown that if is a distribution function of a continuous random variable


, then the transformation follows the uniform distribution . Then the
following transformed random sample:
are drawn from the uniform distribution . Furthermore, are the order
statistics for this random sample. By the preceding discussion,

. Note that is the area under the density function


and to the left of . Thus is a random area and is the
expected area under the density curve to the left of . Recall that is the
common density function of the original sample .

For example, suppose the sample size is an odd integer where . Then

the sample median is . Note that . Thus if we choose


as a point estimate for the population median, is expected to be above the
bottom 50% of the population and is expected to be below the upper 50% of the
population.

Furthermore, is the expected area under the density curve and between
and . This expected area is:

The expected area under the density curve and above the maximum order statistic
is:

Consequently here is an interesting observation about the order statistics


. The order statistics divides the the area under the density
curve and above the x-axis into areas. On average each of these area is

As a result, it makes sense to use order statistics as estimator of percentiles. For

example, we can use as the percentile of the sample where .


Then is an estimator of the population percentile where the area under the
density curve and to the left of is . In the case that is not an integer,
then we interpolate between two order statistics. For example, if , then
we interpolate between and .

Example
Suppose we have a random sample of size drawn from a continuous
distribution. Find estimators for the median, first quartile and second quartile. Find an
estimate for the percentile. Construct an 87% confidence interval for the
percentile.

The estimator for the median is . The estimator for the first quartile (
percentile) is third order statistic . The estimator for the second quartile (
percentile) is the ninth order statistic . Based on the preceding discussion, the
expected area under the density curve to the left of are 0.25, 0.5 and
0.75, respectively.

To find the percentile, note that . Thus we interpolate


and . In our example, we use linear interpolation, though taking the arithmetic
average of and is also a valid approach. The following is an estimate of the
percentile.

To find the confidence interval, consider the probability where


is the percentile. Consider the event as a success with probability of
success . For to happen, there must be at least 2 successes and
fewer than 7 success in the binomial distribution with and . Thus we
have:

Thus the interval can be taken as the 87% confidence interval for . This is
an example of a distribution-free confidence interval because nothing is assumed
about the underlying distribution in the construction of the confidence interval.

You might also like