Hypergeometric Distribution
Hypergeometric Distribution
Hypergeometric Distribution
Hypergeometric Distribution
The probability distribution of a Hypergeometric random variable is called
a Hypergeometric distribution. This topic describes how Hypergeometric random
variables, Hypergeometric experiments, hypergeometric probability, and the
hypergeometric distribution are all related.
Notation
The following notation is helpful, when we talk about hypergeometric distributions and
hypergeometric probability.
Hypergeometric Distribution
A hypergeometric random variable is the number of successes that result from a
hypergeometric experiment. The probability distribution of a hypergeometric random
variable is called a hypergeometric distribution.
Given x, N, n, and k, we can compute the hypergeometric probability based on the following
formula:
N-k
Cn-x ] / [ NCn ]
The variance is n * k * ( N - k ) * ( N - n ) / [ N2 * ( N - 1 ) ] .
Example 1
Suppose we randomly select 5 cards without replacement from an ordinary deck of playing
cards. What is the probability of getting exactly 2 red cards (i.e., hearts or diamonds)?
Solution: This is a hypergeometric experiment in which we know the following:
and drawing a red marble as a failure (analogous to the binomial distribution). If the variable N describes
the number of all marbles in the urn (see contingency table below) and K describes the number
of green marbles, then N K corresponds to the number of red marbles. In this example, X is
the random variable whose outcome is k, the number of green marbles actually drawn in the experiment.
This situation is illustrated by the following contingency table:
drawn
not drawn
total
Kk
nk
N+knK
NK
Nn
green marbles
red marbles
total
Now, assume (for example) that there are 5 green and 45 red marbles in the urn. Standing next to the
urn, you close your eyes and draw 10 marbles without replacement. What is the probability that exactly 4
of the 10 are green? Note that although we are looking at success/failure, the data are not accurately
modeled by the binomial distribution, because the probability of success on each trial is not the same, as
the size of the remaining population changes as we remove each marble.
This problem is summarized by the following contingency table:
green marbles
red marbles
total
drawn
not drawn
total
k=4
Kk=1
K=5
nk=6
N + k n K = 39
N K = 45
n = 10
N n = 40
N = 50
The probability of drawing exactly k green marbles can be calculated by the formula
Intuitively we would expect it to be even more unlikely for all 5 marbles to be green.
As expected, the probability of drawing 5 green marbles is roughly 35 times less likely than
that of drawing 4