Probability Theory: 1.1. Space of Elementary Events, Random Events
Probability Theory: 1.1. Space of Elementary Events, Random Events
Probability Theory: 1.1. Space of Elementary Events, Random Events
Probability theory
1.1. Space of elementary events, random events
Probability theory is the branch of mathematics concerned with analysis of ran-
dom phenomenon. The central objects of probability theory are random variables,
stochastic processes, and events.
A random experiment is an experiment that produces random outcomes. For
example, throwing a die is a random experiment in which each trial produces a
random outcome from six possible outcomes, i.e., faces with one through six spots.
2 definition. A random event in probability theory is any fact, which may occur
as a result of an experiment with a random outcome or may not.
A random event is a higher level outcome that may depend on multiple exper-
iments and multiple outcomes of the experiments. For example, consider a game
consisting of two random experiments, throwing a die and throwing a coin. A player
is to throw the die twice and the coin once. A player who gets the face with one
spot in both die-throwings and a head in the coin-throwing wins the grand prize.
In this game, the random event of interest is winning the grand prize. This event
would occur, if the trials produce the following outcomes: one spot in both of the
die-throwings and a head in the coin-throwing. In this example, the event depends
on multiple experiments and multiple outcomes.
The simplest result of experiment is called an elementary event ( for instance, an
appearance of heads or tails at throwing of a coin etc.).
1
5 definition. A certain event is an event, which must happen as a result of the
experiment without fail.
Example. When we throw a die the certain event is ”a fall of the die on one of its
faces”.
Example. When we throw a die the imposable event is ”a fall of the number 7”.
When events are not mutually exclusive (inclusive events, i.e., non-mutually
exclusive events), the word ”or” allows for the possibility of both events happen-
ing.
Events are collectively exhaustive if all the possibilities for outcomes are ex-
hausted, and at least one of those outcomes must occur. For example, there are
theoretically only two possibilities for flipping a coin. Flipping a head and flipping
a tail are collectively exhaustive events. Events can be both mutually exclusive
and collectively exhaustive. In the case of flipping a coin, flipping a head and
2
flipping a tail are also mutually exclusive events. Both outcomes cannot occur for a
single trial (i.e., when a coin is flipped only once).
If several events can happen as a result of an experiment, and each of them
isn’t more possible than others according to objective conditions, then such events
are called equally likely events. Examples of equally likely events are: an appear-
ance of two, an ace or a knave at taking a card out of a pack, an appearance of any
number from 1 to 6 at throwing of a die etc.
1) The sample space Ω, which is the set of all possible outcomes of an experi-
ment.
1) P(Ω) = 1;
2) P(A) > 1;
3
1.3. Statistical definition of probability
Intuitively, the probability of an event is supposed to measure the long-term
relative frequency of the event. Specifically, suppose that we repeat the experiment
indefinitely. (Note that this actually creates a new, compound experiment.) For an
event A in the basic experiment, let n(A) denote the number of times A occurred
(the frequency of A) in the first N runs. Thus,
n(A)
Pn (A) = (1)
N
is the relative frequency of A in the first N runs. If we have chosen the correct
probability measure for the experiment, then in some sense we expect that the
relative frequency of each event should converge to the probability of the event:
It follows that if we have the data from N runs of the experiment, the observed
relative frequency Pn (A) can be used as an approximation for P(A). This approxi-
mation is called the statistical definition of probability.
1) P(∅) = 0;
4
1.6. Elements of combinatorics
In English we use the word ”combination” loosely, without thinking if the order of
things is important. In other words:
1) ”My fruit salad is a combination of apples, grapes and bananas” We don’t care
what order the fruits are in, they could also be ”bananas, grapes and apples”
or ”grapes, apples and bananas”, its the same fruit salad.
2) ”The combination to the safe was 472”. Now we do care about the order.
”724” would not work, nor would ”247”. It has to be exactly 4-7-2.
1.6.1. Permutation
n!
P (n1 , n2 , ..., nk ) = , n1 + n2 + ...nk = n.
n1 !n2 !...nk !
2. Permutation without Repetition. Permutation without repetitions of n
different elements is denoted by Pn , where
1.6.2. Arrangement
* No Repetition: for example the first three people in a running race. You can’t
be first and second.
5
Example. In the lock above, there are 10 numbers to choose from (0, 1, ..9) and you
choose 3 of them:
So, the formula is simply: nr where n is the number of things to choose from,
and you choose r of them (Repetition allowed, order matters).
2. Arrangements without Repetition. In this case, you have to reduce
the number of available choices each time. For example, what order could 16 pool
balls be in? After choosing, say, number ”14” you can’t choose it again. So, your
first choice would have 16 possibilities, and your next choice would then have 15
possibilities, then 14, 13, etc. And the total permutations would be:
But maybe you don’t want to choose them all, just 3 of them, so that would be
only:
16 × 15 × 14 = 3, 360
In other words, there are 3,360 different ways that 3 pool balls could be selected
out of 16 balls.
The formula is written:
n!
Arn =
(n − r)!
where n is the number of things to choose from, and you choose r of them (No
repetition, order matters).
1.6.3. Combinations
There are also two types of combinations (remember the order does not matter now):
Cnr = Cnn−r
6
In other words choosing 3 balls out of 16, or choosing 13 balls out of 16 have the
same number of combinations.
2. Combinations with Repetition. Let us say there are five flavors of ice
cream: banana, chocolate, lemon, strawberry and vanilla. You can have three
scoops. How many variations will there be?
Let’s use letters for the flavors: {b, c, l, s, v}. Example selections would be
(And just to be clear: There are n = 5 things to choose from, and you choose
r = 3 of them. Order does not matter, and you can repeat!)
Think about the ice cream being in boxes, you could say ”move past the first
box, then take 3 scoops, then move along 3 more boxes to the end” and you will
have 3 scoops of chocolate! OK, so instead of worrying about different flavors, we
have a simpler problem to solve: ”how many different ways can you arrange arrows
and circles” Notice that there are always 3 circles (3 scoops of ice cream) and 4
arrows (you need to move 4 times to go from the 1st to 5th container). So (being
general here) there are r + (n − 1) positions, and we want to choose r of them to
have circles. This is like saying ”we have r + (n-1) pool balls and want to choose
r of them”. In other words it is now like the pool balls problem, but with slightly
changed numbers. And you would write it like this:
(r + (n − 1))!
C̄nr =
r!(n − 1)!
where n is the number of things to choose from, and you choose r of them (Repetition
allowed, order doesn’t matter).
7
2 and 3 for instance. We must subtract 16 multiples of 6, 10 multiples of 10 and 6
multiples of 15. It seems as if 50 + 33 + 20 − 16 − 10 − 6 = 71 is the final answer, but
it is not! The multiples of 30 were counted 3 times and eliminated 3 times. They
are not accounted for. We have to add 3 multiples of 30 to get the correct answer:
50 + 33 + 20 − 16 − 10 − 6 + 3 = 74.
The inclusion-exclusion principle tells us how to keep track of what to add and
what to subtract in problems like the above:
Let S be a finite set, and suppose there is a list of r properties that every element of
S may or may not have. We call S1 the subset of elements of S that have property
1; S1,2 the subset of elements in S that have properties 1 and 2, etc. Notice that
∪Si is the subset of elements of S that have at least one of the r properties.
r
X X X
| ∪ Si | = |Si | − |Si,j | + |Si,j,k | − ... + (−1)r−1 |Si,j,k,...,r |.
i=1 1≤i<j≤r 1≤i<j<k≤r
* ...
8
9 definition. The probability of occurrence of an event A at the condition, that an
event B takes place, is called the conditional probability and calculated by the
formula:
P(A ∩ B)
P(A|B) = (6)
P(B)
where A, B ⊂ Ω, P(B) 6= 0.
Example. We have the same example as above, but know we calculate probability
using formula:
1
P(A ∩ B) 1
P(A|B) = = 64 = .
P(B) 6
4
H1 ∪ H2 ∪ ... ∪ Hn = Ω (7)
then for any event A the probability P (A) can be calculated using the total probability
formula:
n
X
P(A) = P(A|H1 )P(H1 )+P(A|H2 )P(H2 )+...+P(A|Hn )P(Hn ) = P(A|Hi )P(Hi ).
i=1
(8)
P(Hi ∩ A)
P(Hi |A) = . (9)
P(A)
9
We can calculate the numerator from our given information by
Since one and only one of the events H1 , H2 , . . . , Hm can occur, we can write the
probability of A as
Although this is a very famous formula, we will rarely use it. If the number of
hypotheses is small, a simple tree measure calculation is easily carried out.
11 definition. Two events A and B are independent if both A and B have positive
probability and if
P(A|B) = P(A)
and
P(B|A) = P(B)
As noted above, if both P(A) and P(B) are positive, then each of the above
equations imply the other, so that to see whether two events are independent, only
one of these equations must be checked.
The following theorem provides another way to check for independence.
1 proposition. If P(A) > 0 and P(B) > 0, then A and B are independent if and
only if
P(A ∩ B) = P(A)P(B) (13)
10
Proof. Assume first that A and B are independent. Then P(A|B) = P(A), and so
P(A ∩ B) P(A)P(B)
P(A|B) = = = P(A)
P(B) P(B)
Also,
P(A ∩ B) P(A)P(B)
P(B|A) = = = P(B)
P(A) P(A)
Therefore, A and B are independent.
1) in each trial two events can occur: event A, or opposite event Ā;
2) trials are independent. Intuitively, the outcome of one trial has no influence
over the outcome of another trial.;
3) the probability of event A in all trials is the same and equal p = P(A). (The
probability of opposite event Ā in all trials is equal q = P(Ā)).
11
that outcomes on any one trial do not affect those on another, we assign the same
probabilities at each level of the tree. An outcome w for the entire experiment
will be a path through the tree. For example, w3 represents the outcomes SFS.
Our frequency interpretation of probability would lead us to expect a fraction p of
successes on the first experiment; of these, a fraction q of failures on the second; and,
of these, a fraction p of successes on the third experiment. This suggests assigning
probability pqp to the outcome w3 . Thus, the probability that the three events S
on the first trial, F on the second trial, and S on the third trial occur is the product
of the probabilities for the individual events.
b(3, p, 0) = q 3
b(3, p, 1) = 3pq 2
b(3, p, 2) = 3p2 q
b(3, p, 3) = p3
We can, in the same manner, carry out a tree measure for n experiments and deter-
mine b(n, p, k) for the general case of n Bernoulli trials.
12
2 proposition (Bernoulli Formula). Let suppose we are doing n independent
trials, where each trial consists only of two possible outcomes A (success) and Ā
(failure) with p = P(A) and q = P(Ā)) = 1 − p, where 0 < p < 1. The probability
b(n, p, k) of the event which corresponds to k ’successes’ in n Bernoulli trials is
1) probability, that after n independent Bernoulli trials, the event A occurs more
then k1 and less then k2 times, is equal
k2
X
P{k1 6 k 6 k2 } = Cnk pk q n−k ; (15)
k=k1
2) Probability, that after n independent Bernoulli trials, the event A occurs one
or more times, is equal.
P{1 6 k 6 n} = 1 − q n ; (16)
Proof.
n
X n
X
P{1 6 k 6 n} = Cnk pk q n−k = Cnk pk q n−k − Cn0 p0 q n−0 = 1 − q n .
k=1 k=0
np − q 6 k ∗ 6 np + p. (17)
13