Probability
Probability
Probability
CHAPTER FOUR
PROBABILITY AND PROBABILITY DISTRIBUTIONS
4.1. Probability Theory
Basic Concepts
Probability is a measure of the likelihood or chance that an uncertain event will occur. It is a numerical
measure of the chance of an outcome’s occurrence. It can assume a value between 0 and 1, inclusive. A
probability near zero indicates that the outcome is very unlikely to occur, while a probability near 1
indicates that the event is almost certain to happen. If we go to the extreme, a probability of something will
always to happen. Thus, probabilities are non-negative proper fractions. It is the basis for inferential
statistics
Experiment
An experiment is any well defined situation or procedure that results in one or more possible outcomes. Or
simply it can be defined as any process that generates well defined outcomes. For instance, tossing a coin,
rolling a die, foot ball match, etc can be taken as experiments.
Outcome
An outcome is a particular result of an experiment. For example, getting either head or tail is a possible
outcome of the experiment tossing a coin. Winning, loosing or tie/draw are the possible outcomes of the
foot ball experiment, and getting 1, 2, 3, 4,5, or 6 are possible outcomes of the rolling a die experiment.
Events
An event is a specific collection of basic outcomes, that is, a set containing one or more of the basic
outcomes from the sample space. An experiment identifies one or more outcomes of an experiment. For
example, in the rolling a die experiment, the simple collection of two or more of the six possible outcomes
can be taken as an event.
Sample Space
A sample space is a complete roster or listing of all possible out comes of an experiment. The sample space
of an experiment is usually illustrated either by a list or some type of diagram – Venn diagrams and tree
diagrams.
Exercise
Identify the experiment, outcomes, events and sample space for the following questions.
1. Sitting for an exam ………………………………. Experiment
Scoring A, B, C, D, F ……………………………... Possible outcomes
[A, B, C, D, F] ……………………………………... Sample Space
Scoring B and above ……………………………… Event
C and above ……………………………… Event
D or below ……………………………….. Event
Page | 1
2
Events
1. Independent events
Two or more events are independent if the occurrence or nonoccurrence of one of the events does not
affect the occurrence or nonoccurrence of the others. Certain experiments, such as rolling dice, yield
independent events; each die is independent of the other. Whether a 6 is rolled on the first die has no
influence on whether a 6 is rolled on the second die. Coin tosses always are independent of each other. The
possibility of getting a head on the first toss of a coin in independent of getting a head on the second toss.
The impact of independent events on the probability is that, if two events are independent, the probability
of attaining the second event is the same regardless of the outcome of the first event. The probability of
tossing a head is always ½ regardless of what was tossed previously. Thus, if someone tosses a coin six
times and gets six heads, the probability of tossing a head on the seventh time is ½, because coin tosses are
independent. In terms of symbolic notation, if X and Y are independent: P(X/Y) = P(X) and P(Y/X) = P(Y),
where P(X/Y) denotes the probability of X occurring given that Y has occurred, and P(Y/X) denotes the
probability of Y occurring given that X has occurred.
2. Dependent Events
Two or more events are dependent if the occurrence or nonoccurrence of one of the events affects the
occurrence or nonoccurrence of the others. Certain experiments, such as rolling a die, yields dependent
events; the occurrence of one of the six events is dependent on the occurrence or nonoccurrence of other
events.
The impact of dependent events on the probability is that, if two events are dependent, the probability of
attaining the second event is different from that of the outcome of the first event. In terms of symbolic
notation, if X and Y are dependent: P(X/Y) ≠ P(X) and P(Y/X) ≠ P(Y), where P(X/Y) denotes the probability
of X occurring given that Y has occurred, and P(Y/X) denotes the probability of Y occurring given that X has
occurred.
In the toss of a single coin, the events of heads and tails are mutually exclusive. The person tossing the coin
gets either a head or a tail but never both. The probability of two mutually exclusive events occurring at the
same time is zero. In terms of set notation, if events X and Y are mutually exclusive, P(X n Y) = 0, or the
probability of X intersecting Y is zero.
Relating the above three types of events, mutually exclusive events must be dependent, but dependent
events need not be mutually exclusive. Events that are independent cannot be mutually exclusive.
Therefore, mutually exclusive implies dependence and independence implies not mutually exclusive, but no
other simple implications among these conditions hold true.
5. Complementary events
The complement of an event A is denoted . All elementary events of an experiment not in A comprise its
complement. For example, if in rolling one die, event A is getting an even number, the complement of A is
getting an odd number. If event A is getting a 5 on the roll of a die, the complement of A is getting a 1, 2, 3, 4,
Page | 2
3
or 6. The complement of event A contains whatever portion of the sample space that event A does not
contain.
Using the complement of an event can be helpful some times in solving for probabilities because of the rule:
P ( ) = 1- P (A).
Principles of counting
Counting the number of ways in which events may occur in an experiment plays a major role in probability.
Some rules for counting are presented in this section. The first of these is called the fundamental
principle of counting.
Permutations
Other important counting rules pertain to the arrangement of items with regard to the order of items. In
this case with use Permutations.
Permutations are groups of items where both the composition of the groups and the order with in a group
are important.
The number of permutations in n distinct items arranged x at a time is , where n!, read n
factorial, is
n! = n (n -1) (n-2)……. (1).
By definition, 0! = 1.
Combinations
Permutations concern ways in which both order and composition are important. In combinations what
matters is the composition of the group not the order of items as what we have in permutations.
Classical Method
The classical method of assigning probabilities is based on the assumption that each outcome is equally
likely to occur. Classical probability utilizes rules and laws. It involves an experiment and an event. The
definition assumes that all n possible outcomes have the same chance for occurring. In this method
probability values are assigned as follows:
Page | 3
4
As ne can never be greater than N (no more than N outcomes in the population could possibly possess
attribute e), the highest value of any probability is 1. If the probability of an outcome occurring is 1, the
event is certain to occur. The smallest possible probability is zero. If none of the outcomes of N possibilities
possesses the desired characteristic, e, the probability is 0/N = 0, and the event is certain not to occur. The
range of possibilities for probabilities is:
Thus the probabilities are non negative proper fractions or non negative decimal values less than or equal
to 1.
Probability by number of
relative frequency = times an event has occurred
of occurrence total number of opportunities
for the event to occur
Relative frequency of occurrence is not based on rules or laws but on what has occurred in the past.
Subjective method
The subjective method of assigning probability is based on the feelings or insights of a person determining
the probability. Subjective probability comes from the person’s intuition or reasoning. Although not a
scientific approach to probability, the subjective method often is based on the accumulation of knowledge,
understanding, and experience stored and processed in the human mind. At times it is merely a guess. At
other times, subjective probability can potentially yield accurate probabilities.
Subjective probability can be a potentially useful way of tapping a person’s experience, knowledge, and
insight and using them to forecast the occurrence of some event. E.g. Weather forecast
Page | 4
5
Types of probabilities
There are four types of probabilities. These are:
Simple probability
Joint probability
Marginal probability
Conditional probability
Simple Probability
Simple probabilities are relatively straight forward which are obtained using the formula P (A) = n (A)/n –
relative frequency method.
Marginal probability
Marginal probability is denoted by P (E), where E is some event. A marginal probability is usually computed
by dividing some subtotal by the whole. An example of marginal probability is the probability that a student
is infected by HIV/AIDS. This probability is computed by dividing the number of students infected by
HIV/AIDS by the total number of students. The probability of a person wearing glasses is also a marginal
probability. This probability is computed by dividing the number of people wearing glasses by the total
number of people. A marginal probability is found in the margin of any joint probability table. It is the sum
of the joint probabilities for a single category of one attribute over all possible categories of another
attribute.
Example:
ABC Company manufactures window air conditioners in both a deluxe model (D) and a standard model (S).
An auditor engaged in a compliance audit of the firm is validating the sales account for the month April. She
has collected 200 invoices for the month, some of which were sent to wholesalers (W) and the remainders
to retailers (R). Of the 140 retail invoices, 28 are for the standard model. Only 24 of the wholesale invoices
are for the standard model. If the auditor selects one invoice at random, find the following probabilities.
a) The invoice selected is for the deluxe model.
b) The invoice selected is for the standard model.
c) The invoice selected is a wholesale invoice.
d) The invoice selected is a retail invoice.
Solution
Wholesale, W Retail, R Total
Deluxe, D 36 0.18* 112 0.56* 148 0.74**
Standard, S 24 0.12* 28 0.14* 52 0.26**
Total 60 0.30** 140 0.70** 200
P (D) = 148/200 = 0.74 P (W) = 60/200 = 0.30
P (S) = 52/200 = 0.26 P (R) = 140/200 = 0.70
Union probability
A second type of probability is the union of two events. Union probability is denoted by P (E 1 U E2), where
E1 and E2 are two events. P (E1 U E2) is the probability that E1 will occur or that E2 will occur or that both E 1
and E2 will occur. An example of union probability is the probability that a person is infected by HIV/AIDS
or Cancer. To qualify for the union, the person has to be infected with at least one of the diseases. Another
Page | 5
6
example is the probability of wearing eye glasses or is a soldier. All people wearing eye glasses are included
in the union along with all people who are soldiers and all soldiers who wear eye glasses.
Joint probability
A third type of probability is the intersection of two events or joint probability. A joint probability shows
the probability that an observation will possess two (or more) characteristics simultaneously. That is, it
measures the probability of two or more events occurring together. The joint probability of events E1 and
E2 occurring is denoted P (E1 n E2). Some times P (E1 n E2) is read as the probability of E 1 and E2. To qualify
for the intersection, both events must occur. Joint probability ranges from 0 to 1, inclusive [0, 1]. The sum
of all joint probabilities must be equal to 1.0. An example of joint probability is the probability of a person
to be infected with HIV/AIDS and Cancer. Being infected with one of the diseases is not sufficient. A second
example of joint probability is the probability that the person is a soldier as well as he/she wears eye
glasses.
Conditional probability
The fourth type is conditional probability. Conditional probability is denoted by P (E 1 / E2). This expression
is read as: the probability that E1 will occur given that E2 is known to have occurred. The conditional
probability of an event E1, given event E2 is the ratio of the joint probability of two events to the marginal
probability of E2.
e)
Example:
Blue Nile University recently conducted a survey of undergraduate students in order to gather information
about the usage of the library. The population for this study included all 4000 undergraduate students
enrolled in the university. The library officers are interested in increasing usage, particularly among
females (F) and seniors (S) at the university. Of the 4000 students, 800 students are seniors, 1800 students
are females and 450 of the 1800 females are seniors.
Required:
1. What is the probability that a student selected at random is a senior given that the selected student is
female?
2. What is the probability that a student selected at random is female given that the selected student is
senior?
Solution:
Senior, S Non-Senior, N Total
Female, F 450 0.1125 1350 0.3375 1800 0.45
Male, M 350 0.0875 1850 0.4625 2200 .055
Total 800 0.20 3200 0.80 4000
Page | 6
7
The Bayes’ Rule
An extension to the Conditional Law of Probabilities is Bayes’ rule, which was developed by and named for
an English Clergy man Thomas Bayes (1702-1761). Bayes’ rule is a formula that extends the use of the law
of conditional probabilities to allow revision of original probabilities when new information is needed. The
two core ideas in Bayes’ Rule are the prior probability and posterior/revised probability.
Prior probability – is initial probability which is determined before new information is obtained. It is the
starting point for Bayes theorem.
Posterior probability - a probability that has been revised based on new information, because it
represents a probability calculated after new information is obtained.
The Bayes’ theorem simplifies the computation of P(X/Y) when P (XnY) and P(Y) are not given directly.
Example:
1. A company has three machines A, B and C which all produce the same two parts, X and Y. of all the parts
produced, machine A produces 60%, machine B produces 30%, and machine C produces the rest. 40% of
the parts made by machine A are part X, 50% of the parts made by machine B are part X, and 70% of the
parts made by machine C are part X. A part produced by this company is randomly sampled and is
determined to be an X part. With the knowledge that it is an X part, find the probabilities that the part came
from machine A, B or C.
Solution:
P (A) = 0.6 P (X/A) = 0.4 P (A/X) =?
P (B) = 0.3 P (X/B) = 0.5 P (B/X) =?
Page | 7
8
P (C) = 0.1 P (X/C) = 0.7 P (C/X) =?
Method 1
Page | 8
9
0.24
Method 4 – Tree Diagram X/A P (A/X) = 0.24/0.46 = 0.52
0.40
0.60 0.60 0.36
Y/A
A 0.15 P(X) = 0.46
B 0.30 X/B 0.50 P (B/X) = 0.15/0.46 = 0.33
0.50 0.15
C Y/B 0.07 P(C/X) = 0.07/0.46 = 0.15
0.10 X/C 0.70 1.00
0.30 0.03
Y/C
Find: P(Y) = 0.54
P (A/Y) = 0.36/0.54 = 0.667
P (B/Y) = 0.15/0.54 = 0.278
P (C/Y) = 0.03/0.54 = 0.055
1.000
2. Bruk, Alemayehu and yohannes fill orders in a fast food restaurant. Bruk fills incorrectly 20% of the orders
he takes. Alemayehu fills incorrectly 12% of the orders he takes, and Yohannes fills incorrectly 5% of the
orders he takes. Bruk fills 30% of all orders, Alemayehu fills 45% of all orders, and Yohannes fills 25% of all
orders. An order has just been filled.
a) What is the probability that Alemayehu filled the order? 0.45
b) If the order was filled by Yohannes, what is the probability that it would was filled correctly? 0.95
c) Who filled the order is unknown, but the order was filled correctly. What are the revised probabilities
that Bruk, Alemayehu or Yohannes filled the order? 0.2748, 0.4533 and 0.2719
d) Who filled the order is unknown, but the order was filled incorrectly. What are the revised
probabilities that Bruk, Alemayehu or Yohannes filled the order? 0.4743, 0.4269 and 0.0988
3. A major league base ball team has four starting pitchers: Girma, Robel, Solomon, and Asrat. Each pitcher
starts every fourth game. The team wins 60% of all games that Girma starts, 45% of all games that Robel
starts, 35% of all games that Girma starts, 40% of all games that Girma starts. An avid fan has just returned
from a three week vacation in the wilderness and found out that the team played yesterday.
Laws of Probability
Additive Law
The general law of addition is used to find the probability of the union of two events, P (E 1 U E2). The
general Law of Addition is presented as follows:
Page | 9
10
Special Rule of Addition
If two events are mutually exclusive, the probability of the union of the two events is the probability of the
first event plus the probability of the second event. Because mutually exclusive events do not intersect,
nothing has to be subtracted out. The formula is shown below.
Example:
1. A husband and a wife, each 20 years old, are debating whether to setup a retirement program for themselves.
Benefits are paid to the man or woman at the age of 70. If both have died before reaching age 70, no
benefits are paid. Assume that the probability that a man aged 20 lives up to age 70 is approximately 0.7. If
the husband and wife join the program, what is the probability that either the man or the woman will
collect benefits? Assume that the chances of the man or woman dying are independent of each other.
Solution:
Let M= man lives up to age 70, W = woman lives up to age 70.
P (M) = 0.60 P (W) = 0.70
P (WUM) = P (W) + P (M) – P (WnM)
= 0.70 + 0.60 – P (WnM). Since the two events are independent, the joint probability that both
the man and the woman lives up to age 70 is equal to the product of the individual marginal
probabilities. P(WnM) = P (M) * P (W)
= 0.60 * 0.70
= 0.42
= 0.70 + 0.60 – 0.42
= 0.88
2. According to a recent study conducted by businessmen, 76% of all shareholders have some college
education. Suppose that 37% of all adults have some college education and that 22% of all adults are share
holders. For a randomly selected adult:
a) What is the probability that the person did not own shares of stock? 0.78
b) What is the probability that the person owns shares of stock or had some college education?
0.4228
c) What is the probability that the person has neither some college education nor owns shares of
stock? P( ) = 1 – P(AUB) = 0.5772
d) What is the probability that the person does not own shares of stock or has no college education?
P( ) = 1 – P(AnB) = 0.8382
e) What is the probability that the person owns only shares of stock or had some college education
but not both? P(AUB) – P(AnB) = 0.4227 – 0.1672 = 0.2556
3. A 1999 survey of 20,000 sales professionals conducted by Ethiopian Telecommunication Corporation (ETC)
found that 15% of all sales professionals use home fax machines and 35% use mobile telephones. Suppose
that 1% of all sales professionals have both fax machines and use mobile telephones.
a) What is the probability that a randomly selected sales professional has a home fax machine or uses
a mobile telephone?
b) What is the probability that a randomly selected sales professional neither has a home fax machine
nor uses a mobile telephone?
c) Suppose that no sales professional has both a home fax machine and uses a mobile telephone. What
is the probability that a randomly selected sales professional has a home fax machine or uses a
mobile telephone?
Multiplicative law
Page | 10
11
The probability of the intersection of two events (E 1 E2) is called the joint probability. The general law of
multiplication is used to find the probability of the intersection of two events or joint probability. The
general law of multiplication is stated as follows:
Example:
1. Test the matrix for the 200 executive responses to determine whether industry type is independent of
geographic location.
Geographic Location
Industry type North East, D South East, E Mid West, F West, G Total
Finance, A 24 10 8 14 56
Manufacturing, B 30 6 22 12 70
Communication, C 28 18 12 16 74
Total 82 34 42 42 200
Solution:
Select one industry type and one geographic location (Say A – Finance and G – West). Does P (A/G) = P (A)?
P (A/G) = P (AnG)/P (G) = 0.07/0.21 = 0.33
P (A) = 56/200 = 0.28
Since P (A/G) ≠P (A), industry type and geographic location are not independent.
2. Considering the above problem, if a respondent is randomly selected from these data:
a) What is the probability that this executive is from the mid west? 0.21
b) What is the probability that a respondent is from the communication industry or from north east?
0.64
c) What is the probability that a respondent is from the south east or from finance industry? 0.36
d) What is the probability that this executive is from the south east or the west? 0.38
3. The results of a survey asking, “Do you have a calculator and/or a computer in your home?” are as follows:
Calculator
Yes No
Computer Yes 46 3
No 11 15
Is the variable calculator independent of the variable computer? Why or why not? NO
conditional probability of E1 occurring given that E2 is known or has occurred is expressed as . The
Probability Distribution: is a listing of the possible values that a random variable can assume along with
their probabilities. It is any representation of the values of a random variable and the associated
probabilities. Depending on the types of random variables with which we deal with, we do have two types
of Probability Distributions.: Discrete Probability Distributions and Continuous Probability Distributions.
Discrete Probability Distribution is any representation of the values of discrete random variable and the
associated probabilities. The most commonly used discrete probability distributions include the Binomial,
hyper geometric and the Poisson distributions.
To compute the probability of occurrences in binomial distribution we do have the Binomial Formula. It is
stated as follows:
Example:
1. If we toss a coin three times, what is the probability of getting exactly two heads?
Solution:
In a single toss, the probability of getting a head or a tail is 0.5. In tossing the coin three times, the following
are the possible outcomes.
HHH, HHT, HTH, HTT, THH, THT, TTH, TTH, TTT
The probability of getting exactly two heads is, therefore, computed as
= (0.5*0.5*0.5) + (0.5*0.5*0.5) + (0.5*0.5*0.5)
= 0.125 * 3 = 0.375
Using the Binomial formula
P = 0.50 q = 1 – 0.50 = 0.50 n=3 x=2
P(x=2) = ncx * PX * q1-x
= 3c2 *0.52*0.51
=
3(0.25*0.5) = 3(0.125) = 0.375
There are three ways of choosing exactly two heads from a total of three trials.
2. A researcher wants to test the claim that 10% of all people are left-handed by randomly selecting forty
students at a university. What is the probability of getting six left handed students among forty?
Solution:
P = 0.10 q = 1 – 0.10 = 0.90 n = 40 x=6
P(x=6) = 0.1068
If 10% of the population is left-handed, about 10.68% of the time the researcher would get six who are left
handed in a sample of forty.
3. Based on past data, approximately 30% of the oil wells drilled in areas having a certain favorable geological
formation have struck oil. A company has identified 5 locations that possess this information. Assuming
that the chance of striking oil on any location is independent of any others, calculate the probability that
exactly 2 of the 5 wells strike oil.
Solution:
P = 0.30 q = 1 – 0.30 = 0.70 n=5 x=2
P(x = 2) = 0.3087
If the probability of getting oil in areas having certain favorable geological formation is 0.3, 31% of the time
we can get 2 drills which have oil in a sample of 5 drills.
4. The quality control department of a manufacturer tested the most recent batch of 1000 catalytic converters
produced and found 50 of them defective. Subsequently, an employee unwittingly/unintentionally mixed
the defective converters with the non-defective ones. If a sample of three converters is randomly selected
from the mixed batch, what is the probability that the employee may get one defective item?
Page | 13
14
Solution:
Before we try to solve this problem we have to check whether all the assumptions of a Binomial
distribution are satisfied or not. One of the assumptions states that the sample size, n must be less than five
percent of the population size, N. in our case, the sample size is less that 5% of the population size[ 3/1000
= 0.003< 0.05] so we can use the binomial distribution to solve this specific exercise.
N = 1000 p = R/N, where R- the number of success in the population, N
n=3 = 50/1000 = 0.05
x=1 q = 1 – 0.05 = 0.95
P(x=1) = 0.1354
If 5% of the product contains defective converters, 13.54% of the time the quality control department
would get 1 defective item in a sample of three converters.
5. A town has three ambulances for emergency transportation to a hospital. The probability that any one of
these will be available at a given time is 0.75. if a person calls for an ambulance, what is the probability that
an ambulance will be available?
Solution:
n=3 p = 0.75 q = 0.25
Probability of getting (at least) an ambulance is calculated as one minus the probability of getting no
ambulance.
P(ambulance) = 1 – P(0 ambulance)
= 1 – (3c0*0.750*0.253)
= 1 – 0.0156
= 0.9844
Some Binomial tables only show values up to 0.5. Thus, it would appear these tables are can not be used
when the probability of success exceeds p= 0.5. However, such tables can be used by noting that the
probability of n-x failures is also the probability of x successes. That is, finding the probability of x
successes is equal to finding the probability of n-x failures. ncx and ncn-x are always equal.
Example:
Suppose that 70% of all cola drinkers select non diet colas. If 10 cola drinkers are randomly selected, what
is the probability that 4 of them will be diet cola drinkers?
Solution:
Finding the probability of 4 diet cola drinkers is equivalent to finding the probability of 6 non diet cola
drinkers.
n = 10 p= 0.7 q= 0.3 x= 6
P(x=6) = 0.2001
Finding the Probabilities that the Number of Successes X Lie In a Given Interval (Cumulative
Probabilities)
Cumulative probabilities are the sum of individual probability values. The Binomial formula
gives us the probability of exactly x successes in n trials/sample size n. to find
cumulative probabilities such as P(x≥3), P(x≤2), P(x›10) or P(X1≤X≤X2) = P(10≤X≤20), we should add the
respective exact/individual probability values.
Example:
Page | 14
15
1. A project manager has determined that a subcontractor fails to deliver standard orders 20% of the time. The
project manager has six orders that his subcontractor has agreed to deliver. What is the probability that
a) The subcontractor will deliver all of the orders? 0.2621
b) The subcontractor will deliver at least four of the orders? 0.9011
c) The subcontractor will deliver exactly five orders? 0.3932
d) The subcontractor will fail to deliver at most two of the orders? 0.9011
e) What do you conclude from your answers in parts (b) and (d)? Finding the probability of x
successes is equal to finding the probability of n-x failures.
2. About 20% of all pro football players are injured during a given season. A team has four star players. What is
the probability that at least one of the star players gets injured?
Solution:
n=4 p= 0.2 q= 0.8 x≥ 1
P (x≥1) = 1 – P (X≤0) = P(x=0)
= 1 – 0.4096 = 0.5904
3. A lawyer estimates that 40% of the cases in which she represented the defendant were won. If the lawyer is
presently representing 10 defendants in different cases, what is the probability that at least 5 of the cases
will be won? What are you assuming here?
Solution:
The assumption we are taking here is the cases in which the lawyer is representing are independent. With
this assumption:
n = 10 p= 0.4 q= 0.6 x≥ 5
P (x≥5) = P(x=5) + P(x=6) + P(x=7) + P(x=8) + P(x=9) +P(x=10)
= 0.2007 + 0.1115 + 0.0425 + 0.0106 + 0.0016 + 0.0001
= 0.3670
Using Cumulative Binomial Probability Table
If cumulative probability table is given, one must subtract from the cumulative probability of X the
cumulative probability of X-1 to get the exact/individual probability value of X. That is,
P (X=a) = P (X≤a) – P (X≤a-1)
E.g. P (X=3) = P (X≤3) – P (X≤2)
P (X≥a) = 1- P (X≤a-1)
E.g. P (X≥3) = 1- P (X≤2)
P (X>a) = 1- P (X≤a)
E.g. P (X>3) = 1- P (X≤3)
P (a1≤X≤a2) = P (X≤a2) - P (X≤a1)
E.g. P (10≤X≤20) = P (X≤20) - P (X≤10)
P (a1<X<a2) = P (X≤a2-1) - P (X≤a1-1)
E.g. P (10<X<20) = P (X≤19) - P (X≤9)
Example:
1. According to a study conducted approximately 55% of all hospitals in a given town contained 100 or more
beds. A researcher draws a sample of 15 hospitals by randomly selecting names from a directory of
hospitals.
a) What is the probability of selecting 10 or more hospitals that have 100 or more beds?
b) What is the probability of selecting less than five hospitals that have 100 or more beds?
c) What is the probability of selecting from six to ten hospitals, inclusive, that have 100 or more beds?
2. A manufacturing company produces 10, 000 plastic parts per week. This company supplies plastic parts to
another company, which packages the plastic parts as part of picnic sets. The second company randomly
samples10 plastic parts sent from the supplier. If two or less of the sampled plastic parts are defective, the
second company accepts the lot. What is the probability that the lot will be accepted if the part
manufacturing company actually producing parts is 10% defective? 20% defective? 30% defective? 40%
defective?
Page | 15
16
µ = E(X) = ∑[X*P(X) =
Where: X = an outcome
µ = mean
P(X) = the probability of that outcome
And the standard deviation of a discrete random variable is calculated simply by taking the square root of
the variance. δ = √∑(X-µ) 2*P(X).
Hypergeometric Distribution
The binomial distribution assumes that the probability of success (p) and failure (q = 1 - p) are the same
throughout the experiment. This is because
– events are independent
– sampling is done with replacement
– n < 0.05N
– population is infinite
However, in cases where sampling is without replacement and the sample size exceeds 5% of the
population size, it is necessary to use the hypergeometric distribution to determine correct probability.
The hypergeometric distribution has the following characteristics.
- It is a discrete distribution.
- Each outcome consists of either a success or a failure.
- Sampling is done without replacement.
- The population size is finite and known.
Page | 16
17
- It is described by three parameters: N, r and n. because of the multitude of possible combinations of
these three parameters, creating tables for the hypergeometric distribution is practically
impossible.
- The number of successes in the population, r, is known.
- The sample size is ≥ 5% of the population.
Under the above conditions, we can use the hypergeometric distribution for determining the correct
probability, with the following formula:
= 120x56/42,504 = 0.1581
2. A shipment of 10 items has two defective and eight non-defective units. In the inspection of the shipment, a
sample of units will be selected and tested. If the defective unit is found, the shipment of 10 units will be
rejected.
a) If a sample of three items is selected, what is the probability that the shipment will be rejected?
b) If management would like a 0.90 probability of rejecting a shipment with two defective and eight
non-defective units, how large a sample would you recommend?
3. Suppose that there are 18 major insurance companies in Ethiopia and that 12 are located in Addis. If three
insurance companies are randomly selected from the entire list, what is the probability that one or more of
the selected companies are located in Addis?
Solution:
N = 18 n=3 r = 12 x≥1
P(x ≥ 1) = P(x = 1) + P(x = 2) + P(x = 3)
= 0.2206 + 0.4853 + 0.2696
= 0.9755
4. A company produces and ships 16 personal computers knowing that 4 of them have defective wiring. The
company that has purchased the computers is going to test thoroughly 3 of the computers. The purchasing
company can detect the defective wiring when it is there. What is the probability that the purchasing
company will find:
a) No defective computer? 0.3932
Page | 17
18
b) Exactly three defective computers? 0.0071
c) Two or more defective computers? 0.1357
d) One or less defective computers? 0.8643
2. Of a group of 300 men, 240 are physically fit. If five men are randomly selected, what is the probability that
three of them are physically fit?
Solution:
Using hypergeometric distribution
N = 300 n=5 r = 240 x=3
P(x=3) = 0.2057
Page | 18
19
While a binomial random variable counts the number of successes that occur in a fixed number of trials, a
Poisson random variable counts the number of rare events (successes) that occur in a specified continuous
time interval or specified region.
1. The probability of an occurrence is the same throughout the time interval or space per unit.
2. The number of occurrences in one interval is independent of the number of occurrences in another
interval.
3. The probability of two or more occurrences in a subinterval is small enough to be ignored.
4. It must be possible to divide the time interval of interest in to many sub intervals.
5. The expected number of occurrences in an interval is proportional to the size of the interval.
Example:
1. Assume that a bank knows from past experience that between 10 and 11 a.m. of each day, the mean arrival
rate is 60 customers per hour. Suppose that the bank wants to determine the probability that exactly two
customers will arrive in a given minute time minute interval between 10 and 11 a.m. Arrivals are assumed
to be constant over a given time interval. Calculate the probability.
Solution:
λ = 60 customers/hr t= 1 minute x = 2 customers
µ = λ* ι = 60customers/60minutes * 60 minutes = 1
P(x=2) = = = 0.1839
The probability of getting 2 customers during the next one minute in a bank is 0.1839. Or there is 18.39%
chance that exactly 2 customers will arrive in one minute at a bank.
2. Suppose that bank customers arrive randomly on weekday afternoons at an average rate of 3.2 customers
every four minutes. What is the probability of getting 10 customers during an eight minute interval?
λ = 3.2 customers/4 minute t= 8 minutes x =10 customers µ = λ* ι = 3.2
customers/4 minutes * 8minutes = 6.4 customers
P(x=10) = = = 0.0528
Page | 19
20
The probability of getting 10 customers during the next eight minutes in a bank is 0.0528. Or there is 5.28%
chance that exactly 10 customers will arrive in eight minutes at a bank.
3. If a real estate office sells 1.6 houses on average weekday and sales of houses on weekdays are Poisson
distributed, what is the probability of selling:
a) Four houses in a day? 0.0551
b) No house in a day? 0.2019
c) More than five houses in a day? 0.0060
d) Ten or more houses in a day? 1 – 1 = 0.000
e) Four houses in two days? 0.1781
4. A secretary types 75 words per minute and averages six errors per hour of typing. Assuming error
occurrences are a Poisson process, what is the probability that a 225-word letter will be typed with out
error? 0.7408
5. A pen company averages 1.2 defective pens per carton produced (200 pens). The number of defects per
cartoon is Poisson distributed.
a) What is the probability of selecting a cartoon and finding no defective pen? 0.0312
b) What is the probability of finding eight or more defective pens in a cartoon? 0.0000
c) Suppose that a purchaser of these pens will quit buying from the company if a cartoon contains
more than three defectives. What is the probability that the purchaser will quit buying from this
company? 0.0338
6. A certain manufacturer sells a machine that has numerous moving parts. A quality control inspector counts
the number of moving parts that are misaligned as the number of nonconformities for a particular machine.
It is believed that the number of nonconformities per machine follows a Poisson distribution, with an
average of three nonconformities per machine.
a) Determine the probability that the quality control inspector finds no more than one
nonconformity on a particular machine selected at random. 0.0996
b) What is the probability that three or more nonconformities may be obtained by the
quality control inspector on three machines? 0.9938
7. The number of paint blisters produced by an automated painting process at Associated Industries is Poisson
distributed with a rate of 0.06 blisters per square feet. The process is about to be used to paint an item that
measures 9 by 15 feet.
a) What is the probability that the finished surface will have no blister in it? 0.0003
b) What is the probability that the finished surface will have between 5 and 8, inclusive? 0.4846
c) What is the probability that the finished surface will have more than 2 blisters? 0.9873
8. The defects in an automated weaving process at Sharp Industries are Poisson distributed at a mean rate of
0.00025 per square foot. The process is to be used to weave a piece of materials that is 5 by 16 yards.
a) What is the probability that this piece will have no defects?
b) What is the probability that it will have one defect?
Page | 20
21
Simply set µ = np and use the Poisson tables. As a rule of thumb, the approximation will be good whenever
P≤0.05 and n≥20. However, this approximation is reasonably accurate if n>20 and np≤5.
Binomial tables are often not available for large values of n, so in these cases the approximation can be
useful. So in cases where P≤0.05 and n≥20, substitute the mean of the binomial distribution (µ = np) in
place of the mean of the Poisson distribution (µ = λι), so that the formula becomes P(X) =
In general, the larger n is and smaller p is, the better will be the approximation.
Why approximation?
- The Poisson formula is easier to use than the binomial formula.
- It can be tabulated more efficiently than binomial probabilities because Poisson distribution
has only one parameter µ (λι), where as binomial distribution has two parameters n and p.
Example:
n = 500 p= 0.02 µ = np = 500*0.02 = 10
n = 1000 p= 0.01 µ = np = 1000*0.01 = 10
If we want to calculate P(X) for both cases we can tabulate on a single column- Poisson. Had it been
binomial for the above cases we should have formulated two columns.
1. A company sells insurance policies to a random sample of 1000 men who are 35 years of age. The
probability that a 35-year old man dies with in a year is approximately 0.002. What is the probability that
the insurance company will have to pay claims on 2 or more policies next year?
Solution:
Steps: 1. Make sure P≤0.05 and n≥20
P = 0.002 n = 1000…………. Both requirements satisfied
2. Calculate µ = np = 1000*0.002 = 2
3. Calculate P(X)
P (X≥2) = 1 – [P(X=0) + P(X=1)]
=1-[ +[ ]
= 1 – (0.1353 + 0.2707)
= 0.5940
2. Suppose that the probability of a bank making a mistake processing a deposit is 0.0003. If 10,000 deposits
are audited, what is the probability that exactly six mistakes were made in processing deposits?
Solution:
Steps: 1. Make sure P≤0.05 and n≥20
P = 0.0003 n = 10,000…………. Both requirements satisfied
2. Calculate µ = np = 10,000*0.0003 = 3
3. Calculate P(X)
P (X=6) =
= 0.0504
Continuous Probability Distribution
Up to this point, we have focused our attention on discrete distributions of random variables that have
either a finite number of possible value (E.g. 0, 1, 2, 3 …n) or a countably infinite number of values (E.g. 0, 1,
2, 3 …), and we can also list all of the possible values of a discrete random variable and it is meaningful to
consider the probability that a particular individual value will be assumed. In contrast, a continuous
random variable has an uncountably infinite number of possible values and can assume any value in the
Page | 21
22
interval between two points and b(a<x<b). As a result the only meaningful way to compute a probability is
the probability that the variable will fall within a specified region. That is, the probability that a continuous
random variable X will assume any particular value is zero.
It is any representation of the values of continuous random variable and the associated probabilities. The
continuous probability distribution includes the normal distribution and exponential distribution.
Each combination of µ and δ specifies a unique normal distribution. This brings about having an infinite
family of normal distributions. This problem of dealing with an infinite family of distributions can be solved
by transforming all normal distributions to the standard normal distribution, which has a mean equal to 0
and a standard deviation equal to 1. Standard Normal Distribution is a normal distribution in which the
mean is 0 and the standard deviation is 1. It is denoted by z.
Any normal distribution can be converted to the standard normal distribution by standardizing each of its
observations in terms of Z- values. The Z- value measures the distance in standard deviations between the
mean of the normal curve and the X- value of interest. Any random variable can be transformed to a
standard random variable by subtracting the mean and dividing by the standard deviation.
If a random variable X has mean µ and standard deviation δ, the standardized variable Z is defined as:
Page | 22
23
A Z- score is the number of standard deviations that a value, X, is away from the mean. If the value of X is
less than the mean, the Z-score is negative; if the value of X is greater than the mean, the Z-score is positive.
Z-score is also known as z-value. A standardized score in which the mean is zero and the standard
deviation is 1. The Z score is used to represent the standard normal distribution
The probability calculations in normal distribution are made by computing areas under the graph. Thus, to
find the probability that a random variable lies within any specific interval we must compute the area
under the normal curve over that interval.
Probabilities for some commonly used intervals are:
a) 68.26% of the time, a normal random variable assumes a value within ±1δ of its mean.
b) 95.44% of the time, a normal random variable assumes a value within ±2δ of its mean.
c) 99.72% of the time, a normal random variable assumes a value within ±3δ of its mean.
Example:
1. The Graduate Management Admission Test (GMAT) is widely used by graduate school of business as an
entrance requirement. In one particular year, the mean score for the GMAT was 485, with a standard
deviation of 105. assuming that GMAT scores are normally distributed, what is the probability that a
randomly selected score from this administration of the GMAT:
a) Falls between 600 and the mean, inclusive?
b) Is greater than 650?
c) Is less than 300?
d) Falls between 350 and 550, inclusive?
e) Is less than 700?
f) Is exactly 500?
g) If 500 applicants take the test, how many would you expect to score 590 or below?
Solution:
Steps to find the probability value of a random variable which lies over an interval:
2 Calculate the appropriate z values
2 Find the areas (probabilities) in the table
2 Interpret your results
a) P (485≤X≤600) =?
1. First convert X values in to Z-score using the formula
Z485 = 0
Z600 = = +1.10
2.P(485≤X≤600) = P(0≤Z≤+1.10)
= P (0 to +1.10)
= 0.36433
b) P (X>650) =?
1. First convert X values in to Z-score using the formula
Z650 = = +1.57
2.P(X>650) = P(Z>+1.57)
= 0.5- P (0 to +1.57)
= 0.5-0.44179
Page | 23
24
= 0.05281
c) P (X<300) =?
1. First convert X values in to Z-score using the formula
Z300 = = -1.76
2.P(X<300) = P(Z<-1.76)
= 0.5- P (0 to -1.76)
= 0.5-0.46080
= 0.03920
d) P (350≤X≤550) =?
1. First convert X values in to Z-score using the formula
Z350 = = -1.29
Z550 = = +0.62
2.P(350≤X≤550) = P (-1.29≤Z≤-1.29)
= P (0 to -1.29) + P (0 to 0.62)
= 0.40147 + 0.23237
= 0.63384
e) P (X<700) =?
1. First convert X values in to Z-score using the formula
Z700 = = +2.05
2.P(X>300) = P(Z<+2.05)
= P (X<485) + P (485≤X<700)
= 0.5+ P (0 to +2.05)
= 0.5 + 0.47982
= 0.97982
g) To find the expected number of applicants who score 590 or below, we first find P (X≤590) and we
multiply it by the number of applicants.
P (X≤590) =?
1. First convert X values in to Z-score using the formula
Z590 = = +1.00
2.P(X≤590) = P(Z≤+1.00)
= P (X<485) + P (485≤X<590)
= 0.5+ P (0 to +1.00)
Page | 24
25
= 0.5 + 0.34134
= 0.84134
If 500 applicants take the test, the number of students expected to score 590 or below is 500(0.84134) =
420.65 or 421 students.
2. The result of an exam score for a given class is normally distributed. If the mean score is 85 points and the
standard deviation is equal to 20 points, find the cutoff passing grade such that 83.4% of those taking the
test will pass.
Solution:
µ = 85 prob. Of passing = 83.4%
δ = 20 cutoff point =?
Since 83.4% is greater than 50%, the cutoff point should be less than the mean, and hence the Z-value is
negative. And this calls for the inverse use of the standard normal table.
(Z/P=0.334) = -0.97
-0.97 =
-19.4 = X-85
X = 65.6 Points – Minimum point to pass the test.
3. Data accumulated by the National Climatic Data Center shows that the average wind speed in miles per hour
for Addis is 9.7mph. Suppose that wind speed measurements are normally distributed for a given
geographical location. If 22.45% of the time the wind speed measurements are more than 11.6mph, what is
the standard deviation of wind speed in Addis?
Solution:
µ = 9.7mph δ =? X > 11.6
P(X> 11.6) = 22.45%
(Z/P = 0.2755) = +0.76
+0.97 =
0.97δ = 1.9
δ = 2.5
4. The cylinder making machine has δ = 0.5mm and µ = 25mm. within what interval of values centered at the
mean will, the diameters of 80%of the cylinder lie?
Solution:
µ = 25mm δ =0.5mm
From the statement it is clear that the interval is centered at the mean; i.e., 50% of the 80% (40%) lies
below the mean and 50% lies above the mean.
(Z/P=0.4) = ± 1.28
X1 = µ - Z δ X2 = µ + Z δ
-1.28 = +1.28 =
80% of the diameter of the cylinder lies between 24.36mm and 25.64mm.
5. The lives of light bulbs follow a normal distribution. If 90% of the bulbs have lives exceeding 2000 hrs and
3% have lives exceeding 6000 hrs. What are the mean and standard deviation of the lives of light bulbs?
Page | 25
26
Solution:
P(X>2000) = 0.90 P(X>6000) = 0.03
µ=? δ =?
(Z/P=0.4) = - 1.28 (Z/P=0.47) = + 1.88
-1.28 = +1.88 =
-1.28δ = 2000 - µ +1.88δ = 6000 - µ
µ = 2000 + 1.28δ µ = 6000 - 1.88δ
µ = 2000 + 1.28 δ
= 2000 + 1.28(1265.82)
= 3620.25 points
6. On a civil service exam, the grades are normally distributed with µ = 70 points and δ = 10 points. The police
department hires the applicants whose grades are among the top 10% of the population. What is the
minimum grade required to be hired?
Solution:
µ = 70points δ =10points
(Z/P=0.4) = + 1.28
+1.28 =
+1.28 =
12.8 = X - µ
X - 70 = 12.8
X = 82.8 – the minimum grade required to be hired.
7. A bakery shop sells loaves of freshly made bread. Any unsold loaves at the end of the day are either
discarded or sold elsewhere at a loss. The demand for this bread has followed a normal distribution with µ
= 35 loaves and δ = 8 loaves. How many loaves should the bakery make each day so that they can meet the
demand 90% of the time?
Solution:
µ = 70 loaves δ = 8 loaves
(Z/P=0.4) = + 1.28
+1.28 =
+1.28 =
10.24 = X - 35
X = 45.24 ≈ 46- by stocking 46 loaves of breads each day, the bakery will meet the demand for this product
90% of the time.
Page | 26
27
Normal Approximation to Binomial Probability
When a binomial problem involves as n-value larger than 20, the binomial tables may not be used. In such a
case, the Poisson approximation is not appropriate, and hence another method of solving the problem must
be found – the normal distribution.
The normal distribution is bell-shaped and symmetrical with mean, µ, and standard deviation, δ. However,
the binomial distribution is symmetrical only if P=0.5. Hence, if n is large and p is close to 0.5, the normal
distribution provides a good approximation to the binomial distribution. The approximations are quite
good when np and nq are greater than 5.
In binomial distribution
- When p is small (e.g. 0.1), the distribution is skewed to the right. Mode<Median<Mean.
- As p increases (e.g. 0.3), the skewness is less noticeable.
- When p= 0.5 the distribution is symmetrical. Mode = Median = Mean.
- When p > 0.5, the distribution is skewed to the left. Mean>Median<Mode
When we use a normal probability value as an approximation of a binomial probability, we are substituting
a continuous probability distribution for a discrete probability distribution. Such a substitution requires a
CONTINUITY CORRECTION FACTOR (addition and subtraction of 0.5 to the discrete value of x), i.e., a
correction of ± 0.5 depending on the problem is required.
Without the continuity correction, the normal distribution will generally underestimate binomial
probabilities especially if n is small.
To approximate a binomial distribution by normal distribution, a test must be made to determine whether
the interval µ ± 3δ lies between 0 and n, which are the lower and upper limits respectively, of a binomial
distribution. This is because the empirical rule states that approximately 99.72%, or almost all, of the
values a normal curve lie within three standard deviations of the mean. If µ ± 3δ does not lie between 0
and n, don’t use the normal distribution to work a binomial problem, because the approximation is not
good enough.
Page | 27
28
In short, to use a normal distribution as an approximation to binomial we have the following steps.
1. Check that n is large (n≥50) and p is close to 0.5 as well as np and nq > 5.
2. Calculate µ (np) and δ (√npq).
3. Check that µ ± 3δ lies between 0 and n.
4. Use the continuity correction factor and determine the appropriate interval.
5. Calculate the probability value, by calculating the area which is covered by the interval.
6. Interpret the results.
Example:
1. According to a recent study conducted by the Addis Ababa University, 87% of all evening college
students also work. If this figure still holds and if 120 evening class college students are randomly selected,
what is the probability that less than 100 also work? Use normal distribution to approximate the binomial.
Solution:
1. n = 120 - large p= 0.87 – is close to 0.5 np = 0.87*120 = 104.40, nq
= 0.13*120 = 15.6….. Both greater than 5.
2. µ = np = 120*0.87 = 104.40and δ = √npq = √120*0.87*0.13 = 3.684
3. µ ± 3δ = 104.40 ± 3(3.684) = 104.40 ± 11.052 = 92.948 ≤ µ ± 3 δ ≤ 115.052. Hence, the interval
(92.95 to 115.05) is between 0 and n (120).
4. P(X< 100) of binomial is changed in to P(X < 99.5) of normal by applying the continuity correction
factor.
5. P(X< 99.5) =?
Z99.5 = = -1.33
P(X< 99.5) = P (Z<-1.33)
= 0.5- P (0 to -1.33)
= 0.5-0.40824
= 0.09176
6. If 87% of the all the evening college class students work, 9.18% of the time the Addis Ababa
University would get less than 100 evening class college students working in a sample of 120
evening college class students.
2. In a travel study, the Ethiopian Tourism Commission reported that during the Ethio-Eritrean war, 29% of
the tourists who came to Ethiopia said that the crisis would affect their vacation plans.
a) At the end of the war if the figure is still 29%, in a random sample of 150 travelers, what is
the probability that 20 or fewer responded yes that the Ethio-Eritrean crisis would affect their
vacation plans? 0.0000
b) However, a study at the end of the war indicated that only 7% of the travelers felt at that
time that the Ethio-Eritrean crisis would affect their vacation plans. What is the probability a random
sample of 150 travelers would result in 20 or fewer travelers saying yes that the Ethio-Eritrean crisis
would affect their vacation plans? 0.99934
3. A true-false test containing 100 questions is given to a student who is totally ignorant of the subject matter.
What is the probability that the student gets exactly 65 correct? 0.00042
Page | 28
29
Exactly as for the normal approximation of binomial probabilities, a correction factor for continuity should
be used in conjunction with the normal approximation of Poisson probabilities.
Example:
1. suppose we wish to determine the probability that 15 or more maintenance calls will be required on a
randomly selected day, given a Poisson random variable with λ = 10 calls per day by normal distribution.
Solution:
Poisson Normal
λ = 10 calls per day µ = λι = 10
ι = 1 day δ = √λι = √10 = 3.16
P (X≥ 15) =? P (X≥ 14.5) =?
P (X≥ 15) = 1 - P (X≤ 14) Z14.5 = = + 1.42
= 1 – 0.9165
= 0.0835 P (X≥ 14.5) = P (Z≥ +1.42)
= 0.5 – P (0 to +1.42)
= 0.5 – 0.42220
= 0.0778
As we can see from the above result, the difference is only 0.0057(0.0835 - 0.0778). As µ increases the
difference decreases.
Page | 29