Statistics Module I
Statistics Module I
Module I
Prepared By:
Bedru Babulo
Yesuf Mohammednur
Department of Economics
Faculty of Business and Economics
Mekelle University
2005
Mekelle
1.0 INTRODUCTION
Welcome to world of statistics!
Statistics is one of the most important and useful subjects taught in
business as well as economics school. The overall objective of this chapter
is to acquaint you with the introductory concepts in statistics and introduce
you with different aspects of statistics.
Learning Objectives
When you have completed this chapter, you will be able to:
2
All the aforementioned statements are some of the statistics (numerical
facts) we encounter. Now let’s turn our discussion towards defining
statistics and the different aspects (branches) of statistics.
The word statistik comes from the Italian word statista (meaning
“statesmen”). The term first used by Gottfried Achenwall (1719 –1772), a
professor in Marlborough and Gottingen. Dr E.A.W Zimmerman introduced
the word statistics in to England. Sir. John Sinclair in his work, Statistical
account of Scotland 1791 –1799 popularized its use. Long before the 18th
c, however, people had been recording and using data.
But merely all-numerical data do not fulfill the quality of statistics. Hence,
numerical data must possess the following characteristics in order to be
called statistics.
3
i. Statistics are numerically expressed. But every numerical data may
not be statistics. It should be well defined.
ii. Statistics are aggregates of facts. It should be general and does not
concern about individuals at specific manner.
iii. Statistics are affected to a marked extent by multiplicity of causes.
iv. Statistics are enumerated or estimated according to reasonable
standards of accuracy.
v. Statistics are collected in a systematic manner.
vi. Statistics are collected for predetermined purpose.
vii. Statistics should be placed in relation to each other in time and
space. E.g. the economy grows by 1.9% (Where and when?)
The Singular Sense: Modern statistics qualify its scientific nature in its
singular sense. Statistics is a branch of mathematics or applied research
which is concerned with the development and application of methods and
techniques for collecting organizing, analyzing and interpreting quantitative
data to come up with sound decisions. Or, Statistics as a field of study has
been defined as the art and science of collecting, organizing, analyzing and
interpreting data. According to this definition there are five steps in
numerical investigation.
4
iii. Data Analysis – It is the process of extracting relevant information
from a summarized or organized data with out giving conclusions
about the facts.
iv. Data Interpretation – It is the process of generalizing and
drawing valid conclusions from data analysis.
Now days, there are various statistical software packages that made data
organization and analysis very much easier. (E.g. SPSS, SAS, STATA, etc)
For example, in some interval time period, the Central Statistics Authority
(CSA) or Ethiopian Economic Association (EEA) gathers basic data
concerning the number, age distribution, occupational and educational
5
composition of the Ethiopian people. Since the amount of raw data
gathered from Ethiopian people by CSA is immense, it is necessary to
condense and interpret this information to make it useful. So the data will
be summarized and may be presented using tables, graphs or charts.
Table 1.1:Total Population of Ethiopia by Sex, Region, Urban & Rural: July
1/2001.
(In thousands)
Urban Rural Total
Region
Male Female Total Male Female Total Male Female Total
Tigray 321 330 651 1547 1599 3146 1868 1929 3797
Affar 58 45 103 638 502 1140 696 547 1243
Amara 884 875 1759 7496 7493 14989 8380 8368 16748
Oromiya 1391 1391 2782 10101 10140 20241 11492 11531 23023
Somali (1) 317 269 586 1735 1476 3211 2052 1745 3797
Benishangul - Gumuz 25 25 50 253 248 501 278 273 551
southern
Nations/Nationalities &
Peoples 501 507 1008 5911 5984 11895 6412 6491 12903
6
Total Population of Ethiopia
70000
60000
population size
50000
40000
30000
Urban Male
20000 Urban Female
10000 Urban Total
0 Rural Male
Rural Female
ay
l
a
n
1)
uz
a
lla
a
r
ar
ta
fa
le
iy
ab
tio
ar
i(
be
gr
um
To
ar
Af
op
Rural Total
Am
tra
Ab
al
Ti
H
ro
am
Pe
-G
m
is
O
is
So
in
G
Total Male
d
&
l
dm
gu
Ad
s
ie
an
lA
Total Female
lit
sh
na
na
ni
io
io
Total Total
Be
is
at
ov
/N
pr
ns
io
a
aw
at
N
D
rn
i re
he
D
ut
so
Region
7
income of a household (or individual) in Ethiopia. In this case, EEA should
collect data regarding the income level of households in Ethiopia. This is,
however, too costly and time consuming. Hence, EEA may collect
representative sample data from households and based on this estimates
(or make inference) the annual income of the households.
For instance an economist may be asked to forecast the inflation rates for
some future period of time. In such situation, an economist uses
statistical information on indicators like producer price index (PPI), the
unemployment rate, and the manufacturing capacity utilization. Often
these statistical indicators are entered in to computerize forecasting
models that predict inflation rates.
Economic problems almost always involve tasks that are capable of being
expressed numerically. Example: wages, prices, outputs (of
8
manufacturing, mining, agriculture) etc. These numerical magnitudes are
the outcomes of the multiplicity of causes and are subject to variations
from time to time or between places or among particular cases.
Accordingly, the study of economic problems is specially suited to
statistical treatment. Statistical approach to an economic problem not only
leads to its correct description but also indicates lines along which it is to
be directed. Generally, statistics is indispensable for economic policy
formulation, planning and forecasting.
Apart from this, the development of economic theory also been facilitated
by the use of statistics. The complexity of modern economic organizations
has rendered deductive reasoning inadequate and difficult. Statistics is
now being used increasingly not only to develop new economic concepts
but also to test the old ones.
9
1.4 Some Basic Concepts (Terminologies) in Statistics
In this section some of the key statistical concepts that under lay the
theory of statistics are discussed below:
Data- are the facts and figures that are collected, analyzed, and
summarized for presentation and interpretation.
Data set – all the data collected in a particular study.
Discrete Data – refers to data obtained by counting. It assumes always
whole numbers.
Continuous Data – refers to data gathered by measuring and can
include decimal numbers.
Qualitative Data – is data that provide labels or names for a
characteristic of an element. Qualitative data may be numeric or non-
numeric.
Quantitative Data- is data that indicate how much or how many of
something. Quantitative data are always numeric.
Elements – are the entities on which data are collected or an individual
member in the data (or population).
Variable – refers to a characteristic of interest for the elements.
Qualitative variable – is a variable with qualitative data.
Quantitative Variable – is a variable with quantitative data.
Cross-sectional Data – is data collected at the same or approximately
the same point in time.
Time Series Data – refers to data collected at several successive periods
of time.
Observation (cases) – is the set of measurements obtained for a single
element.
Population – is the totality of elements of interest in a particular study.
10
Sample – is a subset of the population.
• Sampling- is the process of selecting a small number of items
or parts of a larger population to make conclusions about the
population.
• Census/complete Enumeration– is an investigation of all the
individual elements making up the population.
Population Elements – refers to an individual number of the population.
Target population – is the specific complete group relevant to the study
or research project.
Population Parameters – are variables in a population or measured
characteristics of the population.
E.g. Population mean (µ) population standard deviation (δ) etc. They are
represented /symbolized by Greek letters.
11
Probability Distribution - is a description of how the probability are
distributed over the values the random variable can assume.
12
With this we end up our discussion of the first chapter an introduction to
statistics. In the coming chapter we see the introductory concepts of
probability theory.
Review Exercises
Learning Objectives
When you have completed this chapter, you will be able to:
Define probability
Describe the classical, the empirical, and the subjective approaches to
probability.
Understand the terms employed in the concept of probability
Calculate probabilities applying the rules of addition and the rules of
multiplication under conditions of statistical dependence and independence
Calculate a probability using Bayes’ theorem
13
2.1 Introduction
In such situations we use the concept of probability in our daily life with
out detailed and actual knowledge of the concept in other words we use it
intuitively.
As project analyst:
14
The subject matter most useful in effectively dealing with such
uncertainties is enclosed under the heading probability.
Probability can be thought as a numerical measure of the chance of
likelihood that a particular event will occur. Here, before we treat
definition of probability in detail, let’s be familiar with some of the basic
concepts (terms) in probability.
15
E.g. rolling a die: S = {1,2,3,4,5,6}
Event A = {1,3,5,}
Complement event of A, A’ = {2,4,6}.
Impossible Event – is a subset of sample space that contains none of the
points.
E.g. Rolling a die: S = {1,2,3,4,5,6}
E = {7} or E = {0}
Independent Events -Two events are said to be independent when the
happening of one event doesn’t affect the happening of the other.
E.g. rolling a die
Dependent Events- Two events are said to be dependent when the
happening (or occurrence) and non-occurrence of an event affects the
happening of another event. E.g.
Mutually Exclusive Event- Events are said to be mutually exclusive if one
and only of them can take place at a time.
Collectively Exhaustive Events/Lists- When a set of events for an
experiment includes every possible outcome the set is said to be
collectively exhaustive event/list.
E.g. flipping a fair coin twice: S = {HH, HT, TH, TT}
Once looking the basic concepts, we pass to formally give definitions for
probability.
In fact, experts disagree about the concept of probability, since there are
various conceptual approaches in defining probability. The most common
are discussed below:
16
1. Classical Approach 2. Relative Frequency Approach
3. Subjective Approach 4. Axiomatic Approach
17
ii. Relative Frequency Approach
Exercise: Suppose that 400 of the 50,000 fire insured houses has a fire. A
fire insurance company would like to know the probability of fire for fire
insured houses, calculate this probability?
400
Solution: P( f ) = = 0.008
50,000
18
Sometimes this approach referred as objective probability since
experiments should be conducted or recorded data must be there in order
to compute probability.
This approach is used when outcomes are not mutually exclusive and
there is no objective data.
19
Generally, though there are three approaches of probability, we can use
any of the aforementioned approach determined depending up on the
problem under consideration.
Diagrammatically:
A B
20
A
Independent events
21
2.4 Probabilities Under conditions of Statistical Independence
When two events happen, the outcome of the first event may or may not
have an effect on the outcome of the second event. That is, the events
may be either dependent or independent. In this section, we examine
events that are statistically independent.
Definition: - statistically independence is the case when the occurrence
of an event has no effect on the probability of the occurrence of any other
event.
Example: In a fair coin toss, P (H) = 0.5, that is, the probability of heads
equal 0.5, and the probability of tails equal 0.5. This is true for every toss,
no matter how many tosses have been made or what their outcomes have
been. Every toss stands alone and is in no way connected with any other
toss. Thus, the outcome of each toss of a fair coin is an event that is
statistically independent of the outcomes of every other toss of the coin.
22
2) Joint Probabilities Under Statistical Independence
The probability of two or more independent events occurring together or
in succession is the product of their marginal probabilities. Mathematically,
this is stated as:
P (A and B) = P (A n B) = P (A). P (B)
Where: P (A n B) = probability of events A and B occurring together or in
succession, this is known as joint probability.
P (A) – Marginal probability of event A occurring
P (B) - Marginal probability of event B occurring
23
Solution: At least one tail = means minimum of one tail otherwise 2 or 3
tails. There is only one case in which no tails occur namely H1H2H3.
Therefore, we can simply subtract for the answer.
P (at least one tail in 3 tosses) = 1 – P (all heads)
= 1 - (H1H2H3)
= 1 – 0.125 = 0.875
24
Examples: What is the probability that the second toss of a fair coin will
result in heads, given that heads resulted on the first toss?
Solution: In this case the two events are independent.
Symbolically: the question is written as: P (H2/H1)
*Using conditional probability under statistically independent
situation, P (H2/H1) = P (H2)
P (H2/H1) = 0.5
Check Yourself
1. What is the probability that a couple’s second third will be
a) A boy, given that their first child was a girl?
b) A girl, given that their first child was a girl?
Solution:
a) P (b/g) = P (b) = 0.5, since the events are statistically
independent
b) P (g/g) = P (g) = 0.5, since the events are statistically
independent
2. The four floodgates of a small hydroelectric dam fail and are
repaired independently of each other. From experience, it is known
that each floodgate is out of order 4 percent of the time.
a) If floodgate 1 is out of order what is the probability that
floodgates 2 and 3 are out of order?
b) During a tour of the dam, you are told that the chances of
all four floodgates being out of order are less than 1 in
5,000,000. Is this statement true?
Solution:
a) P (2 and3) = P (2 n 3) = P (2). P (3)
= 0.04 x 0.04 = 0.0016
25
b) P (1n2n3n4) = P (1n2n3n4) = P (1). P (2). P (3). P (4)
= 0.04 x 0.04 x 0.04 x 0.04
= 0.00000256
Compare 0.00000256 with 0.0000002. Comparing the values the
statement is False
Like the statistically independent events, there are three probability types
for statistically dependent events.
P ( AnB )
Symbolically: P( A / B) =
P( B)
P ( BnA)
P ( B / A) =
P ( A)
26
Example:
27
Example:
Check yourself
Solution:
Given: P (A) = 0.39 P (A or B) =P (AUB)= 0.47
P (B = 0.21
i. P (A or B)’ = 1 – P (A or B) = 1 – 0.47 = 0.53
28
ii. P (A n B) = P (A). P (B/A) = P (B). P (A/B) = [P (A) + P (B)] – P
(AUB) = [0.39 + 0.21] – 0.47 = 0.6 – 0.47 =
0.13
P( AnB ) 0.13
iii. P ( B / A) = = = 0.333&
P ( A) 0.39
P ( AnB ) 0.13
iv. P( A / B) = = = 0.62
P( B) 0.21
29
Example: An application Of Bayes’ Theorem
The quality of the purchased parts varies with the source of supply.
Historical data suggest that the quality ratings of the two suppliers are as
shown in the table below.
If we let G denote the event that a part is good, and B denote the event
that a part is bad, the information in table 2.1 provides the following
conditional probability values.
30
Based on the above information we can compute the joint probabilities of
a part being good and comes from supplier 1, good and A2, a part being
bad and supplied by A1; and bad and supplied by A2.
Suppose now that the parts from the two suppliers are used in the firm’s
manufacturing process and that a machine breaks down because it
attempts to process a bad part. Given the information that the part is a
bad, what is the probability that it came from supplier 1 and what is the
probability that it came from supplier 2?
With the prior probabilities and the join probabilities, Bayes’ theorem
can be used to answer these questions.
- Letting B denote the event that the part is bad, we are looking
for the posterior probabilities P (A1/B) and P (A2/B). From the
law of conditional probability and marginal probability, we know
that:
P ( A1 nB )
P ( A1 / B ) =
P( B)
31
Substituting the above equations, we obtain Bayes’ theorem for the case
of two events.
P ( A1 nB )
P ( A1 / B ) =
P ( A1 nB ) + P ( A2 nB )
P ( A1 ) P ( B / A1 )
P ( A1 / B ) =
P ( A1 ) P ( B / A1 ) + P ( A2 ) P ( B / A2 )
P ( A2 nB )
P ( A2 / B ) =
P ( A1 nB ) + P ( A2 nB )
P ( A2 ) P ( B / A2 )
P ( A1 / B ) =
P ( A1 ) P ( B / A1 ) + P ( A2 ) P ( B / A2 )
Note that in this application we started with a probability of .65 that a part
selected at random was from supplier 1. However, given information that
the part is bad, the probability that the part is from supplier 1 drops to
.4262. In fact, if the part is bad, there is a better than 50-50 chance that
the part came from supplier 2; that is, P (A2/B) = .5738.
32
Bayes’ theorem is applicable when the events for which we want to
compute posterior probabilities are mutually exclusive and their union is
the entire sample space. Bayes’ theorem can be extended to the case
where there are n mutually exclusive events A1, A2,…, An whose union is
the entire sample space. In such case, Bayes’ theorem for computing
posterior probability P (Ai/B) can be written symbolically as:
P( Ai ) P( B / Ai )
P( Ai / B) =
P( A1 ) P( B / A1 ) + P( A2 ) P( B / A2 ) + ... + P( An ) P( B / An )
Check yourself
Once in the night, a speeding taxi struck a man as he crossed the street.
An eyewitness has testified that she thought the taxi (which did not stop)
was blue. The man sued the Blue cab company for his medical expenses.
The city where the accident occurred has only two taxi companies: Blue
cab and Green cab. Green cab has 85 percent of the taxis’ in the city. At
the trial, the man’s lawyer shows that the eyewitness is 80 percent reliable
in identifying the color of taxis. That is, she was able to identify correctly
the color of taxis 80 percent of the time, under conditions like those of the
night accident. The lawyer concludes that it is extremely likely that a Blue
Cab was hit the man. Do you agree? Why or Why not?
Solution:
Given: B = Blue E = eyewitness thought that the taxis was blue.
G = Green
P (E/B) = 0.8 P (E/G) = 0.2
33
P (B) = 0.15 P (G) = 0.85
Required: P (B/E)=?
P ( E / B ).P ( B ) 0.8 x0.15
P( B / E ) = = = 0.41
P( E / B ).P ( B ) + P ( E / G ).P (G ) (0.8 x0.15) + (0.2 x0.85
Review Exercises
34
3.0 THEORETICAL PROBABILITY
DISTRIBUTION
Learning Objectives
When you have completed this chapter, you will be able to:
Define the terms probability distribution and random variable
Distinguish between a discrete and continuous probability distribution
Calculate the mean, variance, and standard deviation of a discrete
probability distribution
Describe the characteristics and compute probabilities using the binomial
probability distribution
Describe the characteristics and compute probabilities using the Poisson
probability distribution
Describe the characteristics and compute probabilities using the hyper
geometric probability distribution
Describe the characteristics and compute probabilities using the uniform,
normal and exponential probability distributions
Describe how to approximate different probability distributions and the
conditions necessary for approximating a probability distribution by other.
35
sequence of whole numbers such as 0,1,2…is referred to as a
discrete random variable.
Examples:
Experiment Random Variables Possible values for the
random variable
Take a 20 multiple Number of questions 0,1,2,3…20
choice question answered correctly
examination
Operate a restaurant Number of customers 0,1,2,3…
for one day
Sell an automobile Gender of customer 0=if female
1=if male
36
Examples:
Experiment Random Variables (X) Possible values for the
random variable
Operate a bank Time between X ≥ 0
customer arrivals in
minutes
Work on a project to Percentage of project 0 %≤ X ≤ 100%
construct new library complete after six
months
37
X1 P (X1)
X2 P (X2)
. .
. .
Xn P (Xn)
Examples:
Suppose that a fair die is thrown once. The outcomes are number of dots.
Let X be a random variable that represents the number of dots of the die
and P (X) the
Probability:
38
Graphically:
P (Xi)
1/6
0 1 2 3 4 5 6Xi
However it is not an only case that each value of the random variable
assumes equal chance of occurrence rather there exists a case when each
value of the random variable has different probability of being observed,
which is referred as non-uniform discrete probability distribution. A
good example of this is tossing a coin, for instance, twice, three times, etc.
Tossing a coin three times what is the probability of observing head 0, 1,
2, 3 from the 8 possible outcomes.
Sample space (S) = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}
39
Random variable Probability
Number of Heads (Xi) P (Xi)
0…………………….. 1/8
1…………………….. 3/8
2…………………….. 3/8
3…………………….. 1/8
numberofoutcomeswheretheeventoccurs 1
P (0 head) = =
totalpossibleoutcomes 8
numberofoutcomeswheretheeventoccurs 3
P (1 head) = =
totalpossibleoutcomes 8
40
Random Probability
Variable P (Xk)
2 3/8
3 4/8
Cum.Prob.= ∑ P( X k ) = 7 8
41
Example:
Consider a continuous random variable X that can assume values between
2 and 6 with equal probability. The probability density function f (X) = ¼.
What is the probability that X will be smaller than or equals to 5?
Solution
5 5
P(2 ≤ x ≤ 5) = ∫ 1 dx 1 x ∫ = 1 (5) − 1 (2) = 3
2 4 4 2 4 4 4
2
Solution: P (0 ≤ X ≤ 2) = ∫ 0
KX 2 dx = 1
2
= K ∫ X 2 dx = 1, ∫ X 2 dx = 1 x 3
3
0
2
= K 13 X 3 ∫
0
= K [ 1 3 (2 3 ) − 13 (0 3 )]
= K 83 = 1 K = 3/8
42
Expected Value of a Random Variable: The expected value of discrete
random variable x, denoted by E(x), is the weighted nean of the possible
values that the random variable can assume, where the might attached to
each value is the probability that the random variable will assume this
value. In other words,
m
∑ ( x) = ∑
−∞
f ( xi ) xi dx .... For contionous r. v
Examples:
1. A real-estate agent sells 0,1, or 2 houses each working week
with respective probabilities 0.5, 0.3, and 0.2. Compute the
expected value of the number of houses sold per week?
43
2
Solution: E (X) = ∑
i=0
Pi ⋅ X i = (0.5 x 0) + (0.3 x 1) + (0.2 x 2) = 0.7
The variance measures that how individual values are speeded, dispersed
or distributed around its mean or expected value.
2
Standard Deviation = δ = var ( x) = E ( xi − µx) 2
x
2
δ E ([ xi − E ( x)])
2
=
x
44
m
= ∑i =1
( xi − E ( x)) 2 P ( xi )
E ( x 2 ) − [E ( x)]
2
Or =
Properties of Variance
1. The variance of a constant is zero
2. If X an Y are two independent random variables, then Var. (X + Y)
= Var. (X) + Var. (Y)
3. If b is constant, then Var. (x + b) = Var. (x)
4. If a is constant, then Var. (ax) = a2. Var. (x)
5. If x an Y are random variables and a and b are constants, the
Var. (ax + by) = a2 var. (x) + b2 var. (y)
Example:
1. Mr. Tujar buys a stock whose return (including both dividends and
change in price of stock) depends on whether the nation’s GNP is
rising, constant, or falling. If the GNP is rising, the return is 20
percent (i.e., 20 cents per Birr); if it is constant, the return is 5
percent; and if it is falling, the return is -10 percent. If he believes
that it is equally likely that the GNP will rise, remain constant, or fall.
What is the expected value of the return from this stock? And what
are the variance and standard deviation of this stock’s return.
Solution: If it is equally likely that the GNP will rise, remain constant or fall,
the probability of each of these outcomes must equal 1/3. Thus, the
expected value is:
2
E (X) = ∑
i =0
Pi ⋅ X i = 20(1/3) + 5(1/3) + -10(1/3)= 5 percent
45
The variance is calculated as follows:
2 3
δ
x
= ∑
i =1
( xi − E ( x) 2 P ( xi )
Example: You are given the expected value and standard deviation of he
profits to be made from a particular business venture. The expected value
is 400,000 Birr and the standard deviation is 100,000 Birr. What is the
probability that the profits from this venture will be below zero or above
46
800,000 Birr? And what is the probability that the profit will be between
400,000 Birr and 600,000 Birr?
Solution:
In this case, if you know the probability distribution of profits you can
figure out this probability exactly but you are not given, so use
Chebyshev’s inequality to determine the maximum and minimum amount
the probability can possibly be:
o The probability of the profits below 0 and above 800,000 Birr is the
same as the probability that the profits will assume a value more
than 4 standard deviation from the profits expected value. Thus,
the maximum amount of this probability can be 1/K2, which is equal
to 1/42 = 1/16.
o The probability that the profit will be between 400,000 and 600,000
Birr is the same as the probability that the profit will assume a value
within 2 standard deviations from the profit’s expected value.
Thus, the probability is at least 1 – 1/22 = 1 – ¼ = ¾.
Covariance (Cov.)
47
Let X and Y are two random variables with means E (X) and E (Y). The
covariance between the two variables is defined as:
Cov. (X, Y) = E [( x − µx) ( y − µy )]
Cov. (X, Y) =
∑ (xi − x ) (yi − y )
N
Can assume negative
Cov. (X, Y) = E [( x − µx) ( y − µy )]
Values
= E (X Y) – µx µy
48
2. In each trial there are only two possible outcomes. We refer to
one outcome as ‘success’ and the other as ‘failure’.
3. The probability of a success on one trial is denoted by P and does
not change from one trial to another. And the probability a
failure, denoted by q, which is equal to 1-P, does not change
from trial to trial. (Stationarity assumption)
4. Statistically, the trials are independent.
49
number of values is finite, X is a discrete random variable. The probability
distribution associated with this random variable is binomial probability
distribution.
P(r) = ( ) p (1− P)
n
r
r n−r
= ( )p qn
r
r n−r
= nCr pr q n−r
n!
Or P(r) = Pr q n−r
r!(n − r)!
Examples:
Suppose a company produces toothpaste. Historically, eight-tenths of the
toothpaste tubes were correctly filled (successes). What is the probability
of getting exactly three of six tubes (half a carton) correctly filled?
Solution:
50
Given: P = 0.8 r=3
Q = 0.2 n = 6, then using binomial formula
n! r n−r
Probability of r success in n trials = P q
r !( n − r ) !
6! 6−3
Probability of 3 correctly filled = 0 .8 3 0 .2
3 !( 6 − 3 ) !
= 20 (0.512) (0.008)
= 0.08192
Interpretation- the probability of getting exactly 3 tubes out of six that are
correctly filled is 0.08192.
Binomial tables are available: of course, we could have solved the above
problem by using binomial probability tables. In order to use the table,
take n, p and r values and look for the probability value that corresponds
to n, p and r.
To this point, we have dealt with the binomial distribution in terms of the
binomial formula and table, but the binomial, like any other distribution,
can be expressed graphically as well. You should understand that there is
51
not just one binomial distribution. Rather, there is a different distribution
for each different pair of n, p values.
P = 0.1
P = 0.3
Q = 0.9
Probability
q = 0.7
Probability
0 1 2 3 4 5
0 1 2 3 4 5
r
r
P = 0.7
P = 0.5 Q = 0.3
Q = 0.5
Probability
Probability
0 1 2 3 4 5
0 1 2 3 4 5 r
r
52
P = 0.9
q = 0.1
Probability
0 1 2 3 4 5
r
^ =5
P = 0.4
Probability
0 1 2 3 4 5
r
^ =10
P = 0.4
Probability
0 1 2 3 4 5 6 7 8 9 10
53
r
Probability
0 5 10 15 20 25 30
r
54
Applications of Binomial Distribution
1
To calculate P for population, use the following formula P=R/N. where R be the number of successes in
the population and N be size of the population. The binomial requirements may be relaxed when n is small
compared with N (if n < 0.05 N).
55
- Similarly, the probability of more than 5 successes would be:
P (6) + P (7) + P (8) + P (9) + P (10)
- Therefore: the cumulative (cum.) probability of r.
Cum. P(r) = [P (0) + P (1) + P (2) + …. + P(r)]
m
= ∑
r =i
P (r ) …………………………… 1
No of
Cum. Prob. ∑ P(r )
0
Successes
(r)
0. …………… 0.00605
1……………. 0.04636
2……………. 0.16729
3……………. 0.38228
4……………. 0.63310
5……………. 0.83376
6……………. 0.94524
7……………. 0.98771
8……………. 0.99833
9……………. 0.99990
10……………. 1.00000
56
1. What is the probability of fewer than five successes? Ans.063310
2. What is the probability of more than five successes? Ans. 0.16624
3. What is the probability of last four successes? Ans. 0.11772
4. What is the probability of most three successes? Ans. 0.38228
5. What is the probability of exactly six successes? Ans. 0.11148
1.265
The Poisson distribution named for its originator Simeon Denis Poisson
(1781 – 1840), a French man who developed the distribution from studies
during the latter part of his lifetime.
57
It is useful when dealing with the number of occurrences of an event over
a specified interval of time or space.
an interval
e = constant equals to 2.71828 …
λ x e −λ
2
In some texts P( x) = where λ the average number of occurrences in an
x!
interval
58
1) A certain restaurant has a reputation for good food. The restaurant
management boasts that on a Saturday night, groups of customers arrive
at a rate of 15 groups every half an hour, on average.
a) What is the probability that 5 minutes will pass with no groups of
customers arriving?
b) What is the probability that 8 groups of customers will arrive in 10
minutes?
Solutions:
a) Given µ = 15 groups in 3 minutes.
15 = 30 min.
? = 5 min.
On average 2.5 groups in 5 minutes.
µ x e −µ
P( x) = =
x!
2.5 0 e −2.5 1.e −2.5 ♣
P ( 0) = = = 0.0821
0! 1
b) 15 = 30 min
? = 10 min
On average 5 groups arrive in 10 minutes = ( µ )
µ xe −µ
P( x) =
x!
5 8 e −5
P (8) = =0.0653♣
8!
♣
fortunately, the answers obtained using hand calculations can be obtained by looking up to the Poisson
probabilities table with out tedious work
59
Just as with the binomial distribution, the Poisson distribution also
involves cumulative probabilities.
60
in place of the mean of the Poisson distribution ( µ ) so that the formula
becomes:
(np ) x e − np
P( x) = ……… Poisson formula for approximating binomial formula
x!
61
1. The result of each draw can be classified in to two categories.
2. The probability of success in each draw changes
R N − R
r n − r
The hyper geometric probability formula P(r) =
N
n
2. Suppose that there are 15 identical tires in stock and 5 are slightly
damaged. What is the probability that a customer who buys 4 tires
will obtain 2 damaged tires?
Solution: N = 15 R=5
n=4 r=2
62
5 15 − 5 5 10
2 4 − 2 2 2
P (2) = = =
15 15
4 4
63
However, the area under the graph of ƒ (x) corresponding to a given
interval will assume a value in that interval.
64
ƒ (x) ƒ (x) = (1/20) (20)= 1 = Area
1/20
Figure 3. :
In general, the uniform probability density functions for a random variable
x which can take a value from a to b can be represented as follows:
1
ƒ (x) = for a ≤ x ≤ b elsewhere
b − a
0...elsewhere
The graph of the PDF provides the height or value of the function at any
particular value of x. Unlike the discrete probability function the PDF for a
continuous random variable does not represent probability rather it
provides the height of the function at any particular value of x.
65
1/20
0
70 80 X
60
Figure 3. :
The probability that the arrival of the plane is between 60 and 70 minutes
is equal to the shaded area [a (rectangle)]3
1
Area = (10) = 0.5 probability
20
Once the PDF has been identified, the probability that x takes on a value
between some lower values (x1) and some high value (X2) can be obtained
by computing the area under the graph of ƒ (x) over the interval x1 and x2:
76 − 68 8
Solution: P (68 ≤ x ≤ 76) = = = 0.4
20 20
3
Area of a rectangle = Base x Height.
66
Expected Value (Mean) and Variance of the Uniform Probability
Distribution.
Examples:
1. The random variable X is supposed to be uniformly distributed
between 10 and 20.
a) Find P (x ≤ 15)?
b) Find P (12 ≤ x ≤ 18)?
c) Compute E (x) and Var. (x)?
67
Solutions:
1 1
,a≤x≤b , 10 ≤ x ≤ 20
1. (a) f (X) = b − a f(x) = 20 − 10
0 elsewhere 0 elsewhere
a) a + b = 10 ⇒ a + b = 20 ……………………..1
2
1.8 = b – a ⇒ -a + b = 1.8……………….2
Solving equations 1and 2 simultaneously, we can find the result:
68
a + b = 20 a + b = 20
-a + b = 1.8 a = 20 – b
2b = 21.8 a – 20 – 10.9
b = 21.8 a = 9.1
b = 10.9
1
, 9.1 ≤ x ≤ 10 .9
b) f(x) = 10 . 9 − 9 . 1
0 elsewhere
10.5 − 9 1.5
P (9 ≤ x ≤ 10.5) = = = 0.833
10.9 − 9.1 1.8
69
Second, it comes close to fitting the actual observed frequency
distributions of many phenomena. For instance, human characteristics
(weights, heights and IQs), outputs from physical/process (like dimensions
and yields) and other measures of interest to economists and business
professional in both the public and private sectors.
E.g. Per capita income in developing countries, air pollution in a
community, etc
f (x)
x
µ = Mean
=Mode
=Median
x−µ
2
−1
1 2 σ
f ( x) = .e
2πσ 2
70
x−µ
2
−1
1 2 σ
f ( x) = ⋅e
σ 2∏
Where: x = the variable µ = Mean σ = Standard deviation
e = 2.718281 … π = 3.14 …
2. The highest point on the normal curve occurs at the mean, which
are also the median and the mode of the distribution. The height of
the curve declines as we go on either direction.
71
5. Areas under the curve give probabilities for the normally distributed
variables. The area under the normal curve is distributed as follows:
i) µ ± σ = 68.27%, one-tail each = 34.14%
i)) µ ± 1.96 σ = 95%, one-tail each = 47.5%
iii) µ ± 2 σ = 95.45%
iv) µ ± 3 σ = 99.73%
Xi −µ
Z =
σ
Graphically: X-scale
Z-scale
72
The Z – value tells us how far away and in what direction X is from its
mean in terms of standard deviation.
f (Z ) ⋅e
2π
Examples:
1. Find the area under the normal curve for Z = ± 1.54
Solution: P (0 + 1.54) = 0.4382 (from the table)
P (0 + -1.54) = 0.4382
P (-1.54 ≤ x ≤ 1.54) 0.4382 + 0.4382 = 0.8764
-1.54 0 1.54
73
0 0.25
0 1.96
0 0.6 1.8
General rule: If both Zs are on the same side of the mean, then the
area between them can be obtained by subtracting. And if both Zs are on
74
the opposite side of the mean, then the area between them can be
obtained by summing the two values.
Application Example:
0.4495
0.5
-1.64
0
75
∴ Income exceeding 832 Birr = 0.5 - 0.4495 = 0.0505
= 5%
0.5
0 1.64
Check yourself
76
c) What proportions of scores are from 110 to 120?
Answer 15.98%
d) What percent of the scores exceed 183? Answer
1.58%
X = zσ + µ
77
1. If the population variance (σ x2 ) is known.
78
Computing the Approximation:
Solution:In this case it is difficult, albeit not impossible, to use the binomial
formula. Hence, we should approximate it by another distribution. Since
the conditions for approximating binomial probability by normal probability
(np ≥ 5 and nq ≥ 5) is satisfied, we compute the probability of 12
successes in 100 trials by using normal approximation. To compute, the
following steps are followed:
1. Finding the values of x by adding 0.5 and subtracting 0.5. The 0.5
we add and subtract is called a continuity correction factor. It is
79
introduced since continuous distribution is being used to
approximate a discrete distribution.
12.5 − 10
Forx 2 = 12.5 Z 2 = = 0.83
3
Find the probabilities from the standard normal table for Z values, and
finally compute the P (11.5 ≤ x ≤ 12.5 )
= 0.1052
80
σ =3
Area = 0.1052
µ = 10 11.5 12.5
1 −x
µ
f ( x) = e whereµ = mean
µ
e = 2.718
Let λ = 1
µ , then f ( x) = λ e −λx
81
Example:
f (x)
f(x)
A
X
a
∞
e − λΧ
Area (A) = a∫ λ dx
Let eu = e − λΧ , u = - λx,
du
= -λ
dx
82
b
∞
Area(A)= ∫a e −λx − λdx = Lim ∫ − e −λx λdx
b →∞ a
− λx
= −e − λb − ( − e − λa )
b
= Lim − e
b →∞
+c a
= 0 − ( − e − λa )
− λ a
Area (A) = e = Probability (x ≥ a)
(a) What is the probability that it takes 6 minutes or less to get a taxi.
−6
= 1− e
− λ a 5
Solution: p (x< 6)= 1- e
Review Exercises:
Behavioral Objectives
A. You should be able to define the following key concepts in this chapter:
experiment *combination
outcome *permutation
83
event conditional probability
frequency definition of probability multiplication rule
subjective definition of probability independent events
sample space Bayes' theorem
mutually exclusive events prior probability
exhaustive events state of nature
addition rule
84
Experts in marketing have devoted considerable study during the past20
years to the way in which the probability that a consumer will purchase a
given brand of a product depends on what brands he or she has
purchased in the past. I As a very simple illustration, suppose that it has
been determined that a consumer, if he or she purchases breakfast food,
has a 20 percent chance of purchasing a particular brand of breakfast food
if he or she has purchased this brand once before, and a 10 percent
chance of purchasing it if he or she has never purchased it before.
(a) Suppose that this consumer had not tried this brand at the beginning'
of April, and that he or she purchased breakfast food once in April and
once in May. What is the probability that he or she did not purchase,
this brand either time?
Multiple-Choice Questions
1. Suppose thatP(A) = O.5,P(B) = 0.2, and P(A and B) = 0.2. Which of the
following is true?
85
5. The probability that the Jones Company will go bankrupt in 1984 is 0.1.
The probability that it will lose money in 1984 is 0.2. The probability that
it will both go bankrupt and lose money in 1984 is 0.1. The probability
that it will either go bankrupt or lose money (or both) in 1984 equals
8. If peA) = 0.3 and PCB) = 0.6, what is P (not A and not B) if A and B
are statistically independent?
86
A. You should be able to define the following key concepts in this
Value of X Probability
0 .10
1 .20
2 .40
3 .20
4 .10
A. You should be able to define the following key concepts in this chapter:
87
Z value
Poisson distribution
Normal distribution
Standard normal distribution
(a) .4332.
(b) .0668.
(c) .3413.
(d) .1587.
(e) none of the above.
(a) 1 inches.
(b) .1 inches.
(c) 2 inches.
(d) .2 inches.
(e) none of the above.
88
3. The pipe manufacturer in the previous question wants to know what the
probability is that a diameter will exceed 1.2 inches. You are hired as a
consultant. Your answer should be
(a) .05.
(b) .10.
(c) .0228
(d) .0793.
(e) none of the above.
4. The probability that the value of a standard normal variable is less than
1.0 equals
(a) .0228.
(b) .0287.
(c) .6915.
(d) .8413.
(e) .0919.
5. The probability that the value of a standard normal variable exceeds 1.9
equals
(a) .0228.
(b) .0287.
(c) .6915.
(d) .8413.
(e) .0919.
89
9. An insurance company finds that .003 percent of the population dies of
a certain disease each year. The company has insured 100,000 people
against death from this disease.
a) What is the probability that the firm must payoff in three or more cases
14. An experiment with three outcomes has been repeated 50 times and it
was learned that EI occurred 20 times, E2 occurred 13 times, and E3
occurred 17 times. Assign probabilities to the outcomes. What method
did you use?
38. The survey of subscribers to Forbes showed that 45.8% rented a car
during the past 12 months for business reasons, 54% rented a car
during the past 12 months for personal reasons, and 30% rented a car
during the past 12 months for both business and personal reasons
(Forbes 1993 Subscriber Study).
90
43. Assume that we have two events, A and B, that are mutually exclusive.
Assume further that we know peA) = .30 and PCB) = .40.
91
variance for the number of sales for the week? What is the standard
deviation for the number of sales for the week?
44. Phone calls arrive at the rate of 48 per hour at the reservation desk
for Regional Airways.
92
demanded, find the following probabilities.
a) P(180 ≤ x ≤ 220)
b) P ( x ≥ 250)
c) P(x ≤ 100)
d) P(225 ≤ x ≤ 250)
93
29. The true unemployment rate is 7% (Business Week, November 7,
1994). Assume that 100 employable people are selected randomly.
a. What is the expected number who are unemployed?
b. What is the variance and standard deviation of the number who are
unemployed?
c. What is the probability that exactly nine are unemployed?
d. What is the probability that at least five are unemployed?
35. The average life of a television set is 12 years (Money, April 1994).
Productlifl follow an exponential probability distribution. Assume that
this is the case fortli a television set.
a. What is the probability that the lifetime will be six years or less?
b. What is the probability that the lifetime will be 15 years or more?
c. What is the probability that the lifetime will be between five and 10
years?
94