Statistics and Probability Notes Part 1

Session 6
Mathematics for Engineers IV
Outline
Chapter III: Probability and statistics
Unit 10: Probability and statistics

• Quick overview of Descriptive Statistics
• Elementary Probability Theory
• Hypothesis Testing
Chapter III: Elementary of
probability and statistics
Unit 10: Probability and statistics

10.1 Quick overview of Descriptive Statistic
10.1.1 Definition
Descriptive statistics describes the data. It is used to say something about a set of
information that has been collected only.
10.1.2 Measures of central tendency and Variation or dispersion

Descriptive statistics are classified into two measures: Measures of central tendency
and Variation or dispersion.
Measures of central tendency

Measures of central tendency describes the center of the data.They include the
mean, median and mode
1. Arithmetic Mean
Let x1 , x2 , . . . , xn be an array of n measurements of a variable X. The arithmetic
mean is denoted and given by
n
1X
x̄ = xi (1)
n
i=1
Suppose that these n measurements can be arranged into k categories, and let
fi , i ∈ {1, 2, . . . , k} represents the frequency in each of the k categories, Then, the
arithmetic mean of the measurements is given by
n
1X
x̄ = fi xi , (2)
n
i=1
P
Where n = fi the sample size and the xi correspond to the observed values.
2. Geometric Mean
For a set of positive numbers x1 , x2 , . . . , xn the geometric mean is the principal nth
root of the product of the n numbers.
v
u n
uY
n
x̄G = t xi , (3)
i=1
And
v
u k
uY f
n
x̄G = t xi i . (4)
i=1
3. Harmonic Mean
The harmonic mean of a set of data x1 , x2 , . . . , xn is the reciprocal of the arithmetic
mean of the reciprocals of the data.
n
x̄H = Pn 1 . (5)
i=1 xi
And
n
x̄H = Pn fi
. (6)
i=1 xi
In summary,
x̄H ≤ x̄G ≤ x̄. (7)
4. Median
It is the middle number. It is found by putting the numbers in ascending order

and taking the actual middle number if there is one, or the average of the two
middle numbers if not. That is,

x n+1 , n odd,


2
Sample median = 1
(x n + x n2 +1 ), n even.
2 2


5. Mode
The mode is the value which occurs with the greatest frequency. Consequently,
it only really makes sense to calculate or use it with discrete data, or for continu-
ous data with small grouping intervals and large sample sizes. From this definition
therefore a distribution may have more than one mode.
Measures of variation
They determine the spread of the data values (variability of data). These mea-
sures include the range, variance and standard deviation.
1. Range
2
Range is the difference between the largest and smallest observation. Since it de-
pends only on two observations, the lowest and the highest, we will get a misleading
idea of dispersion if these values are outliers.
R=highest value-lowest value
2. Variance
The variance is the average of the squares of the distance each value is from the
mean. Let x1 , x2 , . . . , xn be an array of n measurements of a variable X. The
sample variance is given by
n
1 X
S2 = (xi − x̄)2 (8)
n−1
i=1
( n )
1 X
S2 = ( 2
xi ) − nx̄ 2
(9)
n−1
i=1
When the table of measurements is given with frequencies

xi x1 x2 ... xn
fi f1 f2 ... fn
Then, the sample variance is denoted and calculated as follows:
n
1 X
S2 = fi (xi − x̄)2 (10)
n−1
i=1
And the population variance is denoted and calculated as follows:

n
1X
σ2 = fi (xi − x̄)2 (11)
n
i=1
3. Standard Deviation
Sample
√ Standard Deviation is a square root of sample variance and it is denoted by
S = S2.
The population Standard Deviation√ is a square root of population variance and it
is denoted and calculated by σ = σ 2 .
Uses of variance and standard variation
• The variances and standard deviations can be used to determine the spread
of the data. If the variance or standard deviation is large, the data are more
dispersed. This information is useful in comparing two (or more) data sets to
determine which is more (most) variable. If the data all lies close to the mean
then the standard deviation will be small. While if the data is spread out
over a large range of values, standard deviation will be large. That is having
outliers will increase the standard deviation.
• The measures of variance and standard deviation are used to determine the
consistency of a variable.
3
• The variance and standard deviation are used to determine the number of
data values that fall within a specified interval in a distribution.
• Finally, the variance and standard deviation are used quite often in inferential
statistics.
4. Coefficient of variation (CV)
It is equal the standard deviation divided by the mean times 100%
S
CV = × 100. (12)
x̄
The result is expressed as a percentage. This coefficient is used when you need to
compare standard deviations where the units are different.
Skewness and Kurtosis

Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A
distribution, or data set, is symmetric if it looks the same to the left and right of
the center point. The skewness for a normal distribution is zero, and any symmetric
data should have a skewness near zero. Negative values for the skewness indicate
data that are skewed left and positive values for the skewness indicate data that
are skewed right. By skewed left, we mean that the left tail is long relative to the
right tail. Similarly, skewed right means that the right tail is long relative to the
left tail.
Kurtosis is a measure of peakedness or flatness (whether the data are peaked

or flat) relative to a normal distribution. That is, data sets with high kurtosis tend
to have a distinct peak near the mean, decline rather rapidly, and have heavy tails.
Data sets with low kurtosis tend to have a flat top near the mean rather than a sharp
peak. A uniform distribution would be the extreme case. This definition is used
so that the standard normal distribution has a kurtosis of zero. Positive kurtosis
indicates a peaked distribution and negative kurtosis indicates a flat distribution
with referring to normal distribution. Histogram is an effective graphical technique
for showing both the skewness and kurtosis of data set.
Exercises
1. Find the mean, median, mode, range, Standard deviation and variance for the
following five weight measurements in Kg: 40, 45, 50, 55, 60
2. Find the mean, median, mode, range, Standard deviation and variance for the
following ten weight measurements in Kg: 60, 55, 40, 50, 45, 45, 50, 55, 60, 50
3. Select ten students randomly from your class and ask their age in completed
years, or their weight in Kg or their height in cm, then
a. Describe as how you select the ten students from your class
b. Present the information you collected using an appropriate graph and
frequency distribution table.
c. Find the mean, median, mode, range, Standard deviation and variance
for your data
4
d. Tell the most likely distribution for your data
e. Find the skewness and kurtosis value for your data and interpret the
result.
10.2 Elementary Probability Theory

10.2.1 Meaning of Probability
Probability as a general concept can be defined as the chance of an event occur-
ring. Many people are familiar with probability from observing or playing games of
chance, such as card games, slot machines, or lotteries. In addition to being used in
games of chance, probability theory is used in the fields of insurance, investments,
and weather forecasting and in various other areas.
Assume that an experiment can be repeated many times, with each repetition called
a trial, and assume that one or more outcomes can result from each trial, then the
probability of a given outcome is the number of times that outcome occurs divided
by the total number of trials. If the outcome is sure to occur, it has a probability of
1; if an outcome can not occur, its probability is 0. In other words, the probability
is equal to the number of ways of achieving success divide by the total number of
possible outcomes.
Example:The probability of flipping a fair coin and getting tails is 0.50, or 50%. If
a coin is flipped 10 times, there is no guarantee, that exactly 5 tails will be observed,
the proportion of tails can range from 0 to 1.
10.2.2 Basic Definitions

1. An experiment is defined as any planned process of data collection.
2. A probability experiment is a chance process that leads to well-defined

results called outcomes. For example flipping or tossing a coin, rolling a die,
or drawing a card from a deck, etc.
3. An outcome is the result of a single trial of a probability experiment, for ex-

ample, when a coin is tossed, there are two possible outcomes: head (H)
or tail (T). In the roll of a single die, there are six possible outcomes:
1, 2, 3, 4, 5, 6
4. A sample space is the set of all possible outcomes of a probability experi-

ment. It is denoted by S.
Example 1. Find the sample space for tossing one coin.
Solution: The sample space S is S = {H, T } .
Example 2. Find the sample space for the gender of the children if a family
has three children. Use B for boy and G for girl.
Solution There are two genders, male and female, and each child could be
either gender. Hence, there are eight possibilities, as shown here.
S = {BBB BBG BGB GBB GGG GGB GBG BGG}.
5
In the above examples, the sample spaces were found by observation and
reasoning; however, another way to find all possible outcomes of a probability
experiment is to use a tree diagram.
Use a tree diagram to find the sample space for the gender of three chil-
dren in a family, as in Example 2.
Solution Since there are two possibilities (boy or girl) for the first child,
draw two branches from a starting point and label one B and the other G.
Then if the first child is a boy, there are two possibilities for the second child
(boy or girl), so draw two branches from B and label one B and the other G.
Do the same if the first child is a girl. Follow the same procedure for the third
child. The completed tree diagram is shown in Figure 1. To find the outcomes
for the sample space, trace through all the possible branches, beginning at the
starting point for each one.
Figure 1: Tree Diagram
5. An event is defined to be any subset of the sample space, and events are
usually denoted by capital letters, A, B and so forth.
An event can be one outcome or more than one outcome. For example, if a
die is rolled and a 6 shows, this result is called an outcome, since it is a result
of a single trial. An event with one outcome is called a simple event. The
event of getting an odd number when a die is rolled is called a compound
event, since it consists of three outcomes or three simple events. In general,
a compound event consists of two or more outcomes or simple events.
6. A conditional probability is the probability of one event given that another
event has occurred.
7. In Probability OR means the union that is either can occur and in probability
AND means intersection that is both must occur. Two events are mutually
exclusive if they cannot occur simultaneously.
6
Formula used to calculate the probability of an event. The probability of any event
A is
N umber of outcomes in A
(13)
T otal number of outcomes in the sample space
This probability is denoted by
n(A)
P (A) = (14)
n(S)
10.2.3 Rules of probability

1. The probability of any event E is a number (either a fraction or decimal)
between and including 0 and 1. This is denoted by 0 ≤ P (E) ≤ 1.
2. The sum of the probabilities of all simple events must be 1.
3. P (∅) = 0 and P (S) = 1
4. If A ⊆ B), then P (A) ≤ P (B)
5. If A and B are not mutually exclusive then
P (A ∪ B) = P (A) + P (B) − P (A ∩ B) additive law.
6. For disjoint events, A ∩ B = ∅, and the addition rule takes the simple form
P (A ∪ B) = P (A) + P (B).
7. If A and B are independent then P (A ∩ B) = P (A)P (B) (multiplicative law).

P (A∩B) P (B∩A)
8. For conditional probability P (A|B) = P (B) or P (B|A) = P (A) .
P (A)
9. If A ⊆ B), then A ∩ B = A, so P (A|B) = P (B) .
10. The complement of an event A is the set of outcomes in the sample space
that are not included in the outcomes of event A. The complement of Ais
denoted Ac ; P (Ac ) = 1 − P (A), P (A) = 1 − P (Ac ), therefore,
P (A) + P (Ac ) = 1.
The multiplication rule for probabilities when events are not independent can
be used to derive one form of an important formula called Baye’s theorem.
Since P (A ∩ B) is the same as P (B ∩ A), then
P (A|B)P (B) = P (B|A)P (A) (15)

P (B|A)P (A)
P (A|B) = or (16)
P (B)
P (A|B)P (B)
P (B|A) = (17)
P (A)
Note that, it is an extension of the multiplicative rule.
7
Example 3. If a family has three children, find the probability that two of the
three children are girls.
Solution:
The sample space for the gender of the children for a family that has three children
has eight outcomes, that is,
S = {BBB BBG BGB GBB GGG GGB GBG BGG}.
Since there are three ways to have two girls, namely, GGB, GBG, BGG, then
3
P ( two girls) = .
8
Example 4.: In a college campus, suppose that 2600 are men out of 4000 under-
graduate students, while 800 are men among 2000 undergraduates who are under
the age 25. From this population of undergraduate students if one student is se-
lected at random, what is the probability that the student will be either a man or
be under the age 25?
Solution:
Let us define two events as follows:

A: The selected undergraduate student is male.
B: The selected undergraduate student is under the age 25.
One can observe that
2600 2000 800
P (A) = , P (A) = , P (A ∩ B) = (18)
4000 4000 4000
Then,
2600 + 2000 − 800 13
P (A ∪ B) = P (A) + P (B) − P (A ∩ B) = = (19)
4000 20
Example 5.: A die is rolled and the number showing recorded. Given that the
number rolled was even, what is the probability that it was a six?
Solution:
Let A denote the event even and B denote the event a six. Here, sample space is
given by
S = {1, 2, 3, 4, 5, 6}.
3 1 1
P (A) = = and P (B) =
6 2 6
Clearly B ⊆ A, so
1
P (B) 6 1
P (A|B) = = 1 = = 0.3.
P (A) 2
3
Example 5.: For events A and B, you are given that P (A) = 23 , P (B) = 2
5 and
P (A ∪ B) = 43 . Find P (Ac ), P (B c ), P (A ∩ B).
Solution
8
2 1 2 3
P (Ac ) = 1 − P (A) = 1 − = , P (B c ) = 1 − P (B) = 1 − =
3 3 5 5
P (A ∪ B) = P (A) + P (B) − P (A ∩ B)
⇔ P (A ∩ B) = P (A) + P (B) − P (A ∪ B)
2 2 3 19
⇔ P (A ∩ B) = + − =
3 5 4 60
Example 6. A coin is flipped and a die is rolled. Find the probability of getting a
head on the coin and a 4 on the die.
Solution
1 1 1
P ( head and 4) = P ( head).P (4) = . =
2 6 12
Note that the sample space for the coin is H, T ; and for the die it is 1, 2, 3, 4, 5, 6
Example 7. A recent survey asked 100 people if they thought women in the armed
forces should be permitted to participate in combat. The results of the survey are
shown
Gender Yes No Total
Male 32 18 50
Female 8 42 50
Total 40 60 100
Find these probabilities.
a. The respondent answered yes, given that the respondent was a female.
b. The respondent was a male, given that the respondent answered no.
Solution:
Let
M: respondent was a male; Y: respondent answered yes
F: respondent was a female; N: respondent answered no‘
a. The problem is to find P (Y |F ). The rule states
P (F and Y )
P (Y |F ) =
P (F )
The probability P (F and Y ) is the number of females who responded yes, divided
by the total number of respondents:
8
P ((F and Y )) =
100
The probability P (F ) is the probability of selecting a female:
50
P ((F ) =
100
9
Then
P (F and Y ) 8 50 4
P (Y |F ) = = / =
P (F ) 100 100 25
b. The problem is to find P (M |N )
P (M and N ) 18 60 3
P (M |N ) = = / =
P (N ) 100 100 10
10.3 Random variables

10.3.1 Some definitions
1. A Variable is a characteristic or attribute that can assume different values,
2. A random variable is a variable whose values are determined by chance.
3. Discrete random variable can only take finite or countable values. Eg.
Sex, parity, race, etc.
4. Continuous random variable can take any value within a specified interval.
Eg. Blood pressure, weight, etc.
5. Probability distribution: all random variables has a corresponding prob-

ability distribution, which uses the theory of probability to describe the be-
havior of a random variable.
Example 1: Construct a probability distribution for rolling a single die.
Solution
Since the sample space S is S = {1, 2, 3, 4, 5, 6} and each outcome has a proba-
bility of 61 , the distribution is as shown.
Outcome X 1 2 3 4 5 6
1 1 1 1 1 1
Probability P(X) 6 6 6 6 6 6
Example 2:
Construct a probability distribution for tossing a coin three times by assuming that
X is the random variable for the number of heads.
Solution
Knowing that the sample space for tossing a coin three times is given by S =
{T T T, T T H, T HT, HT T, HHT, HT H, T HH, HHH}.
Hence, the probability of getting no heads is 18 , one head is 38 , two heads is 38 , and
three heads is 81 . From these values, a probability distribution can be constructed
by listing the outcomes and assigning the probability of each outcome, as shown
here
Number of heads X 0 1 2 3
1 3 3 1
Probability P(X) 8 8 8 8
10
Example 3: Consider the data below which show the frequency distribution
of the number of babies that women have had. Here let X be the discrete random
variable representing the number of children a woman have had, if the study child
was the first child then X = 1, if he/she the second then X = 2, and so on. The
table 1 shows the probability distribution of X, i.e. P (X), which is the proportion
of a woman with 1,2, ... children (frequency distribution). We observe all possible
outcomes of X, so the probabilities add up to 1 (exhaustive trial).
Table 1: Table of number of children

Parity Count Percentage
1 6076 43.06
2 4267 30.24
3 1977 14.01
4 955 6.77
5 479 3.39
6 216 1,53
7 86 0.61
8 31 0.22
9 13 0.09
10 9 0.06
11 1 0.01
Suppose we want to know the probability that a particular mother’s child is her
3rd child, from the table P (X = 3) = 0.14.
Question
What is the probability that a chosen mother’s child is her 4th or 5th ?
Two requirements for a probability distribution
• The sum ofPthe probabilities of all the events in the sample space must equal
1; that is, ni=1 P (X = xi ) = 1
• The probability of each event in the sample space must be between or equal
to 0 and 1. That is,
0 ≤ P (X) ≤ 1. (20)
Example
Determine whether each distribution is a probability distribution.
X 0 5 10 15 25
1. 1 1 1 1 1
P(X) 5 5 5 5 5.
X 0 2 4 6
2.
P(X) −1 1.5 0.3 0.2
11
Note
If the frequency groups or categories become many it will be difficult to use fre-
quency distribution as in the above example. Instead one appeals to known theo-
retical probability distributions. Most measurements in real life take the form of
known theoretical distributions.
Example: Weight and Age have roughly normal distributions
10.3.2 Probability Density function

Definition
In probability theory, a probability density function (pdf), or density of a con-

tinuous random variable is a function that describes the relative likelihood for this
random variable to occur at a given point. The probability for the random variable
to fall within a particular region is given by the integral of this variable’s density
over the region. The probability density function is non negative everywhere, and
its integral over the entire space is equal to one. That is for absolutely continuous
univariate distribution.
So,
1.f (x) ≥ 0, f or all x ∈ Rn

Z ∞
2. f (x)dx = 1
−∞
Z b
3.P (a ≤ X ≤ b) = f (x)dx.
a
Example:
Suppose that the error in the reaction temperature, in degree Celsius, for a con-
trolled laboratory experiment is a continuous random variable X having the prob-
ability density function
( 2
x
, −1 < x < 2
f (x) = 3
0, elsewhere
a. Verify that f (x) is a density function.

b. Find P (0 ≤ X ≤ 1)
Solution
a. Obviously, f (x) ≥ 0.
∞ 2
x2
Z Z
f (x)dx = dx = 1.
−∞ −1 3
Therefore, f (x) is a density function.
b.
1
x2
Z
1
P (0 ≤ X ≤ 1) = dx =
0 3 9
12
10.3.3 Expectation and Variance
Let X be a random variable with probability distribution f (x).
1. If X is discrete random variable, then
i. the mean or expected value of X is

X
µ = E(X) = xf (x)
x
ii. the variance of X is

X
σ 2 = E[(X − µ)2 ] = (x − µ)2 f (x).
x
2. If X is continuous random variable, then
i. the mean or expected value of X is

Z ∞
µ = E(X) = xf (x)dx
−∞
ii. the variance of X is

Z ∞
2
σ = (x − µ)2 f (x)dx
−∞
Note that V ar(X) = σ 2 = E[(X − E(X))2 ] = E(X 2 ) − E 2 (X).

Example 1: Let the random variable X represent the number of defective parts
for a machine when 3 parts are sampled from a production line and tested. The
following is the probability distribution of X. Calculate V ar(X) = σ 2 .
x 0 1 2 3
f(x) 0.51 0.38 0.1 0.01
Solution:
Let us first compute mean or Expected value of X.

X
µ = E(X) = xf (x) = 0 × 0.51 + 1 × 0.38 + 2 × 0.1 + 3 × 0.01 = 0.61.
x
Second, Variance of X
σ 2 =E[(X − µ)2 ] = E[(X − E(X))2 ] = E(X 2 ) − E 2 (X)

X
= x2 f (x) − [E(X)]2
x
= 0 × 0.51 + 12 × 0.38 + 22 × 0.1 + 32 × 0.01 − (0.61)2 = 0.4979.
2
Example 2: The weekly demand for a drinking-water product, in thousands of

litres, from a local chain of efficiency stores is a continuous random variable X
having the probability density
(
2(x − 1), 1 < x < 2
f (x) =
0, elsewhere
13
Solution:
The expectation of random variable X is given by

Z ∞ Z 2
5
E(X) = xf (x)dx = 2x(x − 1)dx = .
−∞ 1 3
The variance of random variable x is given by
σ 2 =E[(X − µ2 ] = E[(X − E(X))2 ] = E(X 2 ) − E 2 (X)

Z ∞
= x2 f (x)dx − [E(X)]2
−∞
Z 2
5
=2 x2 (x − 1)dx − ( )2
1 3
1
=
18
Properties of the mean and variance
For a and b constants, X and Y random variables.
1. E[(aX + b)] = E[aX] + E[b] = aE[X] + b
2. E(X + Y ) = E(X) + E(Y )
3. E(XY ) = E(X)E(Y ) for X and Y independent
4. V ar(a) = 0
5. V ar(aX) = a2 V ar(X)
6. V ar(X + Y ) = V ar(X) + V ar(Y ) − Cov(X, Y )
10.3.3 Some probability distribution

The Bernoulli, binomial and Poisson distributions are the examples of discrete
distributions.
1. Bernoulli distribution
Suppose that a trial, or an experiment, whose outcome can be classified as ei-

ther a success or as a failure is performed. If we let X equal 1 if the outcome is a
success and 0 if it is a failure, then the probability mass function of X is given by
P (X = 1) = p (21)
P (X = 0) = 1 − p (22)
where 0 ≤ p ≤ 1 the probability that the trial is a success.

2. Binomial distribution
Many types of probability problems have only two outcomes or can be reduced
to two outcomes. For example, when a coin is tossed, it can land heads or tails.
When a baby is born, it will be either male or female. In a basketball game, a team
14
either wins or loses. A true/false item can be answered in only two ways, true or
false. Other situations can be reduced to two outcomes. For example, a medical
treatment can be classified as effective or ineffective, depending on the results. A
person can be classified as having normal or abnormal blood pressure, depending on
the measure of the blood pressure gauge. A multiple-choice question, even though
there are four or five answer choices, can be classified as correct or incorrect. Situ-
ations like these are called binomial experiments.
A binomial experiment and its results give rise to a special probability distribution
called the binomial distribution. The binomial distribution is used when
there are only two outcomes for an experiment, there are a fixed number of trials,n,
the probability is the same for each trial, and the outcomes are independent of one
another.
The probability mass function of binomial distribution is defined by

n!
P (X = x) = px q n−x , x = 0, 1, 2, 3, . . . , n (23)
x!(n − x)!
where 0 ≤ p ≤ 1 the probability that the trial is a success and q = 1 − p is a
probability of failure. The mean and variance of binomial distribution, respectively,
is given by
µ = np (24)
σ 2 = npq (25)
Examples
1. The probability that a patient recovers from a rare blood disease is 0.4. If 15
people are known to have contracted this disease, what is the probability that
a. at least 10 survive,
b. from 3 to 8 survive, and
c. exactly 5 survive?
Solution
a. n = 15 and p = 0.4. “At least 10 survive” means 10, 11, 12, 13, 14, 15, That
is,
15!
P (X = 10) = 0.410 × 0.615−10 = 0.024.
10!(15 − 10)!
15!
P (X = 11) = 0.411 × 0.64 = 0.007.
11!(15 − 11)!
15!
P (X = 12) = 0.412 × 0.63 = 0.002.
12!(3)!
15!
P (X = 13) = 0.413 × 0.62 ≈ 0.
13!(2)!
15!
P (X = 14) = 0.414 × 0.61 ≈ 0
14!(1)!
15!
P (X = 15) = 0.415 × 0.60 ≈ 0.
15!(0)!
15
Therefore, the probability of at least 10 is survived is equal to
P (X = 10) + P (X = 11) + P (X = 12) + P (X = 13) + P (X = 14) + P (X = 15) = 0.033.
b. The probability that the people from 3 to 8 are survived is equal to
P (X = 3) + P (X = 4) + P (X = 5) + P (X = 6) + P (X = 7) + P (X = 8) = 0.8779.
c. The probability that the people 5 is exactly survived is equal to

15!
P (X = 5) = 0.45 × 0.610 = 0.1859.
5!10!
Also, you can find these values using Binomial distribution table as follows
Suppose that n = 3, i = 2, and p = 0.5,
3!
P (X = 2) = 0.52 × 0.51 = 0.375,
2!(1)!
The value 0.375 is found as shown in Figure 2.
Figure 2: Using binomial table
2. A coin is tossed 3 times. Find the probability of getting exactly two heads.
Solution
Here n = 3,x = 2, the probability of a success (heads) is 12 in each case.
3!
P (X = 2) = 0.52 × 0.51 = 0.375,
2!(1)!
3. A die is rolled 360 times. Find the mean, variance, and standard deviation of
the number of 4s that will be rolled.
Solution
This is a binomial experiment since getting a 4 is a success and not getting a
4 is considered a failure. Hence n = 360, p = 16 , q = 1 − 16 = 56 ,
1
µ = np = 360 × = 60
6
1 5
σ 2 = npq = 360 × × = 50
6 6
√
σ = σ 2 = 7.07
16
4. The Statistical Bulletin published by Metropolitan Life Insurance Co. re-
ported that 2% of all American births result in twins. If a random sample of
8000 births is taken, find the mean, variance, and standard deviation of the
number of births that would result in twins (ANS: µ = 160, σ 2 = 156.8, σ =
12.5)
3.Poisson distribution
A discrete probability distribution that is useful when number of trials,n, is
large and the probability of success, p, is small and when the independent
variables occur over a period of time is called the Poisson distribution. It gives
the probability that an outcome occurs in a specified number of times. The
probability of X occurrences in an interval of time, volume, area, etc., for a
variable where (Greek letter lambda) is the mean number of occurrences per unit
(time, volume, area, etc.) is
e−λ λx
P (X = x, λ) = , where x = 0, 1, 2, . . . . (26)
x!
The letter e is a constant approximately equal to 2.7183. The mean and the
variance of the Poisson distribution are the same and it is given by
µ = σ 2 = np (27)
Examples
1. During a laboratory experiment, the average number of radioactive particles

passing through a counter in 1 millisecond is 4. What is the probability that
6 particles enter the counter in a given millisecond?
Solution
Here, λ = 4, x = 6
e−4 46
P (X = 6, 4) = = 0.1042
6!
2. If there are 200 typographical errors randomly distributed in a 500-page

manuscript, find the probability that a given page contains exactly 3 errors.
Solution
First, find the mean number λ of errors. Since there are 200 errors distributed
over 500 pages, each page has an average of
200
λ= = 0.4
500
or 0.4 error per page. Since x = 3, substituting into the formula yields
e−λ λx (2.7183)−0.4 × 0.43

P (X = x, λ) = = P (X = 3, 0.4) = = 0.0072 (28)
x! 3!
Thus, there is less than a 1% chance that any given page will contain exactly 3
errors. The Poisson distribution table can be used to calculate the probability
as follows (Table 3).
17
Figure 3: Using Poisson table
3. Ten is the average number of oil tankers arriving each day at a certain port.
The facilities at the port can handle at most 15 tankers per day. What is the
probability that on a given day tankers have to be turned away? ( answer:
0.0487)
4. Suppose we are interested in the number of people who visit the clinic in
city ”X” in a given year among the total population say 5000,000, and let
the probability that some one in the city visits the clinic is 0.00001. The
mean number of people from the example above would be np = 5000000 × 0 :
00001 = 50 which is also the variance. For this example calculate:
1. The probability that no one in this population visits the clinic in the a
given year
2. The probability that less than 5 people visits the clinic in the a given
year
5. If approximately 2% of the people in a room of 200 people are left-handed, find

the probability that exactly 5 people there are left-handed (answer: 0.1563 ).
10.3.6 Normal distribution
Definition: It is a special distribution that we will use just about every

day for a continuous random variable. For example, A circle can be used to
represent many physical objects, such as a wheel or a gear. Even though it
is not possible to manufacture a wheel that is perfectly round, the equation
and the properties of a circle can be used to study many aspects of the wheel,
such as area, velocity, and acceleration. In a similar manner, the theoretical
curve, called a normal distribution curve, can be used to study many variables
that are not perfectly normally distributed but are nevertheless approximately
normal. The properties of Normal distribution are the following:
(a) It can take on any value (not just integers, as do the binomial and Poisson
distribution)
(b) A normal distribution curve is bell-shaped.
(c) The mean, median, and mode are equal and are located at the center of
the distribution.
(d) A normal distribution curve is unimodal (i.e., it has only one mode).
(e) The curve is symmetric about the mean, which is equivalent to saying
that its shape is the same on both sides of a vertical line passing through
the center.
18
Figure 4: Normal curve
(f) The curve is continuous; that is, there are no gaps or holes. For each
value of X, there is a corresponding value of Y.
(g) The curve never touches the x axis. Theoretically, no matter how far in
either direction the curve extends, it never meets the x axis—but it gets
increasingly closer.
(h) The total area under a normal distribution curve is equal to 1.00, or
100%. This fact may seem unusual, since the curve never touches the x
axis, but one can prove it mathematically by using calculus. (The proof
is beyond the scope of this textbook.)
(i) The area under the part of a Normal distribution Curve that lies within
1 standard deviation of the mean is approximately 0.68, or 68%; within
2 standard deviations, about 0.95, or 95%; and within 3 standard devia-
tions, about 0.997, or 99.7%. See Figure 5, which also shows the area in
each region.
Figure 5: Areas Under a Normal Distribution Curve
Formula: Given a normal random variable X, with mean µ and variance σ 2

that can take on any value between negative and positive infinity (−∞ to +∞),
normal distribution formula is as follows:
1 1
f (x, µ, σ) = √ exp [− 2 (x − µ)2 ], (29)
σ 2π 2σ
where −∞ < x < ∞ and π = 3.14159
19
Standard normal distribution
The Standard Normal distribution follows a normal distribution and has mean
0 and standard deviation 1.
Figure 6: Standard Normal Distribution
Remark: Notice the Figure 6 is perfectly symmetric about 0. If a distribution

is normal but not standard, we can convert a value to the Standard normal
distribution Z score by finding first as how many standard deviations away
the number is from the mean. The number of standard deviations from the
mean is called the z-score and can be found by the formula:
X −µ value − mean
Z= orZ = (30)
σ standard deviation
The Z Score and Area
Often we want to find the probability that a z-score will be less than a given
value, greater than a given value, or in between two values. To accomplish
this, we use the Table from the textbook and a few properties about the
normal distribution.
Example 1: Given a standard normal distribution, find the area under the
curve that lies
(a) to the right of z = 1.84

(b) between z = −1.97 and z = 0.86
Solution
(a) The area in Figure 7(a) to the right of z = 1.84 is equal to 1 minus the
area in Table of standard normal distribution to the left of z = 1.84,
namely, 1 − 0.9671 = 0.0329. Alternatively, This sub-question can be
written as
P (Z > 1.84) = 1 − P (Z ≤ 1.84) = 1 − 0.9671 = 0.0329 (31)
(b) The area in Figure 7(b) between z = −1.97 and z = 0.86 is equal to
the area to the left of z = 0.86 minus the area to the left of z = −1.97.
20
Figure 7: Area under the curves for example 1.
From Table standard normal distribution, we find the desired area to be

0.8051 − 0.0244 = 0.7807. This sub-question can be written as
P (−1.97 ≤ Z ≤ 0.86) = P (Z ≤ 0.86) − P (Z ≤ −1.97)

⇒ P (−1.97 ≤ Z ≤ 0.86) = 0.8051 − 0.0244 = 0.7807
Example 2: Given a standard normal distribution, find the value of k such

that
(a) P (Z > k) = 0.3015

(b) P (k < Z < −0.18) = 0.4197.
Solution
(a) In Figure 8(a), we see that the k value leaving an area of 0.3015 to
the right must then leave an area of 0.6985 to the left. From Table of
standard normal distribution it follows that k = 0.52. Alternatively,
P (Z > k) = 1 − P (Z ≤ k)
⇒ P (Z ≤ k) = 1 − P (Z > k) = 1 − 0.3015 = 0.6985
From Table of standard normal distribution it follows that k = 0.52
21
6. From Table of standard normal distribution, we note that the total area to the
left of −0.18 is equal to 0.4286. In Figure 8(b), we see that the area between
k and −0.18 is 0.4197, so the area to the left of k must be 0.4286 − 0.4197 =
0.0089. Hence, from Table of standard normal distribution, we have k = 2.37.
Alternatively,
P (Z ≤ −0.18) − P (Z ≤ k) = 0.4197
⇒ P (Z ≤ k) = P (Z ≤ −0.18) − 0.4197 = 0.4286 − 0.4197 = 0.0089
Hence, from Table of standard normal distribution, we have k = 2.37.
Example 3: Given a random variable X having a normal distribution with µ = 50

and σ = 10, find the probability that X assumes a value between 45 and 62.
Solution
The z values corresponding to x1 = 45 and x2 = 62 are
45 − 50 62 − 50
z1 = = 0.5 and z2 = 1.2
10 10
, respectively.
Therefore, P (45 < X < 62) = P (0.5 < Z < 1.2).
P (0.5 < Z < 1.2) is shown by the area of the shaded region in Figure 9. This area
may be found by subtracting the area to the left of the ordinate z = −0.5 from the
entire area to the left of z = 1.2. Using Standard normal distribution table, we
have
P (45 < X < 62) = P (0.5 < Z < 1.2) = P (Z < 1.2) − P (Z < 0.5)
⇒ P (45 < X < 62) = 0.8849 − 0.3085 = 0.5764.
22

Statistics and Probability Notes Part 1

Uploaded by

Copyright:

Available Formats

Statistics and Probability Notes Part 1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics and Probability Notes Part 1

Uploaded by

Copyright:

Available Formats

Session 6

Mathematics for Engineers IV

Chapter III: Probability and statistics

Unit 10: Probability and statistics

Unit 10: Probability and statistics

10.1.2 Measures of central tendency and Variation or dispersion

Measures of central tendency

x̄H ≤ x̄G ≤ x̄. (7)

It is the middle number. It is found by putting the numbers in ascending order

When the table of measurements is given with frequencies

And the population variance is denoted and calculated as follows:

Uses of variance and standard variation

Skewness and Kurtosis

Kurtosis is a measure of peakedness or flatness (whether the data are peaked

10.2 Elementary Probability Theory

10.2.2 Basic Definitions

2. A probability experiment is a chance process that leads to well-defined

3. An outcome is the result of a single trial of a probability experiment, for ex-

4. A sample space is the set of all possible outcomes of a probability experi-

S = {BBB BBG BGB GBB GGG GGB GBG BGG}.

Figure 1: Tree Diagram

10.2.3 Rules of probability

2. The sum of the probabilities of all simple events must be 1.

3. P (∅) = 0 and P (S) = 1

4. If A ⊆ B), then P (A) ≤ P (B)

5. If A and B are not mutually exclusive then

P (A ∪ B) = P (A) + P (B) − P (A ∩ B) additive law.

7. If A and B are independent then P (A ∩ B) = P (A)P (B) (multiplicative law).

P (A|B)P (B) = P (B|A)P (A) (15)

Note that, it is an extension of the multiplicative rule.

S = {BBB BBG BGB GBB GGG GGB GBG BGG}.

Let us define two events as follows:

Find these probabilities.

b. The problem is to find P (M |N )

10.3 Random variables

2. A random variable is a variable whose values are determined by chance.

5. Probability distribution: all random variables has a corresponding prob-

Example 1: Construct a probability distribution for rolling a single die.

Table 1: Table of number of children

Two requirements for a probability distribution

10.3.2 Probability Density function

In probability theory, a probability density function (pdf), or density of a con-

1.f (x) ≥ 0, f or all x ∈ Rn

a. Verify that f (x) is a density function.

i. the mean or expected value of X is

ii. the variance of X is

2. If X is continuous random variable, then

i. the mean or expected value of X is

ii. the variance of X is

Note that V ar(X) = σ 2 = E[(X − E(X))2 ] = E(X 2 ) − E 2 (X).

Let us first compute mean or Expected value of X.

σ 2 =E[(X − µ)2 ] = E[(X − E(X))2 ] = E(X 2 ) − E 2 (X)

Example 2: The weekly demand for a drinking-water product, in thousands of

The expectation of random variable X is given by

The variance of random variable x is given by

σ 2 =E[(X − µ2 ] = E[(X − E(X))2 ] = E(X 2 ) − E 2 (X)

1. E[(aX + b)] = E[aX] + E[b] = aE[X] + b

2. E(X + Y ) = E(X) + E(Y )