Notes Stat324
Notes Stat324
Abdullah Al-Shiha
Lectures' Notes
(Unrevised First Draft)
STAT – 324
Probability and Statistics for
Engineers
Textbook:
Probability and Statistics for Engineers and Scientists
By: R. E. Walpole, R. H. Myers, and S. L. Myers
(Sixth Edition)
1.1 Introduction:
* Populations and Samples:
Population Sample = Observations
(Some Unknown Parameters) (We calculate Some Statistics)
Example: KSU Students Example: 20 Students from KSU
(Height Mean) (Sample Mean)
N=Population Size n = Sample Size
Example:
Suppose that the following sample represents the ages (in year)
of a sample of 3 men:
x1 = 30, x2 = 35, x3 = 27. (n=3)
Then, the sample mean is:
30 + 35 + 27 92
x= = = 30.67 (year)
3 3
n
Note: ∑ ( xi − x) = 0 .
i =1
n
∑ ( xi − x)
2
i =1
S = S2 = (unit)
n −1
Example:
Compute the sample variance and standard deviation of the
following observations (ages in year): 10, 21, 33, 53, 54.
Solution:
n=5
n 5
∑ xi ∑ xi
i =1 i =1 10 + 21 + 33 + 53 + 54 171
x= = = = = 34.2 (year)
n 5 5 5
n 5
∑ ( xi − x) ∑ ( xi − 34.2)
2 2
S 2 = i =1 = i =1
n −1 5 −1
=
(10 − 34.2 ) + (21 − 34.2 )2 + (33 − 34.2 )2 + (53 − 34.2 )2 + (54 − 34.2 )2
2
4
1506.8
= = 376.7 (year)2
4
The sample standard deviation is:
S = S 2 = 376.7 = 19.41 (year)
7355 − (5)(34.2)
2
1506.8 2
S =
2 i =1
= = = 376.7 (year)
n −1 5 −1 4
Chapter 2: Probability:
A∩B ≠ φ A∩B = φ
A and B are not A and B are mutually
mutually exclusive exclusive (disjoint)
⎛n⎞ n!
⎜⎜ ⎟⎟ = ; r = 0, 1, 2, K, n
⎝ r ⎠ r ! (n − r )!
Notes:
⎛ n⎞
• ⎜⎜ ⎟⎟ is read as “ n “ choose “ r ”.
⎝ ⎠
r
⎛n⎞ ⎛n⎞ ⎛n⎞ ⎛n⎞ ⎛ n ⎞
• ⎜⎜ ⎟⎟ = 1 , ⎜⎜ ⎟⎟ = 1 , ⎜⎜ ⎟⎟ = n , ⎜⎜ ⎟⎟ = ⎜⎜ ⎟⎟
⎝n⎠ ⎝0⎠ ⎝1⎠ ⎝r ⎠ ⎝n − r⎠
⎛ n⎞
• ⎜⎜ ⎟⎟ = The number of different ways of selecting r objects
⎝r⎠
from n distinct objects.
⎛ n⎞
• ⎜⎜ ⎟⎟ = The number of different ways of dividing n distinct
⎝ ⎠
r
objects into two subsets; one subset contains r
objects and the other contains the rest (n−r) bjects.
Example:
If we have 10 equal–priority operations and only 4 operating
rooms are available, in how many ways can we choose the 4
patients to be operated on first?
Solution:
n = 10 r=4
The number of different ways for selecting 4 patients from 10
patients is
⎛10 ⎞ 10! 10! 10 × 9 × 8 × 7 × 6 × 5 × 4 × 3 × 2 × 1
⎜⎜ ⎟⎟ = = =
⎝ 4 ⎠ 4! (10 − 4 )! 4! × 6! (4 × 3 × 2 × 1) × (6 × 5 × 4 × 3 × 2 × 1)
= 210 (different ways )
n( A) n( A) no. of outcomes in A
P( A) = = =
n( S ) N no. of outcomes in S
Example 2.25:
A mixture of candies consists of 6 mints, 4 toffees, and 3
chocolates. If a person makes a random selection of one of these
candies, find the probability of getting:
(a) a mint
(b) a toffee or chocolate.
Solution:
Define the following events:
M = {getting a mint}
T = {getting a toffee}
C = {getting a chocolate}
Experiment: selecting a candy at random from 13 candies
n(S) = no. of outcomes of the experiment of selecting a candy.
= no. of different ways of selecting a candy from 13 candies.
⎛13 ⎞
= ⎜⎜ ⎟⎟ = 13
⎝ 1 ⎠
The outcomes of the experiment are equally likely because the
selection is made at random.
(a) M = {getting a mint}
n(M) = no. of different ways of selecting a mint candy
from 6 mint candies
⎛6⎞
= ⎜⎜ ⎟⎟ = 6
⎝ ⎠
1
n (M ) 6
P(M )= P({getting a mint})= =
n(S ) 13
(b) T∪C = {getting a toffee or chocolate}
n(T∪C) = no. of different ways of selecting a toffee
or a chocolate candy
= no. of different ways of selecting a toffee
candy + no. of different ways of selecting a
chocolate candy
⎛ 4⎞ ⎛ 3⎞
= ⎜⎜ ⎟⎟ + ⎜⎜ ⎟⎟ = 4 +3 =7
⎝ ⎠
1 ⎝ ⎠
1
= no. of different ways of selecting a candy
from 7 candies
⎛7⎞
= ⎜⎜ ⎟⎟ = 7
⎝ ⎠
1
n(T ∪ C ) 7
P(T∪C )= P({getting a toffee or chocolate})= =
n (S ) 13
Example 2.26:
In a poker hand consisting of 5 cards, find the probability of
holding 2 aces and 3 jacks.
Solution:
Experiment: selecting 5 cards from 52 cards.
n(S) = no. of outcomes of the experiment of selecting 5 cards
from 52 cards.
⎛ 52 ⎞ 52!
= ⎜⎜ ⎟⎟ = = 2598960
⎝ ⎠5 5! × 47!
The outcomes of the experiment are equally likely because the
selection is made at random.
Define the event A = {holding 2 aces and 3 jacks}
n(A) = no. of ways of selecting 2 aces and 3 jacks
= (no. of ways of selecting 2 aces) × (no. of
ways of selecting 3 jacks)
= (no. of ways of selecting 2 aces from 4 aces) × (no.
of ways of selecting 3 jacks from 4 jacks)
⎛ 4⎞ ⎛ 4⎞
= ⎜⎜ ⎟⎟ × ⎜⎜ ⎟⎟
⎝ 2⎠ ⎝ 3⎠
4! 4!
= × = 6 × 4 = 24
2! × 2! 3! ×1!
P(A )= P({holding 2 aces and 3 jacks })
n ( A) 24
= = = 0.000009
n(S ) 2598960
2 1 5
= − =
3 4 12
(c) Probability of failing both courses is:
P(MC∩EC)= 1 − P(M∪E)
31 5
= 1− =
36 36
Theorem 2.12:
If A and AC are complementary events, then:
P(A) + P(AC) = 1 ⇔ P(AC) = 1 − P(A)
Definition 2.9:
The conditional probability of the
event A given the event B is defined by:
P( A ∩ B )
P( A | B ) = ; P(B ) > 0
P(B )
Notes:
P( A ∩ B )
1. P ( A | B ) = = P(S)=Total area=1
P(B )
n ( A ∩ B ) / n (S ) n ( A ∩ B )
= = ; for equally likely outcomes case
n ( B ) / n (S ) n( B )
P( A ∩ B )
2. P(B | A) =
P ( A)
3. P ( A ∩ B ) = P ( A ) P ( B | A )
(Multiplicative Rule=Theorem 2.13)
= P(B ) P( A | B )
Example:
339 physicians are classified as given in the table below. A
physician is to be selected at random.
(1) Find the probability that:
(a) the selected physician is aged 40 – 49
(b) the selected physician smokes occasionally
(c) the selected physician is aged 40 – 49 and smokes
occasionally
Department of Statistics and O.R. − 13 − King Saud University
Unrevised First Draft
STAT – 324 Second Semester 1424/1425 Dr. Abdullah Al-Shiha
Smoking Habit
Daily Occasionally Not at all
(B1) (B2) (B3) Total
20 - 29 (A1) 31 9 7 47
30 - 39 (A2) 110 30 49 189
Age
40 - 49 (A3) 29 21 29 79
50+ (A4) 6 0 18 24
Total 176 60 103 339
Solution:
.n(S) = 339 equally likely outcomes.
Define the following events:
A3 = the selected physician is aged 40 – 49
B2 = the selected physician smokes occasionally
A3 ∩ B2 = the selected physician is aged 40 – 49 and smokes
occasionally
(1) (a) A3 = the selected physician is aged 40 – 49
n( A3 ) 79
P( A3 ) = = = 0.2330
n(S ) 339
(b) B2 = the selected physician smokes occasionally
n(B2 ) 60
P(B2 ) = = = 0.1770
n(S ) 339
(c) A3 ∩ B2 = the selected physician is aged 40 – 49 and
smokes occasionally.
n( A3 ∩ B2 ) 21
P( A3 ∩ B2 ) = = = 0.06195
n (S ) 339
(2) A3|B2 = the selected physician is aged 40 – 49 given that the
physician smokes occasionally
P( A3 ∩ B2 ) 0.06195
(i) P( A3 | B2 ) = = = 0.35
P(B2 ) 0.1770
n( A3 ∩ B2 ) 21
(ii) P( A3 | B2 ) = = = 0.35
n(B2 ) 60
21
(iii) We can use the restricted table directly: P( A3 | B2 ) = = 0.35
60
Independent Events:
Definition 2.10:
Two events A and B are independent if and only if P(A|B)=P(A)
and P(B|A)=P(B). Otherwise A and B are dependent.
Example:
In the previous example, we found that P(A3|B2) ≠ P(A3).
Therefore, the events A3 and B2 are dependent, i.e., they are not
independent. Also, we can verify that P(B2| A3) ≠ P(B2).
Theorem 2.14:
Two events A and B are independent if and only if
P(A∩B)= P(A) P(B)
*(Multiplicative Rule for independent events)
Note:
Two events A and B are independent if one of the following
conditions is satisfied:
(i) P(A|B)=P(A)
⇔ (ii) P(B|A)=P(B)
⇔ (iii) P(A∩B)= P(A) P(B)
Example 2.36:
Three cards are drawn in succession, without replacement, from
an ordinary deck of playing cards. Fined P(A1∩ A2 ∩A3), where
the events A1, A2 , and A3 are defined as follows:
A1 = {the 1-st card is a red ace}
A2 = {the 2-nd card is a 10 or a jack}
A3 = {the 3-rd card is a number greater than 3 but less than 7}
Solution:
P(A1) = 2/52
P(A2 |A1) = 8/51
P(A3| A1 ∩A2) = 12/50
P(A1∩ A2 ∩A3)
= P(A1) P(A2| A1) P(A3| A1∩ A2)
2 8 12
= × ×
52 51 50
192
=
132600
= 0.0014479
Tree Diagram
Example 2.38:
Three machines A1, A2, and A3 make 20%, 30%, and 50%,
respectively, of the products. It is known that 1%, 4%, and 7%
of the products made by each machine, respectively, are
defective. If a finished product is randomly selected, what is the
probability that it is defective?
Solution:
Define the following events:
B = {the selected product is defective}
A1 = {the selected product is made by machine A1}
A2 = {the selected product is made by machine A2}
A3 = {the selected product is made by machine A3}
20 1
P(A1) = = 0.2 ; P(B|A1)= = 0.01
100 100
30 4
P(A2) = = 0.3 ; P(B|A2)= = 0.04
100 100
50 7
P(A3) = = 0.5 ; P(B|A3)= = 0.07
100 100
3
∑ P(A k ) P(B | A k )
P(B) = k =1
Question:
If it is known that the selected product is defective, what is the
probability that it is made by machine A1?
Answer:
P(A1 ∩ B) P(A1 )P(B | A1 ) 0.2 × 0.01 0.002
P(A1|B) = = = = = 0.0408
P(B) P(B) 0.049 0.049
This rule is called Bayes' rule.
Example 2.39:
In Example 2.38, if it is known that the selected product is
defective, what is the probability that it is made by:
(a) machine A2?
(b) machine A3?
Solution:
P(A 2 ) P(B | A 2 ) P(A 2 ) P(B | A 2 )
(a) P(A2|B)= n
=
P(B)
∑ P(A k ) P(B | A k )
k =1
0.3× 0.04 0.012
= = =0.2449
0.049 0.049
Note:
P(A1|B) = 0.0408, P(A2|B) = 0.2449, P(A3|B) = 0.7142
3
• ∑ P(A k | B) = 1
k =1
• If the selected product was found defective, we should
check machine A3 first, if it is ok, we should check
machine A2, if it is ok, we should check machine A1.
Department of Statistics and O.R. − 20 − King Saud University
Unrevised First Draft
STAT – 324 Second Semester 1424/1425 Dr. Abdullah Al-Shiha
Definition 3.4:
The function f(x) is a probability function of a discrete random
variable X if, for each possible values x, we have:
1. f(x) ≥ 0
2. ∑ f ( x) = 1
all x
3. f(x)= P(X=x)
Note:
• P(X∈A) = ∑ f ( x) = ∑ P( X = x)
all x∈ A all x∈ A
Example:
For the previous example, we have:
x 0 1 2 Total
f(x)= P(X=x) 4/9 4/9 1/9 2
∑ f ( x) = 1
x=0
P(X<1) = P(X=0)=4/9
P(X≤1) = P(X=0) + P(X=1) = 4/9+4/9 = 8/9
P(X≥0.5) = P(X=1) + P(X=2) = 4/9+1/9 = 5/9
P(X>8) = P(φ) = 0
P(X<10) = P(X=0) + P(X=1) + P(X=2) = P(S) = 1
Example 3.3:
A shipment of 8 similar microcomputers
to a retail outlet contains 3 that are
defective and 5 are non-defective.
If a school makes a random purchase of 2
of these computers, find the probability
distribution of the number of defectives.
Solution:
We need to find the probability distribution of the random
variable: X = the number of defective computers purchased.
Experiment: selecting 2 computers at random out of 8
⎛8 ⎞
n(S) = ⎜⎜ ⎟⎟ equally likely outcomes
⎝ ⎠
2
The possible values of X are: x=0, 1, 2.
Consider the events:
⎛3⎞ ⎛5⎞
(X=0)={0D and 2N} ⇒ n(X=0)= ⎜⎜ ⎟⎟ × ⎜⎜ ⎟⎟
⎝0⎠ ⎝ 2⎠
⎛ 3⎞ ⎛ 5⎞
(X=1)={1D and 1N} ⇒ n(X=1)= ⎜⎜ ⎟⎟ × ⎜⎜ ⎟⎟
⎝ 1 ⎠ ⎝1 ⎠
⎛3⎞ ⎛5⎞
(X=2)={2D and 0N} ⇒ n(X=2)= ⎜⎜ ⎟⎟ × ⎜⎜ ⎟⎟
⎝ 2⎠ ⎝ 0⎠
⎛3⎞ ⎛5⎞
⎜ ⎟×⎜ ⎟
n( X = 0) ⎜⎝ 0 ⎟⎠ ⎜⎝ 2 ⎟⎠ 10
f(0)=P(X=0)= = =
n( S ) ⎛8 ⎞ 28
⎜⎜ ⎟⎟
⎝ 2⎠
⎛ 3⎞ ⎛ 5⎞
⎜ ⎟×⎜ ⎟
n( X = 1) ⎜⎝1 ⎟⎠ ⎜⎝1 ⎟⎠ 15
f(1)=P(X=1)= = =
n( S ) ⎛8 ⎞ 28
⎜⎜ ⎟⎟
⎝ 2⎠
⎛3⎞ ⎛5⎞
⎜ ⎟×⎜ ⎟
n( X = 2) ⎜⎝ 2 ⎟⎠ ⎜⎝ 0 ⎟⎠ 3
f(2)=P(X=2)= = =
n( S ) ⎛8 ⎞ 28
⎜⎜ ⎟⎟
⎝ 2⎠
In general, for x=0,1, 2, we have:
⎛3 ⎞ ⎛ 5 ⎞
⎜ ⎟×⎜ ⎟
n( X = x) ⎜⎝ x ⎟⎠ ⎜⎝ 2 − x ⎟⎠
f(x)=P(X=x)= =
n( S ) ⎛8 ⎞
⎜⎜ ⎟⎟
⎝ 2⎠
The probability distribution of X is:
x 0 1 2 Total
f(x)= P(X=x) 10 15 3 1.00
28 28 28
The probability distribution of X can be written as a formula as
follows:
⎧⎛ 3 ⎞ ⎛ 5 ⎞
⎪ ⎜⎜ ⎟⎟ × ⎜⎜ ⎟⎟
⎪⎪ ⎝ x ⎠ ⎝ 2 − x ⎠ ; x = 0, 1, 2 Hypergeometric
f ( x) = P( X = x) = ⎨ ⎛8 ⎞ Distribution
⎪ ⎜⎜ ⎟⎟
⎪ ⎝ 2⎠
⎪⎩0 ; otherwise
Definition 3.5:
The cumulative distribution function (CDF), F(x), of a discrete
random variable X with the probability function f(x) is given by:
F(x) = P(X≤x)= ∑ f (t ) = ∑ P( X = t ) ; for −∞<x<∞
t≤x t≤x
Example:
Find the CDF of the random variable X with the probability
function:
x 0 1 2
f(x) 10 15 3
28 28 28
Solution:
F(x)=P(X≤x) for −∞<x<∞
Result:
P(a < X ≤ b) = P(X ≤ b) − P(X ≤ a) = F(b) − F(a)
Example:
In the previous example,
25 10 15
P(0.5 < X ≤ 1.5) = F(1.5) − F(0.5) = − =
28 28 28
25 3
P(1 < X ≤ 2) = F(2) − F(1) = 1 − =
28 28
b
P(a < X < b) = ∫ f(x) dx
a
= area under the curve
of f(x) and over the
interval (a,b)
P(X∈A) = ∫ f(x) dx
A
= area under the curve
f: R → [0, ∞) of f(x) and over the
region A
Definition 3.6:
The function f(x) is a probability density function (pdf) for a
continuous random variable X, defined on the set of real
numbers, if:
1. f(x) ≥ 0 ∀ x ∈R
∞
2. ∫ f(x) dx = 1
-∞
b
3. P(a ≤ X ≤ b) = ∫ f(x) dx ∀ a, b ∈R; a≤b
a
Note:
For a continuous random variable X, we have:
1. f(x) ≠ P(X=x) (in general)
2. P(X=a) = 0 for any a∈R
3. P(a ≤ X ≤ b)= P(a < X ≤ b)= P(a ≤ X < b)= P(a < X < b)
4. P(X∈A) = ∫ f(x) dx
A
Total area = ∫ ∞
f ( x ) dx = 1 area = P(a ≤ X ≤ b )
−∞
= ∫ a f ( x ) dx
b
area = P( X ≥ b ) area = P( X ≤ a )
= ∫ b f ( x ) dx = ∫ −∞ f ( x ) dx
∞ a
Example 3.6:
Suppose that the error in the reaction temperature, in oC, for a
controlled laboratory experiment is a continuous random
variable X having the following probability density function:
⎧1 2
⎪ x ; −1< x < 2
f ( x) = ⎨ 3
⎪⎩0 ; elsewhere
∞
1. Verify that (a) f(x) ≥ 0 and (b) ∫ f(x) dx = 1
-∞
2. Find P(0<X≤1)
Solution:
X = the error in the reaction
temperature in oC.
X is continuous r. v.
⎧1 2
⎪ x ; −1< x < 2
f ( x) = ⎨ 3
⎪⎩0 ; elsewhere
1 11
2. P(0<X≤1) = ∫ f(x) dx = ∫ x 2 dx
0 03
⎡1 x =1⎤
= ⎢ x3 ⎥
⎣9 x = 0⎦
1
= (1 − (0))
9
1
=
9
Definition 3.7:
The cumulative distribution function (CDF), F(x), of a
continuous random variable X with probability density function
f(x) is given by:
x
F(x) = P(X≤x)= ∫ f(t) dt ; for −∞<x<∞
-∞
Result:
P(a < X ≤ b) = P(X ≤ b) − P(X ≤ a) = F(b) − F(a)
Example:
in Example 3.6,
1. Find the CDF
2. Using the CDF, find P(0<X≤1).
Solution:
⎧1 2
⎪ x ; −1< x < 2
f ( x) = ⎨ 3
⎪⎩0 ; elsewhere
For x< −1:
x x
F(x) = ∫ f(t) dt = ∫ 0 dt = 0
-∞ -∞
For −1≤x<2:
x −1 x1
F(x) = ∫ f(t) dt = ∫ 0 dt + ∫ t 2 dt
-∞ -∞ -13
x1
= ∫ t 2 dt
-13
⎡1 t=x ⎤ 1 1
= ⎢ t3 ⎥ = ( x − (−1)) = ( x + 1)
3 3
⎣9 t = −1⎦ 9 9
For x≥2:
x −1 21 x 21
F(x) = ∫ f(t) dt = ∫ 0 dt + ∫ t 2 dt + ∫ 0 dt = ∫ t 2 dt = 1.
-∞ -∞ -13 2 -13
Therefore, the CDF is:
⎧ 0 ; x < −1
⎪⎪ 1
F ( x) = P( X ≤ x) = ⎨ ( x 3 + 1) ; − 1 ≤ x < 2
⎪9
⎪⎩ 1 ; x ≥ 2
10 15 3
= (0) + (1) +(2)
28 28 28
15 6 21
= + = = 0.75 (computers)
28 28 28
Example 4.3:
Let X be a continuous random variable that represents the life
(in hours) of a certain electronic device. The pdf of X is given
by:
⎧ 20,000
⎪ ; x > 100
f ( x) = ⎨ x 3
⎪⎩0 ; elsewhere
Find the expected life of this type of devices.
Solution:
∞
E(X) = µ X = ∫ x f ( x) dx
−∞
∞ 20000
= ∫ x dx
100 x3
∞ 1
= 20000 ∫ dx
x2 100
⎡ 1 x=∞ ⎤
= 20000 ⎢− ⎥
⎣ x x = 100⎦
= − 20000 ⎡⎢0 −
1 ⎤
= 200 (hours)
⎣ 100 ⎥⎦
Therefore, we expect that this type of electronic devices to last,
on average, 200 hours.
Theorem 4.1:
Let X be a random variable with a probability distribution f(x),
and let g(X) be a function of the random variable X. The mean
(or expected value) of the random variable g(X) is denoted by
µg(X) (or E[g(X)]) and is defined by:
⎧ ∑ g ( x) f ( x) ; if X is discrete
⎪⎪all x
E[g(X)] = µ g(X) = ⎨ ∞
⎪ ∫ g ( x) f ( x) dx ; if X is continuous
⎪⎩− ∞
Example:
Let X be a discrete random variable with the following
probability distribution
x 0 1 2
f(x) 10 15 3
28 28 28
2
Find E[g(X)], where g(X)=(X −1) .
Solution:
g(X)=(X −1)2
2 2
E[g(X)] = µ g(X) = ∑ g ( x) f ( x) = ∑ ( x − 1) 2 f ( x)
x=0 x=0
= (0−1) f(0) + (1−1) f(1) +(2−1)2 f(2)
2 2
10 15 3
= (−1)2 + (0)2 +(1)2
28 28 28
10 3 13
= +0+ =
28 28 28
Example:
In Example 4.3, find E⎛⎜ ⎞⎟ .
1
⎝X⎠
Solution:
⎧ 20,000
⎪ ; x > 100
f ( x) = ⎨ x 3
⎪0 ; elsewhere
⎩
1
g(X) =
X
∞ ∞ 1
⎛1⎞
E⎜ ⎟ = E[g(X)] = µ g(X) = ∫ g ( x) f ( x) dx = ∫ f ( x) dx
⎝X⎠ −∞ −∞ x
∞ 1 20000 ∞ 1 20000 ⎡ 1 x = ∞ ⎤
= ∫ dx = 20000 ∫ dx = ⎢ ⎥
100 x x
3
100 x
4 − 3 ⎣ x 3 x = 100⎦
− 20000 ⎡ 1 ⎤
= ⎢ 0 − = 0.0067
3 ⎣ 1000000 ⎥⎦
Definition 4.3:
Let X be a random variable with a probability distribution f(x)
and mean µ. The variance of X is defined by:
⎧ ∑ ( x − µ ) 2 f ( x) ; if X is discrete
⎪⎪all x
Var(X) = σ 2X = E[(X − µ) 2 ] = ⎨ ∞
⎪ ∫ ( x − µ ) 2 f ( x) dx ; if X is continuous
⎪⎩− ∞
Definition:
The positive square root of the variance of X, σ X = σ 2X , is
called the standard deviation of X.
Note:
Var(X)=E[g(X)], where g(X)=(X −µ)2
Theorem 4.2:
The variance of the random variable X is given by:
Var(X) = σ 2X = E(X 2 ) − µ 2
⎧ ∑ x 2 f ( x) ; if X is discrete
⎪⎪all x
where E(X 2 ) = ⎨ ∞
⎪ ∫ x 2 f ( x) dx ; if X is continuous
⎪⎩− ∞
Example 4.9:
Let X be a discrete random variable with the following
probability distribution
x 0 1 2 3
f(x) 0.51 0.38 0.10 0.01
Find Var(X)= σ 2X .
Solution:
3
µ = ∑ x f ( x) = (0) f(0) + (1) f(1) +(2) f(2) + (3) f(3)
x=0
= (0) (0.51) + (1) (0.38) +(2) (0.10) + (3) (0.01)
= 0.61
1. First method:
3
Var(X) = σ 2X = ∑ ( x − µ ) 2 f ( x)
x =0
3
= ∑ ( x − 0.61) 2 f ( x)
x=0
=(0−0.61)2 f(0)+(1−0.61)2 f(1)+(2−0.61)2 f(2)+ (3−0.61)2 f(3)
=(−0.61)2 (0.51)+(0.39)2 (0.38)+(1.39)2 (0.10)+ (2.39)2 (0.01)
= 0.4979
2. Second method:
Var(X) = σ 2X = E(X 2 ) − µ 2
3 2 2 2 2
E(X 2 ) = ∑ x 2 f(x) = (0 ) f(0) + (1 ) f(1) +(2 ) f(2) + (3 ) f(3)
x =0
= (0) (0.51) + (1) (0.38) +(4) (0.10) + (9) (0.01)
= 0.87
2
Var(X) = σ X = E(X 2 ) − µ 2 = 0.87 − (0.61) = 0.4979
2
Example 4.10:
Let X be a continuous random variable with the following pdf:
⎧2( x − 1) ; 1 < x < 2
f ( x) = ⎨
⎩0 ; elsewhere
Find the mean and the variance of X.
Solution:
∞ 2 2
µ = E(X) = ∫ x f ( x) dx = ∫ x [ 2( x − 1)] dx = 2 ∫ x ( x − 1) dx =5/3
−∞ 1 1
∞ 2 2
E(X 2 ) = ∫ x 2 f ( x) dx = ∫ x 2 [ 2( x − 1)] dx = 2 ∫ x 2 ( x − 1) dx =17/6
−∞ 1 1
2
Var(X) = σ 2X = E(X ) − µ = 17/6 − (5/3) = 1/18
2 2
Theorem 4.5:
If X is a random variable with mean µ=E(X), and if a and b are
constants, then:
E(aX±b) = a E(X) ± b
⇔
µaX±b = a µX ± b
Corollary 1: E(b) = b (a=0 in Theorem 4.5)
Corollary 2: E(aX) = a E(X) (b=0 in Theorem 4.5)
Example 4.16:
Let X be a random variable with the following probability
density function:
⎧1 2
⎪ x ; −1< x < 2
f ( x) = ⎨ 3
⎪⎩0 ; elsewhere
Find E(4X+3).
Solution:
∞ 2 1 12 1 ⎡1 x=2 ⎤
µ = E(X) = ∫ x f ( x) dx = ∫ x [ x 2 ] dx = ∫ x 3 dx = ⎢ x 4 ⎥ =5/4
−∞ −1 3 3 −1 3 ⎣4 x = −1⎦
E(4X+3) = 4 E(X)+3 = 4(5/4) + 3=8
Another solution:
∞
E[g(X)] = ∫ g ( x) f ( x) dx ; g(X) = 4X+3
−∞
∞ 2 1
E(4X+3) = ∫ (4 x + 3) f ( x) dx = ∫ (4 x + 3) [ x 2 ] dx = L = 8
−∞ −1 3
Theorem:
If X1, X2, …, Xn are n random variables and a1, a2, …, an are
constants, then:
E(a1X1+a2X2+ … +anXn) = a1E(X1)+ a2E(X2)+ …+anE(Xn)
⇔
n n
E ( ∑ ai X i ) = ∑ ai E ( X i )
i =1 i =1
Corollary:
If X, and Y are random variables, then:
E(X ± Y) = E(X) ± E(Y)
Theorem 4.9:
If X is a random variable with variance Var ( X ) = σ X2 and if a and
b are constants, then:
Var(aX±b) = a2 Var(X)
⇔
2 2 2
σ aX +b = a σ X
Theorem:
If X1, X2, …, Xn are n independent random variables and a1, a2,
…, an are constants, then:
Var(a1X1+a2X2+…+anXn)
= a12 Var(X1)+ a 22 Var (X2)+…+ a 2n Var(Xn)
⇔
n n
Var ( ∑ ai X i ) = ∑ ai2 Var ( X i )
i =1 i =1
⇔
σ a21X1 + a 2 X 2 +…+ a n X n = a12σ X2 1 + a 22σ X2 2 + … + a 2nσ X2 n
Corollary:
If X, and Y are independent random variables, then:
2 2
• Var(aX+bY) = a Var(X) + b Var (Y)
2 2
• Var(aX−bY) = a Var(X) + b Var (Y)
• Var(X ± Y) = Var(X) + Var (Y)
Example:
Let X, and Y be two independent random variables such that
E(X)=2, Var(X)=4, E(Y)=7, and Var(Y)=1. Find:
1. E(3X+7) and Var(3X+7)
2. E(5X+2Y−2) and Var(5X+2Y−2).
Solution:
1. E(3X+7) = 3E(X)+7 = 3(2)+7 = 13
Var(3X+7)= (3)2 Var(X)=(3)2 (4) =36
2. E(5X+2Y−2)= 5E(X) + 2E(Y) −2= (5)(2) + (2)(7) − 2= 22
Var(5X+2Y−2)= Var(5X+2Y)= 52 Var(X) + 22 Var(Y)
= (25)(4)+(4)(1) = 104
Example 4.22:
Let X be a random variable having an unknown distribution
with mean µ=8 and variance σ2=9 (standard deviation σ=3).
Find the following probability:
(a) P(−4 <X< 20)
(b) P(|X−8| ≥ 6)
Solution:
(a) P(−4 <X< 20)= ??
1
P(µ− kσ <X< µ +kσ)≥ 1−
k2
Department of Statistics and O.R. − 38 − King Saud University
Unrevised First Draft
STAT – 324 Second Semester 1424/1425 Dr. Abdullah Al-Shiha
1 1 15
1− 2
=1− =
k 16 16
15 15
Therefore, P(−4 <X< 20) ≥ , and hence, P(−4 <X< 20)≈
16 16
(approximately)
1 1 3
1− =1−=
k2 4 4
3 3
P(2<X < 14) ≥ ⇔ P(|X−8| < 6) ≥
4 4
3
⇔1 − P(|X−8| < 6) ≤ 1 −
4
1
⇔ 1 − P(|X−8| < 6) ≤
4
1
⇔ P(|X−8| ≥ 6) ≤
4
1
Therefore, P(|X−8| ≥ 6) ≈ (approximately)
4
Var(X) = σ2 = i =1
k
Example 5.3:
Find E(X) and Var(X) in Example 5.2.
Solution:
k
∑ xi
1+ 2 + 3 + 4 + 5 + 6
E(X) = µ = i =1
= = 3.5
k 6
k k
∑ ( xi − µ ) ∑ ( xi − 3.5)
2 2
Var(X) = σ2 = i =1
= i =1
k 6
(1 − 3.5) + (2 − 3.5) 2 + L + (6 − 3.5) 2 35
2
= =
6 12
Bernoulli Trial:
• Bernoulli trial is an experiment with only two possible
outcomes.
• The two possible outcomes are labeled:
success (s) and failure (f)
• The probability of success is P(s)=p and the probability of
failure is P(f)= q = 1−p.
• Examples:
1. Tossing a coin (success=H, failure=T, and p=P(H))
2. Inspecting an item (success=defective, failure=non-
defective, and p=P(defective))
Bernoulli Process:
Bernoulli process is an experiment that must satisfy the
following properties:
1. The experiment consists of n repeated Bernoulli trials.
2. The probability of success, P(s)=p, remains constant from
trial to trial.
3. The repeated trials are independent; that is the
outcome of one trial has no effect on the outcome of
any other trial
Binomial Random Variable:
Consider the random variable :
X = The number of successes in the n trials in a Bernoulli
process
Example:
Suppose that 25% of the products of a manufacturing process
are defective. Three items are selected at random, inspected, and
classified as defective (D) or non-defective (N). Find the
probability distribution of the number of defective items.
Solution:
• Experiment: selecting 3 items at random, inspected, and
classified as (D) or (N).
• The sample space is
S={DDD,DDN,DND,DNN,NDD,NDN,NND,NNN}
• Let X = the number of defective items in the sample
• We need to find the probability distribution of X.
(1) First Solution:
Outcome Probability x
NNN 3 3 3 27 0
× × =
The probability distribution
4 4 4 64 .of X is
NND 3 3 1 9 1 .x .f(x)=P(X=x)
× × =
4 4 4 64 0 27
NDN 3 1 3 9 1 64
× × =
4 4 4 64 1 9 9 9 27
+ + =
NDD 3 1 1 3 2 64 64 64 64
× × =
4 4 4 64 2 3 3 3 9
+ + =
DNN 1 3 3 9 1 64 64 64 64
× × =
4 4 4 64 3 1
DND 1 3 1 3 2 64
× × =
4 4 4 64
DDN 1 1 3 3 2
× × =
4 4 4 64
DDD 1 1 1 1 3
× × =
4 4 4 64
(2) Second Solution:
Bernoulli trial is the process of inspecting the item. The results
are success=D or failure=N, with probability of success
P(s)=25/100=1/4=0.25.
The experiments is a Bernoulli process with:
• number of trials: n=3
• Probability of success: p=1/4=0.25
• X ~ Binomial(n,p)=Binomial(3,1/4)
Department of Statistics and O.R. − 43 − King Saud University
Unrevised First Draft
STAT – 324 Second Semester 1424/1425 Dr. Abdullah Al-Shiha
Theorem 5.2:
The mean and the variance of the binomial distribution b(x;n,p)
are:
µ=np
σ = n p (1 −p)
2
Example:
In the previous example, find the expected value (mean) and the
variance of the number of defective items.
Solution:
Department of Statistics and O.R. − 44 − King Saud University
Unrevised First Draft
STAT – 324 Second Semester 1424/1425 Dr. Abdullah Al-Shiha
f ( x) = P( X = x) = h( x; N , n, K )
⎧⎛ K ⎞ ⎛ N − K ⎞
⎪ ⎜⎜ ⎟⎟ × ⎜⎜ ⎟⎟
⎪⎪ ⎝ x ⎠ ⎝ n − x ⎠ ; x = 0, 1, 2,L, n
=⎨ ⎛N⎞
⎪ ⎜⎜ ⎟⎟
⎪ ⎝n⎠
⎪⎩0 ; otherwise
Note that the values of X must satisfy:
0≤x≤K and 0≤n−x≤ N−K
⇔
0≤x≤K and n−N+K≤ x≤ n
⎧ ⎛ 3 ⎞ ⎛ 37 ⎞
⎪ ⎜⎜ ⎟⎟ × ⎜⎜ ⎟⎟
⎪⎪ ⎝ x ⎠ ⎝ 5 − x ⎠ ; x = 0, 1, 2,L,5
f ( x) = P( X = x) = h( x;40,5,3) = ⎨ ⎛ 40 ⎞
⎪ ⎜⎜ ⎟⎟
⎪ ⎝ 5⎠
⎪⎩0 ; otherwise
But the values of X must satisfy:
0≤x≤K and n−N+K≤ x≤ n ⇔ 0≤x≤3 and −42≤ x≤ 5
Therefore, the probability distribution of X is given by:
⎧ ⎛ 3 ⎞ ⎛ 37 ⎞
⎪ ⎜⎜ ⎟⎟ × ⎜⎜ ⎟⎟
⎪⎪ ⎝ ⎠ ⎝
x 5 − x ⎠ ; x = 0, 1, 2,3
f ( x) = P( X = x) = h( x;40,5,3) = ⎨ ⎛ 40 ⎞
⎪ ⎜⎜ ⎟⎟
⎪ ⎝ 5⎠
⎪⎩0 ; otherwise
Now, the probability that exactly one defective is found in the
sample is
⎛ 3 ⎞ ⎛ 37 ⎞ ⎛ 3 ⎞ ⎛ 37 ⎞
⎜⎜ ⎟⎟ × ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ × ⎜⎜ ⎟⎟
−
.f(1)=P(X=1)=h(1;40,5,3)= ⎝ ⎠ ⎝ ⎠ = ⎝1 ⎠ ⎝ 4 ⎠ = 0.3011
1 5 1
⎛ 40 ⎞ ⎛ 40 ⎞
⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟
⎝ 5⎠ ⎝ 5⎠
Theorem 5.3:
The mean and the variance of the hypergeometric distribution
h(x;N,n,K) are:
K
µ= n
N
K N −n
σ2 = n ⎛⎜1 − ⎞⎟
K
N ⎝ N ⎠ N −1
Example 5.10:
In Example 5.9, find the expected value (mean) and the variance
of the number of defectives in the sample.
Solution:
• X = number of defectives in the sample
2
• We need to find E(X)=µ and Var(X)=σ
• We found that X ~ h(x;40,5,3)
• N=40, n=5, and K=3
The expected number of defective items is
K 3
E(X)=µ = n = 5 × = 0.375
N 40
The variance of the number of defective items is
K ⎛ K⎞ N −n 3 40 − 5
= 5 × ⎛⎜1 − ⎞⎟
3
Var(X)=σ2 = n ⎜1 − ⎟ = 0.311298
N ⎝ N ⎠ N −1 40 ⎝ 40 ⎠ 40 − 1
⎛N⎞ ⎜ x ⎟⎜ N ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠ ⎝ N ⎠
⎜⎜ ⎟⎟
⎝n ⎠
Note:
If n is small compared to N and K, then there will be almost no
difference between selection without replacement and selection
K K −1 K − n +1
with replacement ( ≈ ≈L≈ ).
N N −1 N − n +1
Note:
• λ is the average (mean) of the distribution in the unit
time (t=1).
• If X=The number of calls received in a month (unit
time t=1 month) and X~Poisson(λ), then:
Theorem 6.1:
The mean and the variance of the continuous uniform
distribution on the interval [A, B] are:
A+ B
µ=
2
2 ( B − A) 2
σ =
12
Example 6.1:
Suppose that, for a certain company, the conference time, X, has
a uniform distribution on the interval [0,4] (hours).
(a) What is the probability density function of X?
(b) What is the probability that any conference lasts at
least 3 hours?
Solution:
⎧1
⎪ ; 0≤ x≤4
(a) f ( x) = f ( x;0,4) = ⎨ 4
⎪⎩ 0 ; elsewhere
4 41 1
(b) P(X≥3) = ∫ f ( x) dx = ∫ dx =
3 34 4
µ1 = µ2, σ1<σ2
µ1 < µ2, σ1<σ2
Some properties of the normal curve f(x) of N(µ,σ):
1. f(x) is symmetric about the mean µ.
2. f(x) has two points of inflection at x= µ±σ.
3. The total area under the curve of f(x) =1.
4. The highest point of the curve of f(x) at the mean µ.
Definition 6.1:
The Standard Normal Distribution:
• The normal distribution with mean µ=0 and variance
σ2=1 is called the standard normal distribution and is
denoted by Normal(0,1) or N(0,1). If the random variable
Z has the standard normal distribution, we write
Z~Normal(0,1) or Z~N(0,1).
• The pdf of Z~N(0,1) is
given by:
1 2
1 −2z
f ( z ) = n( z;0,1) = e
2π
a ∞ b
P(X<a)= ∫ f(x) dx P(X>b)= ∫ f(x) dx P(a<X<b)= ∫ f(x) dx
-∞ b a
Probabilities of Z~N(0,1):
Suppose Z ~ N(0,1).
Example:
Suppose Z~N(0,1).
(1) Z 0.00 0.01 …
P(Z≤1.50)=0.9332 : ⇓
1.5 ⇒ 0.9332
:
Example:
Suppose Z~N(0,1). Find the Z … 0.04
value of k such that : : ⇑
P(Z≤k)= 0.0207. ⇑
Solution: −2.0 ⇐⇐ 0.0207
.k = −2.04 :
Probabilities of X~N(µ,σ):
X −µ
Result: X ~N(µ,σ) ⇔ Z = ~ N(0,1)
σ
X −µ a−µ a−µ
X ≤a ⇔ ≤ ⇔ Z≤
σ σ σ
a−µ⎞
1. P( X ≤ a ) = P⎛⎜ Z ≤ ⎟
⎝ σ ⎠
a−µ⎞
2. P( X ≥ a ) = 1 − P( X ≤ a ) = 1 − P⎛⎜ Z ≤ ⎟
⎝ σ ⎠
b−µ⎞ a−µ⎞
3. P(a ≤ X ≤ b ) = P( X ≤ b ) − P( X ≤ a ) = P⎛⎜ Z ≤ ⎛
⎟ − P⎜ Z ≤ ⎟
⎝ σ ⎠ ⎝ σ ⎠
4. P(X=a)=0 for every a.
5. P(X≤µ) = P(X≥µ)=0.5
Example:
Suppose that the hemoglobin level for healthy adults males has a
normal distribution with mean µ=16 and variance σ2=0.81
(standard deviation σ=0.9).
(a) Find the probability that a randomly chosen healthy adult
male has hemoglobin level less than 14.
(b) What is the percentage of healthy adult males who have
hemoglobin level less than 14?
Solution:
Let X = the hemoglobin level for a healthy adult male
X ~ N(µ,σ)= N(16, 0.9).
14 − µ ⎞ 14 − 16 ⎞
(a) P(X ≤14)= P⎛⎜ Z ≤ ⎛
⎟ = P⎜ Z ≤ ⎟
⎝ σ ⎠ ⎝ 0.9 ⎠
= P(Z ≤−2.22)=0.0132
(b) The percentage of healthy adult
males who have hemoglobin level less
than 14 is
P(X ≤14) × 100% = 0.01320× 100%
=1.32%
Therefore, 1.32% of healthy adult males
have hemoglobin level less than 14.
Department of Statistics and O.R. − 58 − King Saud University
Unrevised First Draft
STAT – 324 Second Semester 1424/1425 Dr. Abdullah Al-Shiha
Example:
Suppose that the birth weight of Saudi babies has a normal
distribution with mean µ=3.4 and standard deviation σ=0.35.
(a) Find the probability that a randomly chosen Saudi baby has a
birth weight between 3.0 and 4.0 kg.
(b) What is the percentage of Saudi babies who have a birth
weight between 3.0 and 4.0 kg?
Solution:
X = birth weight of a Saudi baby
µ = 3.4 σ = 0.35 (σ2 = 0.352 = 0.1225)
X ~ N(3.4,0.35 )
(a) P(3.0<X<4.0)=P(X<4.0)−P(X<3.0)
4.0 − µ ⎞ 3.0 − µ ⎞
= P⎛⎜ Z ≤ ⎛
⎟ − P⎜ Z ≤ ⎟
⎝ σ ⎠ ⎝ σ ⎠
4.0 − 3.4 ⎞ 3.0 − 3.4 ⎞
= P⎛⎜ Z ≤ ⎛
⎟ − P⎜ Z ≤ ⎟
⎝ 0.35 ⎠ ⎝ 0.35 ⎠
= P(Z≤1.71) − P(Z ≤ −1.14)
= 0.9564 − 0.1271
= 0.8293
(b) The percentage of Saudi babies who have a birth weight
between 3.0 and 4.0 kg is
P(3.0<X<4.0) × 100%= 0.8293× 100%= 82.93%
Notation:
P(Z≥ZA) = A
Result:
ZA = − Z1−A
Example:
Z ~ N(0,1)
P(Z≥Z0.025) = 0.025
P(Z≥Z0.95) = 0.95
P(Z≥Z0.90) = 0.90 Z … 0.06
Example: : : ⇑
Z ~ N(0,1) ⇑
Z0.025 = 1.96 1.9 ⇐⇐ 0.975
Z0.95 = −1.645 P(Z≥Z0.025) = 0.025
Z0.90 = −1.285 Z0.025 = 1.96
Example 6.9:
In an industrial process, the diameter of a ball bearing is an
important component part. The buyer sets specifications on the
diameter to be 3.00±0.01 cm. The implication is that no part
falling outside these specifications will be accepted. It is known
that, in the process, the diameter of a ball bearing has a normal
distribution with mean 3.00 cm and standard deviation 0.005
cm. On the average, how many manufactured ball bearings will
be scrapped?
Solution:
µ=3.00
σ=0.005
X=diameter
X~N(3.00, 0.005)
The specification limits are:
3.00±0.01
x1=Lower limit=3.00−0.01=2.99
x2=Upper limit=3.00+0.01=3.01
P(x1<X< x2)=P(2.99<X<3.01)=P(X<3.01)−P(X<2.99)
3.01 − µ ⎞ 2.99 − µ ⎞
= P⎛⎜ Z ≤ ⎛
⎟ − P⎜ Z ≤ ⎟
⎝ σ ⎠ ⎝ σ ⎠
3.01 − 3.00 ⎞ 2.99 − 3.00 ⎞
= P⎛⎜ Z ≤ ⎛
⎟ − P⎜ Z ≤ ⎟
⎝ 0.005 ⎠ ⎝ 0.005 ⎠
= P(Z≤2.00) − P(Z ≤ −2.00)
= 0.9772 − 0.0228
= 0.9544
Therefore, on the average, 95.44% of manufactured ball
bearings will be accepted and 4.56% will be scrapped?
Example 6.10:
Gauges are use to reject all components where a certain
dimension is not within the specifications 1.50±d. It is known
Department of Statistics and O.R. − 60 − King Saud University
Unrevised First Draft
STAT – 324 Second Semester 1424/1425 Dr. Abdullah Al-Shiha
TABLE:
Areas Under The Standard Normal Curve
Z~Normal(0,1)
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-3.4 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002
-3.3 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0003
-3.2 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.0005
-3.1 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008 0.0007 0.0007
-3.0 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010
-2.9 0.0019 0.0018 0.0018 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
-2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
-2.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026
-2.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036
-2.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048
-2.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064
-2.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084
-2.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110
-2.1 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143
-2.0 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183
-1.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.0233
-1.8 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294
-1.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367
-1.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455
-1.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559
-1.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.0681
-1.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.0823
-1.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985
-1.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 0.1170
-1.0 0.1587 0.1562 0.1539 0.1515 0.1492 0.1469 0.1446 0.1423 0.1401 0.1379
-0.9 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711 0.1685 0.1660 0.1635 0.1611
-0.8 0.2119 0.2090 0.2061 0.2033 0.2005 0.1977 0.1949 0.1922 0.1894 0.1867
-0.7 0.2420 0.2389 0.2358 0.2327 0.2296 0.2266 0.2236 0.2206 0.2177 0.2148
-0.6 0.2743 0.2709 0.2676 0.2643 0.2611 0.2578 0.2546 0.2514 0.2483 0.2451
-0.5 0.3085 0.3050 0.3015 0.2981 0.2946 0.2912 0.2877 0.2843 0.2810 0.2776
-0.4 0.3446 0.3409 0.3372 0.3336 0.3300 0.3264 0.3228 0.3192 0.3156 0.3121
-0.3 0.3821 0.3783 0.3745 0.3707 0.3669 0.3632 0.3594 0.3557 0.3520 0.3483
-0.2 0.4207 0.4168 0.4129 0.4090 0.4052 0.4013 0.3974 0.3936 0.3897 0.3859
-0.1 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.4364 0.4325 0.4286 0.4247
-0.0 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.4761 0.4721 0.4681 0.4641
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990
3.1 0.9990 0.9991 0.9991 0.9991 0.9992 0.9992 0.9992 0.9992 0.9993 0.9993
3.2 0.9993 0.9993 0.9994 0.9994 0.9994 0.9994 0.9994 0.9995 0.9995 0.9995
3.3 0.9995 0.9995 0.9995 0.9996 0.9996 0.9996 0.9996 0.9996 0.9996 0.9997
3.4 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9998
Definition:
The continuous random variable X has an
exponential distribution with parameter β,
if its probability density function is given
by:
⎧ 1 −x / β
⎪ e ; x>0
f ( x) = ⎨ β
⎪ 0 ; elsewhere
⎩
and we write X~Exp(β)
Theorem:
If the random variable X has an exponential distribution with
parameter β, i.e., X~Exp(β), then the mean and the variance of
X are:
E(X)= µ = β
Var(X) = σ2 = β2
Example 6.17:
Suppose that a system contains a certain type of component
whose time in years to failure is given by T. The random
variable T is modeled nicely by the exponential distribution with
mean time to failure β=5. If 5 of these components are installed
in different systems, what is the probability that at least 2 are
still functioning at the end of 8 years?
Solution:
β=5
T~Exp(5)
The pdf of T is:
⎧1 − t / 5
⎪ e ;t >0
f (t ) = ⎨ 5
⎪⎩ 0 ; elsewhere
The probability that a given component is still functioning after
8 years is given by:
∞ ∞1
P(T>8)= ∫ f (t ) dt = ∫ e − t / 5 dt = e −8/5 = 0.2
8 85
S 2 = i =1 = i =1 i =1
n −1 n(n − 1)
Note:
• S2 is a statistic because it is a function of the random
sample X1, X2, …, Xn.
• S2 measures the variability in the sample.
Definition 8.10:
The sample standard deviation is defined to be the statistic:
n
∑(Xi − X )
2
i =1
S = S2 = (unit)
n −1
Example 8.1: Reading Assignment
Example 8.8: Reading Assignment
Example 8.9: Reading Assignment
X −µ σ
• Z= ~N(0,1) ⇔ X ~ N(µ, )
σ/ n n
• We consider n large when n≥30.
Example 8.13:
An electric firm manufactures light bulbs that have a length of
life that is approximately normally distributed with mean equal
to 800 hours and a standard deviation of 40 hours. Find the
probability that a random sample of 16 bulbs will have an
average life of less than 775 hours.
Solution:
X= the length of life
µ=800 , σ=40
X~N(800, 40)
n=16
µ X = µ =800
σ 40
σX = = = 10
n 16
σ
X ~ N(µ, )=N(800,10)
n
X −µ X − 800
⇔ Z= =Z = ~N(0,1)
σ/ n 10
⎛ X − µ 775 − µ ⎞
P( X < 775) = P⎜ < ⎟
⎝σ / n σ / n ⎠
⎛ X − 800 775 − 800 ⎞
= P⎜ < ⎟
⎝ 10 10 ⎠
775 − 800 ⎞
= P⎛⎜ Z < ⎟
⎝ 10 ⎠
= P(Z < −2.50)
= 0.0062
Theorem 8.3:
If n1 and n2 are large, then the sampling distribution of X 1 − X 2
is approximately normal with mean
E ( X 1 − X 2 ) = µ X 1 − X 2 = µ1 − µ 2
and variance
σ 12 σ 22
Var ( X 1 − X 2 ) = σ X2 1 − X 2 = +
n1 n2
that is:
σ 12 σ 22
X 1 − X 2 ~ N( µ1 − µ 2 , + )
n1 n2
⇔
( X 1 − X 2 ) − ( µ1 − µ 2 )
Z= ~N(0,1)
σ 12 σ 22
+
n1 n2
Note:
σ 12 σ 22 σ 12 σ 22 σ1 σ2
σX = σ X2 1 − X 2 = + ≠ + = +
1−X2 n1 n2 n1 n2 n1 n2
Example 8.16:
The television picture tubes of manufacturer A have a mean
lifetime of 6.5 years and standard deviation of 0.9 year, while
those of manufacturer B have a mean lifetime of 6 years and
standard deviation of 0.8 year. What is the probability that a
random sample of 36 tubes from manufacturer A will have a
mean lifetime that is at least 1 year more than the mean lifetime
of a random sample of 49 tubes from manufacturer B?
Solution:
Population A Population B
µ1=6.5 µ2=6.0
σ1=0.9 σ2=0.8
n1=36 n2=49
• We need to find the probability that the mean lifetime of
manufacturer A is at least 1 year more than the mean
lifetime of manufacturer B which is P( X 1 ≥ X 2 + 1 ).
• The sampling distribution of X 1 − X 2 is
σ 12 σ 22
X 1 − X 2 ~N( µ1 − µ 2 , + )
n1 n2
• E ( X 1 − X 2 ) = µ X 1 − X 2 = µ1 − µ 2 =6.5− 6.0=0.5
σ 12 σ 22 (0.9) 2 (0.8) 2
• Var ( X 1 − X 2 ) = σ X2 1 − X 2 = + = + =0.03556
n1 n2 36 49
σ 12 σ 22
• σX = + = 0.03556 = 0.189
1−X2 n1 n2
• X 1 − X 2 ~N(0.5,0.189)
( X 1 − X 2 ) − ( µ1 − µ 2 )
• Recall Z = ~N(0,1)
σ 12 σ 22
+
n1 n2
P( X 1 ≥ X 2 + 1 ) = P( X 1 − X 2 ≥ 1 )
⎛ ⎞
⎜ ⎟
⎜ ( X − X 2 ) − ( µ1 − µ 2 ) 1 − ( µ1 − µ 2 ) ⎟
= P⎜ 1 ≥ ⎟
⎜ σ1 σ 2
2 2
σ 12 σ 22 ⎟
⎜ + +
⎝ n 1 n 2 n 1 n2 ⎟⎠
1 − 0.5 ⎞
= P⎛⎜ Z ≥ ⎟
⎝ 0.189 ⎠
= P(Z≥2.65)
= 1 − P(Z<2.65)
= 1 − 0.9960
= 0.0040
8.7 t-Distribution:
• Recall that, if X1, X2, …, Xn is a random sample of size n
from a normal distribution with mean µ and variance σ2,
i.e. N(µ,σ), then
X −µ
Z= ~N(0,1)
σ/ n
• We can apply this result only when σ2 is known!
• If σ2 is unknown, we replace the population variance σ2
n
∑(Xi − X )
2
distribution with mean µ and variance σ2, i.e. N(µ,σ), then the
statistic
X −µ
T=
S/ n
has a t-distribution with ν=n−1degrees of freedom (df), and we
write T~ t(ν).
Note:
• t-distribution is a continuous
distribution.
• The shape of t-distribution is
similar to the shape of the standard
normal distribution.
Notation:
Example:
Find the t-value with ν=14 (df) that leaves an area of:
(a) 0.95 to the left.
(b) 0.95 to the right.
Solution:
ν = 14 (df); T~ t(14)
Example:
For ν = 10 degrees of freedom (df), find t0.10 and t 0.85 .
Solution:
t0.10 = 1.372
t0.85 = − t1−0.85 = − t 0.15 = −1.093 (t 0.15 = 1.093)
p1 q1 p2 q2
pˆ 1 − pˆ 2 ~ N ( p1 − p2 , + ) (Approximately)
n1 n2
( pˆ − pˆ 2 ) − ( p1 − p2 )
Z= 1 ~ N(0,1) (Approximately)
p1 q1 p2 q2
+
n1 n2
α
ν 0.40 0.30 0.20 0.15 0.10 0.05 0.025
1 0.325 0.727 1.376 1.963 3.078 6.314 12.706
2 0.289 0.617 1.061 1.386 1.886 2.920 4.303
3 0.277 0.584 0.978 1.250 1.638 2.353 3.182
4 0.271 0.569 0.941 1.190 1.533 2.132 2.776
5 0.267 0.559 0.920 1.156 1.476 2.015 2.571
6 0.265 0.553 0.906 1.134 1.440 1.943 2.447
7 0.263 0.549 0.896 1.119 1.415 1.895 2.365
8 0.262 0.546 0.889 1.108 1.397 1.860 2.306
9 0.261 0.543 0.883 1.100 1.383 1.833 2.262
10 0.260 0.542 0.879 1.093 1.372 1.812 2.228
11 0.260 0.540 0.876 1.088 1.363 1.796 2.201
12 0.259 0.539 0.873 1.083 1.356 1.782 2.179
13 0.259 0.537 0.870 1.079 1.350 1.771 2.160
14 0.258 0.537 0.868 1.076 1.345 1.761 2.145
15 0.258 0.536 0.866 1.074 1.341 1.753 2.131
16 0.258 0.535 0.865 1.071 1.337 1.746 2.120
17 0.257 0.534 0.863 1.069 1.333 1.740 2.110
18 0.257 0.534 0.862 1.067 1.330 1.734 2.101
19 0.257 0.533 0.861 1.066 1.328 1.729 2.093
20 0.257 0.533 0.860 1.064 1.325 1.725 2.086
21 0.257 0.532 0.859 1.063 1.323 1.721 2.080
22 0.256 0.532 0.858 1.061 1.321 1.717 2.074
23 0.256 0.532 0.858 1.060 1.319 1.714 2.069
24 0.256 0.531 0.857 1.059 1.318 1.711 2.064
25 0.256 0.531 0.856 1.058 1.316 1.708 2.060
26 0.256 0.531 0.856 1.058 1.315 1.706 2.056
27 0.256 0.531 0.855 1.057 1.314 1.703 2.052
28 0.256 0.530 0.855 1.056 1.313 1.701 2.048
29 0.256 0.530 0.854 1.055 1.311 1.699 2.045
30 0.256 0.530 0.854 1.055 1.310 1.697 2.042
40 0.255 0.529 0.851 1.050 1.303 1.684 2.021
60 0.254 0.527 0.848 1.045 1.296 1.671 2.000
120 0.254 0.526 0.845 1.041 1.289 1.658 1.980
∞ 0.253 0.524 0.842 1.036 1.282 1.645 1.960
α
ν 0.02 0.015 0.01 0.0075 0.005 0.0025 0.0005
1 15.895 21.205 31.821 42.434 63.657 127.322 636.590
2 4.849 5.643 6.965 8.073 9.925 14.089 31.598
3 3.482 3.896 4.541 5.047 5.841 7.453 12.924
4 2.999 3.298 3.747 4.088 4.604 5.598 8.610
5 2.757 3.003 3.365 3.634 4.032 4.773 6.869
6 2.612 2.829 3.143 3.372 3.707 4.317 5.959
7 2.517 2.715 2.998 3.203 3.499 4.029 5.408
8 2.449 2.634 2.896 3.085 3.355 3.833 5.041
9 2.398 2.574 2.821 2.998 3.250 3.690 4.781
10 2.359 2.527 2.764 2.932 3.169 3.581 4.587
11 2.328 2.491 2.718 2.879 3.106 3.497 4.437
12 2.303 2.461 2.681 2.836 3.055 3.428 4.318
13 2.282 2.436 2.650 2.801 3.012 3.372 4.221
14 2.264 2.415 2.624 2.771 2.977 3.326 4.140
15 2.249 2.397 2.602 2.746 2.947 3.286 4.073
16 2.235 2.382 2.583 2.724 2.921 3.252 4.015
17 2.224 2.368 2.567 2.706 2.898 3.222 3.965
18 2.214 2.356 2.552 2.689 2.878 3.197 3.922
19 2.205 2.346 2.539 2.674 2.861 3.174 3.883
20 2.197 2.336 2.528 2.661 2.845 3.153 3.850
21 2.189 2.328 2.518 2.649 2.831 3.135 3.819
22 2.183 2.320 2.508 2.639 2.819 3.119 3.792
23 2.177 2.313 2.500 2.629 2.807 3.104 3.768
24 2.172 2.307 2.492 2.620 2.797 3.091 3.745
25 2.167 2.301 2.485 2.612 2.787 3.078 3.725
26 2.162 2.296 2.479 2.605 2.779 3.067 3.707
27 2.158 2.291 2.473 2.598 2.771 3.057 3.690
28 2.154 2.286 2.467 2.592 2.763 3.047 3.674
29 2.150 2.282 2.462 2.586 2.756 3.038 3.659
30 2.147 2.278 2.457 2.581 2.750 3.030 3.646
40 2.125 2.250 2.423 2.542 2.704 2.971 3.551
60 2.099 2.223 2.390 2.504 2.660 2.915 3.460
120 2.076 2.196 2.358 2.468 2.617 2.860 3.373
∞ 2.054 2.170 2.326 2.432 2.576 2.807 3.291
9.1 Introduction:
• Suppose we have a population with some unknown
parameter(s).
Example: Normal(µ,σ)
µ and σ are parameters.
• We need to draw conclusions (make inferences) about the
unknown parameters.
• We select samples, compute some statistics, and make
inferences about the unknown parameters based on the
sampling distributions of the statistics.
* Statistical Inference
(1) Estimation of the parameters (Chapter 9)
→ Point Estimation
→ Interval Estimation (Confidence Interval)
(2) Tests of hypotheses about the parameters (Chapter 10)
Point Estimation:
A point estimate of some population parameter θ is a single
value θˆ of a statistic Θ̂ . For example, the value x of the statistic
X computed from a sample of size n is a point estimate of the
population mean µ.
Notation:
Za is the Z-value leaving an area of a to the
right; i.e., P(Z>Za)=a or equivalently,
P(Z<Za)=1−a
σ σ
( X − Zα , X + Zα )
2
n 2
n
σ
⇔ X ± Zα
2
n
σ σ
⇔ X − Zα < µ < X + Zα
2
n 2
n
where Z α is the Z-value leaving an area
2
of α/2 to the right; i.e., P(Z> Z α )=α/2, or
2
equivalently, P(Z< Z α )=1−α/2.
2
Note:
σ σ
We are (1−α)100% confident that µ ∈ ( X − Z α , X + Zα )
2
n 2
n
Example 9.2:
The average zinc concentration recorded from a sample of zinc
measurements in 36 different locations is found to be 2.6
gram/milliliter. Find a 95% and 99% confidence interval (C.I.)
for the mean zinc concentration in the river. Assume that the
population standard deviation is 0.3.
Solution:
µ = the mean zinc concentration in the river.
Population Sample
µ=?? n=36
σ=0.3 X =2.6
First, a point estimate for µ is X =2.6.
Z α = Z0.025
2
= 1.96
A 95% C.I. for µ is
σ
X ± Zα
2
n
σ σ
⇔ X − Zα < µ < X + Zα
2
n 2
n
⇔ 2.6 − (1.96)⎛⎜
0.3 ⎞ ⎛ 0.3 ⎞
⎟ < µ < 2.6 + (1.96)⎜ ⎟
⎝ 36 ⎠ ⎝ 36 ⎠
⇔ 2.6 − 0.098 < µ < 2.6 + 0.098
⇔ 2.502 < µ < 2.698
⇔ µ ∈( 2.502 , 2.698)
We are 95% confident that µ ∈( 2.502 , 2.698).
Theorem 9.1:
If X is used as an estimate of µ, we can then be (1−α)100%
σ
confident that the error (in estimation) will not exceed Z α .
2
n
Example:
In Example 9.2, we are 95% confident that the sample mean
X = 2.6 differs from the true mean µ by an amount less than
σ ⎛ 0.3 ⎞
Zα = (1.96)⎜ ⎟ = 0.098 .
2
n ⎝ 36 ⎠
Note:
σ
Let e be the maximum amount of the error, that is e = Z α ,
2
n
then:
2
σ σ ⎛ σ⎞
e = Zα ⇔ n = Zα ⇔ n = ⎜ Zα ⎟
e ⎜ ⎟
2
n 2 ⎝ 2 e⎠
Theorem 9.2:
If X is used as an estimate of µ, we can then be (1−α)100%
confident that the error (in estimation) will not exceed a
2
⎛ σ ⎞⎟
⎜
specified amount e when the sample size is n = Z α .
⎜ e ⎟
⎝ 2 ⎠
Note:
1. All fractional values of n = ( Z α σ / e) 2 are rounded up to the
2
next whole number.
2. If σ is unknown, we could take a preliminary sample of
size n≥30 to provide an estimate of σ. Then using
n
S= ∑ ( X i − X ) /( n − 1)
2
as an approximation for σ in
i =1
Theorem 9.2 we could determine approximately how
many observations are needed to provide the desired
degree of accuracy.
Example 9.3:
How large a sample is required in Example 9.2 if we want to be
95% confident that our estimate of µ is off by less than 0.05?
Solution:
We have σ= 0.3 , Z α = 1.96 , e=0.05. Then by Theorem 9.2,
2
2
⎛ σ ⎞ ⎛ 0 .3 ⎞
2
n = ⎜ Z α ⎟ = ⎜1.96 × ⎟ = 138.3 ≈ 139
⎜ e ⎟ ⎝ 0. 05 ⎠
⎝ 2 ⎠
Therefore, we can be 95% confident that a random sample of
size n=139 will provide an estimate X differing from µ by an
amount less than e=0.05.
Solution:
n n
.n=7 X = ∑ X i / n = 10.0 S= ∑ ( X i − X ) /(n − 1) = 0.283
2
i =1 i =1
n
First, a point estimate for µ is X = ∑ X i / n = 10.0
i =1
Now, we need to find a confidence interval for µ.
α = ??
95%=(1−α)100% ⇔ 0. 95=(1−α) ⇔ α=0.05 ⇔ α/2=0.025
t α = t0.025 =2.447 (with ν=n−1=6 degrees of freedom)
2
A 95% C.I. for µ is 0.025
X ± tα
S ↓
2
n 6 → t0.025=2.447
S S
⇔ X − tα < µ < X + tα
2
n 2
n
⇔ 10.0 − (2.447)⎛⎜
0.283 ⎞ ⎛ 0.283 ⎞
⎟ < µ < 10.0 + (2.447)⎜ ⎟
⎝ 7 ⎠ ⎝ 7 ⎠
⇔ 10.0 − 0.262< µ < 10.0 + 0.262
⇔ 9.74 < µ < 10.26
⇔ µ ∈( 9.74 , 10.26)
We are 95% confident that µ ∈( 9.74 , 10.26).
σ 12 σ 22
or ( X 1 − X 2 ) ± Z α +
2
n1 n2
⎛ σ 12 σ 22 σ 12 σ 22 ⎞⎟
⎜
or ( X 1 − X 2 ) − Z α + , ( X1 − X 2 ) + Z α +
⎜ n n n n2 ⎟
⎝ 2 1 2 2 1 ⎠
Solution:
Wire A Wire B .
nA=6 nB=6
X A =140.67 X B =138.50
S2A=7.86690 S2B=7.10009
____________________________________
A point estimate for µA−µB is X A − X B =140.67−138.50=2.17.
95% = (1−α)100% ⇔ 0. 95 = (1−α) ⇔ α=0.05 ⇔ α/2 = 0.025
ν= df = nA+nB − 2= 10
t α = t0.025 = 2.228
2
(n A − 1) S A2 + (n B − 1) S B2
S 2p =
n A + nB − 2
(6 − 1)(7.86690) + (6 − 1)(7.10009)
= =7.4835
6+6−2
S p = S 2p = 7.4835 = 2.7356
A 95% C.I. for µA−µB is
1 1 1 1
( X A − X B ) − tα S p + < µ A − µ B < ( X A − X B ) + tα S p +
2
n A nB 2
n A nB
1 1
or ( X A − X B ) ± t α S p +
2
n A nB
1 1
(140.67 − 138.50) ± (2.228) (2.7356) +
6 6
2.17 ± 3.51890
−1.35< µA−µB < 5.69
We are 95% confident that µA−µB ∈(−1.35, 5.69)
⎛ pˆ qˆ pˆ qˆ ⎞
⎜ pˆ − Z , pˆ + Z α ⎟
⎜ α ⎟
⎝ 2
n 2
n ⎠
or
pˆ qˆ pˆ qˆ
pˆ − Z α < p < pˆ + Z α
2
n 2
n
Example 9.10:
In a random sample of n=500 families owing television sets in
the city of Hamilton, Canada, it was found that x=340
subscribed to HBO. Find 95% confidence interval for the actual
proportion of families in this city who subscribe to HBO.
Solution:
.p = proportion of families in this city who subscribe to HBO.
.n = sample size = 500
X = no. of families in the sample who subscribe to HBO=340.
p̂ = proportion of families in the sample who subscribe to HBO.
X 340
= = = 0.68
n 500
qˆ = 1 − pˆ = 1 − 0.68 = 0.32
A point estimator for p is
X 340
p̂ = = = 0.68
n 500
Now,
95% = (1−α)100% ⇔ 0. 95 = (1−α) ⇔ α=0.05 ⇔ α/2 = 0.025
Z α = Z0.025 = 1.96
2
A 95% confidence interval for p is:
pˆ qˆ
pˆ ± Z α ; qˆ = 1 − pˆ
2
n
(0.68)(0.32)
0.68 ± 1.96
500
0.68 ± 0.04
0.64< p < 0.72
We are 95% confident that p ∈(0.64,0.72).
Result:
(1) E ( pˆ 1 − pˆ 2 ) = p1 − p2
p1 q1 p2 q2
(2) Var ( pˆ 1 − pˆ 2 ) = + ; q1 = 1 − p1 , q2 = 1 − p2
n1 n2
(3) For large n1 and n2, we have
p1 q1 p2 q2
pˆ 1 − pˆ 2 ~ N ( p1 − p2 , + ) (Approximately)
n1 n2
( pˆ − pˆ 2 ) − ( p1 − p2 )
Z= 1 ~ N(0,1) (Approximately)
p1 q1 p2 q2
+
n1 n2
10.1-10.3: Introduction +:
• A statistical hypothesis is a conjecture concerning (or a
statement about) the population.
• For example, if θ is an unknown parameter of the
population, we may be interested in testing the conjecture
that θ > θo for some specific value θo.
• We usually test the null hypothesis:
H o : θ = θo (Null Hypothesis)
Against one of the following alternative hypotheses:
⎧
⎪ θ ≠ θo
⎪ (Alternative Hypothesis
H1 : ⎨ θ > θo or Research Hypothesis)
⎪ θ < θo
⎪⎩
Example 10.3:
A random sample of 100 recorded deaths in the United States
during the past year showed an average of 71.8 years. Assuming
a population standard deviation of 8.9 year, does this seem to
indicate that the mean life span today is greater than 70 years?
Use a 0.05 level of significance.
Solution:
.n=100, X =71.8, σ=8.9
µ=average (mean) life span
µo=70
Hypotheses:
Ho: µ = 70
H1: µ > 70
T.S. :
X − µ o 71.8 − 70
Z= = = 2.02
σ / n 8.9 / 100
Level of significance:
α=0.05
R.R.:
Z α=Z0.05=1.645
Z > Z α =Z0.05=1.645
Decision:
Since Z=2.02 ∈R.R., i.e., Z=2.02>Z0.05, we reject Ho at
α=0.05 and accept H1: µ > 70. Therefore, we conclude that the
mean life span today is greater than 70 years.
Example 10.4:
A manufacturer of sports equipment has developed a new
synthetic fishing line that he claims has a mean breaking
strength of 8 kilograms with a standard deviation of 0.5
kilograms. Test the hypothesis that µ=8 kg against the
alternative that µ≠8 kg if a random sample of 50 lines is tested
and found to have a mean breaking strength of 7.8 kg. Use a
0.01 level of significance.
Solution:
.n=50, X =7.8, σ=0.5,
α=0.01, α/2 = 0.005
µ= mean breaking strength
µo=8
Hypotheses:
Ho : µ = 8
H1 : µ ≠ 8
T.S. :
X − µo 7.8 − 8
Z= = = −2.83
σ / n 0.5 / 50
Z α/2=Z0.005=2.575 and − Z α/2= −Z0.005= −2.575
Decision:
Since Z= −2.83 ∈R.R., i.e., Z= −2.83 < −Z0.005, we reject
Ho at α=0.01 and accept H1: µ ≠ 8. Therefore, we conclude that
the claim is not correct.
Example 10.5:
… If a random sample of 12 homes included in a planned study
indicates that vacuum cleaners expend an average of 42
kilowatt-hours per year with a standard deviation of 11.9
kilowatt-hours, does this suggest at the 0.05 level of significance
that the vacuum cleaners expend, on the average, less than 46
Department of Statistics and O.R. − 99 − King Saud University
Unrevised First Draft
STAT – 324 Second Semester 1424/1425 Dr. Abdullah Al-Shiha
Ho : µ1 = µ2 ⇔ Ho : µ1 − µ2 = 0
Generally, suppose we need to test
Ho : µ1 − µ2 = d (for some specific value d)
Against one of the following alternative hypothesis
⎧
⎪ µ1 − µ2 ≠ d
⎪
H1 : ⎨ µ1 − µ2 > d
⎪ µ1 − µ2 < d
⎪⎩
Hypotheses Ho : µ1 − µ2 = d Ho : µ1 − µ2 = d Ho : µ1 − µ2 = d
H1 : µ1 − µ2 ≠ d H1 : µ1 − µ2 > d H1 : µ1 − µ2 < d
Test Statistic (X − X2) − d
Z= 1 ~N(0,1) {if σ 12 and σ 22 are known}
(T.S.) σ 2
σ 2
1
+ 2
n1 n2
or
( X1 − X 2 ) − d
T= ~t(n1+n2−2) {if σ 12 = σ 22 =σ2 is unknown}
1 1
Sp +
n1 n2
R.R. and
A.R. of Ho
Or Or Or
Example10.6:
An experiment was performed to compare the abrasive wear of
Department of Statistics and O.R. − 101 − King Saud University
Unrevised First Draft
STAT – 324 Second Semester 1424/1425 Dr. Abdullah Al-Shiha
Recall:
• .p = Population proportion of elements of Type A in the
population
A no. of elements of type A
• = =
A+ B Total no. of elements
• .n = sample size
• X = no. of elements of type A in the sample of size n.
• p̂ = Sample proportion elements of Type A in the sample
X
=
n
• For large n, we have
pˆ − p X − np
Z= = ~ N(0,1) (Approximately, q=1−p)
pq npq
n
Hypotheses Ho: p = po Ho: p = po Ho: p = po
H1: p ≠ po H1: p > po H1 : p < p o
Test Statistic pˆ − po X − npo
Z= = ~N(0,1) (qo=1−po)
(T.S.) p o qo npo qo
n
R.R. and A.R.
of Ho
Example 10.10:
A builder claims that heat pumps are installed in 70% of all
homes being constructed today in the city of Richmond. Would
you agree with this claim if a random survey of new homes in
the city shows that 8 out of 15 homes had heat pumps installed?
Use a 0.10 level of significance.
Solution:
.p = Proportion of homes with heat pumps installed in the city.
.n=15
X= no. of homes with heat pumps installed in the sample = 8
p̂ = proportion of homes with heat pumps installed in the
X 8
sample = = = 0.5333
n 15
Hypotheses:
Ho: p = 0.7 ( po=0.7)
H1: p ≠ 0.7
Level of significance:
α=0.10
T.S.:
pˆ − po 0.5333 − 0.70
Z= = = −1.41
p o qo (0.7)(0.3)
n 15
or
X − npo 8 − (15)(0.7)
Z= = = −1.41
npo qo (15)(0.7)(0.3)
Zα/2= Z0.05= 1.645
Decision:
Since Z= −1.41 ∈A.R., we accept (do not reject) Ho: p=0.7
and reject H1: p ≠ 0.7 at α=0.1. Therefore, we agree with the
claim.
Example 10.12:
A vote is to be taken among the residents of a town and the
surrounding county to determine whether a proposed chemical
plant should be constructed. The construction site is within the
town limits and for this reason many voters in the county feel
that the proposal will pass because of the large proportion of
town voters who favor the construction. To determine if there is
a significant difference in the proportion of town voters and
county voters favoring the proposal, a poll is taken. If 120 of
200 town voters favor the proposal and 240 of 500 county voters
favor it, would you agree that the proportion of town voters
favoring the proposal is higher than the proportion of county
voters? Use a 0.025 level of significance.
Solution:
.p1 = proportion of town voters favoring the proposal
.p2 = proportion of county voters favoring the proposal
p̂1 = sample proportion of town voters favoring the proposal
p̂2 = sample proportion of county voters favoring the proposal
Town County
n1 = 200 n2 = 500
X1=120 X2=240
X 1 120 X 2 240
pˆ 1 = = = 0.60 pˆ 2 = = = 0.48
n1 200 n2 500
qˆ1 = 1 − 0.60 = 0.40 qˆ 2 = 1 − 0.48 = 0.52
Ho: p1 − p2 =0
H1 : p 1 − p 2 > 0
Level of significance:
α=0.025
Zα= Z0.025= 1.96
T.S.:
( pˆ 1 − pˆ 2 ) (0.60 − 0.48)
Z= = = 2.869
⎛1 1 ⎞ ⎛ 1 1 ⎞
pˆ qˆ ⎜⎜ + ⎟⎟ (0.51)(0.49)⎜ + ⎟
⎝ n1 n2 ⎠ ⎝ 200 500 ⎠
Decision:
Since Z= 2.869 ∈R.R. (Z= 2.869> Zα= Z0.025= 1.96), we
reject Ho: p1 = p2 and accept H1: p1 > p2 at α=0.025. Therefore,
we agree that the proportion of town voters favoring the
proposal is higher than the proportion of county voters.