Complete Lectures PME
Complete Lectures PME
Complete Lectures PME
CONTENTS
1
1. I N T R O D U C T I O N T O S T A T I S T I C S
2. M E A S U R E S O F C E N T R A L T E N D E N C Y
3. P R O B A B I L I T Y
4. C O N D I T I O N A L P R O B A B I L I T Y & M A T H E M A T I C A L E X P E C T A T I O N
5. P R O B A B I L I T Y D I S T R I B U T I O N S
6. P R O B A B I L I T Y D E N S I T I E S
7. S A M P L I N G D I S T R I B U T I O N S
8. R E G R E S S I O N & C O R R E L A T I O N A N A L Y S I S
9. E S T I M A T I O N O F P A R A M E T E R S
10.T E S T I N G O F H Y P O T H E S E S
Agenda
2
Introduction of Course
Contents of Course
Grade Criteria
Course Prerequisite:
Calculus (Integral & Differential)
Algebra (Solution of system of Equation)
Recommended Books
1. Applied Statistics and Probability for Engineers, By Douglas C. Montgomery
2. Probability for Engineers By Irwin Miller, John E Freund
3. Statistical methods for Engineering & Scientists, By Walpol & Meyers
4. Introduction to statistics Theory, By Sher Mohammad
Student’s Efforts: Besides class hours(ONLY 48 hrs), every student should devote at
least 6 hours a week to grasp the content of the book and the class notes, and to work out the
examples.
Why you should study Engineering Analysis & Statistics
4
3. Propose a model for the problem, using scientific or engineering knowledge of the phenomenon being
studied. State any limitations or assumptions of the model.
4. Conduct appropriate experiments and collect data to test or validate the tentative model or conclusions
made in steps 2 and 3.
7. Conduct an appropriate experiment to confirm that the proposed solution to the problem is both
effective and efficient.
8. Draw conclusions or make recommendations based on the problem solution.
Why you should study probability
7
The mark you will get on the Can you make money by playing
final the Lottery?
Your average annual income
Population Sample
over the next several years
a
Whether it will rain tomorrow
Inference
Parameters Statistic
..
Graphical Display is further divided into two types. These types are as
follow:
Graph
Diagram
Frequency Distribution
14
Class Limit
Class Boundary
Class Interval
Class Mark ( Midpoint value)
Make a Classification
16
of data into groups
106, 107, 76, 82, 109, 107, 115, 93, 187, 95, 123, 125, 111, 92, 86, 70, 110, 126,
68, 130, 129, 139, 115, 128, 100, 186, 84, 99, 113, 204, 111, 141, 136, 123, 90,
115, 98, 110, 78, 185, 162, 178, 140, 152, 173, 146, 158, 194, 148, 90, 107, 181,
Years
60000
50000
40000
30000
20000
10000
0
1965 1966 1967 1968 1969
Years
Multiple Bar Chart 21
Example:
Chart Title
4000
3500
3000
2500
2000
1500
1000
500
0
1965-66 1970-71 1975-76
Area Production
Component Bar Chart
23
Rawalpindi 40 21 19
Sargodha 60 32 28
Lahore 65 35 30
Component Bar Chart
24
Chart Title
70
60
50
40
30
20
10
0
Peshawar Rawalpindi Sargodha Lahore
Male Female
Pie Diagram
25
Pie diagram is consisting of a circle divided
into sectors whose area are proportional top
the various parts into which whole quantity
is divided.
Example: Represent the total expenditure and expenditure of
various items of a family by the Pie diagram.
26
Historigram:
No. of Cars
120
100
80
60
40
20
0
1929 1930 1931
No. of Cars
Histogram
29
The sum of the deviations of all the observation taken from their arithmetic mean is zero.
_
i,e.
x x 0
The sum of the squares 2 the deviations of the observation taken from arithmetic mean is minimum.
_ of
2 2
x - x
_
x - x x -a
i,e. is minimum
or
Where a is an arbitrary value.
Combined mean & Weighted mean
41
Combined mean W
If a series of n observations consists of two components having n1 and n2
observations (n1+ n2=n), and means Xand1 respectively
X 2 then the
Combined mean of n observations is given by
__
n1 x1 n2 x 2
XC
n1 n2
Arithmetic Mean
42
Merits Demerits
Arithmetic mean is most popular among The A.M is affected by the extreme
averages used in statistical analysis. (Lowest & highest) values in a series.
It is very simple to understand and easy to
calculate. In case of a missing observation in a
The calculation of A.M is based on all the
series it is not possible to calculate the
observations in the series.
A.M.
The A.M is responsible for further algebraic
In case frequency distribution with
treatment.
It is strictly defined.
open end classes the calculation of A.M
is impossible.
It provides a good means of comparison
GEOMETRIC MEAN
43
The Geometric mean of n positive observation is defined as the n th root of the product
of all observation.
G.M=(X1×X2×...×Xn)1/n
logXi
or G.M= Antilog{ n}
G = (X1f1×X2f2×... ×Xnfn)
f log x
i i
or G= antilog{ f} i
Combined Geometric mean
44
G=
c
G G 1 2
n1 n2
w1 w2
Where n1 n2 & n1 n2
Geometric mean
45
Merits Demerits
It is rigidly defined and its value is a It cannot be calculated if any of the
precise figure. observation is zero or negative.
H.M= f
f
x
Harmonic Mean
47
Merits Demerits
Its value is based on all the It is not simple to calculate and
observations of the data. easy to understand.
It is less affected by the extreme
values. It can not be calculated if one of
It is suitable for further the observations is zero.
mathematical treatment.
It is strictly defined. The H.M is always less than A.M
and G.M.
Relation between A.M, G.M, and H.M
48
It is the value that divides the arranged set of data into two equal parts.
For n observations x1,xth 2,...,xn first arrange the data into ascending or descending order
For odd n, n 1 observation is median.
th
n 2
th
2 n
& 2 value
For even n, median is A.M of 2
n
For group data first choose median group by observing in 2cumulative frequencies
Median=
Merits Demerits
The median is useful in case of For calculating median it is necessary to
frequency distribution with open- arrange the data, where as other averages do
not need arrangement.
end classes. Since it is a positional average its value is not
The median is recommended if determined by all the observations in the series.
distribution has unequal classes.
Extreme values do not affect the It can not be calculated if first class is chosen as
For grouped data first choose the Mode group by observing largest frequency
f m f1
Mode=
l h
( f m f1 ) ( f m f 2 )
where, l= lower class boundary of mode group.
fm =Frequency of mode group
f1 =Frequency of preceding group of mode group
f2 =Frequency of following group of mode group
h=class interval
Mode
52
Merits Demerits
It is not suitable for further
It is easy to calculate and simple
to understand. mathematical treatments.
The value of mode cannot always
It is not affected by the extreme
values. be determined.
The value of mode is not based on
Its value can be determined in
case of open-end class interval. each and every value of the series.
The mode is strictly defined. It can not be calculated if first or
last class is chosen as mode class.
Relative position of mean, median, mode for three distribution
53
Symmetric distribution
2
x x
2
or
x 2
2
N N
Properties of variance
Var(X) cannot be negative.
Var(a) = 0 , where a is constant.
Var(aX) = a2 Var(X) , where a is constant.
Var(X +a) = Var(X) , where a is constant.
Var(aX + b) = a2 Var(X) , where a & b are constant.
Var(X + Y) = Var(X) + Var(Y)
Var(X - Y) = Var(X) - Var(Y)
Standard Deviation
55
f f
Standard Deviation
56
GM=39.67
58
Find harmonic mean of 15, 20, 25.
HM=
= 19.5
Find Median of
(i) 15, 20, 25 (ii) 2, 2, 3, 3, 3, 4, 4, 4, 5, 5
Median= 20 Median= 3 4
2
Find mode of 2, 3, 4, 5, 6, 7, 8, 9,10 mode does not exist
Find mode of 2, 3, 3, 3, 4, 4, 4, 4, 4 mode=4
Find mode of 2, 3, 3, 3, 4, 4, 4, 5, 6 mode=3, 4
Find mode of 2, 2, 3, 3, 4, 4, 5, 5, 6 mode=2, 3, 4, 5
EXAMPLE (Grouped Data)
59
Find the A.M, G.M, H.M, Median and Mode from data given below
Class limits Frequency
65-84 9
85-104 10
105-124 17
125-144 10
145-164 5
165-184 4
185-204 5
A.M (Solution)
60
124.248
G.M anti log 117 .1
60
H.M (Solution)
62
60
H .M 113.1
0.5304
Median (Solution)
63
Highest frequency is 17
17 10
Mode 104 . 5 20 114 . 5
17 10 17 10
CO-EFFICIENT OF VARIATION
65
1.08229
σA=1.08229 x A 1.0566 C.V for team A
1.0566
100 1.09 0 0
The proportion of the values that fall within k standard deviations of the mean will be atleast,
where k is any number greater than 1.
1
1 2
k _ _
x ks to x ks
"Within k standard deviations" interprets as the interval:
Chebyshev's Theorem is true for any sample set, not matter what the distribution.
Empirical Rule
The empirical rule is only valid for bell-shaped (normal) distributions. The following statements
are true.
Approximately 68% of the data values fall within one standard deviation of the mean.
Approximately 95% of the data values fall within two standard deviations of the mean.
Approximately 99.7% of the data values fall within three standard deviations of the mean.
Exercise
68
2. Find A.M, G.M, H.M, median and mode of following data
H .M G.M A.M
Wages in rupees Less than 10 Less than 20 Less than 30 Less than 40 Less than 50
No. of workers 5 17 20 22 25
_
x - x x -a
_
x x 0
Exercise
69
4. What is the least value of arithmetic mean for which f3 must exist
What is the most value of arithmetic mean for which f3 must exist
Groups 60-62 63-65 66-68 69-71 72-74
Frequencies 15 54 81 24
f3
5. How Arithmetic mean, median is affected if every value of the variable is increased by 2 and multiplied by 5?
6. The G.M of 10 items of a series was 16.2. It was later found that one of the items was wrongly taken as 12.9 instead of
21.9. Calculate the correct G.M.
7. If n1=2, n2 = 3 and n3=5 and GM1=8, GM2=10 and GM3=15. find the combined geometric mean of all the observations.
8. Find the average rate of growth of population which in the first decade has increased of 20%, in the second decade by
30% and in third by 45%.
9. Write a precise note on “Proper choice of measures of central tendency” for different type of data.
Probability
70
Outline
71
Basic definitions
Counting principles
Probability
Elementary theorem of probability
Examples of Probability
OBSERVATION AND VARIABLE
72
Any numerical recording A characteristic that
of information is known as varies with an individual
observation. or object is called variable.
Toss a die
X=points showing on upper face of die
Test a light bulb
X=lifetime of bulb
Test 20 light bulbs
X=average lifetime of bulbs
Types of Random variable
75
2) Continuous random variable:
1) Discrete random variable:
For example:
Height , weight and temperature
For example:
Number of students in class. Lifetimes of bulb
Number of facebook friend. Time t>0 or [0, t]
Experiment
76
Experiment means a process whose result yield a set of data
For example: Tossing of a coin.
Random Experiment:
An experiment which processes different results under similar condition is called
random experiment.
Experiment Outcomes
Flip a coin Heads, Tails
Exam Marks Numbers: 0, 1, 2, ..., 100
Course Grades - -
F, C, C+, B , B, B+, A , A,
Sample Space
77
Permutation
Choose r object out of n object with order (choice with replacement)
nP
r = n!/[n-r]!
Combination
Choose r object out of n object without order (choice without replacement)
nC
r = n!/r![n-r]!
Product rule
79
EXAMPLE 1
(Assuming a fair die) S = {1, 2, 3, 4, 5, 6}
P(1) = P(2) = P(3) = P(4) = P(5) = P(6) = 1/6
Then:
P(EVEN) = P(2) + P(4) + P(6)
= 1/6 + 1/6 + 1/6
= 3/6
= 1/2
Example 2
83
A fair coin is tossed two times what is the probability that at least one head appear?
SOLUTION:
Sample space S= {HH, HT, TH, TT}
n(S)=2x2=22=4
Let A be an event that at least one head appears
A= {HH, HT, TH}
n (A) =3
We know that
P(A)=n(A)/n(S)
P(A)=3/4
So probability of at least one head appear is 3/4
Example 3
84
An employer wishes to hire 3 people from a group of 15 equally qualified applicants , includes 8 men
and 7 women.
If he select 3 candidates randomly then what is the probability that
a) All three selected are women
b) At-least one woman is selected.
Solution (a) Let A be an event that all three selected are women.
3. If A,B and C are any three event. Then the probability of one of them
occur is
P(AUBUC) = P(A)+P(B)+P(C)-P(A∩B)-P(B∩C)-P(A∩C)+P(A∩B∩C)
4. If A1, A2, ...,Ak are any k event. Then the probability of one of them
occur is
P( A1 A2 ... Ak ) P( Ai ) P( Ai A j ) P( Ai A j Al ) ... (1) k 1 P( A1 A2 ... Ak )
i i j i j l
Solution: n(S)=24=16
S={HHHH,HHHT,HHTT,HTTT,TTTT,HTHH,HHTH,THH
H,HTTH,HTHT,THTH,TTHH, THHT,TTTH,TTHT,THTT}
Let A be an event that at least one head occurred
A={HHHH,HHHT,HHTT,HTTT, HTHH,HHTH,THHH,HTTH,HTHT,THTH,TTHH,
THHT,TTTH,TTHT,THTT}
n(A)=15
P(A)=15/16
Alternative Solution
90
By using Theorem 1
P(A)= 1-P(Ā)= 1- 1/16= 15/16
Example 5
91
1 2 3 4 5 6
1 2 3 4 5 6 7
P(5) = 4/36
2 3 4 5 6 7 8
3 4 5 6 7 8 9
P(11) =2/36 4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12
93
P(A) = 0.07
P(B) = 0.12
(i)
P(C) = 0.17
P(D) = 0.32 P(AUBUCUD)=?
P(E) = 0.21
P(F) = 0.11
96
P(DUEUF)=?
Since Events D, E & F are disjoint so by Theorem 2
P(DUEUF) = P(D)+P(E)+P(F)
= 0.32+ 0.21+0.11 = 0.64
Example 7
97
A card is drawn randomly from a deck of playing card what is the probability
that it is a diamond card, a face card or card of king.
Information
Deck of card consist on
52 cards
13 Diamond card 4 Suit
Diamond, Spade, Club, Heart
13 Heart card
Solution
98
P(AUBUC)=P(A)+P(B)+P(C)-P(A∩B)-P(A∩C)-P(B∩C)+P(A∩B∩C)
99
13 12 4 3 1 4 1 11
P ( AUBUC)
52 52 52 52 52 52 52 26
Example 8
100
Two coins are tossed, what is the conditional probability that two head appear given
that there is at least one head?
Solution:
Sample space for two coin S= {HH,HT,TH,TT}
Let A be an event that 2 head appears
A= {HH}
P(A|B)= ?
Let B be the event at least 1 head appear P ( A) 1
B= {HH,HT,TH} 4
3
P( B)
4
101
P( A B)
P( A | B)
P( B)
A ∩ B= {HH} 1
P( A B)
4
1
1
P( A | B) 4
3 3
4
NOTE THAT (!)
P ( A | B ) P ( A)
Exercise
102
A maintenance firm has gathered the following information regarding the failure
mechanisms for air conditioning systems:
Evidence of gas leaks
YES NO
NO 32 3
The units without evidence of gas leaks or electrical failure showed other types of
failure. If this is a representative sample of AC failure, find the probability
(a) That failure involves a gas leak
(b) That there is evidence of electrical failure given that there was a gas leak
(c) That there is evidence of a gas leak given that there is evidence of electrical
failure
Conditional Probability
103
Outline
104
Conditional Probabilities
Total Probability
Baye’s Theorem
Mathematical Expectation
Decision Making
Conditional Probability
105
P ( A1 A2 ... Ak ) P ( A1 ) P ( A2 ) ... P ( Ak )
Example 8
107
Two coins are tossed, what is the conditional probability that two head appear given
that there is at least one head?
Solution:
Sample space for two coin S= {HH,HT,TH,TT}
Let A be an event that 2 head appears
A= {HH}
P(A|B)= ?
Let B be the event at least 1 head appear P ( A) 1
B= {HH,HT,TH} 4
3
P( B)
4
108
P( A B)
P( A | B)
P( B)
A ∩ B= {HH} 1
P( A B)
4
1
1
P( A | B) 4
3 3
4
NOTE THAT (!)
P ( A | B ) P ( A)
Total Probability Theorem
109
In a certain assembly plant, three machines, B1, B2, and B3, make 30%, 45%, and
25% of the products respectively. It is known from past experience that 2%, 3%,
and 2% of the products made by each machine are defective respectively. Now
suppose that a finished product is randomly selected. What is the probability that
it is defective?
Solution: Consider the following events
A: The product is defective
B1: The product is made by machine B1
B2: The product is made by machine B2
B3: The product is made by machine B3
112
and hence
P(A) = 0.006 + 0.0135 + 0.005 = 0.0245
113
P ( A B3 ).P ( B3 )
P ( B3 A)
P(A B1 ) P(B1 ) P(A B 2 ) P(B 2 ) ... P(A B k ) P(B k )
0.005
0.2040
0.0245
In view of the fact that a defective product was selected, this result suggests that it probably was not made by machine B3.
0 p(x) 1,
p(x) =1
The function f(x) is a probability density function for a continuous random variable X defined over the set of real
numbers R, if
1. f ( x ) 0 for all x
2. f ( x)dx 1
b
3. P (a x b) f ( x )dx
a
Mathematical Expectation
116
E ( x) xP( x) E ( x)
xf ( x ) dx
Properties of Mathematical Expectation
117
What is expected value of number of heads when three fair coins one
tossed.
X Outcome P(x) xP(x)
E=∑x p(x)
=12/8=1.5
119
A salesman can earn Rs 1000 per day If day is rainy, otherwise he lose Rs 200 What is expected earning
of salesman, if probability of rain is 0.3
E=∑x p(x)
=160
Expected earning of salesman is Rs 160 per day
Decision Making problem
120
Two new product designs are to be compared on the basis of revenue potential. Marketing
feels that the revenue from design A can be predicted quite accurately to be $3 million. The
revenue potential of design B is more difficult to assess. Marketing concludes that there is a
probability of 0.3 that the revenue from design B will be $7 million, but there is a 0.7 probability that
the revenue will be only $2 million. Which design do you prefer?
Let X denote the revenue from design A. Because there is no uncertainty in the revenue
from design A, we can model the distribution of the random variable X as $3 million with
probability 1. Therefore, E(x)= $3 million .
Let Y denote the revenue from design B. The expected value of Y in millions of dollars is
Let the continuous random variable X denote the current measured in a thin
copper wire in milliamperes. Assume that the range of X is [0, 20 mA], and
assume that the probability density function of X is f(x)=0.05 for 0<x<20What
is the probability that a current measurement is less than 10 milliamperes?
Is f(x) a probability density function?
20
0.05dx 1
0
10
P (0 x 10) o.o5dx 0.5
0
Mathematical Expectation(for continuous variable)
122
Probability distribution
Uniform distribution
Binomial distribution
Hypergeometric distribution
Geometric distribution
Poisson distribution
The mean of a probability distribution
Standard deviation of a probability distribution
Probability distributions
129
For a discrete random variable, the probability for each outcome x to occur is denoted by f(x),
known as probability distribution if it satisfy
0 f(x) 1
f(x)=1
Uniform distribution
130
x 1 2 3 4 5 6
?
x f(x)
0 ¼
1 ½
Toss a coin ten time. X=# of heads
2 1/4
?
Pick up 2 cards from a deck of cards. X=# of aces
x f(x)
Example
132
Verify that for the number of heads obtained in four flips of a balanced
coin the probability distribution is given by
4
x
f ( x ) , for x=0, 1, 2, 3, and 4
16
In many applied problems, we are interested in the probability that an event will occur x times out of n.
Roll a die 3 times. X=# of sixes
134
x f(x)
0 (5/6)3
1 3 (1/6) (5/6)2
2 3 (1/6)2 (5/6)
3 (1/6)3
x 3 x
3 1 5
f ( x)
x 6 6
Toss a die 5 times. X=# of six. Find P(X=2)
S=six N=not a six 136
SSNNN 1/6*1/6*5/6*5/6*5/6=(1/6)2(5/6)3
SNSNN 1/6*5/6*1/6*5/6*5/6=(1/6)2(5/6)3
SNNSN 1/6*5/6*5/6*1/6*5/6=(1/6)2(5/6)3
SNNNS
10 ways to choose 2 of 5 places for S.
NSSNN etc. __ __ __ __ __
5 5! 5! 5 * 4 * 3!
NSNSN 2 2!(5 2)! 2!3! 2 *1* 3! 10
NSNNS
NNSSN 2
1 5
3
NNSNS P( x 2) 10 *
6 6
NNNSS [1-P(S)]5 - # of S
[P(S)]# of S
n independent trials; p probability of a success; x=# of successes
137
A trial with only two possible outcomes is used so frequently as a building block of a random experiment
that it is called a Bernoulli trial.
A random experiment consists of n Bernoulli trials such that
1) There are a fixed number of trials. This is denoted by n.
2) The n trials are independent and repeated under identical conditions.
3) Each trial results in only two possible outcomes, labeled as “success’’ and “failure’’
4) The probability of a success in each trial, denoted as p, remains constant
The random variable X has a binomial random variable with parameters n and p The probability
function of X is
n px (1-p)n-x
ways to choose x places for s,
x
n x
f ( x ) p (1 p ) n x
x
Roll a die 20 times. X=# of 6’s, n=20, p=1/6
138
x 20 x
20 1 5
f ( x) 6 6
x
4 16
20 1 5
p( x 4) 6 6
4
x 10 x 10
10 1 1 10 1
f ( x)
x 2 2
x 2
Geometric distribution
139
If we sample with replacement and the trials are all independent, the
binomial distribution applies.
n picked
a successes X= # of successes
b non-successes
In the box: a successes,
142
b non-successes
The probability of getting x successes (white balls):
# of ways to pick n balls with x successes
p( x )
total # of ways to pick n balls
# of ways to pick x successes
=(# of ways to choose x successies)*(# of ways to choose n-x non-successes)
a b
=
x
n x
A sample of size n objects is selected randomly (without replacement) from the a+b objects .
Let the random variable X denote the number of successes in the sample. Then X is a hypergeometric random
variable and probability function is defined as
a b
x n x
f ( x) , x 0,1, 2, ..., a
a b
n
Example
143
4 48
2
3
P ( X 2)
52
5
Example
144
98 2
8
2
P ( X 8)
100
10
Poisson distribution
145
This distribution is used to model the number of “rare” events that occur in a time
interval, volume, area, length, etc…
Example: Number of deaths from horse kicks in the Army in different years
Given an interval of real numbers, assume counts occur at random throughout the interval.
If the interval can be partitioned into subintervals of small enough length such that
The number of successes in a fixed subinterval, follows a Poisson process provided the
following conditions are met
1. The probability of two or more successes in any sufficiently small subinterval is 0.
2. The probability of success is the same for any two subintervals of equal length.
3. The number of successes in any subinterval is independent of the number of successes in
any other subinterval provided the subintervals are not overlapping.
Poisson distribution
146
The random variable X that equals the number of counts in the interval is a Poisson
random variable with parameter λ , and the probability function of X is
x e
f ( x) , x=0, 1, 2, ...
x!
When there is a large number of trials, but a small probability of success, binomial
calculation becomes impractical
Limiting case of Binomial dist
147
Radioactive decay
x=# of particles/min
2 3 e 2
λ=2 particles per minutes P ( x 3) , x=0, 1, 2, ...
3!
Example
148
Radioactive decay
X=# of particles/hour
λ =2 particles/min * 60min/hour=120 particles/hr
125 120
120 e
P ( x 125) , x=0, 1, 2, ...
125!
The Poisson Distribution
Emission of -particles
No. - Observed
149 particles
0 57
In 1910, Ernest Rutherford and Hans Geiger recorded the 1 203
2 383
number of -particles emitted from a polonium source in 3 525
4 532
successive intervals of one-eighth of a minute. 5 408
6 273
The results are reported in a table. 7 139
8 45
Does a Poisson probability function accurately describe 9 27
10 10
the number of -particles emitted? 11 4
Source: Rutherford, Sir Ernest; Chadwick, James; and Ellis, C.D.. 12 0
Radiations from Radioactive Substances. London, Cambridge University Press, 1951, p. 172.
13 1
14 1
Over 14 0
Total 2608
No. - Observe Expected
150 particles d
0 57 54
Calculation of λ : 1 203 210
2 383 407
3 525 525
4 532 508
λ = No. of particles per interval 5 408 394
6 273 254
= 10097/2608 7 139 140
= 3.87 8 45 68
9 27 29
10 10 11
11 4 4
Expected values 12 0 1
13 1 1
=2608 e -3.87
(3.87)x
14 1 1
x! 0 0
Over 14
2608 2680
Total
The mean of a probability distribution
151
x 3 x
3
1 5
3
E( X ) X x
x 0 x 6 6
3 2 2 3
5 1 5 1 5 1
0 * 1* 3 2 * 3 3* 1/ 2
6 6 6 6 6 6
In general
153
Population mean=3.5
Box of equal number of
1’s 2’s 3’s
4’s 5’s 6’s
E(X)=(1)(1/6)+(2)(1/6)+(3)(1/6)+
(4)(1/6)+(5)(1/6)+(6)(1/6)
=3.5
X=# of heads in 2 coin tosses
155
X 0 1 2
P(x) 1/4 ½ 1/4
Population Mean=1
For probability distribution
156
For example,
3 white balls, 2 red balls x P(x)
Pick 2 without replacement 0 P(RR)=2/5*1/4=2/20=0.1
X=# of white balls 1 P(RW or WR)=P(RW U
WR)=P(RW)+P(WR)
=2/5*3/4+3/5*2/4=0.6
2 P(WW)=3/5*2/4=6/20=0.3
m =E(X)=(0)(0.1)+(1)(0.6)+(2)(0.3)=1.2
m
The mean of a probability distribution
157
Binomial distribution
n= # of trials,
p=probability of success on each trial
X=# of successes
n x
E ( x ) x p (1 p ) n x
np
x
158
a – successes
b – non-successes
pick n balls without replacement
X=# of successes
a b a b
E( x) x n
x n x
a
n
ab
Example
160
50 balls
20 red
30 blue
N=10 chosen without replacement
X=# of red
20
E ( x ) 10 * ( ) 10 * 0.4 4
50
Since 40% of the balls in our box are red, we expect on average
40% of the chosen balls to be red. 40% of 10=4.
Standard Deviation of a Probability Distribution
161
Variance:
σ2 = weighted average of (X-μ)2 by the probability of each
possible x value = (x- μ)2f(x)
Standard deviation:
( x ) 2
f ( x)
Example
162
σ2=np(1-p)
where n is # of trials and p is probability of a success.
From the previous example, n=2, p=0.5
Then
σ2=np(1-p)=2*0.5*(1-0.5)=0.5
Variance for Hypergeometric distributions
164
Hypergeometric:
a b abn
n
2
a b a b a b 1
np(1 p ) finite population correction factor
Alternative
165
formula
σ2=∑x2f(x)–μ2
Probability densities
Uniform distribution
Gamma Function
Exponential distribution
Gamma distribution
Beta distribution
Normal distribution
Standardized Normal distribution
Normal approximate to binomial
Probability densities
168
b
3. P ( a x b)
a
f ( x ) dx
4. P (x a) 0
.
Fig (A)
The total loading between points a and b is determined as the integral of the density function
from a to b. This integral is the area under density function over interval as shown in Fig (B).
Fig (B)
Correspondence of Area Probability
170
Similarly, the area under f(x) over any interval equals the true probability that a
measurement falls in the interval.
Example 1 (Density function)
171
Let the continuous random variable X denote the current measured in a thin
copper wire in milliamperes. Assume that the range of X is [0, 20 mA], and
assume that the probability density function of X is f(x)=0.05 for 0<x<20
What is the probability that a current measurement is between 5 to 10
milliamperes?
Is f(x) a probability density function? 20
0.05dx 1
0
10
P (5 x 10) o.o5dx 0.25
5
Mathematical Expectation(for continuous variable)
172
1
b a a xb
f ( x)
0 elsewhere
Mean & Variance of Uniform distribution
176
b
b 1 1 x
2
b2 a2 (b a )(b a ) ba
E ( x) a
x
b a
dx
b a
2
2(b a )
2(b a )
2
a
b
b3 a 3 (b a )(a 2 b 2 ab)
3
2 1 1 x
b
dx
2
E x x
a
ba ba 3 a
3(b a ) 3(b a )
( a 2 b 2 ab)
3
2
( a 2 b 2 ab) b a
2
Var ( x ) E x E ( x )
2
3
2
4( a 2 b 2 ab) 3(b 2 a 2 2ab) a 2 b 2 2ab (b a ) 2
12 12 12
Gamma Function
177
( ) 0
y 1e y dy
( 1) 0
y e y dy Integrating by Parts :
u y du y 1dy
dv e y dy v e y
( 1) y e y dy uv y 1e y dy
y
vdu y e
0 0 0
0 ( 0) y 1e y dy ( ) (Recursive Property)
0
y x dy dx
( x ) dx x 1e x dx ( )
1 y 1 x
y e dy e
0 0 0
Exponential Distribution
178
1 x /
e x0
f ( x)
0 elsewhere
Exponential Distribution Mean & Variance
179
1 x 1 x 1 21 x
E ( x) x e dx xe dx x e dx
0
0 0
1
(2) 2 (2 1)!
2 1 x 1 2 x
E x x e dx x e dx x e dx
2 1 31 x
0
0 0
1
(3) 3 2 (3 1)! 2 2
Var ( x) E x 2 E ( x) 2 2 ( ) 2 2
2
Gamma Distribution
180
1 1 x
( ) x e x 0, , 0
f ( x)
0 otherwise
Gamma Distribution Mean & Variance
181
1 1 x 1
0 ( ) 0
x
E ( x) x x e
dx x e dx
( )
1 1 ( 1)
( ) 0
( 1) 1 x 1
x e dx ( 1)
( ) ( )
( )
( )
1 x
E x x
2
2 1
( ) x e
dx
(
1
)
0
x 1 x
e dx
0
1 1 ( 2) 2
( ) 0
( 2 ) 1 x 2
x e dx ( 2)
( ) ( )
( 1) ( 1) 2 ( 1) ( ) 2
( 1)
( ) ( )
Var ( x ) E x 2 E ( x )
2
( 1) ( ) 2 2 2 2 2 2 2
Beta Distribution
182
( ) 1 1
( )( ) x (1 x ) 0 x 1, , 0
f ( x)
0 otherwise
Normal Distribution
183
1 ( x )2
1
f ( x) e 2 2
x , , 0
2 2
184
Normal Distribution – Normalizing Constant
185
( x ) 2
Consider the integral : e 2 2
dx k (we want to solve for k )
x dz 1
Changing variables : z dx dz
dx
( x ) 2
z2
k
e 2 2
dx
e 2
dz
z2
k
e 2
dz
z12 z 22
2 1 z2 z2
k 1 2
e 2
dz1 e 2
dz 2 e 2 dz1dz 2
1 2
2 r 2 2 2
0
e 2
d 0
( 0 ( 1))d 0
d 0
2
r 0
2
k
2 k 2 2 2 k 2 2
Obtaining Value of ┌(1/2)
186
From Previous slide, we get : 2
2
ez 2
dz
1
2
2
ez 2
dz
0
2
1
Now, Consider : y 1 2 1
e y dy y 1 2 e y dy
2 0 0
z2
Changing Variables : y dy zdz
2
1 2
1 z2 1 z2
z2 2
2
e zdz 2 e
2
zdz
2 0
0
z
1
2 e z2 2
dz 2 2
0
2
1
2
Standardized Normal Distribution Mean& Variance
187
1
1 z2
Z ~ N (0,1) f ( z ) e 2
2
1
1 2
z 1
1 2
z 1
1
z2
E ( Z ) z e 2 dz ze 2
dz e 2 0 ( 0) 0
2 2
2
1
1 2 12 z 2
z
1 z2
2 0
E Z 2 2
e 2 z e dzdz 2
2
1
Changing Variables : y z 2 dy 2 zdz dy zdz
2
1 1
1 2 2 z2 2 z2 2
y 2 1
2 z e dz ze 2
zdz y e dy
2 0
2 0
2 0
2
1
3 2 1 1 3 1 1 1 3/ 2 23 2
y 2
y e dy 2 3 2 2 1
2 0
2 2 2
2 2 2 2
Var ( Z ) E Z 2 E ( Z ) 1 0 2 1
2
188
189
Example
190
(a) Suppose the current measurements in a strip of wire are assumed to follow a normal
distribution with a mean of 10 milliamperes and a variance of 4 (milliamperes)2 . What is the
probability that a measurement will exceed 13 milliamperes?
Let X denote the current in milliamperes. The required probability can be represented as P(X > 13).
Let Z = (X - 10)/2 X > 13 corresponds to Z >1.5
P (Z >1.5)= 0.06681
(b) Determine the value for which the probability that a current measurement is below this value is 0.98 We
need the value of x such that P(X < x) = 0.98. By standardizing, this probability expression can be written
Estimation: We
can estimate the value of a population
parameter.
For example, the mean of the data from a sample is used to give information about
the overall mean in the population from which that sample was drawn
Our interest is to know something about the population, but because our time,
resources, and efforts are limited, we can take a sample to learn about the population.
Need of sample
(otherwise for parameter calculation we have to pay the following)
203
Large sample could alter the nature of population, e.g. opinion surveys.
Destruction of population, e.g. crash test only a small sample of
automobiles.
Cooperation of respondents – individuals, firms, administrative
agencies.
In some cases partial data is all that is available, e.g. fossils and historical
records, climate change.
Choice of sample
204
n is the symbol given for the size of the sample or the number of
elements in the sample.
Sampling without replacement
207
Sampling distribution
Staticstic
Parameter
Sampling distribution
211
E(x)
Sampling distribution approximately a normal distribution
214
1+2+3+4+5+6 = 21.
µ=21/6 = 3.5
2
91 21
2
6 6
There is only 1 way to get a mean of 1, but 6 ways to get a mean of
3.5
Sample with mean
218
x P( x ) xP( x )
493.5 126
2
2
x
2
1.45833
36 36
With replacement
220
x 3.5
2
2.9166
2
x 1.4583
n 2
221
Mean Value
Example 2
222
30
6
5
18
2
Sample with mean
223
x P( x ) x P( x )
2 390 60
2
x
2
3
10 10
225
x 6
2
N n 18 2
2
x 3
n N 1 3 4
Central limit theorem
226
Sampling error
The sample cannot be fully representative of the population
As such, there is variability due to chance
We could have a thousand sample means and none of them equal exactly the
population mean.
The sampling error is the difference between the point estimate (value of
the estimator) and the value of the parameter. This is the error caused by
sampling only a subset of elements of a population, rather than all
elements in a population. Our interest lies in minimizing the sampling
error, but all samples have some such error associated with them.
Central limit theorem
227
2
2
X ~ Normal , 2
Normal ,
n
n
Central limit theorem(CLT)
228
Correlation
Rank Correlation
(a) Linear
232
(b) Linear
233
(c) Curvilinear
234
(d) Curvilinear
235
(e) No Relationship
Correlation
236
in some direction.
is said to be Positive.
y = dependent variable
Example of correlation coefficient
238
y x yx y2 x2
487 3 1,461 237,169 9
445 5 2,225 198,025 25
272 2 544 73,984 4
641 8 5,128 410,881 64
187 2 374 34,969 4
440 6 2,640 193,600 36
346 7 2,422 119,716 49
238 1 238 56,644 1
312 4 1,248 97,344 16
269 2 538 72,361 4
655 9 5,895 429,025 81
563 6 3,378 316,969 36
Sum 4855 55 26091 2240286 4855
239
n xy x y
r
[n( x ) ( x) ][ n( y ) ( y ) ]
2 2 2 2
12(26,091) 55(4,855)
r
[12(329) (55) 2 ][12(2,240,687) (4,855) 2 ]
0.8325
Regression
240
Regression analysis describes the relationship between two (or more) variables.
Examples: Income and educational level
Demand for electricity and the weather
Regression tells us how to draw the straight line described by the correlation
Definition: The
relationship between the expected value of dependent variable Y and independent
variable X is Known as Regression line of Y on X
A good line is one that minimizes the sum of squared differences between the points and the
line.
Interpretation of Regression line
241
Y
Y = bX + a
Change
b = S lo p e in Y
C h a n g e in X
a = Y -in te r c e p t
X
Interpretation of Regression coefficient
242
y ab x x y
xy
( x x )( y y ) or n or r
Sy
b ( x ) 2
Sx
(x x) 2
x 2
n
a y bx
ŷ = Estimated, or predicted, y value
a = Unbiased estimate of the regression intercept
b = Unbiased estimate of the regression slope
x = Value of the independent variable
Example of regression line Y on X
244
y x yx y2 x2
487 3 1,461 237,169 9
445 5 2,225 198,025 25
272 2 544 73,984 4
641 8 5,128 410,881 64
187 2 374 34,969 4
440 6 2,640 193,600 36
346 7 2,422 119,716 49
238 1 238 56,644 1
312 4 1,248 97,344 16
269 2 538 72,361 4
655 9 5,895 429,025 81
563 6 3,378 316,969 36
Sum 4855 55 26091 2240286 4855
245
xy x y
26,091
55(4,855)
b n 12 49.9101
(
x n
2 x ) 2
329
(55) 2
12
yˆ 175.8288 49.9101( x)
The principle of least squares
246
( y yˆ ) 2
should be minimize
y yˆ
Error Analysis
247
SSE ( y y )
ˆ 2
or simply
SSE y a y b xy 2
TSS ( y y) 2
SSR ( y
ˆ y) 2
TSS
SSR 191,600.62
R
2
0.6931
TSS 276,434.90
69.31% of the variation in the data for this sample can be explained by
the linear relationship between X and Y
249
R r
2 2
where:
R2 = Coefficient of determination
r = Simple correlation coefficient
MEAN SQUARE REGRESSION
250
SSR
MSR
k
MEAN SQUARE ERROR
SSE
MSE
where: n k 1
SSE = Sum of squares error
n = Sample size
k = Number of independent variables
Curve fitting by least squares
251
If there is tie among ranks of individuals suppose m numbers of ties then for
( m m)
3 2
each tie add a quantity di
12 for each tie in
Example 1
253
A Preparatory. 25
B Primary. 10
C College 8
D secondary 15
E Illiterate 50
F University. 60
Without tie
254
∑ di2=38
6 38
rs 1 0.085
6(35)
Example 2
255
∑ di2=64
6 64 0.5 0.5 0.5
rs 1 0.169
7( 48)
Exercise
257
The following table gives the distribution of the total population and
those who are totally or partially blind among them. Find out if there is
any correlation between age and blindness.
No. of
persons in 100 60 40 36 24 11 6 3
thousand
Blind 55 40 40 40 36 22 18 5
Exercise
258
For data given on {Slide 16} Fit the following (IF POSSIBLE) & decide which
is the best fitted
X=c+dY Linear Curve
Y= a+bX+cX2 Parabolic Curve r=b x
d?
Y=a+bX+cX2+dX3 Cubic Curve
Y=a ebX Exponential Curve
Y=aXb Power Curve
Y=1/a+bX Hyperbolic Curve
Also find Mean Squares Regression , Mean squares Error & Coefficient of
Determination
ESTIMATION OF PARAMETERS
259
Outline
260
Confidence Intervals
261
Statistical
Methods
Descriptive Inferential
Statistics Statistics
Hypothesis
Estimation
Testing
Estimators & Estimates
262
Xi
X i 1
n
but if the observed values of X are 1, 2, 3, and 6, the estimate is 3.
So the estimator is a formula; the estimate is a number.
Properties of a Good Estimator
263
1. Unbiasedness
2. Efficiency
3. Sufficiency
4. Consistency
Unbiasedness
264
ˆ
E ( )
In other words, on average the estimator is right on target.
Examples
265
(X - X ) 2
Recall that s 2 i 1
.
n 1
If we divided by n instead of by n-1, we would not have an unbiased estimator of s 2. That is why s2 is
defined the way it is.
Bias
266
ˆ
bias E( )
Note: The bias of an unbiased estimator is zero
Mean Squared Error (MSE)
267
ˆ
MSE E[( ) ]
2
The most efficient estimator is the one with the smallest MSE.
Efficiency
269
Since MSE 2
bias , 2
If you have two estimators, one of which has a small bias & a small variance and
The other has no bias but a large variance, the more efficient one may be the one that is just slightly
off on average, but that is more frequently in the right vicinity.
Example: sample mean & median
270
As we have found, the sample mean is an unbiased estimator of m.
It turns out that the sample median is also an unbiased estimator of m.
We know the variance of the sample mean is s2/n.
The variance of the sample median is (p/2)(s2/n).
Since p is about 3.14, p/2 >1.
So the variance of the sample median is greater than s2/n, the variance of
the sample mean.
Since both estimators are unbiased, the one with the smaller variance
(the sample mean) is the more efficient one.
In fact, among all unbiased estimators of m, the sample mean is the one
with the smallest variance.
Sufficiency
271
Example 2: The sample mean, however, uses all the information, and
therefore is a sufficient estimator.
Consistency
273
In other words, as the sample size increases, The estimator spends more
and more of its time closer and closer to the parameter value.
One way that an estimator can be consistent is for its bias and its variance
to approach zero as the sample size approaches infinity.
Example of a consistent estimator
274
m
Example (Sample Mean)
275
_
We know that the mean of X is m.
So its bias not only goes to zero as n approaches infinity, its bias is
always zero.
It is unlikely, however, that our estimate will precisely equal our
parameter.
We, therefore, may prefer to report something like this: We are 95%
certain that the parameter is between “a” and “b.”
-1.96
We know that P (0 < Z < 1.96) = 0.4750
0.4750
Then P(-1.96 < Z < 1.96) = 0.95
0 1.96 Z
X-
We also know that is distributed as a standard normal (Z).
n
So there is a 95% probability that X-
- 1.96 1.96
n
X-
- 1.96 1.96
n
279
- 1.96 X - 1.96
Multiplying through by , n n
n
Subtracting off , X - X - 1.96 - -X 1.96
n n
Multiplying by -1 and flipping the
inequalities appropriately, X 1.96 X - 1.96
n n
Flipping the entire expression,
X - 1.96 X 1.96
n n
So we have a 95% Confidence Interval for the
Population Mean m
280
X - 1.96 X 1.96
n n
Example
281
Suppose a sample of 25 students at a university has a sample mean IQ of 127. If the population standard
deviation is 5.4, calculate the 95% confidence interval for the population mean.
An intelligence quotient,
or IQ, is a score derived from one
of several standardized tests
X - 1.96 X 1.96
n n designed to assess intelligence
5.4 5.4
127 - 1.96 127 1.96
25 25
127 - 2.12 127 2.12
124 .88 129.12
We are 95% certain that the population mean is between 124.88 & 127.12
When we say we are 95% certain that the population mean m is between
124.88 & 127.12, it means that
282
The population mean m is a fixed number, but we don’t know what it is.
Our confidence intervals, however, vary with the random sample that
we take.
0.4900
-k 0 k
-2.33 Z 2.33
Suppose we want a 98% confidence interval.
We need to find 2 values, call them –k and k, such that Z is between them 98% of the time.
Then Z will be between 0 and k with probability half of 0.98, which is 0.49 .
Look in the body of the Z table for the value closest to 0.49, which is 0.4901 .
So that is your value of k, and the number you use for Z in your confidence interval.
Sometimes 2 numbers in the Z table are equally close to the value you want.
285
For example, if you want a 90% confidence interval, you look for half of 0.90 in the body of the Z
table, that is, 0.45
Usually in these cases, we use the average of 1.64 and 1.65, which is 1.645
Thus, when the confidence level needs to be very high (such as 99%), the
interval needs to be wide.
Let’s redo the IQ example with a different confidence level
287
Statistical
Methods
Descriptive Inferential
Statistics Statistics
Hypothesis
Estimation
Testing
Objectives
292
Given a claim, identify the null hypothesis, and the alternative hypothesis, and
express them both in symbolic form.
Given a claim and sample data, calculate the value of the test statistic.
Identify the type I and type II errors that could be made when testing a given
claim.
Hypothesis Testing
293
Represented by H0 Represented by H1
Statement about the value of a Statement about the value of a
population parameter that must be
population parameter that is under true if the null hypothesis is false
investigation. Stated in on of three forms
Always stated as an Equality >
It asserts there is no change. <
For the NTS form, the null
For the NTS form, the alternative
hypothesis is:
hypothesis is:
H0: = 15 minutes
H1: > 15 minutes
Null Hypothesis: H0
298
If you are conducting a study and want to use a hypothesis test to support
your claim, the claim must be worded so that it becomes the alternative
hypothesis. This means your claim must be expressed using only
<
>
Type of Test to Use
301
This depends on what you suspect. For the NTS form, you suspected
the mean was greater time was greater than claimed, so you would
lean to a right-tailed test.
If you suspect that the average length of time you get from your
phone battery is less than claimed, you would use a left-tailed test.
If the distribution is not normal, you will need a sample size of at least
30 to test the mean. If the population standard deviation is known, you
will use z.
If the distribution is normal (or the sample size is larger than 30) and
the standard deviation is known. Then
x
z/ n
Hypothesis Tests when is unknown
304
Except use the t-distribution with d.f.=n-1 and the test statistic will be
x
t s/ n
Test Statistic For Proportion
305
The test statistic is a value computed from the sample data, and it is
used in making the decision about the rejection of the null
hypothesis.
The test statistic is a value computed from the sample data, and it is
used in making the decision about the rejection of the null
hypothesis.
(n – 1)s2
2 =
2
Conclusions
307
Every hypothesis test ends with the experimenters (you and I) either
Rejecting the Null Hypothesis, or
Failing to Reject the Null Hypothesis
The null hypothesis is rejected if the P-value is very small, such as 0.05
or less. The smaller the P-value, the stronger the evidence against H0
Interpreting the p-value
309
The smaller the p-value, the more statistical evidence exists to support
the alternative hypothesis.
If the p-value is less than 1%, there is overwhelming evidence that
supports the alternative hypothesis.
If the p-value is between 1% and 5%, there is a strong evidence that
supports the alternative hypothesis.
If the p-value is between 5% and 10% there is a weak evidence that
supports the alternative hypothesis.
If the p-value exceeds 10%, there is no evidence that supports the
alternative hypothesis.
Interpreting the p-value
310
Overwhelming Evidence
(Highly Significant)
Strong Evidence
(Significant)
Weak Evidence
(Not Significant)
No Evidence
(Not Significant)
If we fail to reject the null hypothesis, we conclude that there is not
enough statistical evidence to infer that the alternative hypothesis is
true.
This does not mean that we have proven that the null hypothesis is true.
Types of Errors
312
Referring
to Ho, the
Null
Hypothesis
True False
A common level of significance is .05 (that means if we reject the null hypothesis,
we will be at least 95% sure that the null hypothesis is false).
Determine the null and alternative hypothesis and set the level of
significance
If P-value ≤ , then reject H0; If P-value > , then do not reject H0
Example
316
317
Critical Region (or Rejection Region)
318
Set of all values of the test statistic that would cause a rejection of the
null hypothesis
Right-tailed Test
319
H 0: =
H 1: > Points Right
Values that
differ significantly
from Ho
Left-tailed Test
320
H0 : =
H1 : <
Points Left
Values that
differ significantly
from Ho
Critical Region Method
321
As with previous method for hypothesis tests, determine H0, H1 and .
z
-z.025 0 +z.025
323
Since z = 1.19 is not greater than 1.96, nor less than –1.96 we cannot reject
the null hypothesis in favor of H1.
Example
324
ˆ
A survey of n = 880 randomly selected adult drivers showed
that 56% of those respondents admitted to running red lights.
Find the value of the test statistic for the claim that the
majority of all adult drivers admit to running red lights.
Solution 325
The preceding example showed that the given claim results in the following null and alternative hypotheses:
H0: p = 0.5 and H1: p > 0.5
Because we work under the assumption that the null hypothesis is true with p = 0.5, we get the following test
statistic:
z=p–p
pq
n (0.5)(0.5)
880
326
Decision Criterion
327
Traditional method:
Reject H0 if the test statistic falls within the critical region.
Fail to reject H0 if the test statistic does not fall within the critical region.
P-value method:
Reject H0 if P-value (where is the significance level, such as 0.05).
Fail to reject H0 if P-value > .
Decision Criterion
328
Confidence Intervals:
Because a confidence interval estimate of a population parameter
contains the likely values of that parameter,
Reject a claim that the population parameter has a value that is not
included in the confidence interval.
Decision
329
Type I error
We decide to Correct
(rejecting a true
reject the decision
null hypothesis)
null hypothesis
Decision
Type II error
We fail to Correct (rejecting a false
reject the decision null hypothesis)
null hypothesis
Controlling Type I and Type II Errors
330