Sampling
Sampling
9.1
Sampling Distributions…
A sampling distribution is created by, as the name suggests,
sampling.
9.2
Sampling Distribution of the Mean…
A fair die is thrown infinitely many times,
with the random variable X = # of spots on any throw.
9.3
Sampling Distribution of Two Dice
A sampling distribution is created by looking at
all samples of size n=2 (i.e. two dice) and their means…
3.5 6/36
4.0 5/36
4.5 4/36 2/36
5.0 3/36
5.5 2/36
6.0 1/36 1/36
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0
9.5
Compare…
Compare the distribution of X…
1 2 3 4 5 6 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0
9.6
Generalize…
We can generalize the mean and variance of the sampling of
two dice:
…to n-dice:
The standard deviation of the
sampling distribution is
called the standard error:
9.7
Central Limit Theorem…
The sampling distribution of the mean of a random sample
drawn from any population is approximately normal for a
sufficiently large sample size.
The larger the sample size, the more closely the sampling
distribution of X will resemble a normal distribution.
9.8
Central Limit Theorem…
If the population is normal, then X is normally distributed
for all values of n.
9.9
Sampling Distribution of the Sample Mean
1.
2.
9.10
Sampling Distribution of the Sample Mean
We can express the sampling distribution of the mean simple
as
X−µ
Z=
σ/ n
9.11
Sampling Distribution of the Sample Mean
The summaries above assume that the population is infinitely
large. However if the population is finite the standard error is
σ N−n
σx =
n N −1
N−n
N −1
9.12
Sampling Distribution of the Sample Mean
If the population size is large relative to the sample size the
finite population correction factor is close to 1 and can be
ignored.
9.14
Example 9.1(a)…
We want to find P(X > 32), where X is normally distributed
and µ = 32.2 and σ =.3
X − µ 32 − 32.2
P(X > 32) = P > = P( Z > − .67) = 1 − .2514 = .7486
σ .3
9.15
Example 9.1(b)…
The foreman of a bottling plant has observed that the
amount of soda in each “32-ounce” bottle is actually a
normally distributed random variable, with a mean of 32.2
ounces and a standard deviation of .3 ounce.
9.16
Example 9.1(b)…
We want to find P(X > 32), where X is normally distributed
With µ = 32.2 and σ =.3
Things we know:
1) X is normally distributed, therefore so will X.
2) = 32.2 oz.
3)
9.17
Example 9.1(b)…
If a customer buys a carton of four bottles, what is the
probability that the mean amount of the four bottles will be
greater than 32 ounces?
9.18
Graphically Speaking…
mean=32.2
what is the probability that one bottle will what is the probability that the mean of
contain more than 32 ounces? four bottles will exceed 32 oz?
9.19
Chapter-Opening Example
Salaries of a Business School’s Graduates
In the advertisements for a large university, the dean of
the School of Business claims that the average salary
of the school’s graduates one year after graduation is
$800 per week with a standard deviation of $100.
9.20
Chapter-Opening Example
Salaries of a Business School’s Graduates
He does a survey of 25 people who graduated one year ago
and determines their weekly salary.
P( X < 750)
The distribution of X, the weekly income, is likely to be
positively skewed, but not sufficiently so to make the
distribution of X nonnormal. As a result, we may assume that X
is normal with mean
µ x = µ = 800
σ x = σ / n = 100 / 25 = 20
9.22
Chapter-Opening Example
Thus,
P( X < 750)
X − µx 750 − 800
= P <
σ x 20
= P( Z < − 2.5)
= .5 − .4938
= .0062
The probability of observing a sample mean as low as $750 when
the population mean is $800 is extremely small. Because this event
is quite unlikely, we would have to conclude that the dean's claim is
not justified.
9.23
Using the Sampling Distribution for Inference
Here’s another way of expressing the probability calculated from a
sampling distribution.
P(-1.96 < Z < 1.96) = .95
Substituting the formula for the sampling distribution
X−µ
P(−1.96 < < 1.96) = .95
σ/ n
σ σ
P(µ − 1.96 < X < µ + 1.96 ) = .95
n n
9.24
Using the Sampling Distribution for Inference
Returning to the chapter-opening example where µ = 800, σ = 100,
and n = 25, we compute
100 100
P(800 − 1.96 < X < 800 + 1.96 ) = .95
25 25
or
This tells us that there is a 95% probability that a sample mean will
fall between 760.8 and 839.2. Because the sample mean was
computed to be $750, we would have to conclude that the dean's
claim is not supported by the statistic.
9.25
Using the Sampling Distribution for Inference
Changing the probability from .95 to .90 changes the probability
statement to
σ σ
P(µ − 1.645 < X < µ + 1.645 ) = .90
n n
9.26
Using the Sampling Distribution for Inference
We can also produce a general form of this statement
σ σ
P(µ − z α / 2 < X < µ + zα / 2 ) =1− α
n n
In this formula α (Greek letter alpha) is the probability that
does not fall into the interval.
9.27
Using the Sampling Distribution for Inference
For example, with µ = 800, σ = 100, n = 25 and α= .01, we
produce
σ σ
P(µ − z .005 < X < µ + z .005 ) = 1 − .01
n n
100 100
P(800 − 2.575 < X < 800 + 2.575 ) = .99
25 25
9.28
Sampling Distribution of a Proportion…
The estimator of a population proportion of successes is the
sample proportion. That is, we count the number of
successes in a sample and compute:
9.29
Normal Approximation to Binomial…
Binomial distribution with n=20 and p=.5 with a normal
approximation superimposed ( =10 and =2.24)
9.30
Normal Approximation to Binomial…
Binomial distribution with n=20 and p=.5 with a normal
approximation superimposed ( =10 and =2.24)
Hence:
and
9.31
Normal Approximation to Binomial…
Normal approximation to the binomial works best when the
number of experiments, n, (sample size) is large, and the
probability of success, p, is close to 0.5
9.32
Normal Approximation to Binomial…
To calculate P(X=10) using the
normal distribution, we can find
the area under the normal curve
between 9.5 & 10.5
9.33
Normal Approximation to Binomial…
In fact:
P(X = 10) = .176
while
P(9.5 < Y < 10.5) = .1742
the approximation is quite good.
9.34
Sampling Distribution of a Sample Proportion…
Using the laws of expected value and variance, we can
determine the mean, variance, and standard deviation of .
(The standard deviation of is called the standard error of
the proportion.)
9.35
Example 9.2
In the last election a state representative received 52% of the
votes cast.
9.36
Example 9.2
The number of respondents who would vote for the representative
is a binomial random variable with n = 300 and p = .52.
9.37
Example 9.2
Thus, we calculate
9.39
Sampling Distribution: Difference of two means
The expected value and variance of the sampling
distribution of are given by:
mean:
standard deviation:
mean of
9.41
Example 9.3…
Starting salaries for MBA grads at two universities are
normally distributed with the following means and standard
deviations. Samples from each school are taken…
University 1 University 2
Mean 62,000 $/yr 60,000 $/yr
Std. Dev. 14,500 $/yr 18,300 $/yr
sample size n 50 60
9.42
Example 9.3…
“What is the probability that the sample mean starting salary
of University #1 graduates will exceed that of the #2 grads?”
Z
“there is about a 74% chance that the sample mean
starting salary of U. #1 grads will exceed that of U. #2”
9.43
From Here to Inference
In Chapters 7 and 8 we introduced probability distributions,
which allowed us to make probability statements about
values of the random variable.
9.44
From Here to Inference
In Example 7.9, we needed to know that the probability that
Pat Statsdud guesses the correct answer is 20% (p = .2) and
that the number of correct answers (successes) in 10
questions (trials) is a binomial random variable.
9.45
From Here to Inference
In Example 8.2, we needed to know that the return on
investment is normally distributed with a mean of 10% and a
standard deviation of 5%.
9.46
From Here to Inference
The figure below symbolically represents the use of
probability distributions.
9.47
From Here to Inference
In this chapter we developed the sampling distribution,
wherein knowledge of the parameter(s) and some
information about the distribution allow us to make
probability statements about a sample statistic.
9.48
From Here to Inference
Statistical works by reversing the direction of the flow of
knowledge in the previous figure. The next figure displays
the character of statistical inference.
9.49
From Here to Inference
Sampling distribution
Statistic ------ Parameter
9.50