Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
97 views

TOPIC 6 Sampling Distribution and Point Estimation of Parameters

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views

TOPIC 6 Sampling Distribution and Point Estimation of Parameters

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 38

Sampling Distribution

and Point Estimation of


Parameters
Objectives
At the end of the lesson, the students are expected to
• Explain the general concepts of estimating the
parameters of a population or a probability
distribution;
• Explain the important role of the normal distribution as
a sampling distribution; and
• Understand the central limit theorem.
Point Estimator
• In statistics, a point estimator is a statistic used to estimate an
unknown parameter in a population. It's a single value calculated
from sample data that serves as an approximation or best guess of
the true parameter value.

Example 1: Estimating Population Mean


• Let's say you want to estimate the average height of students in a
school. You randomly select 50 students and measure their heights.
The sample mean height of these 50 students (e.g., 165 cm) is used
as a point estimator for the population mean height.
• Parameter: Population mean height (unknown).
• Point Estimator: Sample mean height (165 cm).
• Interpretation: You're using the average height of the sampled
students as an estimate of the average height of the entire
student population.
Point Estimator
• A point estimate of some population parameter is a
single numerical value of a statistic . The statistic is
called the point estimator.

Estimation problems occur frequently in engineering.


• The mean μ of a single population
• The variance σ2 (or standard deviation σ) of a single
population
• The proportion p of items in a population that belong
to a class of interest
Point Estimator
• The difference in means of two populations,
• The difference in two population proportions,

Reasonable point estimates:


Sampling Distribution
• Imagine you have a population and you're interested in
some characteristics, called parameters (like the mean or
standard deviation). Instead of surveying the entire
population, you take samples. The distribution of these
sample statistics, like means or proportions, is called the
sampling distribution. It helps us understand the variability
and behavior of these statistics.
Sampling Distribution
• The random variables are a random sample of size n is
(a) the Xi’s are independent random variables, and (b)
every Xi has the same probability distribution.
• A statistic is any function of the observations in a
random sample.
• The probability distribution of a statistic is called a
sampling distribution.
- For example, the probability distribution of is called
the sampling distribution of the mean.
Central Limit Theorem
Consider determining the sampling distribution of the
sample mean . The sample mean

has a normal distribution with mean

and variance
Central Limit Theorem
• If we are sampling from a population that has an
unknown probability distribution, sampling distribution
of the sample mean will still be approximately normal
with mean μ and variance σ2/n, if the sample size n is
large.
• In Inferential Statistics, n ≥ 40 (Montgomery and
Runger, 2011) is considered a large sample. Otherwise,
it is considered small.
• n ≥ 30 is considered a large sample (Walpole, et al,
2012)
Central Limit Theorem
If is a random sample of size n taken from a population
(either finite or infinite) with mean μ and variance σ2, and
if is the sample mean, the limiting form of the
distribution of

as n → ∞, is the standard normal distribution.


Central Limit Theorem
The normal approximation for depends on the sample
size n. Figure 7-1(a) shows the distribution obtained
for throws of a single, six-sided true die. The
probabilities are equal (1/6) for all the values obtained,
1, 2, 3, 4, 5, or 6. Figure 7-1(b) shows the distribution
of the average score obtained when tossing two dice,
and Fig. 7-1(c), 7-1(d), and 7-1(e) show the
distributions of average scores obtained when tossing
three, five, and ten dice, respectively. Notice that, while
the population (one die) is relatively far from normal,
the distribution of averages is approximated reasonably
well by the normal distribution for sample sizes as
small as five. (The dice throw distributions are discrete,
however, while the normal is continuous.)
Central Limit Theorem
• With larger sample sizes, the sampling distribution of the
sample means will tend to resemble a bell-shaped (normal)
curve, even if the individual outcomes (rolling the die) are
not normally distributed.

• Thus, Central Limit Theorem (CLT) states that with a large


enough sample size, the sampling distribution of the sample
mean approaches a normal distribution, regardless of the
population's original distribution. It's a powerful concept
that allows us to make inferences about a population based
on sample data.
Central Limit Theorem
Examples:
1. An electronics company manufactures resistors that
have a mean resistance of 100 ohms and a standard
deviation of 10 ohms. The distribution of resistance is
normal. Find the probability that a random sample of
n = 25 resistors will have an average resistance less
than 95 ohms.

This problem involves sampling distribution and the Central Limit Theorem
(CLT) as it deals with the distribution of sample means. The formula to find the
probability involves using the properties of a normal distribution.
Given:
Population mean (μ) = 100 ohms
Population standard deviation (σ) = 10 ohms
Sample size (n) = 25 resistors
We need to find the probability that the average resistance of the sample () is
less than 95 ohms.

Steps to Solve:
1. Standardize the Distribution: To use the normal distribution, convert the
sample mean to a z-score using the formula for the standard error of the mean
(SEM).

2. Calculate the Probability: Once you have the z-score, find the probability
using a standard normal distribution table.
The desired probability corresponds to the shaded area in Fig. 7-2.
Solution:
First, calculate the standard error of the mean (SEM):

Next, find the z-score:

Using a standard normal distribution table, find the probability


corresponding to a z-score of -2.5. The probability of a z-score being
less than -2.5 represents the probability that the average resistance of
the sample is less than 95 ohms.

The probability corresponding to a z-score of -2.5 is approximately


0.0062 or 0.62%.
• Practical Conclusion: This example show that if the distribution of
resistance is normal with mean 100 ohms and standard deviation of
10 ohms, then finding that a random sample of resistors with a
sample mean smaller than 95 ohms is a rare event. If this actually
happen, it caste doubt as to whether the true mean is really 100
ohms or if the true standard deviation is really 10 ohms.
Central Limit Theorem
2. Suppose that a random variable X has a continuous
uniform distribution

Find the distribution of the sample mean of a random


sample of size n = 40.
Central Limit Theorem

• The mean and variance of X are and

• The central limit theorem indicates that the distribution of is


approximately normal with mean

• and variance
Central Limit Theorem
Central Limit Theorem
3. Suppose that the random variable X has the continuous
uniform distribution

Suppose that a random sample of n = 12 observations is


selected from this distribution. What is the approximate
probability distribution of ? Find the mean and variance of
this quantity.
Solution:
Central Limit Theorem
4. Suppose that X has a discrete uniform distribution

A random sample of n = 36 is selected from this


population. Find the probability that the sample mean is
greater than 2.1 but less than 2.5, assuming that the
sample mean would be measured to the nearest tenth.
Solution:
Central Limit Theorem
5. The amount of time that a customer spends waiting at
an airport check-in counter is a random variable with
mean 8.2 minutes and standard deviation 1.5 minutes.
Suppose that a random sample of n = 49 customers is
observed. Find the probability that the average time
waiting in line for these customers is
(a) Less than 10 minutes
(b) Between 5 and 10 minutes
(c) Less than 6 minutes
Solution:

Using the central limit theorem, is approximately normally distributed.


Difference in Sample Means
If we have two independent populations with means μ1
and μ2 and variances and and if and are the sample
means of two independent random samples of sizes n1
and n2 from these populations, then the sampling
distribution of

Is approximately normal, if the conditions of the central


limit theorem apply. If the two populations are normal,
the sampling distribution of Z is exactly standard normal.
Difference in Sample Means
Examples:
6. Aircraft Engine Life The effective life of a component
used in a jet-turbine aircraft engine is a random variable
with mean 5000 hours and standard deviation 40 hours.
The distribution of effective life is fairly close to a normal
distribution. The engine manufacturer introduces an
improvement into the manufacturing process for this
component that increases the mean life to 5050 hours
and decreases the standard deviation to 30 hours.
Suppose that a random sample of n1 = 16 components is
selected from the “old” process and a random sample of
n2 = 25 components is selected
Difference in Sample Means
from the “improved” process. What is the probability that
the difference in the two sample means is at least 25
hours? Assume that the old and improved processes can
be regarded as independent populations.
To solve this problem, we first note that the distribution of is normal with mean

and standard deviation

and the distribution of is normal with mean

and standard deviation

Now the distribution of is normal with mean

and variance
This sampling distribution is shown in Fig. 7-4.

The probability that is the shaded portion of the normal distribution in


this figure.

Corresponding to the value in Fig. 7-4, we find that

and consequently,
• Therefore, there is a high probability (0.9838) that the
difference in sample means between the new and the old
process will be at least 25 hours if the sample sizes are
and .
Difference in Sample Means
7. A random sample of size n1 = 16 is selected from a
normal population with a mean of 75 and a standard
deviation of 8. A second random sample of size n2 = 9 is
taken from another normal population with mean 70 and
standard deviation 12. Let and be the two sample
means. Find:
(a) The probability that exceeds 4
(b) The probability that
Solution:
Summary
• The probability distribution of a statistic is called the
sampling distribution. For example, the sampling
distribution of the sample mean is the normal
distribution.
• The simplest form of the central limit theorem states
that the sum of n independently distributed random
variables tend to be normally distributed as n becomes
large. It is a necessary and sufficient condition that
none of the variances of the individual random
variables are large in comparison to their sum.
Summary
• Sampling Distribution of the Mean

• Approximate Sampling Distribution of a Difference in


Sample Means
References
• Montgomery and Runger. Applied Statistics and
Probability for Engineers, 5th Ed. © 2011
• Walpole, et al. Probability and Statistics for Engineers
and Scientists 9th Ed. © 2012, 2007, 2002

You might also like