0% found this document useful (0 votes)

36 views

Sampling Distribution and Estimation

1) The document discusses sampling distributions and how they relate to estimating parameters from a population based on a sample. It provides examples of common sampling distributions like t, chi-square, and F distributions. 2) A manager conducted a survey of 40 customers to estimate the average amount spent on pizza and proportion of youth customers after a promotional campaign. She wants to use the data to calculate 95% confidence intervals for each estimate and test if the values have increased. 3) The central limit theorem states that as sample size increases, the sampling distribution of the sample mean approaches a normal distribution, allowing inferences to be made about the population mean.

Uploaded by

AKSHAY NANGIA

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views

Sampling Distribution and Estimation

Uploaded by

AKSHAY NANGIA

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 53

Sampling Distribution

and
Estimation
Sample Population

Summary
Summary
(Statistic/Estimator Inference
(Parameter)
)
Population
Distribution
Sampling
(Binomial,
Distribution
Poisson,
(Z, t, chi-square, F) Inference
Normal)
Case
A restaurant chain has made improvements to its pizzas by
including a new soft and tasty crust, more and bigger toppings,
more cheese and a new heavier imported tomato sauce. The
chain makes a promotional campaign by offering the product to
its customers at half price expecting a large footfall of youths
and families in the outlets. The offer is limited to one month.

The manager of one of the outlets of the chain restaurant

wants to measure the effectiveness of the campaign. She has
been running the restaurant for last ten years and has claimed
from her experience that 75% of its customers are youth and
customers spend an average of INR 350 on pizza.
Case
In an attempt to measure the persisting impact of the marketing campaign
on the amount that the customers spend on pizza, and on the proportion of
youths visit the restaurant, the manager conducts a survey on 40 of her
pizza customers. The survey is carried out two months after the promotion
gets over to eliminate bias in the experiment. The survey reveals that 32
(80%) of the customers comprise of youth, and an average of INR 375 is
spent by a customer on pizza. Moreover, the estimated standard deviation is
found to be INR 50.
The manager wants to use the above data to
a) obtain 95% confidence interval for average amount spent on pizza.
b) obtain 95% confidence interval for proportion of youth customers
ordering pizza.
c) test whether the average amount spent has increased due to the
campaign.
d) test whether the proportion of youths buying pizza has increased.
Example

• Mean amount spent is a random variable

• Proportion of youth customers ordering pizza is a random
variable
Concept of sampling distribution

Suppose you select all possible random samples of

customers, each of those samples will yield a value
of the average amount spent (𝑥).
ҧ If you construct a
histogram of those values, what you will get is
precisely the sampling distribution of the mean
amount spent
Sampling Distributions
• t-distribution
• Chi-square distribution
• F distribution
• The tdf Distribution with Various Degrees of
Freedom

LO 8.4
Chi-square
The distribution of chi-square depends on 1 parameter, its degrees of freedom (df
or v). As df gets large, curve is less skewed, more normal.
F Distribution
Car Mileage Case
Hybrid and electric cars are a vital part in reducing US’s gasoline consumption.
Most effective way to conserve gasoline is to design gasoline powered cars that are
more fuel efficient. Virtually every gasoline powered midsize cars equipped with
automatic transmission has an EPA combined city and highway mileage estimate of
26miles/gallon or less. Suppose that government has decided to offer tax credit to
any automaker selling midsize model which achieves an EPA of at least 31mpg.

Consider an automaker has recently introduced a new midsized model that this
qualifies for the tax credit. Consider the population of all cars of this type that will
or could be potentially be produced. The automaker will choose a sample of 50 of
these cars. The manufacturers production operation runs 8 hour-shifts, with 100
midsized cars produced on each shift. When all start up problems have been
corrected, automaker select 1 car at random from each of 50 shifts and they are
subjected to EPA test.
Sampling Distribution of the Sample Mean

The sampling distribution of the sample mean is the

probability distribution of the population of the sample
means obtainable from all possible samples of size n from a
population
Example: The Population of Sample Means
Example: A Graph of the Probability
Distribution
Standard Error

• Variation in the values of statistic from sample to

sample is called sampling fluctuation and is
measured by STANDARD ERROR
Sampling Distribution of Mean

E ( x) = 
ˆ = x

s.e( x ) =
n

As sample size increases, standard error decreases

Result

2 𝜎2
𝐼𝑓 𝑋~𝑁 𝜇, 𝜎 , 𝑥~𝑁(𝜇,
ҧ )
𝑛
Example
The foreman of a bottling plant has observed that the amount
of soda in each “32-ounce” bottle is actually a normally
distributed random variable, with a mean of 32.2 ounces and
a standard deviation of .3 ounce.

If a customer buys one bottle, what is the probability that the

bottle will contain more than 32 ounces?
Example

We want to find P(X > 32), where X is normally distributed and µ =

32.2 and σ =.3

 X −  32 − 32.2 
P(X  32) = P   = P( Z  − .67) = 1 − .2514 = .7486
  .3 

“there is about a 75% chance that a single bottle of soda contains more than
32oz.”
Example

The foreman of a bottling plant has observed that the amount

of soda in each “32-ounce” bottle is actually a normally
distributed random variable, with a mean of 32.2 ounces and
a standard deviation of .3 ounce.

If a customer buys a carton of four bottles, what is the

probability that the mean amount of the four bottles will be
greater than 32 ounces?
Example

We want to find P(X > 32), where X is normally distributed

With µ = 32.2 and σ =.3

Things we know:
X is normally distributed, therefore so will X.

= 32.2 oz.
Example

If a customer buys a carton of four bottles, what is the probability that

the mean amount of the four bottles will be greater than 32 ounces?

“There is about a 91% chance the mean of the four bottles will exceed
32oz.”
mean=32.2

what is the probability that one bottle will what is the probability that the mean of
contain more than 32 ounces? four bottles will exceed 32 oz?
Central Limit Theorem (CLT)

If a random sample of size n is drawn from a

population with mean µ and standard deviation σ,
the distribution of the sample mean (x) approaches
normal distribution with mean µ and standard
deviation n as the sample size (n) increases.
 2 
 , n
i.e. x ~ N  

 
If the population is normal, the distribution of the
sample mean is normal regardless of sample size.
WHY CLT IS USEFUL

• When the sampling distribution of x is approximately

normal, we can use the Empirical rule to predict how
close sample means will be to the true population
mean.
• Since the CLT holds for a large number of population
distributions, it helps us to make inferences about the
population means regardless of the shape of the
population distribution. This is often helpful in practice
since we usually do not know the true shape of the
population distribution (and often it is skewed).
How Large?

• How large is “large enough?”

• If the sample size is at least 30, then for most
populations, the sampling distribution of sample
means is approximately normal
• For skewed distribution, it may be even 50 or more
• For heavy tailed it may be even more (100 or more)
• If the population is normal, then the sampling
distribution of sample mean is normal regardless of
the sample size
Data Analysis

Mean 31.56
Standard Error 0.112812
Median 31.55
Mode 31.4
Standard Deviation 0.797701
Sample Variance 0.636327
Kurtosis -0.51125
Skewness -0.03422
Range 3.5
Minimum 29.8
Maximum 33.3
Sum 1578
Count 50
How to estimate parameters?

Already seen that

𝜇Ƹ = 𝑥ҧ 𝑢𝑠𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛 𝑡𝑜 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛

For the car mileage case sample mean=31.56
How to estimate population Standard
deviation σ?

1 2
𝜎ො = 𝑠 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑑 = ෍ 𝑥 − 𝑥ҧ
𝑛−1

Note: These estimates are point estimates, may not

be perfect.

Use “Interval Estimates”

Confidence Intervals
Interval Estimate =
Point Estimate ± Margin of Error

Margin of Error = sampling

distribution (point)*Standard error
Confidence Intervals for a Mean: σ Known

• Confidence interval for a population mean is an

interval constructed around the sample mean so we
are reasonable sure that it contains the population
mean
• Any confidence interval is based on a confidence
level
Elements of Interval Estimation

Probability That the Population Parameter Falls

Somewhere Within the Interval.
Confidence Interval Statistic (eg.
Sample mean)

Confidence Limit (Lower) Confidence Limit (Upper)

The Car Mileage Case

• Automaker conducted mileage tests on n=50 cars

• Sample mean is 31.56
• This is a point estimate of the population mean
• Do not know how good this estimate is
• Will use a confidence interval
The Car Mileage Case

• There were many samples of 50 cars

• Each would give different means
• Consider the probability distribution of all the
sample means
• Called the sampling distribution

=x
se( x ) = 
n
The Car Mileage Case

1. Because the sampling distribution of the sample mean is

a normal distribution, we can use the normal distribution
to compute probabilities about the sample mean
2. The 95 percent confidence interval is

  
x  1.96 x =  x  1.96
 

 n 
What is happening?
S ampling Dis tribution of the Me an
0.4

95%
0.3

f(x) 0.2

0.1
2.5% 2.5%

0.0
x

 x + 1.96

x − 1.96 n
n

x
2.5% fall below
the interval x
x
x
x 2.5% fall above
the interval
x

x
x

95% fall within

the interval
2/6/2021 Statistical Inference
Generalizing

• The probability that the confidence interval will contain the

population mean μ is denoted by 1 - α
• 1 –α is referred to as the confidence coefficient
• (1 – α)  100% is called the confidence level
• Usual to use two decimal point probabilities for 1 – α
• Here, focus on 1 – α = 0.95 or 0.99
General Confidence Interval

• In general, the probability is 1 – α that the population

mean μ is contained in the interval

x  z 2 x  
=  x  z 2
 

 n

• The normal point zα/2 gives a right hand tail area under
the standard normal curve equal to α/2
• The normal point -zα/2 gives a left hand tail area under
the standard normal curve equal to a/2
• The area under the standard normal curve between zα/2
and zα/2 is 1 – α
General Confidence Interval

• If a population has standard deviation σ (known),

• and if the population is normal or if sample size is large (n 
30), then …
• … a (1-)100% confidence interval for  is

    
x  z 2 =  x − z 2 , x + z 2 
n  n n
95% Confidence Interval

  
x  z 0.025  x  =  x  1.96 
 n
   
=  x − 1.96 , x + 1.96 
 n n
99% Confidence Interval

• For 99% confidence, need the normal point z0.005

• (1 – 0.99) / 2 = 0.005
• z0.005 = 2.575
• The 99% confidence interval is

  
x  z 0.025  x  =  x  2.575 
 n
   
=  x − 2.575 , x + 2.575 
 n n
The Effect of α on Confidence Interval Width
t-Based Confidence Intervals for a Mean:
σ Unknown

• If σ is unknown (which is usually the case), we can construct a

confidence interval for μ based on the sampling distribution of

x −
t=
s n

• If the population is normal, then for any sample size n, this sampling
distribution is called the t distribution
The t Distribution

• The curve of the t distribution is similar to that of the

standard normal curve
• Symmetrical and bell-shaped
• The t distribution is more spread out than the standard
normal distribution
• The spread of the t is given by the number of degrees of
freedom (sample size)
• Denoted by df
• For a sample of size n, there are one fewer degrees of
freedom, that is, df = n – 1
Degrees of Freedom and the
t-Distribution

As the number of degrees of freedom increases, the spread

of the t distribution decreases and the t curve approaches
the standard normal curve
t and Right Hand Tail Areas

• Use a t point denoted by tα

• tα is the point on the horizontal axis under the t curve that
gives a right hand tail equal to α
• So the value of tα in a particular situation depends on the
right hand tail area α and the number of degrees of freedom
• df = n – 1
• 1 – α is the specified confidence coefficient
t and Right Hand Tail Areas
t-Based Confidence Intervals for a Mean:
σ Unknown

• If the sampled population is normally distributed with

mean , then a (1)100% confidence interval for  is

s
x  t 2
n

• t/2 is the t point giving a right-hand tail area of /2

under the t curve having n-1 degrees of freedom
Car Mileage estimation:
• Recall from the previous example, 𝑥ҧ = 31.56 mpg
for a sample of size n=50 and s= 0.8


 0 .8
x = = = 0.113
n 50
t 0.025, 49 = 2.010
Car Mileage: 95% Confidence interval of
mean mileage
s
x  t 2; n −1
n
= 31.56  ( 2.010 * 0.113)
= 31.56  0.22713
95% CI of 
[31.33,31.79]
Practice Problem 1:
• A manufacturer of light bulbs claims that its light bulbs have a mean life  hours
with a standard deviation of 85 hours. A random sample of 40 such bulbs is
selected for testing. If the sample produces a mean value of 1505 hours, find out
95% Confidence Interval of .
Solution: Given, n=40 (large), =85 (known), 1-=0.95, =0.05,

x = 1505
z / 2 = z 0.025 = 1.96
95% CI of  is given by

 85 85 
1505 − 1.96 , 1505 + 1.96
 40 40 
= 1478.66 , 1531.34
Practice Problem 2:
• Waiting times (in hours) at a popular restaurant are found to have a mean
waiting time of 1.52 hours with sd 2.25hrs. for a sample of 50 customers.
Construct the 99% confidence interval for the estimate of the population mean.
Solution: Given, n=50 (large), s=2.25 (estimated), 1-=0.99, =0.01,
z / 2 = z 0.005 = 2.58
x = 1.52
Therefore,
99% CI of  is given by

 2.25 2.25 
1.52 − 2.58 , 1.52 + 2.58
 50 50 
= 1.20 , 2.34 
Use t based confidence interval and observe the difference (assuming normal
population).

3 Discrete Random Variables
No ratings yet
3 Discrete Random Variables
8 pages
Tutorial QS
No ratings yet
Tutorial QS
18 pages
Sampling 1
No ratings yet
Sampling 1
26 pages
Central Limit Theorem
No ratings yet
Central Limit Theorem
26 pages
Chapter 5
No ratings yet
Chapter 5
35 pages
central-limit-theorem-ppt-kyle
No ratings yet
central-limit-theorem-ppt-kyle
24 pages
Sampling and Sampling Distributions
No ratings yet
Sampling and Sampling Distributions
35 pages
Week 7 Sampling
No ratings yet
Week 7 Sampling
29 pages
Chapter 9
No ratings yet
Chapter 9
16 pages
Chapter 6 - Sampling and Estimation
No ratings yet
Chapter 6 - Sampling and Estimation
36 pages
6.1 Central Limit Theorem
No ratings yet
6.1 Central Limit Theorem
4 pages
Central Limit Theorem Grade 11 Group 4
No ratings yet
Central Limit Theorem Grade 11 Group 4
7 pages
Statatistical Inferences
No ratings yet
Statatistical Inferences
22 pages
Binomial Distributions For Sample Counts
No ratings yet
Binomial Distributions For Sample Counts
38 pages
UNIT - 4
No ratings yet
UNIT - 4
10 pages
Chapter 5
No ratings yet
Chapter 5
47 pages
Module 6 - Central Limit Theorem
No ratings yet
Module 6 - Central Limit Theorem
6 pages
Module 6 - CENTRAL LIMIT THEOREM
No ratings yet
Module 6 - CENTRAL LIMIT THEOREM
6 pages
Chapter 4 Estimation Theory
0% (1)
Chapter 4 Estimation Theory
40 pages
Estimation
No ratings yet
Estimation
44 pages
Point of Estimation of Parameters and Sampling Distri.
No ratings yet
Point of Estimation of Parameters and Sampling Distri.
39 pages
STAT001 Module 5 Mean SD of Sampling Distribution
No ratings yet
STAT001 Module 5 Mean SD of Sampling Distribution
32 pages
Statistics Prob Dist
No ratings yet
Statistics Prob Dist
9 pages
Business Statistics - Chapter 6
No ratings yet
Business Statistics - Chapter 6
31 pages
Basic Statistics
100% (1)
Basic Statistics
106 pages
TOPIC 6 Sampling Distribution and Point Estimation of Parameters
No ratings yet
TOPIC 6 Sampling Distribution and Point Estimation of Parameters
38 pages
Characteristics of Sampling Distribution
No ratings yet
Characteristics of Sampling Distribution
5 pages
Chapter 8 - Sampling Distribution
No ratings yet
Chapter 8 - Sampling Distribution
34 pages
Sampling Distribution
No ratings yet
Sampling Distribution
8 pages
Sampling Methods and The Central Limit Theorem
No ratings yet
Sampling Methods and The Central Limit Theorem
24 pages
BS 7
No ratings yet
BS 7
5 pages
Using Statistical Inference
No ratings yet
Using Statistical Inference
18 pages
Rec 12A - Sampling Distribution-2-1 Rev 20220404
No ratings yet
Rec 12A - Sampling Distribution-2-1 Rev 20220404
5 pages
Central Limit Theorem
No ratings yet
Central Limit Theorem
18 pages
AP Stat Ch17
No ratings yet
AP Stat Ch17
110 pages
Chapter 6
No ratings yet
Chapter 6
7 pages
Lecture 3: Sampling and Sample Distribution
No ratings yet
Lecture 3: Sampling and Sample Distribution
30 pages
Statistical Distributions
No ratings yet
Statistical Distributions
35 pages
Sampling Dist
No ratings yet
Sampling Dist
40 pages
Chapter 5
No ratings yet
Chapter 5
28 pages
1. CLT and Normal Dist PowerPoint
No ratings yet
1. CLT and Normal Dist PowerPoint
25 pages
Introductory Statistics A Problem Solving Approach 2nd Edition Kokoska Test Bank download
100% (1)
Introductory Statistics A Problem Solving Approach 2nd Edition Kokoska Test Bank download
49 pages
Evans Analytics2e PPT 06 Final
100% (1)
Evans Analytics2e PPT 06 Final
36 pages
Isom 2500
No ratings yet
Isom 2500
58 pages
Introductory Statistics A Problem Solving Approach 2nd Edition Kokoska Test Bankinstant download
100% (1)
Introductory Statistics A Problem Solving Approach 2nd Edition Kokoska Test Bankinstant download
52 pages
Sampling and Sampling Distribution
No ratings yet
Sampling and Sampling Distribution
46 pages
Chapter 5 Sampling and Estimation
No ratings yet
Chapter 5 Sampling and Estimation
13 pages
W5 Lecture5
No ratings yet
W5 Lecture5
15 pages
PROBABILITY & STATISTICAL ANALYSIS
No ratings yet
PROBABILITY & STATISTICAL ANALYSIS
28 pages
Ch6- Estimation (6)_1696b509adcc3dc40b3ab8b137f8d15d
No ratings yet
Ch6- Estimation (6)_1696b509adcc3dc40b3ab8b137f8d15d
28 pages
Unit-4 - Confidence Interval and CLT
No ratings yet
Unit-4 - Confidence Interval and CLT
29 pages
Central Limit Theorem
No ratings yet
Central Limit Theorem
12 pages
Statistical Sampling & Parameter Estimation: Prof M.Shashi
No ratings yet
Statistical Sampling & Parameter Estimation: Prof M.Shashi
25 pages
Interval Estimation
No ratings yet
Interval Estimation
33 pages
Sample and Sampling Procedure: Population
No ratings yet
Sample and Sampling Procedure: Population
21 pages
Sampling Distribution
No ratings yet
Sampling Distribution
22 pages
All chapter download Introductory Statistics A Problem Solving Approach 2nd Edition Kokoska Test Bank
100% (7)
All chapter download Introductory Statistics A Problem Solving Approach 2nd Edition Kokoska Test Bank
40 pages
Sampling and Estimation
No ratings yet
Sampling and Estimation
36 pages
Introductory Statistics A Problem Solving Approach 2nd Edition Kokoska Test Bank - Free Download Available In PDF DOCX Format
100% (3)
Introductory Statistics A Problem Solving Approach 2nd Edition Kokoska Test Bank - Free Download Available In PDF DOCX Format
50 pages
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)
Sampling in Statistics
From Everand
Sampling in Statistics
Stephanie Glen
No ratings yet
Introductory Statistics
From Everand
Introductory Statistics
Alandra Kahl
No ratings yet
Probability in Decision Making (Analysis of Nominal/categorical Data)
No ratings yet
Probability in Decision Making (Analysis of Nominal/categorical Data)
22 pages
Data and Summarization
No ratings yet
Data and Summarization
57 pages
Random Variable: "The Number of Heads When Flipping A Coin"
No ratings yet
Random Variable: "The Number of Heads When Flipping A Coin"
25 pages
DAY7
No ratings yet
DAY7
93 pages
Day 5
No ratings yet
Day 5
57 pages
Data Management
No ratings yet
Data Management
39 pages
Sajc 2010 Prelim Math p2
No ratings yet
Sajc 2010 Prelim Math p2
6 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
16 pages
Kuesioner Pengaruh Etika Komunikasi Antar Mahasiswa
No ratings yet
Kuesioner Pengaruh Etika Komunikasi Antar Mahasiswa
20 pages
Mathematics: Answer Key
No ratings yet
Mathematics: Answer Key
5 pages
Chapter 13, Statistics
No ratings yet
Chapter 13, Statistics
35 pages
Koleksi Soalan SPM Paper 1
No ratings yet
Koleksi Soalan SPM Paper 1
34 pages
Continuous Probability Distribution
No ratings yet
Continuous Probability Distribution
60 pages
BSSEII
No ratings yet
BSSEII
12 pages
Calculation of Remaining Life of Paper Insulation
100% (1)
Calculation of Remaining Life of Paper Insulation
9 pages
2014 Dealer Manufacturer Survey
No ratings yet
2014 Dealer Manufacturer Survey
68 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
86 pages
Topic 2 - 2.2 Mechanical Operations
No ratings yet
Topic 2 - 2.2 Mechanical Operations
43 pages
Cfa L1 R1 Outline
No ratings yet
Cfa L1 R1 Outline
9 pages
Juan G. Macaraeg National High School Fourth Grading Period Long Quiz 5
No ratings yet
Juan G. Macaraeg National High School Fourth Grading Period Long Quiz 5
1 page
Determination of Soil Property Characteristic Values From Standard Penetration Tests
No ratings yet
Determination of Soil Property Characteristic Values From Standard Penetration Tests
8 pages
Statistics With GraphPad Prism
No ratings yet
Statistics With GraphPad Prism
53 pages
A Study On Customers Perception and Satisfaction Towards Green Tea With Special Reference To Coimbotore City"
No ratings yet
A Study On Customers Perception and Satisfaction Towards Green Tea With Special Reference To Coimbotore City"
4 pages
AO Maths Syllabus 2021
No ratings yet
AO Maths Syllabus 2021
11 pages
Unit Iii Sas Procedures
No ratings yet
Unit Iii Sas Procedures
27 pages
Module 5
No ratings yet
Module 5
16 pages
SAT Math5 Answers
100% (1)
SAT Math5 Answers
50 pages
Lectures On Biostatistics-Ocr4
100% (1)
Lectures On Biostatistics-Ocr4
446 pages
Central Tendency Exercises (Mean)
No ratings yet
Central Tendency Exercises (Mean)
2 pages
) Simple Random Sampling
No ratings yet
) Simple Random Sampling
9 pages
IPE Question Bank For Maths 2A, 2B, Phy, Chem 2023 Ts State
No ratings yet
IPE Question Bank For Maths 2A, 2B, Phy, Chem 2023 Ts State
79 pages
KRM Om10 ch05
No ratings yet
KRM Om10 ch05
92 pages
Test Bank For Mind On Statistics 5th Edition Utts Heckard 1285463188 9781285463186
No ratings yet
Test Bank For Mind On Statistics 5th Edition Utts Heckard 1285463188 9781285463186
64 pages