Lecture7 - Sampling Distribution - 0930
Lecture7 - Sampling Distribution - 0930
Yunduan Lin
Assistant Professor
Department of Decisions, Operations and Technology
CUHK Business School
Agenda
o Either A or B = Union (it will count the case that both A and B happens for once)
o Both A and B = Intersection
Homework 1 – 3(e)
▪ 10% of those who do not have the disease will get a positive result
Fact What we care about
▪ The probability that a person has the disease given positive report
What we care about Fact
Quiz 1 - 1
Combinations (true or false):
Bernoulli
o Binary outcome
Binomial
Euler constant = 2.718
o Count of successes for repeated discrete trials
Poisson
o Count of events over a continuous time
o Binomial approaches Poisson when n is really large and p is really small
o Can be used to approximate binomial and is easy to calculate, because has only 1 parameter
Recap - Continuous Random Variable
Mean Variance PDF
Exponential
o Time between independent random events
o Poisson: event count -> exponential: time between events
o Memoryless property
Normal
Population and Sample
Population
Sample
o Subset of population
Goal of Inference
Population
Sample
o an observation from population
Random Sample
Explanation:
Probability mass function
X1, …, Xn is a simple random sample if
o X1, … , Xn are independent random variables, and
o X1, . . . , Xn follow the same probability function P(x) or f(x) Probability density function
Simple Random Sample - Property
o
Simple random sample in fact has an even strong property
Each observation follows the same distribution as the population
o
This includes all summary statistics
o
Other Sampling Methods
o Online surveys likely exclude seniors who do not use internet often
o Samples from offline surveys are likely to be dependent due to geographical correlation (e.g.,
economic condition, location preference)
o Advanced sampling method to reduce sampling error: Stratified sampling - divide population into
subsamples, and do simple random sample within each subsample, and produce weighted average
across subsamples
Statistics - Definition
Statistics:
o Data summary
o Data reduction (simplification)
Sample mean:
It is useful to guess population mean
Intuition:
o If we sample many times, average of all sample means is the population mean
Amy rolls a dice for 5 times Charlie rolls a dice for 10 times
o Population mean
o Consider sample with size=1, the sample mean can be one of {1,2,3} with the same probability.
Expectation of the sample mean for size=1 is
o Consider sample with size=2, the sample mean can be one of the following 9 results with the
same probability. Expectation of the sample mean for size=2 is
x1\x2 1 2 3
1 1 1.5 2
2 1.5 2 2.5
3 2 2.5 3 The sample size can be larger, and even larger than 3, and
there are more possibilities.
Sample Mean – Expectation Proof
x1\x2 1 2 3
1 1 1.5 2
2 1.5 2 2.5
3 2 2.5 3
Sample Mean – Variance Proof
Transformation of variance
o As , when n gets larger, we have the sample mean eventually very close to population
mean, that is,
Law of Large Numbers
Let X1, . . . , Xn be a random sample from a distribution with mean μ and variance σ2.
Or more rigorously,
Loosely speaking, when sample size is large, variation disappears and the sample mean becomes
population mean. Or, with a larger sample, sample mean is closer to population mean, and it can be
as close as we want.
Law of Large Number
Markov inequality
Consider a nonnegative random variable, , then for all t>0,
Chebyshev inequality
Consider , then by Markov inequality
Taking the limit on both sides, we arrive at the law of large number.
Sample Mean – Large Samples
o But, sample mean itself is still a random variable. What is the distribution function of sample
mean when n becomes larger?
The distribution of sample mean RATHER THAN the distribution of a sample itself
Let's look at the CDF of the sample mean for different sample sizes.
or
No matter what is the distribution for population. We can use normal distribution to approximate the
sample mean with size 100.
o Population mean
o Population variance
or
Let X be binomial distribution with n = 100 and p = 0.6. What is the probability that X is less than
55?
https://docs.google.com/forms/d/e/1FAIpQLSfsEgnMFLypI_KW6GF7j_FXtVY5E4Jrmf2P_BDwaG8GXWDc0A/viewform?usp=sf_link