- Module 4-Sampling 2
- Module 4-Sampling 2
Parameters Statistics
Vary No Yes
Calculated No yes
Sampling Distribution
A sampling distribution acts as a frame of reference for
statistical decision making.
It is a theoretical probability distribution of the possible values of some sample
statistic that would occur if we were to draw all possible samples of a fixed size
from a given population.
The sampling distribution allows us to determine whether, given the variability
among all possible sample means, the one we observed is a common out come or a
rare outcome.
Imagine that each one of you asks a random sample of 10 people
in this class what their height is.
You each calculate the average height of your sample to get the
sample mean.
When you report back, would you expect all of your sample
means to be the same?
How much would you expect them to differ?
Sampling Distribution of the Mean
• Random samples rarely exactly represent the underlying population. We rely
on sampling distributions to give us a better idea whether the sample we’ve
observed represents a common or rare outcome.
• Sampling distribution of the mean: probability distribution of means for ALL
possible random samples OF A GIVEN SIZE from some population, It
describes the behavior of a sampling mean
• ALL possible samples is a lot!
• Example: All possible samples of size 5 from a class of 90 = 43949268
Sampling Distribution of the Mean
The value of the sample mean varies from random sample to
another.
- The value of is random and it depends on the random
sample.
- The sample mean is a random variable.
- The probability distribution of is called the sampling
distribution of the sample mean .
- Questions:
What is the sampling distribution of the sample mean ?
What is the mean of the sample mean ?
What is the variance of the sample mean ?
Sampling Distribution of the Mean
Sampling Distribution of the Mean
• A population of 10 people with $0–$9
Sampling Distribution of the Mean
We use this result when sampling from normal distribution with known variance
.
Some Results of sampling distribution of
Statistical Hypothesis
An assumption or statements about population parameters in numerical form
is called statistical hypothesis
For eg. Height of indian soldiers is 6 feet.
or = where s is given
Std deviation = 36 = 40
Paired t-test for difference of mean.
• If n1=n2=n (sample size are same)
• And Two samples are not independent (or correlated).
• Samples observations are paired together.
Examples of related populations are:
1. Height of the father and height of his son.
2. Mark of the student in MATH and his mark in STAT.
3. Pulse rate of the patient before and after the medical treatment.
4. Hemoglobin level of the patient before and after the medical treatment.
Test Statistic:
Where
Di = Xi - Yi (i=1, 2, …, n)
Does these data provide sufficient evidence to allow us to conclude that the diet
program is effective? Use α=0.05 and assume that the populations are normal.
Solution:
μ1 = the mean of weights before the diet program
μ2 = the mean of weights after the diet program Hypotheses:
Ho: μ1 = μ2 (Ho: the diet program is not effective)
H1: μ1 ≠ μ2 (HA: the diet program is effective)
Degrees of freedom: df= ν= n-1 = 10-1=9
Significance level: α=0.05
Rejection Region of Ho: Critical values: t 0.025 = - 2.262 and
2.262 Critical Region: t < - 2.262 or t > 2.262
Decision:
Since t= 2.43 ∈R.R., i.e., t=2.43 > 2.262,
we reject: Ho: μ1 = μ2 (the diet program is not effective) and
we accept: H1: μ1 ≠ μ2 (the diet program is effective)
Consequently, we conclude that the diet program is effective at
α=0.05.
Chi-Square( )
Chi-Square( )
• Non-Parmetric Test
• A measure of the difference between the observed and expected
frequencies of the outcomes of a set of events or variables.
• χ2 depends on the size of the difference between actual and observed
values, the degrees of freedom, and the samples size.
• Can be used to test whether two variables are related or independent from
one another or to test the goodness-of-fit between an observed distribution
and a theoretical distribution of frequencies.
• Can be applied to only categorical data type e.g containing
groups/categories of gender, marital status, inoculated, age group etc.
• Data to be presented in tabular form.
Statistical Inferences:
(Estimation and Hypotheses Testing)
It is the procedure by which we reach a conclusion about a population on the
basis of the information contained in a sample drawn from that population.
There are two main purposes of statistics;
• Descriptive Statistics: Organization & summarization of the data
• Statistical Inference: Answering research questions about some unknown
population parameters.
(1) Estimation: Approximating (or estimating) the actual values of the
unknown parameters:
- Point Estimate: A point estimate is single value used to estimate the
corresponding population parameter.
- - Interval Estimate (or Confidence Interval): An interval estimate
consists of two numerical values defining a range of values that most
likely includes the parameter being estimated with a specified degree of
confidence.
(2) Hypothesis Testing: Answering research questions about the unknown
parameters of the population (confirming or denying some conjectures or
statements about the unknown parameters).
Hypothesis Testing Using Chi-Square( )
1. Set up Null Hypothesis (No significant difference between the observed
and expected values/No association between the mentioned attributes)
and Alternate Hypothesis.
2. Identify the degrees of freedom, n-1 OR(r-1)(c-1), where r = no. of rows,
c = no.of columns.
3. Test statistc =∑
• where:c=Degrees of freedom; =Observed value(s); =Expected
value(s)
4. Determine the critical value of from the table.
5. Compare the calculated and tabulated results.
6. Make Decision as is rejected or is not rejected on the basis if the
calculated test statistic value falls in rejection region or acceptance
region respectively.
Problems-1Chi square test for goodness of fit
• A die is thrown 132 times with the following results:
Number turned up: 1 2 3 4 5 6
Frequency: 16 20 25 14 29 28
Infection No Total
Infection
Administered the drug 144 312 456