Hypothesis Testing: Reject or Fail To Reject? That Is The Question!
Hypothesis Testing: Reject or Fail To Reject? That Is The Question!
Population Sample
of
population
Inference Statistics
Parameters X of sample
Samples and Populations Contd.
Samples must match characteristics of the
population.
Similarity results in generalizability.
Type of sample impacts research quality
– Systematic Sample
– Random Sample
– Sample of Convenience
– Volunteerism
Pitfalls of Samples
Sample Bias
– Over represent subgroups.
– Not representative of population.
Sampling Error
– Sample group not accurate picture.
– Reduce by enlarging sample.
– Larger sample, less error.
Standard Error
– Measure of how much sampling error likely to occur
when sample is extracted from population
– Standard deviation of values of the sampling
distribution.
Null and Research Hypotheses
Hypothesis
– Educated guess
– Reflects the research problem being
investigated
– Determines the techniques for testing the
research questions
– Should be grounded in theory
Purposes of the Null Hypothesis
A good hypothesis:
is stated in
declarative form and
not as a question.
posits an expected
relationship between
variables.
What Makes a Good Hypothesis?
A good hypothesis:
reflects the theory or
literature on which it is
based.
should be brief and to
the point.
is testable, which
means that it can carry
out the intent of the
question reflected by
the hypothesis.
Six Steps of Hypothesis Testing
Designates a
mistake made if Ho is
not rejected when
the null is actually
false.
Step 4: Collection and Analysis
of Sample Data
The summary of the sample data will
always lead to a single numerical value
which is referred to as the calculated
value. ( r, t, or f).
The computer calculates the probability
of the above value in the form of p =
____.
Step 5: The Criterion for
Evaluating the Sample Evidence
Two Methods:
Compare the calculated and critical
values.
Compare the data-based p-value
against a preset point on the 0-1 scale
on which the p must fall. (Level of
Significance)
Step 6: Make a Decision!
0 Z 0 Z
Must Be Significantly Small values don’t contradict H0
Below = 0 Don’t Reject H0!
t-Test: Unknown
Assumptions
– Population is normally distributed
– If not normal, only slightly skewed & a large
sample taken (Central limit theorem applies)
Parametric test procedure
t test statistic, with n-1 degrees of
freedom X
t
S
n
Degrees of Freedom
• # in sample - number of parameters that must be
estimated before test statistic can be computed.
– For a single sample t-test, we must first estimate the
mean before we can estimate
f
the standard deviation.
– Once the mean is estimated, n-1 of the values are left
since we know that the nth value is equal to
n 1
nx x i
i 1
Example: One Tail t-Test
Does an average box of cereal
contain more than 368 grams
of cereal? A random sample of
36 boxes showed X = 372.5,
and 15. Test at the 0.01
level. 368 gm.
Data
5 3
4 2
Test statistic Probability of
4 4
3 1 U = 15.5 H0 being true
5 2 p = 0.03
4 1
5
Is p above critical level?
Y N
Reject H0
Accept H0
This particular test:
The Mann-Whitney U test is a non-parametric test
which examines whether 2 columns of data could
have come from the same population (ie “should” be
the same)
It generates a test statistic called U (no idea why it’s
U). By hand we look U up in tables; PCs give you an
exact probability.
It requires 2 sets of data - these need not be paired,
nor need they be normally distributed, nor need there
be equal numbers in each set.
How to do it
1: rank all data into 2 Harmonize ranks where the
ascending order, then re- same value occurs more than
code the data set
replacing raw data with once
ranks.
NB This test is unique in one feature: Here low values of the test stat. Are
significant - this is not true for any other test.
In this case:
Data
5 #13 = 12 3 #5 = 5.5
4 #10 = 8.5 2 #4 = 3.5
Ux = 6*7 + 7*8/2 - 67 = 3
4 #9 = 8.5 4 #7 = 8.5 Uy = 6*7 + 6*7/2 - 24 =
3 #6 = 5.5 1 #2 = 1.5 39
5 #12 = 12 2 #3 = 3.5
4 #8 = 8.5 1 #1 = 1.5
5 #11 = 12
Lowest U value is 3.
___ ___
rx=67 ry=24 Critical value of U (7,6) = 4 at p =
0.01.
Check: rx + ry + 91
Calculated U is < tabulated U so
13*13/2 + 13/2 = 91 CHECK.
reject H0.
Do males differ
Do results differ
from females?
between these sites?
Wilcoxon Rank Sum Test
μ X
Distribution B
μ X
Wilcoxon
1b. Small samples, independent
groups
Wilcoxon Rank Sum Test
– first, combine the two samples and rank order
all the observations.
– smallest number has rank 1, largest number
has rank N (= sum of n1 and n2).
– separate samples and add up the ranks for the
smaller sample. (If n1 = n2, choose either one.)
– test statistic : rank sum T for smaller sample.
Wilcoxon
1b. Small samples, independent
groups
Wilcoxon – One-tailed Hypotheses
H0: Prob. distributions for 2 sampled
populations are identical.
HA: Prob. distribution for Population A
shifted to right of distribution for
Population B. (Note: could be to the left,
but must be one or the other, not both.)
Wilcoxon
1b. Small samples, independent
groups
Wilcoxon – Two-tailed Hypotheses
H0: Prob. distributions for 2 sampled
populations are identical.
HA: Prob. distribution for Population A
shifted to right or left of distribution for
Population B.
Wilcoxon
1b. Small samples, independent
groups
Wilcoxon – Rejection region:
Wilcoxon
1b. Small samples, independent
groups
Wilcoxon for n ≥ 10 and n ≥ 10:
1 2
Test statistic:
Z = TA – n1(n1 + n2 + 1)
2
n1n2(n1 + n2 + 1)
12
Wilcoxon
Wilcoxon for n1≥ 10 and n2 ≥ 10
Rejection region:
One-tailed Two-tailed
Wilcoxon
Example 1
Wilcoxon
Test of hypothesis of equal
variances
H0: 12 = 22
HA: 12 ≠ 22
Statistical test: T
Wilcoxon
Example 1 – Wilcoxon Rank
Sum Test
Rejection region:
Reject H0 if TCajun > 66 (or if TCreole <
39)
Wilcoxon
Example 1 – Wilcoxon Rank
Sum Test
Cajun Creole
6.5 4.5
3500
11.5 13.53100
4200
9.5 3 4700
4100
13.5 6.5 2700
4700 2 3500
11.5
4200
8 4.5 2000
3705
9.5 1 3100
4100 1550
Σ 70 35
Wilcoxon
Example 1 – Wilcoxon Rank
Sum Test
Calculation check:
Wilcoxon
Example 2 – Wilcoxon Rank
Sum Test
H0: 12 = 22
HA: 12 ≠ 22
Reject H0 – do Wilcoxon
Wilcoxon
Example 2 – Wilcoxon Rank
Sum Test
H0: Prob. distributions for females and males
populations are identical.
HA: Prob. distribution for females is shifted to
left of distribution for males.
Statistical
test: T
Rejection region: T♂ > TU = 90
(or T♀ < TL = 54)
Wilcoxon
Example 2 – Wilcoxon Rank
Sum Test
6.4 16 2.7 3
1.7 1 3.9 10
3.2 5 4.6 12
5.9 15 3.0 4
2.0 2 3.4 6.5
3.6 8 4.1 11
5.4 14 3.4 6.5
7.2 17 4.7 13
3.8 9
Σ 78 75
Wilcoxon
Example 2 – Wilcoxon Rank
Sum Test
T♂ = 78 < TU = 90
Wilcoxon
Example 3 – Wilcoxon Rank
Sum Test
H0: 12 = 22
HA: 12 ≠ 22
= 13.74
Reject H0 – do Wilcoxon
Wilcoxon
Example 3 – Wilcoxon Rank
Sum Test
H0: Prob. distributions for Hoodoo and
Mukluk populations are identical.
HA: Prob. distribution for Hoodoos is shifted
to right or left of distribution for Mukluks.
Statistical
test: T
Rejection region: TH > 52 or < 26
Wilcoxon
Example 3 – Wilcoxon Rank
Sum Test
Hoodoo Mukluk
2 1 6 5
6 5 8 9.5
4 2.5 7 7.5
23 12 10 11
7 7.5 8 9.5
6 5 4 2.5
Σ 33 45
Wilcoxon
Example 3 – Wilcoxon Rank
Sum Test
Check: TH + TM = 78
(12)(13) = 78
2
Parametric vs Non-parametric
Chi-Square
– 1 way
– 2 way
Parametric Tests
fo = observed frequency
fe = expected frequency
Chi Square Statistic
( fo fe) 2
2
fe
One-way Chi Square
Interpretation
If our calculated value of chi square is less than
the table value, accept or retain Ho
Total 20 20 40
Two-Way Chi-Square Example