Hypothesis Testing
Hypothesis Testing
Hypothesis Testing
Introduction
• Inferential statistics enables to measure the behaviour in samples to learn more
about the behaviour in populations that are often too large or inaccessible.
• Hypothesis testing is an act in statistics whereby an analyst tests an assumption
regarding a population parameter. The methodology employed by the analyst
depends on the nature of the data used and the reason for the analysis.
• Hypothesis testing is used to assess the plausibility of a hypothesis by using
sample data which come from a larger population, or from a data-generating
process.
• The test provides evidence concerning the plausibility of the hypothesis, given
the data.
• Statistical analysts test a hypothesis by measuring and examining a random
sample of the population being analysed.
Introduction
• The method in which we select samples to learn more about characteristics in a
given population is called hypothesis testing.
• Hypothesis testing is really a systematic way to test claims or ideas about a group
or population
Hypothesis testing or significance testing is a method for testing a
claim or hypothesis about a parameter in a population, using data
measured in a sample. In this method, we test some hypothesis by
determining the likelihood that a sample statistic could have been
selected, if the hypothesis regarding the population parameter were
true.
Basic Terms
• Population all possible values
• Sample a portion of the population
• Statistical inference generalizing from a sample to
a population with calculated degree of certainty
• Two forms of statistical inference
• Hypothesis testing
• Estimation
• Parameter a characteristic of population, e.g., population
mean µ
• Statistic calculated from data in the sample, e.g., sample
mean ( )
x
Distinctions Between Parameters and Statistics
Parameters Statistics
Vary No Yes
Calculated No Yes
Sampling Distributions of a Mean
x ~ N , SE x
where SE x
n
Hypothesis Testing
• A statistical hypothesis is an assertion Hypothesis testing is formulated in
or conjecture concerning one or more terms of two hypotheses:
populations. • H0: the null hypothesis;
• To prove that a hypothesis is true, or false,
• H1: the alternate hypothesis.
with absolute certainty, we would need
absolute knowledge. That is, we would
have to examine the entire population.
• Instead, hypothesis testing concerns on how
to use a random sample to judge if it is
evidence that supports or not the
hypothesis.
Hypothesis Testing
• Tests a claim about a parameter using evidence (data
in a sample.
• The goal of hypothesis testing is to determine the
likelihood that a population parameter, such as the
mean, is likely to be true
• The technique is introduced by considering a one-
sample z test
• The procedure is broken into four steps
• Each element of the procedure must be understood
Hypothesis Testing Steps
• The first step is for the analyst to state
the two hypotheses so that only one can
State the hypotheses.
be right.
• The next step is to formulate an analysis
Set the criteria for a decision or
signification level
plan, which outlines how the data will be
evaluated.
Compute the test statistic and
• The third step is to carry out the plan and
Corresponding P-Value physically analyze the sample data.
• The fourth and final step is to analyze the
Make a conclusion / decision. results and either reject the null
hypothesis, or state that the null
hypothesis is plausible, given the data.
State the hypotheses
• Convert the research question to null and alternative hypotheses
• The null hypothesis (H0) is a statement of no effect, relationship, or difference between two or
more groups or factors. In research studies, a researcher is usually interested in disproving the
null hypothesis
• The null hypothesis (H0) is a claim of “no difference in the population”
• The alternative hypothesis (Ha) claims “H0 is false”
• The alternative hypothesis (H1) is the statement that there is an effect or difference. This is
usually the hypothesis the researcher is interested in proving.
• The alternative hypothesis can be one-sided (only provides one direction, e.g., lower) or two-
sided.
• Collect data and seek evidence against H0 as a way of bolstering Ha (deduction)
• Rather than trying to prove that the study hypothesis is true, we proceed in statistical
hypothesis testing by attempting to disprove the null hypothesis, H0, which is the converse of the
study hypothesis or alternative.
State the hypotheses
• Usually, the alternative hypothesis states that a difference exists between the
parameter values but the direction of that difference is not known. It leads to a two-
sided or a two-
• tailed test.
State the hypotheses
Two-Sided Tests
Usually, the alternative hypothesis states that a difference exists between the parameter values
but the direction of that difference is not known. It leads to a two-sided or a two-tailed test.
• Suppose a pharmaceutical company manufactures ibuprofen pills. They need to perform
some quality assurance to ensure they have the correct dosage, which is supposed to be 500
milligrams. This is a two-sided test because if the company's pills are deviating significantly in
either direction, meaning there are more than 500 milligrams or less than 500 milligrams, this
will indicate a problem.
One-Sided Tests
Very occasionally, however, we have sound prior knowledge that any difference between the treatments,
if it exists, can be in one direction only. This must not be based on hopes or expectations about a novel
treatment, but on an absolute certainty that the difference can only be in that direction, if the difference
is not zero. This gives rise to a one-sided or a one-tailed test in which the direction of the difference is
specified in the alternative hypothesis.
State the hypotheses
One-Sided Tests
― we'll look at the proportion of students who suffer from test anxiety. We want to
test the claim that fewer than half of students suffer from test anxiety.
― we will be testing the claim that women in a certain town are taller than the
average state height, which is 63.8 inches
Setp 2: Set the Significance Level (α)
• Having specified the null and alternative hypotheses, we then collect
our sample data and set the significance level (denoted by the Greek
letter alpha— α) is generally set at 0.05. This means that there is a
5% chance that you will accept your alternative hypothesis when
your null hypothesis is actually true.
• The smaller the significance level, the greater the burden of proof
needed to reject the null hypothesis, or in other words, to support the
alternative hypothesis.
The test statistic and the P-value
• From the data we calculate the value of a test statistic (an algebraic
expression particular to the hypothesis we are testing)
• Failure to reject H0 does not mean the null hypothesis is true. There is no formal
outcome that says “accept H0.” It only means that we do not have sufficient
evidence to support H1.
Step 4 Making a decision using the P-value
The P-value allows us to determine whether we have enough evidence to
reject the null hypothesis in favour of the alternative hypothesis.
• If the P-value is very small, then it is unlikely that we could have
obtained the observed results if the null hypothesis were true, so we
reject H0
• If the P-value is very large, then there is a high chance that we could
have obtained the observed results if the null hypothesis were true,
and we do not reject H0
Case study
0 10 critical region
100not
Do critical
reject H0 value
Case study(cont.)
H0 is true H1 is true
Because we are making a decision based Do not Correct Type II
on a finite sample, there is a possibility reject H0 decision error
Truth
Decision H0 true H0 false
Retain H0 Correct retention Type II error
Reject H0 Type I error Correct rejection
α ≡ probability of a Type I error
β ≡ Probability of a Type II error
Types of errors(cont.)
Definition
The acceptance of H1 when H0 is true is called a Type I error. The probability
of committing a type I error is called the level of significance and is denoted by
α.
The probability of making a Type I error is the probability of incorrectly
rejecting the null hypothesis
it is the P-value obtained from the test. The null hypothesis will be rejected if this probability
is less than the significance level, often denoted by α (alpha) and commonly taken as 0.05.
Thus the significance level is the maximum chance of making a Type I error. If the P-value is
equal to or greater than α, then we do not reject the null hypothesis and we are not making a
Type I error. Therefore, by choosing the significance level of the test to be α at the design
stage of the study, we are limiting the probability of a Type I error to be less than α.
Types of errors(cont.)
Definition
Failure to reject H0 when H1 is true is called a Type II error. The probability of
committing a type II error is denoted by β.
x 0 173 170
zstat 0.60
SE x 5
Illustrative Example: z statistic
If we found a sample mean of 185, then
x 0 185 170
zstat 3.00
SE x 5
Reasoning Behind µ zstat
x ~ N 170,5
Sampling distribution of xbar
under H0: µ = 170 for n = 64
P-value
• The P-value answer the question: What is the
probability of the observed test statistic or one more
extreme when H0 is true?
• This corresponds to the AUC in the tail of the
Standard Normal distribution beyond the zstat.
• Convert z statistics to P-value :
For Ha: μ > μ0 P = Pr(Z > zstat) = right-tail beyond zstat
For Ha: μ < μ0 P = Pr(Z < zstat) = left tail beyond zstat
For Ha: μ ¹ μ0 P = 2 × one-tailed P-value
• Use Table or software to find these probabilities
One-sided P-value for zstat of 0.6
One-sided P-value for zstat of 3.0
Two-Sided P-Value
• One-sided Ha AUC
in tail beyond zstat
• Two-sided Ha
consider potential
deviations in both
directions double Examples: If one-sided P
the one-sided P-value
= 0.0010, then two-sided
P = 2 × 0.0010 = 0.0020.
If one-sided P = 0.2743,
then two-sided P = 2 ×
0.2743 = 0.5486.
Interpretation
• P-value answer the question: What is the probability
of the observed test statistic … when H0 is true?
• Thus, smaller and smaller P-values provide stronger
and stronger evidence against H0
• Small P-value strong evidence
Interpretation
Conventions*
P > 0.10 non-significant evidence against H0
0.05 < P 0.10 marginally significant evidence
0.01 < P 0.05 significant evidence against H0
P 0.01 highly significant evidence against H0
Examples
P =.27 non-significant evidence against H0
P =.01 highly significant evidence against H0
15
SE x 5
n 9
x 0 112.8 100
zstat 2.56
SE x 5
C. P-value: P = Pr(Z ≥ 2.56) = 0.0052
| | n
1 z1 0 a
2
| 170 190 | 16
1.96
40
0.04
0.5160
Reasoning Behind Power
• Competing sampling distributions
Top curve (next page) assumes H0 is true
Bottom curve assumes Ha is true
α is set to 0.05 (two-sided)
• We will reject H0 when a sample mean exceeds 189.6 (right tail, top
curve)
• The probability of getting a value greater than 189.6 on the bottom
curve is 0.5160, corresponding to the power of the test
Sample Size Requirements
Sample size for one-sample z test:
n
2
z1 z1
2
2
2
where
1 – β ≡ desired power
α ≡ desired significance level (two-sided)
σ ≡ population standard deviation
Δ = μ0 – μa ≡ the difference worth detecting
Example: Sample Size Requirement
How large a sample is needed for a one-sample z test
with 90% power and α = 0.05 (two-tailed) when σ = 40?
Let H0: μ = 170 and Ha: μ = 190 (thus, Δ = μ0 − μa = 170 –
190 = −20)
n
2
z1 z1
2
2
40 2 (1.28 1.96) 2
41.99
2 20 2
Round up to 42 to ensure adequate power.
Illustration: conditions
for 90% power.