Introduction to Statistical Hypothesis Testing in R
Introduction to Statistical Hypothesis Testing in R
Example:
A company is claiming that their average sales for this quarter are
1000 units. This is an example of a simple hypothesis.
Suppose the company claims that the sales are in the range of 900 to
1000 units. Then this is a case of a composite hypothesis.
If the sample falls within this range, the alternate hypothesis will be
accepted, and the null hypothesis will be rejected.
Example:
According to the H1, the mean can be greater than or less than 50.
This is an example of a Two-tailed test.
Level of Significance
The alpha value is a criterion for determining whether a test statistic is
statistically significant. In a statistical test, Alpha represents an
acceptable probability of a Type I error. Because alpha is a
probability, it can be anywhere between 0 and 1.
In practice, the most commonly used alpha values are 0.01, 0.05, and
0.1, which represent a 1%, 5%, and 10% chance of a Type I error,
respectively (i.e. rejecting the null hypothesis when it is in fact
correct).
P-Value
A p-value is a metric that expresses the likelihood that an observed
difference could have occurred by chance. As the p-value decreases
the statistical significance of the observed difference increases. If the
p-value is too low, you reject the null hypothesis.
Here you have taken an example in which you are trying to test
whether the new advertising campaign has increased the product's
sales.
The p-value is the likelihood that the null hypothesis, which states
that there is no change in the sales due to the new advertising
campaign, is true. If the p-value is .30, then there is a 30% chance that
there is no increase or decrease in the product's sales. If the p-value is
0.03, then there is a 3% probability that there is no increase or
decrease in the sales value due to the new advertising campaign.
As you can see, the lower the p-value, the chances of the alternate
hypothesis being true increases, which means that the new advertising
campaign causes an increase or decrease in sales.
Hypothesis Testing in R
Statisticians use hypothesis testing to formally check whether the
hypothesis is accepted or rejected. Hypothesis testing is conducted in
the following manner:
1. State the Hypotheses – Stating the null and alternative
hypotheses.
2. Formulate an Analysis Plan – The formulation of an
analysis plan is a crucial step in this stage.
3. Analyze Sample Data – Calculation and interpretation of
the test statistic, as described in the analysis plan.
4. Interpret Results – Application of the decision rule
described in the analysis plan.
True Positive:
Decision Errors in R
The two types of error that can occur from the hypothesis testing:
Type I Error – Type I error occurs when the researcher
rejects a null hypothesis when it is true. The term
significance level is used to express the probability of Type I
error while testing the hypothesis. The significance level is
represented by the symbol α (alpha).
Type II Error – Accepting a false null hypothesis H is 0