Introduction To Inferential Statistics
Introduction To Inferential Statistics
Introduction To Inferential Statistics
Inferential Statistics
Statistical inference's primary goal is about drawing conclusions about a population
parameter from a sample of that population. A correlation coefficient, a standard deviation, a mean, a
median, or any other of several statistical parameters could be that parameter. The purpose of
inferential statistics is to draw a conclusion (an inference) about conditions that exist in a population
(the complete set of observations) by studying a sample (a subset) drawn from the population ( King
and Minium ,2018).
Psychologists use inferential statistics to draw conclusions and to make inferences that are based on
the numbers from a research study but that go beyond the numbers. For example, inferential statistics
allow researchers to make inferences about a large group of individuals based on a research study in
which a much smaller number of individuals took part.
1. Population- A population is the complete set of observations about which an investigator wishes to
draw conclusions.
2. Sample- It is a part of population In inferential statistics, we try to make inferences about the
population from the sample.
The use of NHST and its role in decision‐making in healthcare research is controversial, not
least because it is typically misunderstood and misused. The idea that an intervention is effective, or
exposure to a risk factor is only important if the value of p is less than 0.05 is a reductionist view that
does not always reflect clinical importance. There have been frequent calls for a statistics reform
regarding the use of NHST in decision‐making, including no longer using the concept of statistical
significance. Nonetheless, the binary approach to decision‐making is a convenient one, and its use has
remained ubiquitous. Regardless of the future for statistical significance, there are calls for greater
focus on the magnitude and accuracy of the observed effects and their clinical importance. This
ultimately seems sensible and accords with the original intentions of Fisher. 5 However, to do so will
bring challenges not least because of the subjectivity that will exist when interpreting study results.
Journals have been cautious in their approach to calls for a statistics reform, and it appears that
statistical significance will continue to play a role in decision ‐making also in obstetrics and
gynecology. This may be acceptable if we continue to educate ourselves in the role of statistics,
including controlling the probability of type I and II errors plus the importance of sample size. In
particular, statisticians, researchers, and clinicians all need to recognize that a statistical answer based
on NHST to the question posed is not necessarily an answer to the scientific question asked. Statistical
inference does not automatically reflect clinical inference (Scand, Sedgwick and
Hammer,.2022)
𝜇x̄ = 𝜇x
The standard deviation of the random sampling distribution of the mean, sometimes referred
to as the standard error of the mean, is determined by the sample size and the population
standard deviation, 𝜎X.
This distribution, which includes all feasible means of samples of a given size and hence
represents the population of them rather than just some of them, serves as a representation of
the population. ( King & Minium, 2018).
If the population of scores is normally distributed, then the sampling distribution of the mean
will also be normally distributed in terms of distribution shape, irrespective of sample
size.Rather than a single random sampling distribution of X corresponding to a given
population, there is a family of such distributions, one for each possible sample size.
According to the central limit theorem, "the approximation to the normal distribution
improves as sample size increases; the random sampling distribution of the mean tends
toward a normal distribution irrespective of the shape of the population of observations
sampled." (King & Minium,2018) .
Figure 1: A Normally Distributed Population of Scores and Random Sampling Distributions
of Means for n = 3 and n = 9.
Sampling With and Without Replacement
There exist two sampling techniques to generate a random sample: replacement
sampling, in which an element is sampled more than once, and non-replacement sampling, in
which an element is sampled more than once. Every score is received, recorded, and then
returned to the population before the next score is chosen when replacement sampling is
applied. If there are 50 tickets in a lottery and we choose one, put it away, draw another, and
so on until five tickets are selected, we are sampling without replacement. Within the
parameters of this sampling plan, no element may occur more than once in a sample. If we
select a ticket, note its number, and then return it before moving on to the next, we are
sampling with replacement, though. When an element appears more than once, drawing a
sample using this design is doable. These two approaches satisfy the random sampling
criterion, but sampling without replacement precludes a number of sample outcomes that are
possible with replacement sampling. A lower standard error of the mean is the outcome of
replication sampling. That being said, the difference is insignificant if the sample is tiny compared to
the population (King & Minium,2018).
Testing Hypothesis About Mean
Degree of Freedom
The notion of degrees of freedom in statistical analysis pertains to the number of variables in
a calculation that can vary independently. In the context of computing the sample standard deviation
Sx for a dataset comprising three scores X1, X2, X3, only two deviation scores can freely change due
to the constraint that the sum of deviations from the mean (∑X – X) must equate to zero.
Consequently, if the values of any two deviation scores are known, the third becomes fixed, resulting
in n-1 degrees of freedom for a sample of size (n). This principle is pivotal for precisely estimating
population variance from sample information. Degrees of freedom are also integral in hypothesis
testing, particularly in determining the critical values of (t)from statistical tables. For instance, with 20
degrees of freedom and a significance level(α) of .05, the critical (t) values delineating the central
95% of the distribution would be ±2.086, showcasing the symmetric nature of the (t) distribution
around zero (King, Rosapa & Minium, 2018).
In statistical hypothesis testing, the significance level (α) represents the probability of
incorrectly rejecting the null hypothesis (H0) when it is actually true. Researchers should determine α
before conducting tests to control the risk of this error. However, many fail to specify α, although it's
commonly assumed to be no higher than 0.05, a widely accepted criterion in behavioral science
journals. P-values indicate the probability of observing sample data as extreme as or more extreme
than what was obtained, given that H0 is true. Researchers often report p-values instead of α, with
lower values suggesting stronger evidence against H0. For instance, Dr. Brown's study found a p-
value of 0.0062, indicating a rare outcome if H0 were true. If the p-value is less than or equal to α, H0
is rejected. Researchers commonly compare p-values to predefined significance levels such as 0.05,
0.01, or 0.001. They report significance based on whether the p-value falls below or above these
thresholds. For example, if a p-value is "significant at the 0.01 level," it means the result is unlikely if
H0 were true. However, the reported significance levels do not necessarily reflect the exact thresholds
used for all sets of results; often, the actual level of significance applied across different tests is likely
the same, usually 0.05.
In conclusion, while significance levels and p-values are essential in hypothesis testing,
researchers often interchangeably use them in reporting results, which can lead to confusion. The
reported significance levels may not precisely reflect the thresholds used in the analysis( King &
Minium, 2018).
Interpreting results of hypothesis testing
In essence, while statistical significance is important, researchers must also evaluate the
practical significance of their findings, considering the context of their research questions. The
importance of a difference depends on the specific implications of the research rather than just
statistical calculations( King & Minium, 2018).
If we fail to reject the null hypothesis (H0) when it's false, we commit a Type II error. For
instance, suppose we're testing H0: X = 150 at a 5% significance level. If we obtain a sample mean of
152 but the true population mean is 154, our sample mean belongs to the true distribution centered on
154. However, evaluating it based on the sampling distribution centered on 150, as per H0, it doesn't
appear sufficiently deviant to reject H0. Therefore, we retain H0, unaware that a difference exists.
This incorrect decision constitutes a Type II error, meaning we fail to detect a genuine difference that
indeed exists in reality. The probability of committing a Type II error is indicated by the Greek letter
(beta):
The value of (1-beta) is called the power of the test. Among several ways of conducting a test,
the most powerful one is that offering the greatest probability of rejecting H0 when it should be
rejected. Because beta and the power of a test are complementary, any condition that decreases beta
increases the power of the test, and vice versa. In the next several sections, we will examine the
factors affecting the power of a test (and beta) ( King & Minium, 2018).