Inferential Statistics Masters Updated 1
Inferential Statistics Masters Updated 1
Inferential Statistics Masters Updated 1
Hypothesis Testing
Meaning of Hypothesis
There are two ways of stating a hypothesis. A hypothesis that is intended for
statistical test is generally stated in the null form. Being the starting point of the testing
process, it serves as our working hypothesis. A null hypothesis (Ho) expresses the idea
of non-significance of difference or non- significance of relationship between the
variables under study. It is so stated for the purpose of being accepted or rejected.
If the null hypothesis is rejected, the alternative hypothesis (Ha) is accepted. This
is the researcher’s way of stating his research hypothesis in an operational manner. The
research hypothesis is a statement of the expectation derived from the theory under
study. If the related literature points to the findings that a certain technique of teaching
for example, is effective, we have to assume the same prediction. This is our alternative
hypothesis. We cannot do otherwise since there is no scientific basis for such prediction.
Reasonable doubt is based on probability sampling distributions and can vary at the
researcher's discretion. Alpha .05 is a common benchmark for reasonable doubt. At
alpha .05 we know from the sampling distribution that a test statistic will only occur by
random chance five times out of 100 (5% probability). Since a test statistic that results in
an alpha of .05 could only occur by random chance 5% of the time, we assume that the
test statistic resulted because there are true differences between the population
parameters, not because we drew an extremely biased random sample.
When conducting statistical tests with computer software, the exact probability of a Type
I error is calculated. It is presented in several formats but is most commonly reported as
"p <" or "Sig." or "Signif." or "Significance." Using "p <" as an example, if a priori you
established a threshold for statistical significance at alpha .05, any test statistic with
significance at or less than .05 would be considered statistically significant and you
would be required to reject the null hypothesis of no difference. The following table links
p values with a benchmark alpha of .05:
General Assumptions
Note: The alternative hypothesis will indicate whether a 1-tailed or a 2-tailed test
is utilized to reject the null hypothesis.
Ha for 1-tail tested: The __ of __ is greater (or less) than the __ of __.
Let us consider an experiment involving two groups, an experimental group and a control
group. The experimenter likes to test whether the treatment (values clarification lessons)
will improve the self-concept of the experimental group. The same treatment is not given
to the control group. It is presumed that any difference between the two groups after the
treatment can be attributed to the experimental treatment with a certain degree of
confidence.
Ho: There is no significant effect of the values clarification lessons on the self-
concept of the students.
Ha: Values clarification lessons have significant effect on the self-concept of
students.
Ha: The self-concept of the students is not significantly related to the values
clarification lessons they were exposed to.
Parametric Test
F- test (ANOVA)
The z-test is another test under parametric statistics which requires normality of
distribution. It uses the two population parameters and .
It is used to compare two means, the sample mean, and the perceived population
mean.
It is also used to compare the two sample means taken from the same population.
It is used when the samples are equal to or greater than 30. The z-test can be
applied in two ways: the One-Sample Mean Test and the Two-Sample Mean Test.
The tabular value of the z-test at .01 and .05 level of significance is shown below.
.01 .05
The z-test for one sample group is used to compare the perceived
population mean against the sample mean, 𝑋̅
The one-sample group test is used when the sample is being compared to
the perceived population mean. However if the population standard
deviation is not known the sample standard deviation can be used as a
substitute.
The z-test is used for a one-sample group because this is appropriate for
comparing the perceived population mean against the sample mean 𝑋̅. We are
interested if significant difference exists between the population against the
sample mean. For instance a certain tire company would claim that the life span
odd its product will last 25,000 kilometers. To check the claim, sample tires will be
tested by getting sample mean 𝑋̅.
The formula is
Where:
𝑋̅ = sample mean
= hypothesized value of the
population mean
= population standard deviation
N = sample size
Example 1
Data from a school census show that the mean weight of college students was 45
kilos, with a standard deviation of 3 kilos. A sample of 100 college students were found
to have a mean weight of 47 kilos. Are the 100 college students really heavier than the
rest, using .05 significance level?
Step 1. Ho: The 100 college students are not really heavier than the rest. (X=45 kls.)
Step 2. Set .05 level of significance.
Step 3. The standard deviation given is based on the population. N>30. Therefore the z-
test is to be used.
Step 4. The given values in the problem are:
𝐱̅ = 47 kilos = 3 kilos
µ= 45 kilos n = 100
𝐱̅− 𝟒𝟕− 𝟒𝟓 2 2
𝑍= = 3⁄ =3 = = 6.67
⁄ 𝑛 ⁄10 .3
√ √100
Step 5. The tabular value for a z – test at .05 level of significance is found in the following
table. Critical values of z for other levels of significance are found in the table of
normal curve areas.
Significance
Level .10 .05 .025 .01
Test Type
One-tailed test +1.28 +1.645 +1.96 +2.33
Two-tailed test +1.645 +1.96 +2.33 +2.58
Based on the given above, the tabular value of z for a one tailed test at .05 leve3l
of significance is + 1.645.
Step 6. The computed value 6.67 is greater than the tabular value 1.645. Therefore, the
null hypothesis is rejected.
The 100 college students are really heavier than then rest.
What is the z-test for a two-sample mean test?
The z-test for a two-sample mean test is another parametric test used to compare
the means of two independent groups of samples drawn from a normal population if
there are more than 30 samples for every group.
The z-test for two-sample mean is used when we compare the means of samples
of independent groups taken from a normal population.
The z-test is used to find out if there is a significant difference between the two
populations by only comparing the sample mean of the population.
The formula is
The formula is
Example 2
The absolute computed value /- 1.579/ is less than the absolute tabular value 2.58
which is a two-tailed test. The null hypothesis is not. rejected
P1− P2
Z= 𝑃1 𝑞1 𝑃2 𝑞2
√ 𝑛 + 𝑛
1 1
Example 3
.10
Z= = 2.22
.045
Since the computed z – value (2.22) falls on the rejection region (because it is
greater than the tabular value 1.96 which is a two-tailed test) the null hypothesis is
rejected.
The t-test is a test of difference between two independent groups. The means are
being compared 𝑥̅1 against 𝑥̅ 2.
The t-test for independent samples is used when we compare means of two
independent groups.
The t-test is used for independent sample because it is more powerful test
compared with other tests of difference of two independent groups.
𝐱̅−
t=𝐬
⁄
√𝐧−𝟏
Example 4
A researcher knows that the average height of Filipino women is 1.525 meters. A
random sample of 26 women was taken and was found to have a mean height of 1.56
meters, with standards deviation of .10 meters. Is there reason to believe that the 26
women in the sample are significantly taller than the others at .05 significance level?
Ho : The sample is not significantly taller than the other Filipino women (X̅ =
1.525).
𝐱̅ = 1.56 meters
µ = 1.525 meters
n = 26
s = .10 meters
The absolute computed value (1.75) is greater than the absolute tabular value (df = n –
1=1.708 which is a one-tailed test. The Ho is rejected.
The t-test is a test of difference between two independent groups. The means are
being compared 𝑥̅1 against 𝑥̅ 2.
The t-test for independent samples is used when we compare means of two
independent groups.
The t-test is used for independent sample because it is more powerful test
compared with other tests of difference of two independent groups.
x̅1 −x̅2
t= (𝑛 −1)(𝑠1 )2+ (𝑛1 −1)(𝑠2)2 1 1
√[ 1 𝑛1+ 𝑛2 −2
][𝑛 +𝑛 ]
1 2
Where:Type equation here.
t = the t test
𝑥̅1 = the mean of group 1 or sample 1
𝑥̅2 = the mean of group 2 or sample 2
𝑆1 = the standard deviation of group 1 or sample 1
𝑆2 = the standard deviation of group 2 or sample 2
𝑛1 = the number of observations in group 1
𝑛2 = the number of observations in group 2
Comparing two Sample Means or Independent Groups
Example 5.
A teacher wishes to test whether or not the Case Method of teaching is more
effective than the Traditional Method. She picks two classes of approximately equal
intelligence (verified through and administered IQ test). She gathers a sample of 18
students to whom she uses the Case Method. After the experiment, an objective test
revealed that the first sample got a mean score of 28.6 with a standard deviation of 5.9,
while the second group of 14 got a mean score of 21.7 with a standard deviation of 4.6.
Based on the result of the administered test, can we say that Case Method is more
effective than the Traditional method?
x̅1 −x̅2
t= (𝑛 −1)(𝑠1 )2+ (𝑛1 −1)(𝑠2)2 1 1
√[ 1 𝑛1+ 𝑛2 −2
][𝑛 +𝑛 ]
1 2
28.6 −21.7
t=
(18−1)(5.9 )2+ (14−1)(4.6)2 1 1
√[ ][18+14]
32−2
6.9
=
(17)(34.81 )+ (13)(21.16)
√[ ]√.06+.07
18+14−2
The computed t – value of 3.56 is in the rejection region. It greater than the tabular
value which is 1.697 ( n1 + n2 - 2 = 32= df) using the one-tailed test. The null hypothesis is
therefore rejected. The case method is more effective than the traditional method.
The Case Method is more effective than the Traditional Method of reaching.
The t-test for correlated samples is another parametric test applied to one group
of samples. It can be used in the evaluation of a certain program or treatment.
Since this is another parametric test, conditions must be met like the normal
distribution and the use of interval or ratio data.
The test for correlated samples is applied when the mean before and the mean
after are being compared. The pretest (mean before) is measured, the treatment
of the intervention is applied and then the posttest (mean after) is likewise
measured. Then the two means (pretest vs. the posttest) are compared.
The t-test for correlated samples is used to find out if a difference exists between
the before and after means. If there is a difference in favor of the posttest then the
treatment or intervention is effective. However, if there is no significant difference
then the treatment is not effective.
This is the appropriate test for evaluation of government programs. This is used in
an experimental design to test the effectiveness of a certain technique or method
or program that had been developed.
The formula is
𝑥̅1 − 𝑥̅2
t=
𝑫𝟐 –(D)𝟐
√
𝑛2 (𝑛−1)
T – test for Correlated Means
Dependent Samples
Example 6
Prior to pursuing a training program, enrollees should take an aptitude test. Ten students
were given the test before they undergo training under the Dual Training System in
Refrigeration and Air Conditioning. Upon the completion of the training program, the
same test was re-administered. It is suspected that the students will perform well after
the training. The following were the scores obtained by the students.
𝑥̅1 − 𝑥̅2
t=
𝑫𝟐 –(D)𝟐
√
𝑛2 (𝑛−1)
At ἀ=.05 (two tailed), and df=10-1=9, the tabular value of t is 2.262. Since the absolute
value of the computed t (t=/-3.376/) exceeded the tabular value, we reject the null
hypothesis. We conclude that the training significantly improved the scores of the
enrollees.
The F-test is another parametric test used to compare the means of two or more
groups of independent samples. It is also known as the analysis of variance,
(ANOVA).
The F-test is the analysis of variance (ANOVA). This is used in comparing the
means of two or more independent groups. One-way ANOVA is used when there
is only one variable involved. The two-way ANOVA is used when two variables
are involved: the column and the row variables. The researcher is interested to
know if there are significant differences between and among columns and rows.
This is also used in looking at the interaction effect between the variables being
analyzed.
Like the t-test, the F-test is also a parametric test which has to meet some
conditions, and the data to be analyzed if they are normal are expressed in
interval or ratio data. This test is more efficient than other tests of difference.
The F-test is used to find out if there is a significant difference between and
among the means of the two or more independent groups.
TSS is the total sum of squares minus the CF, the correction factor.
WSS is the sum of squares or it is the difference between the TSS minus BSS.
After getting the TSS, BSS and WSS, the ANOVA table should be constructed.
ANOVA Table
Sources of F-Value
Df SS MS Computed Tabular
𝐵𝑆𝑆 𝑀𝑆𝐵
Between K-1 BSS =𝐹 see the table
𝑑𝑓 𝑀𝑆𝑊
at .05 or the desired level
of significance
𝑊𝑆𝑆
Within (N-1)-(K-1) WSS w/ df between
𝑑𝑓
Group and w/ group
Total N-1 TSS
The sources of variations are between the groups, within the group
itself and the total variations.
The degrees of freedom for the total is the total number of
observation minus 1.
The degrees of freedom from the between group is the total number
of groups minus 1.
The degrees of freedom for the within group is the total df minus the
between groups df.
If the F computed value is greater than the F-tabular value, reject the null
hypothesis in favour of the research hypothesis.
When the F-computed value is greater than the F-tabular value the null is
rejected and the research hypothesis not rejected which means that there is
a significant difference between and among the means of the different
groups.
Brand
A B C D
7 9 2 4
3 8 3 5
5 8 4 7
6 7 5 8
9 6 6 3
4 9 4 4
3 10 2 5
Perform the analysis of variance and test the hypothesis at .05 level of significance that
the average sales of the four brands of shampoo are equal.
Solving by the Stepwise Method
II. Hypotheses:
𝐻0: There is no significant difference in the average sales of the four brands of
shampoo
𝐻1: There is a significant difference in the average sales of the four brands of
shampoo.
= 225+475+110+204-869.14
=1014 – 869.14
TSS = 144.86
(𝑥1 )2 ( 𝑥 )2 (𝑥3 )2 (𝑥4 )2
BSS = + 2 + + − 𝐶𝐹
𝑛1 𝑛2 𝑛3 𝑛4
= 941.42-869.14
BSS = 72.28
=144.86 – 72.28
WSS = 72.58
IV. Conclusion: Since the F-computed value of 7.98 is greater than the F –tabular
value of 3.01 at .05 level of significance with 3 and 24 degrees of
freedom, the null hypothesis is rejected in favor of the research
hypothesis which means that there is a significant difference in the
average sales of the 4 brands of shampoo.
To find out where the differences lies, another test must be used.
The F-test tells us that there is a significant difference in the average sales
of the 4 brands of shampoo but as to where the difference lies, it has to be tested
further by another test, the Scheffé ’s test formula.
(x̅1 − x̅2 )2
F′ =
SW 2 (n1 + n2
n1 n2
Where:
F′ = Scheffé ’s test
x̅1 = mean of group 1
x̅2 = mean of group 2
n1 = number of samples in group 1
n2 = number of samples in group 2
SW 2 = within mean squares
A vs. B
′ (5.28−8.14)2
F = 3.02 (7+7)
7(7)
8.1796
= 42.28
49
8.1796
=
.86
F′ = 9.51
A vs C A vs D
(𝟓.𝟐𝟖−𝟑.𝟕𝟏)𝟐 (𝟓.𝟐𝟖−𝟓.𝟏𝟒)𝟐
𝐅′ = 𝟑.𝟎𝟐(𝟕+𝟕) 𝐅′ = 𝟑.𝟎𝟐(𝟕+𝟕)
𝟕(𝟕) 𝟕(𝟕)
𝟐.𝟒𝟔𝟒𝟗 .𝟎𝟏𝟗𝟔
= =
.𝟖𝟔 .𝟖𝟔
𝐅 ′ = 2.87 𝐅 ′
.02 =
B vs C B vs D
(𝟖.𝟏𝟒−𝟑.𝟕𝟏)𝟐 (𝟖.𝟏𝟒−𝟓.𝟏𝟒)𝟐
𝐅′ = 𝟑.𝟎𝟐(𝟕+𝟕) 𝐅′ = 𝟑.𝟎𝟐(𝟕+𝟕)
(𝟕)(𝟕) (𝟕)(𝟕)
𝟏𝟗.𝟔𝟐𝟒𝟗 𝟗
= =
.𝟖𝟔 .𝟖𝟔
𝐅 ′ = 22.82 𝐅 ′
=
10.46
C vs D
(𝟑.𝟕𝟏−𝟓.𝟏𝟒)𝟐
𝐅′ = 𝟑.𝟎𝟐(𝟕+𝟕)
(𝟕)(𝟕)
𝟐.𝟎𝟒𝟒𝟗
=
.𝟖𝟔
𝐅 ′ =2.38
The above table shows that there is a significant difference in the sales between
brand A and brand B, brand B and brand C and also brand B and D. However, brands A
and C, A and D and C and D not significantly differ in their average sales.
NON-PARAMETRIC TEST
CHI – SQUARE
DEFINITION OF CHI-SQUARE
Chi-square (x2) may be defined as the sum of the difference of observed and
expected frequencies divided by the expected frequency. The definition is denoted by
this formula (Ferguson, 1976):
2
x2 = Σ(O – E)
E
where:
x2 = Chi-square
O = Observed frequency
E = Expected frequency
Chi-square is a descriptive measure of the discrepancy values between observed
frequency and expected frequency. The larger the discrepancies between O and E, the
larger the chi-square value obtained. If observed and expected frequencies show no
discrepancies at all, the chi-square value is zero.
USES OF CHI-SQUARE
3. It is used to test the hypothesis that the variance of a normal population is equal
to a given value.
The subjects are 30 women and 30 men or a total of 60 subjects in all. Of the 30
women, 9 answered yes; and 9, undecided. Of the 30 men, 15 answered yes; 2, no; and
13, undecided.
Table 1
(O – E)2
O E O–E (O – E)2
Response W M Both W M W M W M WE M Both
df = (r – 1) (C – 1)
= (3 – 1) (2 – 1)
= (2) (1)
df = 2
df.01 = 9.210
TABLE 2
Status
Career Success Permanent Temporary Casual Total
Very Successful 60 35 15 110
Successful 55 45 20 120
Unsuccessful 30 40 50 120
Total 145 120 85 350
5. Rejection region. The null hypothesis (Ho) will be rejected if chi-square (χ2) value
obtained is equal to or greater than the tabular value at df 4 and at 1 percent level
of significance.
STATUS
120 x 120
45 = = 41.143
350
(O – E)2
O E O–E (O – E)2
60 45.572 14.428 208.16718 4.5679
E
35 37.714 -2.714 7.36579 0.1953
15 26.714 -11.714 137.21779 0.4385
55 49.714 5.286 27.94179 0.5621
45 41.143 3.857 14.87644 0.3616
20 29.143 -9.143 83.59444 2.8684
30 49.714 -19.714 388.64179 7.8176
40 41.143 -1.143 1.306449 0.0318
50 29.143 20.857 453.01444 14.9269
Total 350 350.0000 0.000 31.7701 or
31.77**
df = (R – 1) (C – 1)
= (3 – 1) (3 – 1)
df = 4
df.01 = 13.28**
7. Interpretation. The computed chi-square (χ2) value is 31.77. This value is greater
than the tabular value of 13.28 at df 4 and at 1 percent level significance, hence, it
is significant. This means that success in career depends on the position status of
government employees. Therefore, the null hypothesis (Ho) is rejected.
PREPARED BY:
DR. FE C. MONTECALVO
Professor VI