Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

14-18 Stat Inferential Non-Parametric

Download as pdf or txt
Download as pdf or txt
You are on page 1of 77

Statistics

CATALINO N. MENDOZA, DMS, PHD, FRIEDR, LPT, TESOL


KNOW-WELL
Inferential
Statistics
The
Nonparametric
Tests
The Chi-Square Test

A test of difference between the observed and expected frequencies


that depends on the three unique test functions which are the
following:
1. The test of goodness-of-fit
2. The test of homogeneity
3. The test of independence
The Goodness-of-Fit

2
x =∑2 (O − E )
E
Where:
X2 = the Chi-Square test
O = the observed frequencies
E = the expected frequencies
Sample Scenario
A researcher conducted an experiment on the blood
type proportions of 400 people with A, B, O and AB
in the populations when picked randomly and
examined and the results are, respectively, 148, 96,
106, and 50. It is believed that that the proportions
are as follows: 0.4, 0.2, 0.3 and 0.1 respectively. Test
the hypothesis that these data bear out the stated
belief. Use 5% level of significance.
Tabulated Value

α = 0.05
df = h-1
= 4-1 = 3
X2 = 7.815
Decision Rule:
If the chi square computed is greater than the tabulate chi square
value, reject the null hypothesis.
Sample Scenario 2

The number of accidents in a factory during the months of a year are


given in the following table.
Using a 5 % level of significance, test the null hypothesis that the
number of accidents does not depend on the month of the year.
Hint: the probability of an accident that will take place to a person in
any month is ½.
Data

Jan Feb Mar Apr May June

25 28 24 18 17 27

Jan Feb Mar Apr May June

25 28 24 18 17 27
The Test of Homogeneity

A test concerning two or more samples, with only one criterion variable
and used to determine if two or more populations are homogenous.
Formula
2
2 N (D1 − D2 )
Where: x =∑
N = the grand total TcTr
D1 = the product of the diagonal value from the north-west
D2 = the product of the diagonal value from the south-east
Tc = the product of the column values
Tr = the product of the row values
Sample Scenario

Evaluate the attitude of a sample of Lakas and Laban Parties on the


issue of peace and order in Mindanao. A separate random sample
members of each party is drawn from the nationwide population of
Lakas and Laban. The scores are classified as “Favorable and
Unfavorable” categories. Test the significant differences at 5 %.
Data

Party Favorable Unfavorable Total

Lakas 65 35 100

Laban 50 50 100

Total 115 85 200


Tabulated Value

α = 0.05
df = (c-1)(r-1)
= (2-1)(2-1) = 1
X2 = 3.841
Decision Rule:
If the chi square computed is greater than the chi square tabulated
value, reject the null hypothesis.
The Test of Independence

It is used to determine whether the two criterion variables are either


independent or associated with one in a given population and the
sample in this test is consists of members randomly drawn from the
population.
Sample Scenario

Ninety individuals, male and female, were given a test in psychomotor


and their scores were classified into high and low and are tabulated
in the following table.
Test the significance differences using 5 % level of significance.
Data

Gender High Low Total

Male 18 28 46

Female 32 12 44

Total 50 40 90
Tabulated Value

α = 0.05
df = (c-1)(r-1)
= (2-1)(2-1) = 1
X2 = 3.841
Decision Rule:
If the Chi square computed is greater than the chi square tabulated
value, reject the null hypothesis.
Sample Scenario 2

Two lots of 30 experimental guinea pigs were used in testing the


effectiveness of a new serum in combating a certain disease. Both
were inoculated with the new organism but only one lot was
previously given the preventive serum. Test the serum effectiveness
at 1 % significance.
Data

Criteria Serum No Serum Total

Recovered 12 3 15

Died 2 13 15

Total 14 16 30
Tabulated Value

α = 0.01
df = (c-1)(r-1)
= (2-1)(2-1) = 1
X2 = 6.635
Decision Rule:
If the computed lowest U value is lesser than the tabulated U value,
reject the null hypothesis.
Wilcoxon Two Sample Test

A procedure for comparing two populations when independent


samples are drawn from them and it is used when the two
populations are not approximately normally distributed.
Formula
n1 (n1 + 1) n2 (n2 + 1)
U1 = W1 − U 2 = W2 −
Where:
2 2
U1 = Wilcoxon Rank-Sum Test 1
W1 = sum of ranks of group 1
n1 = sample size of group 1
U2 = Wilcoxon Rank-Sum Test 2
W2 = sum of ranks of group 2
U2 = sample size of group 2
Sample Scenario

Of the eighteen selected patients who reached an advanced stage


of leukemia, ten were treated with a new serum and eight were not.
The survival time, in years, was reckoned from the time the
experiment was conducted. Use a 5 % level of significance to test
whether the serum is effective.
Data

With Treatment
2.9 3.1 5.3 4.2 4.5 3.9 2.0 3.7 4.1 4.0

No Treatment
1.9 0.5 0.9 2.2 3.1 2.0 1.7 2.5
Tabulated Value

α = 0.05
df ; n1 = 10 n2 = 10
U = 17
Decision Rule:
If the computed lowest U value is lesser than the U tabulated value,
reject the null hypothesis.
Sample Scenario 2

The nicotine content of two brands of cigarettes, measured in


milligrams, are in the following table.
Test the hypothesis that the average nicotine contents of the two
brands are equal at 5 % level of significance.
Data

Brand X
4.1 0.7 3.1 2.5 4.0 6.2

Brand Y
2.1 4.0 5.4 4.8 3.3 1.6 1.7 5.4
Tabulated Value

α = 0.05
df ; n1 = 6 n2 = 8
U = 8
Decision Rule:
If the lowest computed U value is lesser than the tabulated U value,
reject the null hypothesis.
The Kruskal-Wallis Test

Test used to compare three (3) or more independent groups.


This test is an alternative test for the f-test (ANOVA) analysis in
parametric tests and used under abnormal data distribution.
Formula
n
12 Ri
H= ∑ − 3(n + 1)
n(n + 1) i =1 ni
Where:
H = Kruskal Wallis test
n = the number of observation
12 = constant
3 = constant
R = sum of the ranks
Sample Scenario

Consider the examination scores of samples of college students who


are taught in statistics using three different methods: Method 1
(classroom instruction with laboratory), Method 2 (classroom
instruction only), Method 3 (self study in the laboratory). Test the
hypothesis that their means are equal at 5 % level of significance.
Data

Method 1
94 88 90 95 92 90
Method 2
85 88 90 80 79 85 80
Method 3
8 9 78 75 65 80
Tabular Value

α = 0.05
df = h-1
= 3-1 = 2
X2 = 5.991
Decision Rule:
If the computed H value is greater than the tabulated H value, reject
the null hypothesis.
Sample Scenario 2

The following are the mileage yield per gallon which a test driver
consumed for 5 tankfuls each of four kinds of gasoline.
Check whether the claim that there is a significant difference in the
true average mileage of the four kinds of gasoline at 5 % level of
significance.
Data

Gasoline C
24 17 21 31 22
Gasoline P
21 31 32 19 17
Gasoline S
28 23 26 31 14
Gasoline T
29 16 18 31 20
Tabular Value

α = 0.05
df = h-1
= 4-1 = 3
X2 = 7.815
Decision Rule:
If the computed H value is greater than the tabulated H value, reject
the null hypothesis.
Spearman Rank Order Correlation,
rs
A method used to measure the association between two variables
measured on each item by ranking their values in order of their
magnitude which cannot be expressed by exact measurement but
from which ranked data can be obtained.
Formula
2
6∑ D
rs = 1 −
Where: (
n n −12
)
rs = Spearman Rank Order Coefficient
∑D2 = sum of the squares of the difference between ranks of the
variables
n = sample size
6 = constant
Sample Scenario

The following are the number of hours which 12 students studied for
midterm examination and the grades they obtained in statistics. Is
there a significant relationship between the number of hours spent
in studying statistics and the corresponding grades in the midterm
examination?
Data

Number of hours spent


5 6 11 20 19 20 10 12 8 15 18 10

Midterm grades
50 60 79 90 85 92 80 82 65 85 94 70
Level of Significance

α = 0.05
df = n-1
= 12-1 = 11
rs = 0.532
Decision:
if the r computed value is greater than the r tabulated value, reject
the null hypothesis.
Sample Scenario 2

The following is the ranking of two judges given to the work of eight (8)
artists. Determine if the two judges differ most of their opinions about
the artists’ works. Use a 5 % level of significance.
Data

Judge A
5 8 4 2 1 7 3 6

Judge B
8 5 6 4 2 1 3 7
Level of Significance

α = 0.05
df = n-1
= 8-1 = 7
rs = 0.714
Decision:
if the r computed is greater than the r tabulated value, reject null
hypothesis
SIGN TEST
Two Independent Samples
(Median Test Case)
This test is known as the median test used to compare the
median of two independent samples.

Formula:
2
x=
2
N (ad − bc )
klmn
Nomenclature

X2 = Chi-square test
a and c = observed (+) frequencies
b and d = observed (-) frequencies
K and l = the row total
M and n = the column total
N = the grand total
Sample Scenario 1

Consider the test scores of 12 female


and 9 male students on algebra test

Female : 122625101010222019171715
Male : 6 22197 8 12168 19
Continued…

Determine the significant difference


between the two groups on their
performance in algebra at 0.05 level
of significance.
Sample Scenario 2

Two groups in grams of experimental


animals are given two brands of feed
supplement and the following weight
gains.
Group 1 : 260 204 105 275 210 100
110 143 170 189
Group 2 : 6578140 182 195 48798069
89
Continued…

Apply the sign test to test the


hypothesis that the two samples
come from population with the same
median at 0.05 level of significance.
Two Correlated Samples (Fisher
Sign Test)
The test to compares two correlat4ed
samples and is applicable to data
composed of N paired observations
at which the half the difference
between the paired observations will
be positive and the other half will be
negative.
Formula

D −1
z =
N
Where:
Z = the Fisher Sign Test
D = the difference between the
number of + and - signs.
Sample Scenario 1

The pretest and posttest results of the


implementation of the program of a
certain student organization are as
follows:
Pretest: 15193136101119151016
Posttest: 1930268 106 1713228
Continued…

Determine if there is a significant


difference between the pretest and
posttest results of the evaluation of
the 10 students. Use 0.05 level of
confidence.
Sample Scenario 2

Ten students were given remedial instructions in statistics. Pretest and


posttest were also administered and the following are the results:
Pretest:25 20 20 8 15 8 20 21 35 20
Posttest: 30 25 20 12 20 8 20 20 38 30
Continued…

Use the sign test at 0.05 level of


significance if there is a significant
difference between the pretest and
posttest results.
Independent Samples
(Median Test: Multi-
sampleofcase)
The extension the median test for two
independent samples to a multi-
sample cases.
2
Where: 2
x =∑
(O − E )
E
X = chi-square test
O = observed frequencies
E = expected frequencies
Sample Scenario 1

A sampling of the acidity of rain for ten


randomly selected rainfalls was recorded
at three different locations in the
province of Batangas: Lobo, Talisay, and
Tingloy. The pH readings for these 30
rainfalls are shown in the following table.
(Note: pH readings ranging from 0-14; 0 is
acid and 14 is alkaline. Pure water falling
through clean air has a pH reading of 5.7)
Data

Lobo :4.4 4.0 4.1 3.5 2.4 3.8 4.2


3.9 4.1 4.2
Talisay :4.6 4.5 4.3 3.8 4.2 .5 4.7 4.3
4.5 4.8
Tingloy:4.7 4.8 5.0 4.9 3.9 4.5 4.6
4.3 4.0 4.7
Continued…

Use the median test at 0.05 level of


significance to test the null
hypothesis that there is no
significant difference among the
pH readings of the three different
municipalities of Batangas.
Sample Scenario 2

Apply the sign test at 0.05 level of


significance if there is a significant
difference among the four different
treatments given to four groups of
animals.
Data

T1: 5 6 16 15 15
T2: 8 15 12 15 10
T3: 17 18 15 12 13
T4: 10 12 20 20 25
MC Nemar’s Test for Correlated
Proportions
A chi-square test for the situations
when samples are matched, which
means they are not independent
that can be applied to a research
design trying to test whether there
is a significant change between
the before and after the situations.
Formula

2
c 2
=
(b − c)
b+c
Where: X2 = chi-square
b = the first cell of the 2nd column in a 2x2 table
c = the first cell of the 2nd row in a 2x2 table
Sample Scenario 1

The following data is collected on the


use of seat belt before and after
involvement in auto accidents for a
sample of 100 accident victims. Is
there a significant difference in the
use of seat belt before and after
involvement in an auto accident?
Use 0.05 level of significance.
Data

Wore seat belt regularly


after the accident
Wore seat Yes No
belt regularly
Yes 60 6
before the
accident No 19 15
Sample Scenario 2

The following is the data collected on


the charter change before and
after a televised debate for a
sample of 50 registered voters.
Apply the McNemar’s test at 0.05
level of significance.
Data

After the
Debate
Before Yes No
the Yes 19 11
Debate No 8 12
The Friedman Test for Randomized
Block Design
A test used for comparing the
distributions of measurements for k
treatments laid out in b blocks using
randomized block design in either
the number of k treatments or the
number of b of blocks is larger than
five.
Formula

12 2
Fr = ∑ T − 3b(K + 1)
bk (k + 1)
Where:
Fr = the Friedman’s Test
B = the number of blocks
K = the number of treatments
Ti = rank sum fro treatment I
i = element
Sample Scenario 1

In a study of the probability of antibiotics


in children, 5 sample healthy children
were used as subjects to assess their
reaction to the taste of four antibiotics.
The children’s response was measured
on a 10-centimeter visual analog scale
incorporating the use of faces, from sad
(low score) to happy (high score).
Continued…

The minimum score was 0 and the


maximum was 10. Test the
differences in the reactions of the
five children on four different
antibiotics using Friedman’s Test
under randomized block design at
0.05 level of significance.
Data
Antibiotics

Child 1 2 3 4
1 5.8 2.5 6.7 6.2
2 9.0 9.0 6.6 9.5
3 5.0 2.6 3.5 6.6
4 7.9 9.4 5.3 8.4
5 3.9 7.5 2.5 2.5
Sample Scenario 2

Five different subjects are repeatedly


measured under 3 different
conditions. Use the Friedman’s Test if
there is a significant difference in five
different subjects in 3 different
conditions at 0.05 level of
significance.
Data

Conditions

Subject 1 2 3
1 30 35 40
2 11 15 12
3 21 29 17
4 3 5 18
5 12 10 15

You might also like