Lecture 6 Part One
Lecture 6 Part One
4 Mann-Whitney U Test
In Section 14.2 and 14.3, we have discussed the paired sign test and Wilcoxon
matched-pair signed-rank test respectively which are used when the
observations are dependent, that is, observations are paired. When we are
interested in testing of difference in means of two independent populations then
we use two sample t-test. To use the t-test however, it is necessary to make a
set of assumptions (as described in Unit 11 of this course). In particular, it is
necessary that two independent samples be randomly drawn from normal
populations having equal variances and the data be measured in at least of an
interval scale. But in studies of consumer behaviour, marketing research,
experiment of psychology, etc. generally the data are collected in ordinal scale
and the form of the population is not known. Since the parametric t-test could
not be used in such situation, an appropriate non-parametric technique is
needed. In such a circumstance a very simple non-parametric test known as
Mann-Whitney U test may be used.
This test was developed jointly by Mann, Whitney and Wilcoxon. Wilcoxon
considered only the case of equal sample sizes while Mann and Whitney seem
to have been the first to treat the case of unequal sample sizes. Therefore, it is
sometimes called the Mann-Whitney U test and sometimes the Wilcoxon ranks
sum test. Thus, this test may also be viewed as a substitute for the parametric t-
test for the difference between two population means. When assumptions of the
two-sample t-test are fulfilled then this test is slightly weaker than t-test.
Assumptions
This test work under the following assumptions:
(i) The two samples are randomly and independently drawn from their
respective populations.
(iv) The distributions of two populations differ only with respect to location
parameter.
53
Non-Parametric Tests Let us discuss the general procedure of this test:
Let us suppose that we have two independent random samples X1 , X 2 ,..., X n1
and Y1 , Y2 ,..., Yn 2 drawn from two populations having medians 1 and 2
respectively. Here, we want to test the hypothesis about the medians of two
populations so we can take the null and alternative hypotheses as
H 0 : µ 1 = µ 2 and H 1: µ 1 µ 2 for two-tailed test
H 0 : 1 2 and H1: 1 2
or for one-tailed test
H 0 : 1 2 and H1: 1 2
We can also form the null and alternative hypotheses in the form of distribution
functions. If F1(x) and F2(x) are the distribution functions of the first and
second populations respectively and we want to test that the two independent
samples come from the populations that are identical with respect to location,
that is, median.
Therefore, we can take the null and alternative hypotheses as
H1 : F1 (x) F2 (x) for at least one value of x
for two-tailed test
H1 : F1 (x) F2 (X) for at least one value of x
H 0 : F1 (x) F2 (x) for at least one value of x
or for right-tailed test
H 1 : F1 (x) F2 (X) for at least one value of x
H 0 : F1 (x) F2 (x) for at least one value of x
or for left-tailed test
H 1 : F1 (x) F2 (X) for at least one value of x
After setting null and alternative hypotheses, this test involves following steps:
Step 1: First of all, we combine the observations of two samples.
Step 2: After that, ranking all these combined observations from smallest to
largest, that is, the rank 1 is given to the smallest of the combined
observations, rank 2 is given to the second smallest and so on up to
the largest observation. If several values are same (tied), we assign
each the average of ranks they would have received if there were no
repetition.
Step 3: If null hypothesis is true then we can expect that the sum of ranks of
the two samples are equal. Now for convenience, we consider the
smaller sized sample and calculate the sum of ranks of the
observations of this sample. Let S be the sum of the ranks assigned to
the sample observations of smaller sized sample then for testing the
null hypothesis the test statistic is given by
n1 n1 1
U S ; if n1 is small
2
n2 n2 1
U S ; if n2 is small
2
Step 4: Obtain critical value(s) of test statistic U at given level of significance
under the condition that null hypothesis is true. Table VII in the
Appendix at the end of this block provides the lower and upper
critical values for a given combination of n1 and n2 at α level of
significance for one-tailed and two-tailed test.
54
Step 5: Decision rule: Two-Sample Tests
To take the decision about the null hypothesis, the calculated value of
test statistic U (computed in Step 3) is compared with the critical
(tabulated) value (obtained in Step 4) at a given level of significance
(α) under the condition that null hypothesis is true. Since test may be
one-tailed or two-tailed so following cases arise:
For two-tailed test: When H0 : 1 2 and H1 : 1 2
For two-tailed test, we see critical values at α/2 for α level of
significance. If calculated value of test statistic U is either less than or
equal to the lower critical value (U L, / 2 ) or greater than or equal to
the upper critical value (U U, / 2 ) , that is, U U L, / 2 or U U U, / 2 then
we reject the null hypothesis at α level of significance. However, if
computed U lies between these critical values, that is,
U L, / 2 U U U, / 2 , then we do not reject the null hypothesis at α%
level of significance.
For one-tailed test:
For one-tailed test, we see the critical value at α for α level of significance.
Case I: When H 0 : 1 2 and H1 : 1 2 (right-tailed test)
If calculated value of test statistic U is greater than or equal to the
upper critical value (U U, ), that is, U U U, then we reject the null
hypothesis at α level of significance. However, if computed U is less
than upper critical value (U U, ) , that is, U U U, then we do not
reject the null hypothesis at α% level of significance.
Case II: When H 0 : 1 2 and H1 : 1 2 (left-tailed test)
If calculated value of test statistic U is less than or equal to the lower
critical value (U L, ), that is, U U L, then we reject the null
hypothesis at α level of significance. However, if computed U is
greater than lower critical value (U L, ), that is, U U L, then we do
not reject the null hypothesis at α% level of significance.
For large (n1 or n2 >20):
When either n1 or n2 exceeds 20, the statistic U is approximately
normally distributed with mean
n 1n 2
E U … (7)
2
and variance
n1n 2 n1 n 2 1
Var U … (8)
12
The proof of mean and variance of test statistic(T) is beyond the
scope of this course.
Therefore in this case, we use normal test (Z-test) (described in Unit
10 of Block 3 of this course.). The test statistic of Z-test is given by
U E U U E U
Z ~ N 0, 1
SE U Var U
55
Non-Parametric Tests n1 n 2
U
2 ~ N 0,1 Using equations … (9)
n 1 n 2 n 1 n 2 1 (7) and (8)
12
After that, we calculate the value of test statistic Z and compare it
with the critical value(s) given in Table 10.1 at prefixed level of
significance α. Take the decision about the null hypothesis as
described in Section 10.2 of Unit 10 of this course.
Let us do some examples to become more user friendly with the test explained
above:
Example 5: A Statistics professor taught two special sections of a basic course
in which students in each section were considered outstanding. He used a
“traditional” method of instruction (T) in one section and an “experimental”
method of instruction (E) in the other. At the end of the semester, he ranked the
students based on their performance from 1 (worst) to 20 (best).
T 1 2 3 5 8 10 12 13 14 15
E 4 6 7 9 11 16 17 18 19 20
Examine that the median amount to be spent on a birthday gift by students who
are unemployed is lower than for students who are employed by Mann-
Whitney U test at 1% level of significance.
Solution: It is case of two independent populations and the assumption of
normality of both the populations is not given so we cannot use t-test in this
case. Also sample sizes are small so we cannot use the Z-test. Therefore, we go
for Mann-Whitney U test.
Here, we want to test that the median amount to be spent on a birthday gift by
students who are unemployed is lower than for students who are employed. If
1 and 2 denote the average (median) amount to be spent on a birthday gift by
students who are unemployed and employed respectively so our claim is
1 2 and its complement is 1 2 . Since complement contains the equality
sign so we can take the complement as the null hypothesis and the claim as the
alternative hypothesis. Thus,
H0 : 1 2
57
Non-Parametric Tests Here, we consider the first sample so the test statistic is given by
n1 n1 1
U S
2
6 6 1
27.5 6.5
2
To decide about the null hypothesis, the calculated value of test statistic U is
compared with the lower critical (tabulated) value at 1% level of significance.
Since test is left-tailed so the lower critical value of test statistic for left-tailed
corresponding to n1= 6 and n2 = 9 at 1% level of significance is
U L, U L, 0.01 29.
Since calculated value of test statistic U (= 6.5) is less than lower critical value
(= 29) so we reject the null hypothesis and support the alternative hypothesis
i.e. we support the claim at 1% level of significance.
Thus, we conclude that the samples fail to provide us sufficient evidence
against the claim so we may assume that the median amount to be spent on a
birthday gift by students who are unemployed is lower than for students who
are employed.
Now, you can do following exercises in same manner.
E7) Write one difference between two-sample t-test and Mann-Whitney U
test.
E8) Write one difference between Wilcoxon matched-pair singed-rank test
and Mann-Whitney U test.
E9) The senior class in a particular high school had 25 boys. Twelve boys
lived in villages and other thirteen lived in a town. A test was conduct
to see that village boys in general were physically fit than the town
boys. Each boy in the class was given a physical fitness test in which a
low score indicates poor physical condition. The scores of the village
boys (V) and the town boys(T) are as follows:
Test whether the village boys are more fit than town boys at 5% level of
significance.
E10) The following data represent lifetime (in hours) of batteries for two
different brands A and B:
Brand A 40 30 55 40 40 35 30 40 50 45 40 35
Brand B 45 60 50 60 35 50 55 60 50 50 40 55
58
Two-Sample Tests
14.5 KOLMOGOROV-SMIRNOV TWO-SAMPLE
TEST
In Mann-Whitney U test, we have tested the hypothesis that the two
independent samples come from the populations that are identical with respect
to location whereas Kolmogorov-Smirnov two-sample test is sensitive to
differences of all types that may exist between two distributions, that is,
location, dispersion, skewness, etc. Therefore, it is referred as a general test.
This test was developed by Smirnov. This test also carries the name of
Kolmogorov because of its similarity to the one-sample test developed by
Kolmogorov. In Kolmogorov-Smirnov one sample test, the observed (sample
or empirical) cumulative distribution function is compared with the
hypothesized cumulative distribution function whereas in two-sample case the
comparison is made between the empirical cumulative distributions functions
of the two samples.
Assumptions
The assumptions necessary for this test are:
(i) The two samples are randomly and independently drawn from their
respective populations.
(ii) The variable under study is continuous.
(iii) The measurement scale is at least ordinal.
Let us discuss the general procedure of this test:
Let us suppose that we have two independent random samples X1 , X 2 ,..., X n1
and Y1 , Y2 ,..., Yn 2 drawn from first and second populations with distribution
functions F1(x) and F2(x) respectively. Also let S1(x) and S2(x) be the sample or
empirical cumulative distribution functions of samples drawn from first and
second populations respectively.
Generally, we want to test whether independent random samples come from
populations having the same distribution functions in all respect or the
distribution functions of two populations differ with respect to location,
dispersion, skewness, etc. So we can take the null and alternative hypotheses as
H 0 : F1 (x) F2 (x) for all values of x
H1 : F1 (x) F2 (x) for at least one value of x
After setting null and alternative hypotheses, the procedure of Kolmogorov -
Smirnov two-sample test summarise in the following steps:
Step 1: This test is based on the comparison of the empirical (sample)
cumulative distribution functions, therefore, first of all we compute
sample cumulative distribution functions S1(x) and S2(x) from the
sample data as the proportion of the number of sample observations
less than or equal to some number x to the number of observations,
that is,
The number of sample observations less than or equal to x
S1 x
n1
and
59
Non-Parametric Tests The number of sample observations less than or equal to x
S2 x
n2
Step 2: After finding the empirical cumulative distribution functions S1(x)
and S2(x) for all possible values of x, we find the deviation between
the empirical cumulative distribution functions for all x. That is,
S1 (x) S2 (x) for all x
Step 3: If the two samples have been drawn from identical populations then
S1(x) and S2(x) should be fairly close for all value of x. Therefore, we
find the point at which the two functions show the maximum
deviation. So we take the test statistic which calculate the greatest
vertical deviation between S1(x) and S2(x), that is,
D sup S1 (x) S2 (x)
x
Solution: Here, we want to test that the average life of batteries of two brands
is same. If F1(x) and F2(x) are cumulative distribution functions of the life of
batteries of brand A and brand B respectively then our claim is F1(x) = F2(x)
and its complement is F1(x) ≠ F 2(x). Since the claim contains the equality sign
so we can take claim as the null hypothesis and complement as the alternative
hypothesis. Thus,
60
H 0 : F1 (x) F2 (x) for all values of x Two-Sample Tests
30 2 2 2/12 0 0 0 2/12
35 2 4 4/12 1 1 1/12 3/12
40 5 9 9/12 1 2 2/12 7/12
45 1 10 10/12 1 3 3/12 7/12
50 1 11 11/12 4 7 7/12 4/12
55 1 12 12/12 2 9 9/12 3/12
60 0 12 12/12 3 12 12/12 0
Total n1 = 12 n2 = 12
From the above calculation, we have
7
D sup S1 x S2 x 0.58
x 12
The critical value of test statistic for equal sample sizes n1 = n2 = 12 at 5%
level of significance is 6/12 = 0.5.
Since calculated value of test statistic (= 0.58) is greater than critical value
(= 0.5) so we reject the null hypothesis i.e. we reject the claim at 5% level of
significance.
Thus, we conclude that the samples provide us sufficient evidence against the
claim so the average life of batteries of two brands is different.
Example 8: The following are the marks in Statistics of B.Sc. students taken
randomly from two colleges A and B:
Marks 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 90-100
College (A) 2 2 4 6 3 3 4 8 7 5
College (B) 1 1 2 5 7 3 3 2 6 6
61
Non-Parametric Tests our claim is F1(x) = F2(x) and its complement is F1(x) ≠ F2(x). Since the claim
contains the equality sign so we can take claim as the null hypothesis and
complement as the alternative hypothesis. Thus,
H 0 : F1 (x) F2 (x) for all values of x
where, S1(x) and S2(x) are the empirical distribution functions of samples
drawn from college A and college B respectively which are defined as:
number of students of college A whose marks x
S1 x
n1
number of students of college B whose marks x
S2 x
n2
Calculation for S1 x S2 x :
Marks Frequency C.F. SA(x) Frequency C. F. SB(x) S A x S B x
(College A) (College A) (College B) (College B)
0.0732
Since n1 = 44 > 16 so we can calculate the critical value of test statistic for
unequal sample sizes n1 = 44 and n2 = 36 at 1% level of significance by the
formula
n1 n 2 44 36
D n, 1.63 1.63 0.366
n 1n 2 44 36
Since calculated value of test statistic D (= 0.0732) is less than critical value
(= 0.366) so we do not reject the null hypothesis i.e. we support the claim at
1% level of significance.
62
Thus, we conclude that the samples fail to provide us sufficient evidence Two-Sample Tests
against the claim so we may assume that the distribution of marks in Statistics
of B.Sc. students in college A and college B are same.
Now, you can try the following exercises.
63