NOMINALLY SCALED DATA and KAPPA STATISTIC K
NOMINALLY SCALED DATA and KAPPA STATISTIC K
NOMINALLY SCALED DATA and KAPPA STATISTIC K
GANDAMRA
SE 413-Advanced Educational Statistics
CHAPTER 9: MEASURES OF ASSOCIATION AND THEIR TESTS OF SIGNIFICANCE
Category
Object 1 2 … j … m
1 n11 n12 … n1 j … n1 m S1
2 n21 n22 … n2 j … n2 m S2
: : : : :
i ni 1 ni 2 … nij … nℑ Si
: : : : :
N nN 1 nN 2 … n Nj … n Nm SN
C1 C2 … Cj … Cm
The kappa coefficient of agreement is the ratio of the proportion of times that the raters
agree (corrected for chance agreement) to the maximum proportion of times that the raters
could agree (corrected for chance agreement):
P ( A )−P (E)
K= Equation 9.27 (p. 285)
1−P(E)
P(E) – proportion of times that we would expect the k raters to agree by chance
K=1 (complete agreement among the raters)
K=0 (no agreement among the raters)
To find P(E) we note that the proportion of objects assigned to the j th category is
Cj
p j= .
Nk
Total expected agreement across all categories:
m
P ( E )=∑ p 2j Equation 9.28 (p. 286)
j=1
Example 9.8a
It has been observed by researchers of animal behavior that the male stickleback fish changes
color during the nesting and courtship cycle. When placed in a suitable environment, the male
sticklebacks establish territories, build nests, and engage in courtship and aggression when
stimulus fish are introduced into the environment. To analyze the relation between color and
other behaviors during experimental study, it was necessary to code the fish in terms of their
coloration. Since the fish must be observed from outside their environment, and because of
variation in observational conditions, k = 4 trained raters evaluated the coloration of each fish.
The colorations were divided into m = 5 categories. The first category was for those fish with
minimal color development and the last category represented maximal color development and
coloration, the other three categories involved varying degrees of coloration. In this study, a
group of N = 29 fish was observed. The data are summarized in Table 9.15. Note that the raters
were in complete agreement about the coloration of fish 1 and that they were divided in their
ratings of fish 2. Examination of the rows of the table shows that there was complete
agreement for some fish but low agreement about others.
Table 9.15
Estimates of Nuptial Coloration of Male Sticklebacks
Coloration
Compute the value of P(E), the proportion Category
of agreement which we could expect by chance, using
equation 9.28,
Fish
2 2 2 2 2
¿ .362 +.026 +.319 +.069 +.224
¿ .2884
Next we must find P( A), the proportion of times that the raters agreed.
N
1
P ( A )= ∑ S i
N i=1
1+ .333+1+ .333+.50+…+.333+ .167
¿
29
¿ .5804
P ( A )−P (E)
K=
1−P(E)
.580−.288
¿
1−.288
K=.41
Thus, we conclude that there is moderate agreement among the raters.
Example 9.8b
H 0 : ( K=0) The researcher conclude that the raters exhibit no significant agreement on their ratings.
H 1 : (K >0) The researcher conclude that the raters exhibit significant agreement on their ratings.
Statistical Test
Recall that N=29 (objects rated), m=5 (rating categories), k =4 (raters), and P ( E )=.288 . Since
K=.41 categories (nominal), then we find the variance of K .
Significance Level
Set α= .01 , and N = 29.
Therefore, the researcher conclude that the raters exhibit significant agreement on their ratings.