Kappa Test

The kappa test is used to measure agreement between two or more observers for categorical items. A kappa of 1 indicates perfect agreement and a kappa of zero agreement less than chance. Substantial agreement requires a kappa between 0.6 and 0.8 and almost perfect agreement a kappa higher than 0.8 1.

The xray.rda dataset is used to show how to perform a kappa test in JGR / R. The data frame is called xray and contains two variables: Observer1 and Observer2 with categorical data regarding the interpretation of radiographs (OA or RA).

To show the data frame:

   Observer1 Observer2
1         RA        RA
2         OA        OA
3         OA        OA
4         RA        OA
5         OA        OA
6         RA        RA
7         OA        OA
8         OA        OA
9         OA        OA
10        OA        RA
11        OA        OA
12        OA        OA
13        OA        OA
14        OA        OA
15        OA        OA
16        OA        OA
17        RA        RA
18        RA        RA
19        OA        OA
20        OA        OA

It can be seen that most of the time the observers agree with each other. To perform a kappa test is straight forward:

 Cohen’s Kappa for 2 Raters (Weights: unweighted)

 Subjects = 20
   Raters = 2
    Kappa = 0.733

        z = 3.28
  p-value = 0.00104

The kappa is 0.733 confirming substantial agreement between the two observers. The p-value to test the null hypothesis of no association and is highly significant.

Viera AJ, Garrett JM. Understanding interobserver agreement: the kappa statistic. Fam Med. 2005 May;37(5):360–3.