Non Parametric Tests

If data are not Normally distributed, a non parametric test should be used.

There are many non parametric tests described, and it is impossible to name / lists them all. The following tests are discussed using the example data in skinfold.rda:

Sign test
Chi square test
Fisher Exact test
Mann-Whitney U test / Wilcoxon test

It has been shown that the data do not conform a Normal distribution. Therefore, parametric statistics can’t be used and a non parametric test should be used instead. The skinfold data will be analysed with each of the four tests mentioned above.

Sign Test

This is the most basic type of non parametric test. As the name implies, it uses the plus or minus sign to differentiate. The data is transformed; if the skin fold is smaller or equal to 4 mm, we indicate this by a minus sign and if the skin fold is greater than 4 mm by a plus sign. The data can than be summarised in a contingency table (row: skin thickness, column: cancer or not):

	No Cancer	Cancer	Total
<= 4 mm	10	8	18
> 4 mm	10	1	11
Total	20	9	29

Therefore, of the patients with cancer, there is only 1 out of 9 patients who has a skin thickness of more than 4 mm; whilst 50% of the patients without cancer has a skin fold thickness of more than 4 mm. Is this due to chance / random variation or is there a statistically significant difference between the two groups?

If the probabilities were equal, the probability of plus and minus would be 50% (1/2).

If we test two sided we need to calculate the probability that:

P = 0 out of 9 patients > 4 mm + 1 out of 9 patients > 4 mm

8 out of 9 patients < 4 mm + 9 out of 9 patients < 4 mm

So:

\(P = {9 \choose 0} \cdot (0.5)^0 \cdot (0.5)^9 + {9 \choose 1} \cdot (0.5)^1 \cdot (0.5)^8 + \) \({9 \choose 8} \cdot (0.5)^8 \cdot (0.5)^1 + {9 \choose 9} \cdot (0.5)^9 \cdot (0.5)^0 \)

\(P = \frac{9!}{9! \cdot 0!} \cdot(0.5)^0 \cdot (0.5)^9 + \frac{9!}{8! \cdot 1!} \cdot(0.5)^1 \cdot (0.5)^8 + \) \(\frac{9!}{1! \cdot 8!} \cdot(0.5)^8 \cdot (0.5)^1 + \frac{9!}{0! \cdot 9!} \cdot(0.5)^9 \)

\(P = 1 \cdot 0.001953125 + 9 \cdot 0.001953125 + \) \(9 \cdot 0.001953125 + 1 \cdot 0.001953125 \)

\(P = 0.0390625 \)

P < 5%, therefore statistically significant.

It is straight forward to perform a binomial test in R:

binom.test(1,9)
    Exact binomial test
data:  1 and 9
number of successes = 1, number of trials = 9, p-value = 0.03906
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.002809137 0.482496515
sample estimates:
probability of success 
             0.1111111

Therefore, the null hypothesis is rejected in favour of the alternate hypothesis and it is concluded that there is a difference in skin thickness between cancer patients and other patients.

Chi Square Test (𝓧² test)

Instead of the sign test, the Square test can be used. There are prerequisites for the Chi square test. These are:

Random sample
Sufficient sample size
Independence of the observations
Expected cell count at least 5 in 2 by 2 tables or at least 5 in 80% of larger tables, but no cells with an expected count of 0.

For now, lets assume the prerequisites have been met. Again, we look at the table (these are the observed frequencies):

	No Cancer	Cancer	Total
<= 4 mm	10	8	18
> 4 mm	10	1	11
Total	20	9	29

In total, there are 18 out of 29 patients with a skin fold thickness of <= 4 mm. If there were no difference between the two groups, we would expect 20 × 18 / 29 patients in the ‘No Cancer’ Group and 9 × 18 / 29 patient in the ‘Cancer’ group with a skin thickness <= 4 mm. These are called the expected frequencies. Similarly, we would expect 20 × 11 / 29 patients in the ‘No Cancer’ group and 9 × 11 / 29 patients in the ‘Cancer group with a skin thickness > 4 mm. Or summarised in a table:

	No Cancer Observed	No Cancer Expected	Cancer Observed	Cancer Expected	Total
<= 4 mm	10	12.4	8	5.6	18
> 4 mm	10	7.6	1	3.4	11
Total	20	20	9	9	29

Please note one of the expected frequencies is below 5!

Next, the Chi Square test statistic is calculated:

\(\chi^2 = \sum (\frac{(Observed – Expected)^2}{Expected}) \)

So:

\(\chi^2 = \sum (\frac{(O-E)^2}{E}) = \frac{(10 – 12.4)^2}{12.4} + \frac{(8- 5.6)^2}{5.6} + \frac{(10 – 7.6)^2}{7.6} + \frac{(1 – 3.4)^2}{3.4} \)

\( \chi^2 \approx 3.987093154 \)

Next we need to determine the degrees of freedom: In the observed frequencies table, there are two columns (c = 2) and two rows (r = 2). So there are (r – 1)×(c – 1) = 1 degree of freedom.

We can now look in a Chi Square distribution table (statistical table):

Degrees of Freedom	10%	5%	1%
1	2.71	3.84	6.63
2	4.61	5.99	9.21
3	6.25	7.81	11.34
4	7.78	9.49	13.28
5	9.24	11.07	15.09
10	15.99	18.31	23.21

Looking in the first row under one degree of freedom:

P = 3.84 at 5% and 6.63 at 1%.

3.987093154 > 3.84 and therefore the null hypothesis can be rejected in favour of the alternate hypothesis (p < 5%).

Please note that statistical significance is demonstrated at p <= 5%, but not at p <= 1%

In the R console:

If the 2 by 2 table has already been constructed, it is easy to perform a Chi Square test. First create a matrix that contains the data in 2 rows:

skindata <- matrix(c(10,10,8,1),nrow=2)
skindata
     [,1] [,2]
[1,]   10    8
[2,]   10    1

Next, do a Chi Square test:

chisq.test(skindata,correct=FALSE)
    Pearson's Chi-squared test
data:  skindata
X-squared = 3.9871, df = 1, p-value = 0.04585
Warning message:
In chisq.test(skindata, correct = FALSE) :
  Chi-squared approximation may be incorrect

The ‘correct=FALSE’ option switches OFF the continuity correction (Yates) when computing the test statistic. This involves subtracting 0.5 from the difference in observed and expected frequencies before squaring. The Yates continuity correction is recommended, especially when the numbers are small in a two by two table. In this case, it would be better to use the continuity correction.

The result is the same as with the manual method. Again, we reject the null hypothesis in favour of the alternate hypothesis: The skin thickness is less in patients with cancer than in other patients. However, the numbers are small and it would be better to use the Yates continuity correction:

chisq.test(skindata,correct=TRUE)

	Pearson's Chi-squared test with Yates' continuity correction

data:  skindata
X-squared = 2.5064, df = 1, p-value = 0.1134

Warning message:
In chisq.test(skindata, correct = TRUE) :
  Chi-squared approximation may be incorrect

This show a non significant result.

To show the expected frequencies:

chisq.test(skindata,correct=FALSE)$expected
          [,1]     [,2]
[1,] 12.413793 5.586207
[2,]  7.586207 3.413793
Warning message:
In chisq.test(skindata, correct = FALSE) :
  Chi-squared approximation may be incorrect

Please note that the prerequisites of the Chi Square test have not been met! One of the expected frequencies is less than 5, hence the warning.

Create matrix:

Above, the matrix was created ‘manually’ by adding up the categories. However, this would be more difficult with large data sets. To recode the Skinfold variable into a new binary variable (Skinfold_cat)and create a two by two table in R:

skinfold$Skinfold_cat <- ifelse(skinfold$Skinfold <= 4, 0, 1)
skinfold
   Patient     Group Skinfold Skinfold_cat
1        1 No Cancer      1.9            0
2        2 No Cancer      2.2            0
3        3 No Cancer      2.3            0
4        4 No Cancer      2.6            0
5        5 No Cancer      2.8            0
6        6 No Cancer      2.9            0
7        7 No Cancer      3.0            0
8        8 No Cancer      3.7            0
9        9 No Cancer      3.8            0
10      10 No Cancer      4.0            0
11      11 No Cancer      4.3            1
12      12 No Cancer      4.4            1
13      13 No Cancer      4.8            1
14      14 No Cancer      5.6            1
15      15 No Cancer      6.0            1
16      16 No Cancer      6.2            1
17      17 No Cancer      6.2            1
18      18 No Cancer      7.0            1
19      19 No Cancer     10.0            1
20      20 No Cancer     10.4            1
21      21    Cancer      1.8            0
22      22    Cancer      2.0            0
23      23    Cancer      2.0            0
24      24    Cancer      2.0            0
25      25    Cancer      3.0            0
26      26    Cancer      3.8            0
27      27    Cancer      3.9            0
28      28    Cancer      4.0            0
29      29    Cancer      4.1            1
table(skinfold$Skinfold_cat, skinfold$Group)
   
    Cancer No Cancer
  0      8        10
  1      1        10

Fisher Exact test

The output in the Chi Square test gives a warning that the approximation may be incorrect. This is because not all prerequisites for the Chi Square test have been met. One of the cells has an expected frequency of less than 5 (3.413); therefore the Chi Square test may be incorrect and a Fisher Exact test is more appropriate. To perform a Fisher Exact test:

fisher.test(skindata)
    Fisher's Exact Test for Count Data
data:  skindata
p-value = 0.0959
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.002571797 1.319280078
sample estimates:
odds ratio 
 0.1333924

The Fisher Exact test, does not reject the null hypothesis as the p-value is 0.0959.

Mann-Whitney U Test / Wilcoxon Test

This test has several names: Mann–Whitney U test, Mann-Whitney-Wilcoxon test, Wilcoxon rank sum test, Wilcoxon-Mann-Whitney test. It is a non parametric test for two independent samples of continuous data. It is very similar to the t-test, but the data does not have to be Normally distributed. There doesn’t seem to be agreement in the literature regarding the nomenclature of this test. In this book, the name Wilcoxon test (as used in the R programming language).

Continuing with the same example (skinfolds).

It has been demonstrated that it was not appropriate to use the t-test as the data are not Normally distributed. Subsequently, the data were transformed and a Chi Square test was performed. The Chi Square test without continuity correction rejected the null hypothesis. However, not all prerequisites were met and its use probably inappropriate. The Chi Square test with Yates’ continuity correction (should be used if expected frequencies are below 10) and the Fisher Exact test were both insignificant. What about the Wilcoxon test?

The Wilcoxon test is performed on continuous data, but Normality is not a requirement. The test is more powerful than the Sign and Chi Square test, but less versatile. In the previous sections, the Sign and Chi Square test were used after transforming the data. However, it is unnecessary to transform the data to nominal data and normally the most powerful test available is used within its operational conditions. The Wilcoxon test is the most appropriate test to use for the skinfold data.

In the R console:

wilcox.test(Skinfold~Group,data=skinfold,correct=FALSE)
    Wilcoxon rank sum test
data:  skinfold$Skinfold by skinfold$Group
W = 46.5, p-value = 0.04011
alternative hypothesis: true location shift is not equal to 0

Warning message:
In wilcox.test.default(x = c(1.8, 2, 2, 2, 3, 3.8, 3.9, 4, 4.1),  :
  cannot compute exact p-value with ties

As the data are continuous, the ‘correct=FALSE’ option is selected so that no continuity correction is used in the calculation of the p-value.

R gives a warning message as there are some values that are the same, making ranking and calculation of an exact p value difficult.

The p=value is significant and therefore the null hypothesis is rejected in favour of the alternative hypothesis and it is concluded that there is a difference in the skin thickness of cancer patients and other patients.