# Statistical Tests

In this section, parametric tests (based on a Normal distribution) and non parametric tests are discussed. In addition, there are sections on power analysis, sensitivity / specificity, errors, repeatability / reproducibility, bias / accuracy and multiple observer analysis.

Null Hypothesis: No difference between study groups

Alternate Hypothesis: There is a difference between the study groups

Outcome Measure: Variable used to test the hypotheses

Test Statistic: Calculation used to determine statistical significance

P Value: Probability statement is incorrect

Statistically Significant: p < 5 %

Parametric Test: Assumes Normal distribution

Tests for Normality:

• Quantile-Quantile plots

qqnorm(data)
qqline(data)

• Shapiro-Wilk test

shapiro.test(data)

• Kolmogorov-Smirnov test

ks.test(data,”pnorm”,mean=mean(data),sd=sd(data))

t-test:

t.test(data~group)
t.test(data1,data2,paired=FALSE)

Non Parametric Test: Not Normal distribution

DataSample SizeTest
Continuous> 50Normal
< 50
Normal
t-Test
Not NormalWilcoxon
OrdinalWilcoxon
NominalChi Squared

α: Probability of incorrectly rejecting the null hypothesis (false positive)

β: Probability of incorrectly accepting the null hypothesis (false negative)

Power: Probability of correctly rejecting the null hypothesis (Power = 1 – β)

Chi-square test:

The prerequisites are:

• Random sample
• Sufficient sample size
• Independence of the observations
• Expected cell count at least 5 in 2 by 2 tables or at least 5 in 80% of larger tables, but no cells with an expected count of 0.

data<-matrix(c(a,b,c,d),nrow=2)
chisq.test(data,correct=TRUE)
chisq.test(data,correct=TRUE)\$expected (for expected frequencies)

It is recommended to use Yates’ continuity correction (correct=TRUE), if the expected cell count is less than 10 . When the expected cell count is less than 5, the Chi Square test should not be used and the Fisher exact test is recommended.

Wilxocon test: Non parametric singed rank test for continuous data

wilcox.test(data)

Power analysis

• Difference desired to detect
• Significance level (α)
• Test statistic (power)

power.t.test(sd=s,delta=d,sig.level=0.05,power=0.8)

Errors:

Type 1: Null hypothesis is incorrectly rejected (false positive; significance level (α)

Type 2: Null hypothesis is incorrectly accepted (false negative; statistical power)

Errors are related to:

• Difference desired to detect
• Significance level (α)
• Test statistic (power)

In general:

Athroscopy
Positive
Arthroscopy
Negative

a + cb + da + b + c + d
MRI
Positive
aba + b
MRI
Negative
cdc + d

True Positive: False Positive: False Negative: True Negative: Positive Predictive Value is the probability that a person who is test positive has the condition: Negative Predictive Value is the probability that a person who is test negative does not have the condition: Sensitivity, or true positive rate, is the probability that a person who has the condition tests positive: Specificity, or true negative rate is the probability that a person who does not have the condition tests negative: Accuracy is the probability a test result is correct: Or in the JGR / R console using the epiR 1 package:

library(epiR)
mat<-matrix(c(a,c,b,d),ncol=2) {enter values}
epi.tests(mat)
Disease +    Disease –      Total
Test +            a            b
Test –             c           d
Total

Point estimates and 95 % CIs:
———————————————————
Apparent prevalence
True prevalence
Sensitivity
Specificity
Positive predictive value
Negative predictive value
Positive likelihood ratio
Negative likelihood ratio
———————————————————

Bias

• Selection bias; reduced by randomisation
• Confounding bias; reduced by stratification
• Observational bias; reduced by blinding

 

 1. Stevenson M, Nunes T, Heuer C, Marshall J, Sanchez J, Thornton R, et al. epiR: Tools for the Analysis of Epidemiological Data [Internet]. 2015. (R package). Available from: http://cran.r-project.org/package=epiR