In this section, parametric tests (based on a Normal distribution) and non parametric tests are discussed. In addition, there are sections on power analysis, sensitivity / specificity, errors, repeatability / reproducibility, bias / accuracy and multiple observer analysis.
Null Hypothesis: No difference between study groups
Alternate Hypothesis: There is a difference between the study groups
Outcome Measure: Variable used to test the hypotheses
Test Statistic: Calculation used to determine statistical significance
P Value: Probability the test statistic takes a more extreme value
Statistically Significant: Probability the test statistic has a more extreme value is less than 5%.
Parametric Test: Assumes data follow a Normal distribution
Tests for Normality:
- Quantile-Quantile plots
qqnorm(data)
qqline(data)
- Shapiro-Wilk test
shapiro.test(data)
- Kolmogorov-Smirnov test
ks.test(data,"pnorm",mean=mean(data),sd=sd(data))
t-test:
t.test(data ~ group)
t.test(data1, data2, paired=FALSE)
Non Parametric Test: Data can’t be modelled with a Normal distribution
Data | Sample Size | Test |
---|---|---|
Continuous | > 50 | Normal |
< 50 Normal | t-Test | |
Not Normal | Wilcoxon | |
Ordinal | Wilcoxon | |
Nominal | Chi Squared |
α: Probability of incorrectly rejecting the null hypothesis (false positive, type 1 error)
β: Probability of incorrectly accepting the null hypothesis (false negative, type 2 error)
Power: Probability of correctly rejecting the null hypothesis (Power = 1 – β)
Chi-square test:
The prerequisites are:
- Random sample
- Sufficient sample size
- Independence of the observations
- Expected cell count at least 5 in 2 by 2 tables or at least 5 in 80% of larger tables, but no cells with an expected count of 0.
data<-matrix(c(a,b,c,d),nrow=2)
chisq.test(data,correct=TRUE)
chisq.test(data,correct=TRUE)$expected (for expected frequencies)
It is recommended to use Yates’ continuity correction (correct=TRUE), if the expected cell count is less than 10 . When the expected cell count is less than 5, the Chi Square test should not be used and the Fisher exact test is recommended.
Wilxocon test: Non parametric singed rank test for continuous data
wilcox.test(data)
Power analysis
Required information:
- Difference desired to detect (delta)
- Spread of data (sd)
- Significance level (α)
- Test statistic (power)
power.t.test(sd=s, delta=d, sig.level=0.05, power=0.8)
Errors:
Type 1: Null hypothesis is incorrectly rejected (false positive; significance level (α)
Type 2: Null hypothesis is incorrectly accepted (false negative; statistical power)
Errors are related to:
- Difference desired to detect
- Spread of data
- Significance level (α)
- Test statistic (power)
Disease | No Disease | |
---|---|---|
Exposure | a | b |
No Exposure | c | d |
True Positive: a
False Positive: b
False Negative: c
True Negative: d
Positive Predictive Value (or precision) is the probability that a person who is test positive has the condition:
Negative Predictive Value is the probability that a person who is test negative does not have the condition:
Sensitivity (or recall or true positive rate), is the probability that a person who has the condition tests positive:
Specificity, or true negative rate is the probability that a person who does not have the condition tests negative:
Accuracy is the probability a test result is correct:
In the R console using the epiR1 package:
library(epiR)
mat <- matrix(c(a,c,b,d), ncol=2) {enter values}
epi.tests(mat)
Disease + Disease - Total
Test + a b
Test - c d
Total
Point estimates and 95 % CIs:
---------------------------------------------------------
Apparent prevalence
True prevalence
Sensitivity
Specificity
Positive predictive value
Negative predictive value
Positive likelihood ratio
Negative likelihood ratio
---------------------------------------------------------
Bias
- Selection bias; reduced by randomisation
- Confounding bias; reduced by stratification
- Observational bias; reduced by blinding