Statistical Tests – Statsbook

In this section, parametric tests (based on a Normal distribution) and non parametric tests are discussed. In addition, there are sections on power analysis, sensitivity / specificity, errors, repeatability / reproducibility, bias / accuracy and multiple observer analysis.

Null Hypothesis: No difference between study groups

Alternate Hypothesis: There is a difference between the study groups

Outcome Measure: Variable used to test the hypotheses

Test Statistic: Calculation used to determine statistical significance

P Value: Probability the test statistic takes a more extreme value

Statistically Significant: Probability the test statistic has a more extreme value is less than 5%.

Parametric Test: Assumes data follow a Normal distribution

Tests for Normality:

Quantile-Quantile plots

qqnorm(data)
qqline(data)

Shapiro-Wilk test

shapiro.test(data)

Kolmogorov-Smirnov test

ks.test(data,"pnorm",mean=mean(data),sd=sd(data))

t-test:

t.test(data ~ group)
t.test(data1, data2, paired=FALSE)

Non Parametric Test: Data can’t be modelled with a Normal distribution

Data	Sample Size	Test
Continuous	> 50	Normal
	< 50 Normal	t-Test
	Not Normal	Wilcoxon
Ordinal		Wilcoxon
Nominal		Chi Squared

α: Probability of incorrectly rejecting the null hypothesis (false positive, type 1 error)

β: Probability of incorrectly accepting the null hypothesis (false negative, type 2 error)

Power: Probability of correctly rejecting the null hypothesis (Power = 1 – β)

Chi-square test:

The prerequisites are:

Random sample
Sufficient sample size
Independence of the observations
Expected cell count at least 5 in 2 by 2 tables or at least 5 in 80% of larger tables, but no cells with an expected count of 0.

data<-matrix(c(a,b,c,d),nrow=2)
chisq.test(data,correct=TRUE)
chisq.test(data,correct=TRUE)$expected (for expected frequencies)

It is recommended to use Yates’ continuity correction (correct=TRUE), if the expected cell count is less than 10 . When the expected cell count is less than 5, the Chi Square test should not be used and the Fisher exact test is recommended.

Wilxocon test: Non parametric singed rank test for continuous data

wilcox.test(data)

Power analysis

Required information:

Difference desired to detect (delta)
Spread of data (sd)
Significance level (α)
Test statistic (power)

power.t.test(sd=s, delta=d, sig.level=0.05, power=0.8)

Errors:

Type 1: Null hypothesis is incorrectly rejected (false positive; significance level (α)

Type 2: Null hypothesis is incorrectly accepted (false negative; statistical power)

Errors are related to:

Difference desired to detect
Spread of data
Significance level (α)
Test statistic (power)

	Disease	No Disease
Exposure	a	b
No Exposure	c	d

True Positive: a

False Positive: b

False Negative: c

True Negative: d

Positive Predictive Value (or precision) is the probability that a person who is test positive has the condition:

\(ppv = \frac{a}{a+b} \)

Negative Predictive Value is the probability that a person who is test negative does not have the condition:

\(npv = \frac{d}{c+d} \)

Sensitivity (or recall or true positive rate), is the probability that a person who has the condition tests positive:

\(sensitivity = \frac{a}{a+c} \)

Specificity, or true negative rate is the probability that a person who does not have the condition tests negative:

\(specificity = \frac{d}{b+d} \)

Accuracy is the probability a test result is correct:

\(accuracy = \frac{a+d}{a+b+c+d} \)

In the R console using the epiR¹ package:

library(epiR)
mat <- matrix(c(a,c,b,d), ncol=2) {enter values}
epi.tests(mat)
          Disease +    Disease -      Total
Test +            a            b          
Test -             c           d         
Total          
Point estimates and 95 % CIs:
---------------------------------------------------------
Apparent prevalence               
True prevalence                    
Sensitivity                           
Specificity                        
Positive predictive value            
Negative predictive value              
Positive likelihood ratio          
Negative likelihood ratio             
---------------------------------------------------------

Bias

Selection bias; reduced by randomisation
Confounding bias; reduced by stratification
Observational bias; reduced by blinding

References

↑ Stevenson et al., « epiR: Tools for the Analysis of Epidemiological Data », 2025-08-21