As stated, parametric statistics is preferred if at all possible. Parametric tests are more powerful than non parametric tests. This means that we require fewer patients to demonstrate a statistically significant difference. For example, (in analysing the same data) a t-test is more likely to demonstrate a statistically significant difference than a Mann-Whitney U test. Similarly, the Mann-Whitney U test is more powerful than the sign test.
Parametric tests are more powerful than non parametric tests because the distribution of the data is better known. If we know more about the distribution, we can incorporate this knowledge in our statistical test. The mathematics is not discussed here, but it appears logical that the more we know about the distribution, the more powerful a test can be.
If the test we use makes assumptions about the distribution of our data, we also limit its use to specific conditions. For example a t-test can only be used if data is Normally distributed. So, in general, the more powerful a test, the more restricted its use.
When a statistical test is significant, it is always of sufficient power. However, a non significant test could be underpowered (there may be a difference but this has not been demonstrated by the data), or there really is no difference. In this respect, a retrospective power analysis may be appropriate to demonstrate the study is not underpowered.
Power analysis is performed in order to determine the number of patients required in a study to demonstrate a statistically significant difference. The exact mathematics is beyond the scope of this book. Also, it depends on what type of statistical analysis will be performed. However, in order to estimate the sample size an estimate is required of:
- Difference desired to detect
- Difference in mode (non parametric)
- Difference in mean (parametric)
- Spread of Data
- Standard deviation or variance (parametric)
- Range (non parametric)
- Significance level (α)
- Test statistic (power)
Power Analysis in R / JGR:
Power analysis is performed to estimate how many patients would be required in a study to demonstrate that there is a significant difference. It is related to a type two error: accepting the null hypothesis incorrectly (β). Obviously, we would want to make this chance as small as possible. But the less chance of making a type two error, the more chance of making a type one error. Normally statisticians suggest to set β = 0.2. The power of the test is than 1 – β = 0.8 or 80%.
To calculate an estimated sample size the following information is required:
- Variance (or standard deviation)
- Difference to detect (δ)
- Significance level (α): probability of making a type one error, usually 5%
- Statistical power required: usually set at 80%
- What type of test (t-test: one sample, two sample, paired, unpaired, etc)
How many patients are required in each of two groups to detect a difference in height of 10% with a t-test that has a power of 80 % and significance of 5 % whilst the mean is 175 cm and the standard deviation 20 cm?
- The standard deviation is 20 cm (sd = 20 cm)
- The difference to detect is 10 % of 175 cm: delta = 17.5
- The significance level is 5 %: sig.level = 0.05
- Power = 80 % = 0.8
So, enter in the JGR command window:
power.t.test(sd=20,delta=17.5,sig.level=0.05,power=0.8)
Two-sample t test power calculation
n = 21.50714
delta = 17.5
sd = 20
sig.level = 0.05
power = 0.8
alternative = two.sided
NOTE: n is number in *each* group
The power analysis suggests 22 patients in each of the two groups (study size 44). Normally, it is recommended to be on the side of caution and increase this number to deal with unforeseen circumstances such as lost to follow up etc.
The calculated number of patients in each group is normally rounded up. Even if the result of the power analysis would have been 21.01 patients in each group; this would have been rounded up to 22.
Similarly, if we wanted to detect a 5 % difference (delta = 8.75):
power.t.test(sd=20,delta=8.75,sig.level=0.05,power=0.8)
Two-sample t test power calculation
n = 82.98415
delta = 8.75
sd = 20
sig.level = 0.05
power = 0.8
alternative = two.sided
NOTE: n is number in *each* group
Power analysis recommends 83 patients in each group (study size 166).
Another fantastic program that is especially created for power analysis is g*power, published by the University of Dusseldorf 1. The program is freely available for Windows and Mac.