Since 2004, Paul Cool’s books ‘Statistics for FRCS(Orth)’ and ‘Medical Statistics’ 1,2 as well as courses run at the Orthopaedic Institute at the Robert Jones and Agnes Hunt Orthopaedic Hospital NHS Foundation Trust in Oswestry 3, have helped trainees across the disciplines understand statistical methods and tests. ‘R applied to Medical Statistics’ 4 followed in 2010 as a compendium to ‘Medical Statistics’ 4 and a practical guide to the analysis of large data sets.
This web book 7, is intended to be the next logical step. The book is available free of charge in a format that allows us to respond to changes in software packages and libraries; keeping the book up to date.
We have used the same examples used in ‘Medical Statistics’ with additional, larger, example data sets available for download.
Statistical Software – the way it was
Whilst we appreciate the benefits of computer based analysis, we dislike the use of expensive proprietary software. These packages are ‘black boxes’ into which data is entered and from which come graphs and figures without any explanation as to how they have been derived. The algorithms are concealed within software which is closed to the outside world. It is taken on trust, therefore, that the software engineers have done their job perfectly despite a lack of open scrutiny.
The Open Source Revolution
‘Open source’, is a term used for software which is developed co-operatively by a community of software engineers, both professional and amateur.
Open source software will usually be available for many different platforms including Microsoft Windows, Apple Mac OSX and Linux (for example Ubuntu). This software is free to download, use and distribute (although certain legal caveats usually apply). Since open source programs publish their source code, you can really ‘see’ what is going on ‘under the hood’ and know how the program has transformed your data into graphs and p-values.
Open Source Evolution
The open source statistical programming language ‘R’ (http://www.r-project.org/) is an integrated suite of software facilities for data manipulation, calculation and graphical display. Whilst hugely powerful and the ‘Industry Standard’ for statistical analysis, the learning curve has historically been rather steep.
Latterly, a number of Graphical (that is to say windows and icons – mouse driven) User Interfaces (GUIs) have been written to make these powerful tools available, whilst avoid learning and typing complex commands. The most popular GUIs for R are:
In this book, we have made extensive use of the GUI ‘Deducer’. The merits of which include: a familiar, spreadsheet-like, data manager; active development; menu-driven tools and cross-platform compatibility. However, native R source code is also given where appropriate.
We very much hope you enjoy this book and would appreciate a citation:
Cool P, Ockendon M. Stats Book. http://countcool.com (date of access)
Paul Cool & Matthew Ockendon