{"id":1414,"date":"2015-09-05T17:51:11","date_gmt":"2015-09-05T16:51:11","guid":{"rendered":"http:\/\/pcool.dyndns.org:8080\/statsbook\/?page_id=1414"},"modified":"2025-07-05T11:39:28","modified_gmt":"2025-07-05T10:39:28","slug":"survival-analysis-in-r-jgr","status":"publish","type":"page","link":"https:\/\/pcool.dyndns.org\/index.php\/survival-analysis-in-r-jgr\/","title":{"rendered":"Survival Analysis in R"},"content":{"rendered":"\n<p><a href=\"https:\/\/pcool.dyndns.org\/index.php\/life-table-survival-analysis\/\" data-type=\"page\" data-id=\"522\">Life table analysis<\/a> is mostly obsolete in orthopaedics and <a href=\"https:\/\/pcool.dyndns.org\/index.php\/kaplan-meier-survival-analysis\/\" data-type=\"page\" data-id=\"519\">Kaplan-Meier survival analysis <\/a>is the preferred method as the exact failure time is usually known. Two variables are required:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Follow up time (<a href=\"https:\/\/pcool.dyndns.org\/index.php\/data-types\/\" data-type=\"page\" data-id=\"355\">continuous variable<\/a>)<\/li>\n\n\n\n<li>Censor (end point, <a href=\"https:\/\/pcool.dyndns.org\/index.php\/data-types\/\" data-type=\"page\" data-id=\"355\">nominal dichotomous variable<\/a> \n<ul class=\"wp-block-list\">\n<li>yes \/ no , 0 \/ 1 <\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p>An optional grouping variable to compare different groups:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Grouping (<a href=\"https:\/\/pcool.dyndns.org\/index.php\/data-types\/\" data-type=\"page\" data-id=\"355\">nominal grouping variable<\/a>)<\/li>\n<\/ul>\n\n\n\n<p>Often the follow up time and censor need to be calculated from dates. In this respect the script <a href=\"https:\/\/pcool.dyndns.org:\/wp-content\/R_functions\/survivaldates.txt\" target=\"_blank\" rel=\"noreferrer noopener\">survivaldates<\/a> may be useful (this is an example script on how to convert dates to follow up time and how to create the censor variable; use <a href=\"https:\/\/pcool.dyndns.org:\/wp-content\/data_files\/survivaldates.rda\" target=\"_blank\" rel=\"noreferrer noopener\">survivaldates.rda<\/a> with this example). Alternatively, spreadsheet programs can be used to calculate the follow up time from dates.<\/p>\n\n\n\n<p>The censor is a <a href=\"https:\/\/pcool.dyndns.org\/index.php\/data-types\/\" data-type=\"page\" data-id=\"355\">dichotomous variable<\/a> as each patient can only in one of two states: uncensored or censored. Once the event of interest has occurred, the patient is uncensored. All other patients are censored. Patients can be censored due to :<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The event has not occurred before the end of the study<\/li>\n\n\n\n<li>The patient is lost to follow up during the study<\/li>\n\n\n\n<li>The patient withdraws from the study due to another reason (i.e. death if the event of interest is failure of hip replacement)<\/li>\n<\/ul>\n\n\n\n<p>Before survival analysis can be performed, the data needs to be in an appropriate format. The data should be in a table with as many rows as there are patients. There should be at least two columns; a column for the continuous follow up variable and a column for the binary censor variable (0 or 1). A third categorical grouping column is optional, but can be used to compare different groups of patients. <\/p>\n\n\n\n<p>Here, the <a href=\"https:\/\/pcool.dyndns.org:\/wp-content\/data_files\/plotsurvival.rda\" target=\"_blank\" rel=\"noreferrer noopener\">plotsurvival.rda<\/a> dataset is used with several methods on how to perform the analysis similar, but more in depth as described on the <a href=\"https:\/\/pcool.dyndns.org\/index.php\/survival-plot\/\" data-type=\"page\" data-id=\"513\">surival plot page<\/a>. The methods described are:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Console method<\/li>\n\n\n\n<li>KM with Numbers function<\/li>\n\n\n\n<li>Prodlim package<\/li>\n<\/ul>\n\n\n\n<p><strong>Console method<\/strong><\/p>\n\n\n\n<p>The survival package<sup class='sup-ref-note' id='note-zotero-ref-p1414-r1-o1'><a class='sup-ref-note' href='#zotero-ref-p1414-r1'>1<\/a><\/sup> should be installed and active.<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><span style=\"color: #ff0000;\">library(survival)<\/span><\/em><\/code><\/pre>\n\n\n\n<p>To check the data frame is loaded:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><mark style=\"background-color:rgba(0, 0, 0, 0);color:#f60808\" class=\"has-inline-color\">plotsurvival\n<\/mark><mark style=\"background-color:rgba(0, 0, 0, 0);color:#3d07f5\" class=\"has-inline-color\">   number   fu group censor\n1       1  5.2 Hip 1      0\n2       2  6.3 Hip 1      0\n3       3  7.0 Hip 1      0\n4       4  8.5 Hip 1      0\n5       5  9.4 Hip 1      0\n6       6 10.1 Hip 1      0\n7       7 11.2 Hip 1      0\n8       8 12.0 Hip 1      0\n9       9 12.5 Hip 1      1\n10     10 12.7 Hip 1      0\n11     11  4.5 Hip 2      1\n12     12  5.8 Hip 2      0\n13     13  6.9 Hip 2      0\n14     14  7.8 Hip 2      1\n15     15  8.2 Hip 2      1\n16     16  8.9 Hip 2      1\n17     17  9.5 Hip 2      1\n18     18 10.2 Hip 2      0\n19     19 11.5 Hip 2      0\n20     20 12.9 Hip 2      0<\/mark><\/em><\/code><\/pre>\n\n\n\n<p>To perform survival analysis for all hips together, first create a survival object and call this mysurvival:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><span style=\"color: #ff0000;\">mysurvival &lt;- Surv(plotsurvival$fu,plotsurvival$censor)<\/span><\/em><\/code><\/pre>\n\n\n\n<p>To show the object:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><span style=\"color: #ff0000;\"><em>mysurvival<\/em><\/span>\n<span style=\"color: #0000ff;\"><em>&nbsp;&#091;1]&nbsp; 5.2+&nbsp; 6.3+&nbsp; 7.0+&nbsp; 8.5+&nbsp; 9.4+ 10.1+ 11.2+ 12.0+ 12.5&nbsp; 12.7+&nbsp; 4.5&nbsp;&nbsp; 5.8+&nbsp; 6.9+&nbsp; 7.8&nbsp;&nbsp; 8.2&nbsp;&nbsp; 8.9&nbsp;&nbsp; 9.5&nbsp; 10.2+ 11.5+ 12.9+<\/em><\/span><\/code><\/pre>\n\n\n\n<p>This shows the survival times of all patients. The follow up times followed by a &#8216;+&#8217; are censored; the others have failed.<\/p>\n\n\n\n<p>A summary of the object can be useful:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><span style=\"color: #ff0000;\"><em>summary(mysurvival)<\/em><\/span>\n<em><mark style=\"background-color:rgba(0, 0, 0, 0);color:#3a02f8\" class=\"has-inline-color\">      time            status   \n Min.   : 4.500   Min.   :0.0  \n 1st Qu.: 6.975   1st Qu.:0.0  \n Median : 9.150   Median :0.0  \n Mean   : 9.055   Mean   :0.3  \n 3rd Qu.:11.275   3rd Qu.:1.0  \n Max.   :12.900   Max.   :1.0  <\/mark><\/em><\/code><\/pre>\n\n\n\n<p>So the shortest follow time is 4.5 years and the longest 12.9 years, with a median follow up time of 9.15 years.<\/p>\n\n\n\n<p>Next, a Kaplan-Meier (product limit) survival curve, called mycurve, is fitted on the mysurvival object (without confidence interval):<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><span style=\"color: #ff0000;\"><em>mycurve &lt;- survfit(mysurvival~1, conf.int=FALSE)<\/em><\/span><\/code><\/pre>\n\n\n\n<p class=\"is-style-text-annotation is-style-text-annotation--1\">Please note the tilde 1 (~1). This is a required grouping parameter. If there are no groups, it should be set to 1.<\/p>\n\n\n\n<p>To show the survival estimate:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><span style=\"color: #ff0000;\"><em>mycurve<\/em><\/span>\n<em><mark style=\"background-color:rgba(0, 0, 0, 0);color:#4f0cf1\" class=\"has-inline-color\">Call: survfit(formula = mysurvival~1, conf.int=FALSE)\n\n      n events median\n&#091;1,] 20      6   12.5<\/mark><\/em><\/code><\/pre>\n\n\n\n<p>So, there are 20 patients of whom 6 failed (event). The median survival is the time it takes for the cumulative survival to become 50%. Here, the median survival is 12.5 years.<\/p>\n\n\n\n<p>A more detailed summary can be useful:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><mark style=\"background-color:rgba(0, 0, 0, 0);color:#f50606\" class=\"has-inline-color\">summary(mycurve)\n<\/mark><mark style=\"background-color:rgba(0, 0, 0, 0);color:#4c05f5\" class=\"has-inline-color\">Call: survfit(formula = mysurvival ~ 1, conf.int = FALSE)\n\n time n.risk n.event survival std.err\n  4.5     20       1    0.950  0.0487\n  7.8     14       1    0.882  0.0795\n  8.2     13       1    0.814  0.0982\n  8.9     11       1    0.740  0.1138\n  9.5      9       1    0.658  0.1275\n 12.5      3       1    0.439  0.1982<\/mark><\/em><\/code><\/pre>\n\n\n\n<p>It is easy to obtain a survival probability at a specified time. For example, the survival probability at 5 years is:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><span style=\"color: #ff0000;\"><em>summary(mycurve,time=5)<\/em><\/span>\n<span style=\"color: #0000ff;\"><em>Call: survfit(formula = mysurvival ~ 1, conf.int = FALSE)<\/em><\/span>\n<span style=\"color: #0000ff;\"><em>&nbsp;time n.risk n.event survival std.err<\/em><\/span>\n<span style=\"color: #0000ff;\"><em>&nbsp;&nbsp;&nbsp; 5&nbsp;&nbsp;&nbsp;&nbsp; 19&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1&nbsp;&nbsp;&nbsp;&nbsp; 0.95&nbsp; 0.0487<\/em><\/span><\/code><\/pre>\n\n\n\n<p>or 49%; and the survival probability at 10 years is:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><span style=\"color: #ff0000;\"><em>summary(mycurve,time=10)<\/em><\/span>\n<span style=\"color: #0000ff;\"><em>Call: survfit(formula = mysurvival ~ 1, conf.int = FALSE)<\/em><\/span>\n<span style=\"color: #0000ff;\"><em>&nbsp;time n.risk n.event survival std.err<\/em><\/span>\n<span style=\"color: #0000ff;\"><em>&nbsp;&nbsp; 10&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 8&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 5&nbsp;&nbsp;&nbsp; 0.658&nbsp;&nbsp; 0.127<\/em><\/span><\/code><\/pre>\n\n\n\n<p>or 13%.<\/p>\n\n\n\n<p>To show the survival curve:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><span style=\"color: #ff0000;\">plot(mycurve, conf.int=FALSE, mark.time=TRUE)<\/span><\/em><\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"768\" src=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/surv6-1024x768.png\" alt=\"\" class=\"wp-image-3702\" srcset=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/surv6-1024x768.png 1024w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/surv6-300x225.png 300w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/surv6-768x576.png 768w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/surv6.png 1355w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"is-style-text-annotation is-style-text-annotation--2\">mark.time=TRUE adds the censor tick marks to the plot (default = FALSE)<\/p>\n\n\n\n<p>and add axes labels:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><span style=\"color: #ff0000;\">title(main='KM curve', xlab='Follow Up &#091;years]',ylab='Probability')<\/span><\/em><\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"621\" src=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv7_updated-1024x621.jpeg\" alt=\"\" class=\"wp-image-4950\" srcset=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv7_updated-1024x621.jpeg 1024w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv7_updated-300x182.jpeg 300w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv7_updated-768x466.jpeg 768w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv7_updated-1536x931.jpeg 1536w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv7_updated.jpeg 1692w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>The censored patients are indicated by a tick mark on the survival curve. To create the plot in a different colour is easy:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><span style=\"color: #ff0000;\">plot(mycurve,col='salmon', conf.int=FALSE, mark.time=TRUE)<\/span><\/em>\n<em><span style=\"color: #ff0000;\">title(main='KM curve', xlab='Follow Up &#091;years]',ylab='Probability')<\/span><\/em>\n<span style=\"color: #99cc00;\"><a href=\"http:\/\/pcool.dyndns.org:8080\/statsbook\/wp-content\/uploads\/surv8.png\"><\/a><\/span><\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"621\" src=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv8_updated-1024x621.jpeg\" alt=\"\" class=\"wp-image-4951\" srcset=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv8_updated-1024x621.jpeg 1024w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv8_updated-300x182.jpeg 300w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv8_updated-768x466.jpeg 768w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv8_updated-1536x931.jpeg 1536w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv8_updated.jpeg 1692w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>For a full list of available colours:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><span style=\"color: #ff0000;\"><em>colors()<\/em> <\/span><\/code><\/pre>\n\n\n\n<p><em>Output omitted for brevity.<\/em><\/p>\n\n\n\n<p class=\"is-style-text-annotation is-style-text-annotation--3\">Please note the American spelling of color<\/p>\n\n\n\n<p>To add a confidence interval to the survival curve is also straight forward and should be included in the mycurve object:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><span style=\"color: #ff0000;\"><em>mycurve &lt;- survfit(mysurvival~1, conf.int=0.95)<\/em><\/span>\n<span style=\"color: #ff0000;\"><em>summary(mycurve)<\/em><\/span>\n<mark style=\"background-color:rgba(0, 0, 0, 0);color:#3c06f4\" class=\"has-inline-color\"><em>Call: survfit(formula = mysurvival ~ 1, conf.int = 0.95)\n\n time n.risk n.event survival std.err\n  4.5     20       1    0.950  0.0487\n  7.8     14       1    0.882  0.0795\n  8.2     13       1    0.814  0.0982\n  8.9     11       1    0.740  0.1138\n  9.5      9       1    0.658  0.1275\n 12.5      3       1    0.439  0.1982\n lower 95% CI upper 95% CI\n        0.859        1.000\n        0.739        1.000\n        0.643        1.000\n        0.548        1.000\n        0.450        0.962\n        0.181        1.000<\/em><\/mark><\/code><\/pre>\n\n\n\n<p class=\"is-style-text-annotation is-style-text-annotation--4\">The confidence interval level can be changed by altering the conf.int value.<\/p>\n\n\n\n<p>To plot a &#8216;slategray&#8217; curve with 95% confidence intervals:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><span style=\"color: #ff0000;\"><em>plot(mycurve,col='slategray', mark.time=TRUE)<\/em><\/span>\n<span style=\"color: #ff0000;\"><em>title(main='KM curve', xlab='Follow Up &#091;years]',ylab='Probability')<\/em><\/span><\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"621\" src=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv9_updated-1024x621.jpeg\" alt=\"\" class=\"wp-image-4952\" srcset=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv9_updated-1024x621.jpeg 1024w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv9_updated-300x182.jpeg 300w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv9_updated-768x466.jpeg 768w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv9_updated-1536x931.jpeg 1536w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/07\/surv9_updated.jpeg 1692w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p><span style=\"color: #99cc00;\"><a href=\"http:\/\/pcool.dyndns.org:8080\/statsbook\/wp-content\/uploads\/surv9.png\"><\/a><\/span><em><strong>Groups<\/strong><\/em><\/p>\n\n\n\n<p>Create a survival object similar as to described above. However, in the Kaplan Meier estimate object, set the grouping variable after the tilde (~) sign (here the categorical &#8216;group&#8217; variable in the &#8216;plotsurvival&#8217; data frame).<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><mark style=\"background-color:rgba(0, 0, 0, 0);color:#f40606\" class=\"has-inline-color\">library(survival)\n<span style=\"color: #ff0000;\">mysurvival&lt;-Surv(plotsurvival$fu,plotsurvival$censor)<\/span>\n<span style=\"color: #ff0000;\">mysurvival<\/span><\/mark><\/em>\n<span style=\"color: #0000ff;\"><em>&nbsp;&#091;1]&nbsp; 5.2+&nbsp; 6.3+&nbsp; 7.0+&nbsp; 8.5+&nbsp; 9.4+ 10.1+ 11.2+ 12.0+ 12.5&nbsp; 12.7+&nbsp; 4.5&nbsp;&nbsp; 5.8+&nbsp; 6.9+&nbsp; 7.8&nbsp;&nbsp; 8.2&nbsp;&nbsp; 8.9&nbsp;&nbsp; 9.5&nbsp; 10.2+ 11.5+ 12.9+<\/em><\/span>\n\n<span style=\"color: #ff0000;\"><em>mycurve=survfit(mysurvival~<strong>plotsurvival$group<\/strong>, conf.int=FALSE)\nmycurve\n<\/em><\/span><em><mark style=\"background-color:rgba(0, 0, 0, 0);color:#3d07f5\" class=\"has-inline-color\">Call: survfit(formula = mysurvival ~ plotsurvival$group, conf.int = FALSE)\n\n                          n events median\nplotsurvival$group=Hip 1 10      1   12.5\nplotsurvival$group=Hip 2 10      5    9.5<\/mark><\/em>\n\n<span style=\"color: #ff0000;\"><em>summary(mycurve)<\/em><\/span>\n<em><mark style=\"background-color:rgba(0, 0, 0, 0);color:#1b05f6\" class=\"has-inline-color\">Call: survfit(formula = mysurvival ~ plotsurvival$group, conf.int = FALSE)\n\n                plotsurvival$group=Hip 1 \n    time   n.risk  n.event survival  std.err \n  12.500    2.000    1.000    0.500    0.354 \n\n                plotsurvival$group=Hip 2 \n time n.risk n.event survival std.err\n  4.5     10       1    0.900  0.0949\n  7.8      7       1    0.771  0.1442\n  8.2      6       1    0.643  0.1679\n  8.9      5       1    0.514  0.1769\n  9.5      4       1    0.386  0.1732<\/mark><\/em><\/code><\/pre>\n\n\n\n<p>Now plot the curves with labels to the axes and a title:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><span style=\"color: #ff0000;\"><em>plot(mycurve, col=c('wheat','green'), mark.time=TRUE)<\/em><\/span>\n<span style=\"color: #ff0000;\"><em>title(main='KM curve', xlab='FU &#091;years]', ylab='Probability')<\/em> <\/span><\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"768\" src=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/surv10-1024x768.png\" alt=\"\" class=\"wp-image-3722\" srcset=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/surv10-1024x768.png 1024w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/surv10-300x225.png 300w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/surv10-768x576.png 768w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/surv10.png 1355w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>This suggests that Hip 1 is superior to Hip 2. However, this will be tested with the logrank test, Cox proportional hazards and parametric tests.<\/p>\n\n\n\n<p class=\"is-style-text-annotation is-style-text-annotation--5\">Please note Hip 1 has a survival of 100% until there is a significant drop at the end. Only two patients had a follow up beyond 11 years.<\/p>\n\n\n\n<p><em><strong>Logrank test<\/strong><\/em><\/p>\n\n\n\n<p>The log rank test is a test that compares two survival curves. The null hypothesis is that the curves are identical. The statistical method uses a Chi Square test in its calculation. To perform this test in R is straight forward. The function that does the test is called &#8220;survdiff&#8221; (survival difference). Previously, the survival object was defined as mysurvival, but it is redefined (just in case) and its contents checked:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><span style=\"color: #ff0000;\"><em>mysurvival &lt;- Surv(plotsurvival$fu, plotsurvival$censor)<\/em><\/span>\n<span style=\"color: #ff0000;\"><em>mysurvival<\/em><\/span>\n<span style=\"color: #0000ff;\"><em>&#091;1]&nbsp; 5.2+&nbsp; 6.3+&nbsp; 7.0+&nbsp; 8.5+&nbsp; 9.4+ 10.1+ 11.2+ 12.0+ 12.5&nbsp; 12.7+&nbsp; 4.5&nbsp;&nbsp; 5.8+&nbsp; 6.9+&nbsp; 7.8&nbsp;&nbsp; 8.2&nbsp;&nbsp; 8.9&nbsp;&nbsp; 9.5&nbsp; 10.2+ 11.5+ 12.9+<\/em><\/span><\/code><\/pre>\n\n\n\n<p>To perform the log rank test:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><span style=\"color: #ff0000;\"><em>mylogrank &lt;- survdiff(mysurvival~plotsurvival$group)<\/em><\/span>\n<span style=\"color: #ff0000;\"><em>mylogrank<\/em><\/span>\n<em><mark style=\"background-color:rgba(0, 0, 0, 0);color:#1205f1\" class=\"has-inline-color\">Call:\nsurvdiff(formula = mysurvival ~ plotsurvival$group)\n\n                          N Observed Expected (O-E)^2\/E (O-E)^2\/V\nplotsurvival$group=Hip 1 10        1     3.31      1.61      3.63\nplotsurvival$group=Hip 2 10        5     2.69      1.97      3.63\n\n Chisq= 3.6  on 1 degrees of freedom, p= 0.06<\/mark><\/em><\/code><\/pre>\n\n\n\n<p>The p value is larger than 5% and therefore the null hypothesis can&#8217;t be rejected. Although the curve suggests that Hip 2 performs worse than Hip 1, this can&#8217;t be confirmed statistically. However, there were only 10 patients in each group and the test lacks statistical power. Perhaps a more powerful test can show a difference (such as Cox proportional hazards or a parametric test; see below)?<\/p>\n\n\n\n<p><em><strong>Cox proportional hazards<\/strong><\/em><\/p>\n\n\n\n<p>The cox proportional hazards test is a non parametric test that compares two or more survival curves. It is commonly used in medicine and more powerful than the log rank test. The function that performs this test in R is called &#8220;coxph&#8221;. The test is performed on the previously defined mysuvival object and the syntax is very similar to the log rank test:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><span style=\"color: #ff0000;\"><em>mycoxproportionalhazards &lt;- coxph(mysurvival~plotsurvival$group)<\/em><\/span>\n<span style=\"color: #ff0000;\"><em>mycoxproportionalhazards<\/em><\/span>\n<em><mark style=\"background-color:rgba(0, 0, 0, 0);color:#4607f6\" class=\"has-inline-color\">Call:\ncoxph(formula = mysurvival ~ plotsurvival$group)\n\n                         coef exp(coef) se(coef)     z      p\nplotsurvival$groupHip 2 1.837     6.275    1.100 1.669 0.0951\n\nLikelihood ratio test=3.84  on 1 df,<strong> p=0.04996<\/strong>\nn= 20, number of events= 6 <\/mark><\/em><\/code><\/pre>\n\n\n\n<p>The Cox proportional hazard test shows a statistical significant difference. Therefore, the Cox proportional hazards test rejects the null hypothesis in favour of the alternate hypothesis and it is concluded that Hip 1 outperforms Hip 2. Please refer to the section on <a href=\"https:\/\/pcool.dyndns.org\/index.php\/cox-proportional-hazards\/\" data-type=\"page\" data-id=\"2128\">Cox proportional hazards<\/a> for more information about this linear regression model.<\/p>\n\n\n\n<p><em><strong>Parametric<\/strong><\/em><\/p>\n\n\n\n<p>Parametric regression analysis can also be used to compare different survival curves. This method is more powerful than the methods describe above. However, it makes assumptions about the shape of the hazard function, which may or may not be justified. Assumptions of different types of hazard curves include (amongst others):<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Constant hazard (&#8220;exponential&#8221;)<\/strong>\n<ul class=\"wp-block-list\">\n<li>a constant hazard gives an exponential distribution.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Increasing hazard<\/strong>\n<ul class=\"wp-block-list\">\n<li>hazard increases as time goes on.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Decreasing hazard<\/strong>\n<ul class=\"wp-block-list\">\n<li>hazard decreases as time goes on.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p>The R function &#8216;survreg&#8217;, that performs parametric regression analysis on survival curves, includes attributes for several distributions. These include: &#8216;weibull&#8217;, &#8216;exponential&#8217;, &#8216;gaussian&#8217;, &#8216;logistic&#8217;,&#8217;lognormal&#8217; and &#8216;loglogistic&#8217;. A full description is outside the scope of this book. Further information can be obtained from the build in help function. To perform regression analysis on the example is similar to the other tests and based on the mysurvival object:<\/p>\n\n\n\n<p>Model 1: assume a constant hazard (exponential distribution):<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><span style=\"color: #ff0000;\"><em>myexponential &lt;- survreg(mysurvival~plotsurvival$group,dist='exponential')<\/em><\/span>\n<span style=\"color: #ff0000;\"><em>myexponential<\/em><\/span>\n<em><mark style=\"background-color:rgba(0, 0, 0, 0);color:#2709f5\" class=\"has-inline-color\">Call:\nsurvreg(formula = mysurvival ~ plotsurvival$group, dist = \"exponential\")\n\nCoefficients:\n            (Intercept) plotsurvival$groupHip 2 \n               4.552824               -1.705591 \n\nScale fixed at 1 \n\nLoglik(model)= -24.8   Loglik(intercept only)= -26.4\n\tChisq= 3.31 on 1 degrees of freedom, <strong>p= 0.0689 <\/strong>\nn= 20 <\/mark><\/em><\/code><\/pre>\n\n\n\n<p>So, if a constant hazard is assumed, p&gt;5% and the null hypothesis can&#8217;t be rejected. It must be concluded that there is no difference between the two hips. However, it is unlikely that the hazard remains constant over time. It is much more likely that the hazard for aseptic loosening increases over time. Therefore, it seems more appropriate to use a model such as Weibull:<\/p>\n\n\n\n<p>Model 2: Weibull, increasing hazard over time:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><span style=\"color: #ff0000;\"><em>myweibull &lt;- survreg(mysurvival~plotsurvival$group,dist='weibull')<\/em><\/span>\n<span style=\"color: #ff0000;\"><em>myweibull<\/em><\/span>\n<em><mark style=\"background-color:rgba(0, 0, 0, 0);color:#2e09f0\" class=\"has-inline-color\">Call:\nsurvreg(formula = mysurvival ~ plotsurvival$group, dist = \"weibull\")\n\nCoefficients:\n            (Intercept) plotsurvival$groupHip 2 \n              2.9364589              -0.5055782 \n\nScale= 0.2617896 \n\nLoglik(model)= -20   Loglik(intercept only)= -22.2\n\tChisq= 4.31 on 1 degrees of freedom, p= 0.0378 \nn= 20 <\/mark><\/em><\/code><\/pre>\n\n\n\n<p>The Weibull model is statistical significant and the null hypothesis can be rejected in favour of the alternate hypothesis. It can be concluded that Hip 1 is better than Hip 2 when the Weibull model is used.<\/p>\n\n\n\n<p><strong>Cox proportional hazards or parametric regression analysis?<\/strong><\/p>\n\n\n\n<p>Both, parametric and non parametric analysis may be appropriate to use to demonstrate a difference between two groups. Parametric regression analysis is more powerful, but makes assumptions regarding the shape of the hazard function. This may not always be appropriate. Parametric regression analysis defines the shape of the hazard function and therefore allows for extrapolation. This is not possible with the Cox proportional hazards model.<\/p>\n\n\n\n<p>Traditionally, medical studies tend to use the Cox proportional hazards model whilst engineering uses the parametric regression analysis. However, parametric analysis can easily be applied to medicine and this can be particularly useful if prediction \/ extrapolation is required.<\/p>\n\n\n\n<p><strong>Hazard plot<\/strong><\/p>\n\n\n\n<p>This function requires the muhaz<sup class='sup-ref-note' id='note-zotero-ref-p1414-r2-o1'><a class='sup-ref-note' href='#zotero-ref-p1414-r2'>2<\/a><\/sup> <a href=\"https:\/\/pcool.dyndns.org\/index.php\/packages\/\" data-type=\"page\" data-id=\"22\">package to be installed and loaded<\/a>. The hazard function is related to the survival function and can be derived from it. Given the patient has survived up to a certain time, the hazard function gives the instantaneous hazard for the event to happen at that time. Hazard plots can be very useful in estimating the risk of the event of interest occurring, given the patient has survived up to that time. Still using the same example, the muhaz function from the muhaz package does this conveniently. Enter the following in the R command window:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><span style=\"color: #ff0000;\"><em>library(muhaz)<\/em><\/span>\n<span style=\"color: #ff0000;\"><em>myhazard &lt;- muhaz(plotsurvival$fu, plotsurvival$censor, max.time=4)<\/em><\/span>\n<span style=\"color: #ff0000;\"><em>plot(myhazard)<\/em><\/span>\n<span style=\"color: #ff0000;\"><em>title(main='Hazard Plot')<\/em><\/span><\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"768\" src=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/surv11-1024x768.png\" alt=\"\" class=\"wp-image-3727\" srcset=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/surv11-1024x768.png 1024w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/surv11-300x225.png 300w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/surv11-768x576.png 768w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/surv11.png 1355w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p><span style=\"color: #99cc00;\"><\/span>The plot shows that the hazard of failure increases over time, suggesting that it would be appropriate to use the parametric Weibull analysis as described above.<\/p>\n\n\n\n<p><strong>Kaplan-Meier plot showing numbers at risk<\/strong><\/p>\n\n\n\n<p>It may be preferable to have the number of patients at risk indicated in the plot. The&nbsp;<a href=\"https:\/\/pcool.dyndns.org:\/wp-content\/R_functions\/kmatrisk.txt\" target=\"_blank\" rel=\"noreferrer noopener\">KMatRisk<\/a>&nbsp;function can be used for this. Copy the function to the clipboard, paste it into the console and hit return. The function should now be available in R (the library survival<sup class='sup-ref-note' id='note-zotero-ref-p1414-r3-o1'><a class='sup-ref-note' href='#zotero-ref-p1414-r3'>3<\/a><\/sup> should also be <a href=\"https:\/\/pcool.dyndns.org\/index.php\/packages\/\" data-type=\"page\" data-id=\"22\">installed and loaded<\/a>).<\/p>\n\n\n\n<p>Create the survival curve object&nbsp;directly (in one step) from the data and call this hipfit:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><span style=\"color: #ff0000;\">hipfit &lt;- survfit(Surv(fu,censor)~group, data=plotsurvival)<\/span><\/em><\/code><\/pre>\n\n\n\n<p>Next call the loaded function to create the survival curve with numbers at risk:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><span style=\"color: #ff0000;\">ggkm(hipfit,timeby=1,ystratalabs=c(\"Hip 1\",\"Hip 2\"),main=\"KM plot, Hip&nbsp;by type\",xlabs=\"FU &#091;Years]\")<\/span><\/em><\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"768\" src=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/survivalplot5-1024x768.png\" alt=\"\" class=\"wp-image-3773\" srcset=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/survivalplot5-1024x768.png 1024w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/survivalplot5-300x225.png 300w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/survivalplot5-768x576.png 768w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/survivalplot5.png 1355w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p><strong>Prodlim package<\/strong><\/p>\n\n\n\n<p>The prodlim<sup class='sup-ref-note' id='note-zotero-ref-p1414-r4-o1'><a class='sup-ref-note' href='#zotero-ref-p1414-r4'>4<\/a><\/sup> package should be <a href=\"https:\/\/pcool.dyndns.org\/index.php\/packages\/\" data-type=\"page\" data-id=\"22\">installed<\/a>. The package creates nice survival curves. First load the package:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><span style=\"color: #ff0000;\">library(prodlim)<\/span><\/em><\/code><\/pre>\n\n\n\n<p>Create a Kaplan-Meier object (called kmhipfit) using the product limit method:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><span style=\"color: #ff0000;\">kmhipfit &lt;- prodlim(Hist(fu, censor) ~ 1, data = plotsurvival)<\/span><\/em><\/code><\/pre>\n\n\n\n<p>To create a plot that shows numbers at risk at yearly intervals with a shadowed 95% confidence interval:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><span style=\"color: #ff0000;\">plot(kmhipfit, percent=FALSE, axes=TRUE, axis1.at=seq(0,kmhipfit$maxtime+1,1), axis1.lab=seq(0,kmhipfit$maxtime+1,1), marktime=TRUE, xlab=\"Years\", atrisk=TRUE, atrisk.labels=\"At Risk:\", confint=TRUE, confint.citype=\"shadow\", col=4)<\/span><\/em><\/code><\/pre>\n\n\n\n<p>And add a title:<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><span style=\"color: #ff0000;\">title(main= \"Kaplan Meier Survival Analysis - All Hips\")<\/span><\/em><\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"768\" src=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/survivalplot6-1024x768.png\" alt=\"\" class=\"wp-image-3774\" srcset=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/survivalplot6-1024x768.png 1024w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/survivalplot6-300x225.png 300w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/survivalplot6-768x576.png 768w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/survivalplot6.png 1355w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Similarly, create a plot grouped by type of hip replacement (instead of ~1, use ~group):<\/p>\n\n\n\n<pre class=\"wp-block-code has-small-font-size\"><code><em><span style=\"color: #ff0000;\">kmhipfit2 &lt;- prodlim(Hist(fu, censor) ~ group, data = plotsurvival)<\/span><\/em>\n\n<span style=\"color: #ff0000;\"><em>plot(kmhipfit2,percent=FALSE, axes=TRUE, axis1.at=seq(0,kmhipfit2$maxtime+1,1), axis1.lab=seq(0,kmhipfit2$maxtime+1,1), marktime=TRUE,xlab=\"Years\", atrisk=TRUE, atrisk.labels=\"At Risk:\", confint=TRUE, confint.citype=\"shadow\", col=c(4,3), legend=TRUE, legend.x=0, legend.y=0.4, legend.cex=1) <\/em><\/span>\n\n<span style=\"color: #ff0000;\"><em>title(main= \"Kaplan Meier Survival Analysis - By Hip Type\")<\/em><\/span><\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"768\" src=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/survivalplot7-1024x768.png\" alt=\"\" class=\"wp-image-3775\" srcset=\"https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/survivalplot7-1024x768.png 1024w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/survivalplot7-300x225.png 300w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/survivalplot7-768x576.png 768w, https:\/\/pcool.dyndns.org\/wp-content\/uploads\/2025\/06\/survivalplot7.png 1355w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"is-style-text-annotation is-style-text-annotation--6\">Please note the syntax to place the legend position.<\/p>\n\n\n\n<p>For more information re prodlim options:<\/p>\n\n\n\n<p><em><span style=\"color: #ff0000;\">??prodlim<\/span><\/em><\/p>\n\n\n\n<p>Other packages that can be of use in creating survival curves are the survminer<sup class='sup-ref-note' id='note-zotero-ref-p1414-r5-o1'><a class='sup-ref-note' href='#zotero-ref-p1414-r5'>5<\/a><\/sup> and the ggfortify <sup class='sup-ref-note' id='note-zotero-ref-p1414-r6-o1'><a class='sup-ref-note' href='#zotero-ref-p1414-r6'>6<\/a><\/sup> packages (autoplot() function) that are designed to work with ggplot2<sup class='sup-ref-note' id='note-zotero-ref-p1414-r7-o1'><a class='sup-ref-note' href='#zotero-ref-p1414-r7'>7<\/a><\/sup>. For examples please refer to the <a href=\"https:\/\/pcool.dyndns.org\/index.php\/survival-plot\/\" data-type=\"page\" data-id=\"513\">survival plot page<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Life table analysis is mostly obsolete in orthopaedics and Kaplan-Meier survival analysis is the preferred method as the exact failure time is usually known. Two variables are required: An optional grouping variable to compare different groups: Often the follow up time and censor need to be calculated from dates. In this respect the script survivaldates [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"inline_featured_image":false,"footnotes":""},"class_list":["post-1414","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/pcool.dyndns.org\/index.php\/wp-json\/wp\/v2\/pages\/1414","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pcool.dyndns.org\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/pcool.dyndns.org\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/pcool.dyndns.org\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/pcool.dyndns.org\/index.php\/wp-json\/wp\/v2\/comments?post=1414"}],"version-history":[{"count":9,"href":"https:\/\/pcool.dyndns.org\/index.php\/wp-json\/wp\/v2\/pages\/1414\/revisions"}],"predecessor-version":[{"id":4959,"href":"https:\/\/pcool.dyndns.org\/index.php\/wp-json\/wp\/v2\/pages\/1414\/revisions\/4959"}],"wp:attachment":[{"href":"https:\/\/pcool.dyndns.org\/index.php\/wp-json\/wp\/v2\/media?parent=1414"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}