Yesterday the Advanced Analytics Centre at AstraZeneca publicly released the InformativeCensoring package for R, on GitHub. Standard survival or time to event analysis methods assume that censoring is uninformative – that is that the hazard of failure in those subjects at risk at a given time and who have not yet failed or been censored, is the same as the hazard at that time in those who have been censored (with regression modelling, this assumption is somewhat relaxed).
In some settings, particularly certain randomised trials, there may be a concern that the non-informative censoring assumption is violated. This might be the case if the reason for censoring is related to the failure process. In this case standard methods may give invalid inferences.
The InformativeCensoring package for R implements two multiple imputation approaches which have been proposed for handling the problem of informative censoring.
Jackson method
The approach we have termed the ‘Jackson method’, implements a method proposed by Dan Jackson and others in a 2014 Statistics in Medicine paper. The method first fits a Cox model to the observed dataset. For those subjects for whom the user specifies, the method then imputes a failure time greater than their censoring time, using the conditional survival distribution implied by their baseline covariates and their censoring time. Importantly, the user is allowed to increase/decrease the assumed hazard following the censoring, such that the imputation can be carried out under a specific informative censoring assumption. For example, we may believe it is plausible that the hazard for failure is a certain amount higher than is estimated under the non-informative censoring assumption. The R package also allows the user to potentially specify different changes in the hazard following informative censoring for different subjects. Following the imputation process, the R package then allows the user to fit Cox models to the imputations and combine the estimates using Rubin’s rules.
Hsu method
The package also implements an approach developed by Hsu and Taylor in a series of papers (e.g. here). This approach involves fitting separate Cox models for the failure time and censoring processes, with user specified baseline or time-dependent covariates. Based on these two fitted models, for each subject two risk scores are calculated. To impute a failure time for a subject deemed to have been informatively censored, the subject is first matched to those subjects who have similar values of the two risk scores. A conditional Kaplan-Meier survival curve is then estimated, and used to randomly impute the failure time for the given subject.
Their approach relaxes the assumption of non-informative censoring to one of being non-informative conditional on the covariates used in the Cox models. Hsu and Taylor also show a certain double robustness property possessed by their approach, namely that one so long as one of the two Cox models is correctly specified, and the conditional non-informative censoring assumption is satisfied, then one obtains valid inferences. An attractive feature of the approach is that the semi-parametric Cox models are used only as an intermediary step to match subjects, and the actual imputation of failure times is then based on the non-parametric Kaplan-Meier approach.
R package installation
For the moment the InformativeCensoring package must be installed from GitHub, as described in the README instructions. We hope people find it useful, and look forward to receiving feedback on it.