Multiple imputation for missing covariates in the Fine & Gray model for competing risks

Competing risks and the Fine & Gray model

In the setting of competing risks, one approach involves modelling the effects of covariates on the so-called cause specific hazard functions for each of the causes. An alternative is to model covariate effects on the cumulative incidence of one or more of the causes. The effects of covariates on the cumulative incidence of one cause (failure type), say cause 1, depends on the covariates’ effects on all the cause specific hazard functions. This can be intuitively seen by the fact that one way a covariate can increase the chances an individual fails from cause 1 is by reducing the hazard for failure from the other causes, meaning they have more opportunity to fail from cause 1.

For modelling covariate effects on cumulative incidence, the most popular regression approach is the Fine & Gray model, which assumes a proportional hazards model for the subdistribution hazard for the cause of interest (again, I’ll call this cause/failure type 1). In the (unusual) situation where no event times are censored, this model can be fitted using standard software for fitting Cox proportional hazards model, where individuals who fail from causes other than 1 are kept in the risk set at all observed event times. When there is censoring, but the censoring times are known for all individuals, even those who were observed to have an event (termed censoring complete by Fine & Gray), such as would be the case when censoring is administrative, individuals who fail from causes other than 1 remain in the risk set until their potential censoring time. Often however we have censoring of other forms such that the censoring times are not known for individuals who are observed to fail (experience an event).

Missing covariates in the Fine & Gray model

In practice one or more of the covariates we wish to include in the model may have missing values. Suppose one wishes to use multiple imputation to impute these missing covariate values. How should the imputation be performed, given that a Fine & Gray model is of interest? This question is addressed in a paper by Edouard Bonneville and colleagues, a pre-print of which is now available on arXiv. I shall not go into the details of the paper here, except to give a brief overview of one of the main contributions. This is to exploit the fact that in the setting of time-to-event data with one type of failure, imputation methods are well developed (White and Royston 2009, Bartlett et al 2015), and that as described above, when data are censoring complete, the Fine & Gray model can be fitted using a Cox model with a modified risk set definition. As such, Edouard’s paper proposes in the usual situation where censoring times are not available for all to 1) impute the missing censoring times using Kaplan-Meier based imputation, 2) apply imputation methods for missing covariates developed for the single failure type Cox model setting. The paper proposes approaches for step 2) based on MICE imputation and also the SMC-FCS approach we developed in earlier work (Bartlett et al 2015).

smcfcs for the Fine & Gray model in R

The proposal based on the SMC-FCS approach is now available in the smcfcs R package, thanks to Edouard. The following code, taken from the example in the new smcfcs.finegray function, illustrates the relatively simple workflow using this extension:


imps <- smcfcs.finegray(
  originaldata = ex_finegray,
  smformula = "Surv(times, d) ~ x1 + x2",
  method = c("", "", "logreg", "norm"),
  cause = 1,
  kmi_args = list("formula" = ~ 1)

impobj <- imputationList(imps$impDatasets)
# Important: use Surv(newtimes, newevent) ~ ... when pooling
# (respectively: subdistribution time and indicator for cause of interest)
models <- with(impobj, coxph(Surv(newtimes, newevent) ~ x1 + x2))

If you’re interested to learn more, please take a look at Edouard’s paper on arXiv.

Variance estimation for reference-based multiple imputation – the debate continues

Reference-based imputation methods have become a popular approach to handling missing data in clinical trials after patients experience what is nowadays referred to as an intercurrent event. Roughly speaking, these approaches impute such missing data in one treatment group (e.g. those in the active treatment group) based to some extent on estimates of parameters from another treatment group (e.g. the control treatment group). The approach was proposed in a paper by my colleague James Carpenter and others in 2013.

Read more

Multiple imputation and its application – 2nd edition published

I am delighted to write this blog post announcing the publication of the second edition of the book ‘Multiple Imputation and its Application’, published by Wiley, and which I am a co-author along with colleagues James Carpenter, Tim Morris, Angela Wood, Matteo Quartagno, and Mike Kenward.

Key additions in the second edition are:

  • in depth discussion of congeniality and compatibility, and the practical implications of the theory for these for data analysts
  • an updated chapter on performing imputation with derived variables, such as interactions, non-linear effects, sum scores, splines
  • expanded chapter on MI with survival data, including imputing missing covariates in Cox models and MI for case-cohort and nested case-control studies
  • new chapters on multiple imputation for / in the context of:
    • prognostic models
    • measurement error and misclassification
    • causal inference
    • using MI in practice
  • practical and theoretical exercises in each chapter

We hope it will be useful for those handling missing data by multiple imputation in their analyses, particularly in regards to thinking about how to use it in a way which accommodates the various complexities that are often present in statistical analyses.

The book should now be available “in all good bookshops”, as they say. You can find it at Amazon (please note I may receive a commission if you subsequently purchase from Amazon after clicking this link).