Multiple imputation for missing covariates in the Fine & Gray model for competing risks

Competing risks and the Fine & Gray model

In the setting of competing risks, one approach involves modelling the effects of covariates on the so-called cause specific hazard functions for each of the causes. An alternative is to model covariate effects on the cumulative incidence of one or more of the causes. The effects of covariates on the cumulative incidence of one cause (failure type), say cause 1, depends on the covariates’ effects on all the cause specific hazard functions. This can be intuitively seen by the fact that one way a covariate can increase the chances an individual fails from cause 1 is by reducing the hazard for failure from the other causes, meaning they have more opportunity to fail from cause 1.

For modelling covariate effects on cumulative incidence, the most popular regression approach is the Fine & Gray model, which assumes a proportional hazards model for the subdistribution hazard for the cause of interest (again, I’ll call this cause/failure type 1). In the (unusual) situation where no event times are censored, this model can be fitted using standard software for fitting Cox proportional hazards model, where individuals who fail from causes other than 1 are kept in the risk set at all observed event times. When there is censoring, but the censoring times are known for all individuals, even those who were observed to have an event (termed censoring complete by Fine & Gray), such as would be the case when censoring is administrative, individuals who fail from causes other than 1 remain in the risk set until their potential censoring time. Often however we have censoring of other forms such that the censoring times are not known for individuals who are observed to fail (experience an event).

Missing covariates in the Fine & Gray model

In practice one or more of the covariates we wish to include in the model may have missing values. Suppose one wishes to use multiple imputation to impute these missing covariate values. How should the imputation be performed, given that a Fine & Gray model is of interest? This question is addressed in a paper by Edouard Bonneville and colleagues, a pre-print of which is now available on arXiv. I shall not go into the details of the paper here, except to give a brief overview of one of the main contributions. This is to exploit the fact that in the setting of time-to-event data with one type of failure, imputation methods are well developed (White and Royston 2009, Bartlett et al 2015), and that as described above, when data are censoring complete, the Fine & Gray model can be fitted using a Cox model with a modified risk set definition. As such, Edouard’s paper proposes in the usual situation where censoring times are not available for all to 1) impute the missing censoring times using Kaplan-Meier based imputation, 2) apply imputation methods for missing covariates developed for the single failure type Cox model setting. The paper proposes approaches for step 2) based on MICE imputation and also the SMC-FCS approach we developed in earlier work (Bartlett et al 2015).

smcfcs for the Fine & Gray model in R

The proposal based on the SMC-FCS approach is now available in the smcfcs R package, thanks to Edouard. The following code, taken from the example in the new smcfcs.finegray function, illustrates the relatively simple workflow using this extension:


imps <- smcfcs.finegray(
  originaldata = ex_finegray,
  smformula = "Surv(times, d) ~ x1 + x2",
  method = c("", "", "logreg", "norm"),
  cause = 1,
  kmi_args = list("formula" = ~ 1)

impobj <- imputationList(imps$impDatasets)
# Important: use Surv(newtimes, newevent) ~ ... when pooling
# (respectively: subdistribution time and indicator for cause of interest)
models <- with(impobj, coxph(Surv(newtimes, newevent) ~ x1 + x2))

If you’re interested to learn more, please take a look at Edouard’s paper on arXiv.

Variance estimation for reference-based multiple imputation – the debate continues

Reference-based imputation methods have become a popular approach to handling missing data in clinical trials after patients experience what is nowadays referred to as an intercurrent event. Roughly speaking, these approaches impute such missing data in one treatment group (e.g. those in the active treatment group) based to some extent on estimates of parameters from another treatment group (e.g. the control treatment group). The approach was proposed in a paper by my colleague James Carpenter and others in 2013.

Read more

Estimating hypothetical estimands with causal inference and missing data estimators in a diabetes trial

We (Camila Olarte Parra (LSHTM), Rhian Daniel (Cardiff), myself, and David Wright (AstraZeneca)) recently put on arXiv a new paper which explores the use of estimators from both the causal inference and missing data literatures for estimating a so-called hypothetical estimand in a previously conducted clinical trial in diabetes.

Read more