Missing covariates in competing risks analysis

Today I gave a seminar at the Centre for Biostatistics, University of Manchester, as part of a three seminar afternoon on missing data. My talk described recent work on methods for handling missing covariates in competing risks analysis, with a focus on when complete case analysis is valid and on multiple imputation approaches. For the latter, our substantive model compatible adaptation of fully conditional specification now supports competing risks analysis, both in R and Stata (see here).

The slides of my talk are available here.

Update 13th May 2016: the corresponding paper is now available (open access) here, and the supplementary materials for the paper are here.

Multiple imputation followed by deletion of imputed outcomes

In 2007, Paul von Hippel published a nice paper proposing a variant of the conventional multiple imputation (MI) approach to handling missing data. The paper advocated a multiple imputation followed by deletion (MID) approach. The context considered was where we are interested in fitting a regression model for an outcome Y with covariates X, and some Y and X values are missing. The approach advocated consists of running imputation as usual, imputing missing values in Y and X, but then discarding those records where the outcome Y had been imputed. Instead, the reduced datasets, with missing X values imputed but only observed Y values, are analysed as usual, with results combined using Rubin’s rules.

Read more

Using hazard ratios to estimate causal effects in RCTs

Odd Aalen and colleagues have recently published an interesting paper on the use of Cox models for estimating treatment effects in randomised controlled trials. In a randomised trial we have the treatment assignment variable X, and an often used primary analysis is to fit a simple Cox model with X as the only covariate. This gives an estimated hazard ratio comparing the hazard in the treatment group compared to the control, and this is assumed constant over time. In any trial, there will almost certainly exist other variables Z, some of which might be measured, and some of which will always be unmeasured, and which influence the outcome. At baseline, X and Z are statistically independent as a result of randomisation, which of course is the reason randomisation in general allows us to make a causal statement about the treatment effect – we need not worry about confounding.

Read more