Reference based multiple imputation – what’s the right variance and how to estimate it?

Reference based multiple imputation methods have become a popular approach for handling missing data in the analysis of randomised trials (Carpenter et al 2013). Very roughly speaking, they impute missing outcomes in patients in the active arm assuming that the missing outcomes behave as if the patient switched onto the control treatment. This is in contrast to what is now the standard approach, based on the missing at random assumption, which effectively imputes missing outcomes for patients in a given arm as if they remained on the same treatment as they were randomised to.

Soon after reference based MI methods were proposed, people started noticing that Rubin’s rules variance estimator, which is the standard approach for analysing multiply imputed datasets, overstated the variance of treatment effects compared to the true frequentist variance of the effect estimator (Seaman et al 2014). This means that if Rubin’s rules are used, the type 1 error will be less than the standard 5% level if the null hypothesis is true, and power is lower (sometimes substantially) than if the frequentist variance were used for inference.

In a new pre-print on arXiv I review the congeniality issue and the bias in Rubin’s variance estimator, and summarise some of the arguments made in favour and against using Rubin’s rules with reference based methods. In the end I personally conclude that the frequentist variance is the ‘right’ one, but that we should scrutinise further whether the referenced based assumptions are reasonable in light of the behaviour they cause for inferences. For instance, they lead to a situation where the more data are missing, the more certain we are about the value of treatment effect, which would ordinarily seem incorrect.

I also review different approaches for estimating the frequentist variance, should one decide it is of interest, including efficiently combining bootstrapping with multiple imputation, as proposed by Paul von Hippel and myself a paper (in press at Statistical Science) and available to view here.

I hope the paper stimulates further debate as to what the right variance is for reference based methods, and would very much welcome any comments on it.

19th July 2021 – a short talk about this work can be viewed here.

22nd September 2021 – this work has now been published in the journal Statistics in Biopharmaceutical Research, and is available open-access here.

Is MAR dropout classified as MNAR according to Mohan and Pearl?

Mohan and Pearl have just had published a paper ‘Graphical Models for Processing Missing Data’ (open access pre-print here, journal version here). It’s a great read, and no doubt contains lots of useful developments (I’m still working my way through the paper). But something strikes me as somewhat troubling about their missing at random definition. Years ago when working with colleagues on using directed acyclic graphs to encode missing data assumptions, we struggled to see how MAR monotone dropout, as might occur in a longitudinal study, could be encoded in a DAG. In this post I will try and see whether MAR monotone dropout is classified as MAR according the definitions of Mohan and Pearl.

Read more

Auxiliary variables and congeniality in multiple imputation

Meng’s concept of congeniality in multiple imputation (MI) is I think a tricky one (for me anyway!). Loosely speaking congeniality is about whether the imputation and analysis models make different assumptions about the data. Meng gave a definition in his 1994 paper, but I prefer the one given in a more recent paper by Xie and Meng, which is what I and Rachael Hughes used in our paper this year on different methods of combining bootstrapping with MI. In words (see the papers for the same in equations) it is that there exists a Bayesian model for the data such that:

  • given complete/full data, the posterior mean of the parameter of interest matches the point estimate given by fitting our analysis model of interest to that data, and the posterior variance matches the variance estimator calculated by our analysis model fit.
  • the conditional distribution of the missing data given the observed in this Bayesian model matches that used by our imputation model.

If they are congenial and the models are correctly specified, Rubin’s variance estimator is (asymptotically) unbiased for the true repeated sampling variance of the MI point estimator(s).

One of the potentially useful features of MI are that we can include variables in the imputation stage which we then don’t use in the analysis model. Including such auxiliary variables in the imputation model can increase the likelihood that the MAR assumption holds when the auxiliary variable is associated with the probability of missingness, and can increase efficiency according to how strongly it is correlated with the variable(s) being imputed. A nice paper (among many) on the potential of including auxiliary variables in MI is Hardt et al 2012. In this post, I’ll consider whether including auxiliary variables in the imputation model leads to uncongeniality. The post was prompted following a discussion earlier in the year with my colleague Paul von Hippel.

Read more