How many imputations with mice? Assessing Monte-Carlo error after multiple imputation in R

When using multiple imputation to handle missing data, one must, if not immediately, but eventually, decide how many imputations to base inferences on. The validity of inferences does not rely on how many imputations are used, but the statistical efficiency of the inference can be increased by using more imputations. Moreover, we may want our results to be reproducible to a given precision, in the sense that if someone were to re-impute the same data using the same number of imputations but with a different random number seed, they would obtain the same estimates to the desired precision. For a great summary on considerations on how many imputations to use, see the corresponding section from Stef van Buuren’s book.

In this post I provide a small bit of R code which, given a pooled analysis after performing imputation using the mice package in R, calculates the so called Monte-Carlo standard error of the multiple imputation point estimates. Stata has really nice functionality for doing this built into mi estimate.

For a given parameter, we get estimates of it from each imputed dataset. Rubin’s rules then takes the average of these estimates across the imputations to obtain an overall estimate. If we imagine using an infinite number of imputations, and then taking the average, this is the ‘true’ MI estimate so to speak. The estimate we obtain with a finite number M of imputations is an estimate of this ‘true’ value. It’s important to remember that what I am referring to here as the true value is not the population true value of the parameter. The ‘true’ value I am referring to here will not be the same as the population parameter value because our inferences are based on a finite sample from the population. Since the MI point estimate is a sample mean across M imputations, it’s standard error is simply the between imputation variance divided by M, square rooted. We assume that the point estimates across imputations are normally distributed when applying Rubin’s rules. Consequently, we can also calculate a 95% CI for the ‘true’ estimates. This confidence interval can be useful to judge whether our MI estimates have sufficiently small Monte-Carlo error given the precision we would like to report the results to.

Let’s illustrate this with the first example from the mice documentation. We first impute using the default 5 imputations:

library(mice)
imp <- mice(nhanes)

For our analysis model, let’s estimate the mean of the cholesterol variable chl:

#estimate mean chl
fit <- with(data=imp, exp=lm(chl~1))
pooled <- pool(fit)
summary(pooled, conf.int=TRUE)
         term estimate std.error statistic       df      p.value    2.5 %   97.5 %
1 (Intercept)  192.568  10.08969  19.08563 9.972686 3.518747e-09 170.0784 215.0576

The standard error and 95% CI limits reported here are the Rubin’s rules ones – they represent the overall uncertainty in the parameter estimates, which includes the Monte-Carlo error’s impact. We now define the following function, which will calculate the Monte-Carlo error in our estimate(s) and calculate a Monte-Carlo error confidence interval:

miceMCError <- function(pooledRes) {
  monteCarloSE <- sqrt(pooledRes$pooled$b/pooledRes$m)
  ciLower <- pooledRes$pooled$estimate - qt(0.975,df=pooledRes$m-1)*monteCarloSE
  ciUpper <- pooledRes$pooled$estimate + qt(0.975,df=pooledRes$m-1)*monteCarloSE
  mcTable <- cbind(pooledRes$pooled$estimate, monteCarloSE, ciLower, ciUpper)
  colnames(mcTable) <- c("Estimate", "Monte Carlo SE", "95% CI lower limit", "95% CI upper limit")
  print(mcTable)
  print("Warning: 95% CI only quantifies Monte-Carlo uncertainty!")
}

Note that since the true between imputation variance is not known but is estimated based on the M imputations, we use the t-distribution to construct the 95% confidence interval. Let’s run this on our earlier pooled analysis:

miceMCError(pooled)
     Estimate Monte Carlo SE 95% CI lower limit 95% CI upper limit
[1,]  192.568       2.442045           185.7878           199.3482
[1] "Warning: 95% CI only quantifies Monte-Carlo uncertainty!"

We see an estimate of the Monte-Carlo standard error (2.44) and a 95% CI. Notice that the 95% CI is smaller than the 95% CI we obtained earlier. This is because the Monte-Carlo error CI is quantify only the Monte-Carlo uncertainty. The interpretation of the Monte-Carlo 95% CI is that if we repeatedly generate M imputations and analyse them, 95% of the time this 95% Monte-Carlo CI will include the ‘true’ value – the value we would obtain with an infinite number of imputations.

We can see here that the 95% Monte-Carlo CI is pretty large, which is not surprising given that we only used 5 imputations. Let’s re-run with 100 imputations:

imp <- mice(nhanes, m=100)
fit <- with(data=imp, exp=lm(chl~1))
pooled <- pool(fit)
miceMCError(pooled)

     Estimate Monte Carlo SE 95% CI lower limit 95% CI upper limit
[1,] 193.2388      0.4468063           192.3522           194.1254
[1] "Warning: 95% CI only quantifies Monte-Carlo uncertainty!"

The interval is much narrower, but if we say wanted to report our estimate to the nearest integer, we would need yet more imputations. Re-running with M=1,000 we obtain:

     Estimate Monte Carlo SE 95% CI lower limit 95% CI upper limit
[1,] 193.1072      0.1483916            192.816           193.3984
[1] "Warning: 95% CI only quantifies Monte-Carlo uncertainty!"

In this dataset the Monte-Carlo error in the MI estimate of the mean of the chl is large because 10 out of the 25 rows (40%) in the data frame are missing chl. This large fraction of missing values translates into a large between imputation variance and consequently larger Monte-Carlo SE. More commonly we might have smaller fractions of missing values, lower Monte-Carlo SEs, and not need so many imputations.

In this post I’ve just focused on Monte-Carlo error in the MI point estimates. There is of course Monte-Carlo error in the other parts of the inferential output (p-values, standard errors, 95% CI limits). For more on these, I suggest taking a look at Section 7 of White et al 2011.

Postscript: In writing this post I initially set R’s random number seed before each call to mice. In playing around with the code though, I discovered that mice by default now resets the random number generator seed back to the value it was at before the call to mice, such that if no seed is set in your code, effectively each call to mice is run using the same seed value (whatever your R session has it set to at the point you call mice).

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.