Missing not at random sensitivity analysis with FCS multiple imputation

Daniel Tompsett and colleagues have recently published a paper (open access here) on performing missing not at random (MNAR) sensitivity analyses within the fully conditional specification (FCS) framework for multiple imputation (MI). A number of previous papers had explored versions of the approach, and Tompsett et al bring these together to formalise the basis for the approach (which they term NARFCS) and importantly how to choose values of the sensitivity parameters involved.

The procedure itself is conceptually quite simple (assuming you understand the regular MAR FCS imputation procedure). Each partially observed variable is imputed conditional on the other variables, the missingness indicators of the other partially observed variables, and the missingness indicator of the variable being imputed. The coefficient of the latter indicator cannot be estimated, because the imputation model for a given variable is, in the FCS algorithm, fitted to the data only from those who have that variable observed. The approach thus consists of fitting the imputation model in this subset, omitting the missingness indicator for the current variable. To impute missing values, the linear fitted linear predictor then has an offset added, and this offset is the sensitivity parameter for that particular variable. One thus ends up with as many sensitivity parameters as partially observed variables.

Having defined the procedure, Tompsett et al emphasise that these sensitivity parameters are conditional parameters, representing differences in the distribution of the observed and missing values of each variable, conditional on the other variables, and conditional on the missingness indicators of the other variables. This is a quantity that is essentially impossible to elicit values for directly from experts, and differs from the marginal difference between the observed and missing values for the variable. This means that choosing values for the conditional sensitivity parameters involved in NARFCS is tricky.

The basic strategy to overcome this difficulty proposed is to first choose values (or ranges of values) for marginal sensitivity parameters, e.g. mean differences between the observed and missing values of each partially observed variable. The data are then repeatedly imputed using NARFCS using ranges of the conditional sensitivity parameter values in order to find those values which give imputed data where the marginal differences between imputed and observed is close to the chosen marginal sensitivity parameter values. Tompsett et al describe a number of different detailed approaches for doing this in Section 6. Section 8 is particularly useful, where they demonstrate application of their proposal to data from the ALSPAC study. The authors provide a tutorial at Github, which includes installation of a forked version of the MICE package, implementing NARFCS.

Performing sensitivity analyses within FCS imputation, particularly in the non-monotone setting, has proved a difficult problem in the past. Their developments should prove extremely useful for researchers who may have been using FCS imputation under the MAR assumption, but want to explore sensitivity to MAR within the FCS framework.

Leave a ReplyCancel reply