Multiple imputation has become an extremely popular approach to handling missing data, for a number of reasons. One is that once the imputed datasets have been generated, they can each be analysed using standard analysis methods, and the results pooled using Rubin’s rules. However, in addition to the missing at random assumption, for multiple imputation to give unbiased point estimates the model(s) used to impute missing data need to be (at least approximately) correctly specified. Because of this, care must be taken when choosing the imputation model.
What constitutes a reasonable imputation model will obviously depend on the dataset and situation at hand. One situation which is commonly encountered, but where it is not obvious what one should do, is where the dataset, or the model(s) which will be fitted after imputation, contains interaction terms or non-linear terms such as squared terms.