This week I’ve been given the opportunity to present some ongoing work with colleagues Camila Olarte Parra and Rhian Daniel about the so called ‘trimmed means estimand’ in clinical trials at the International Biometric Conference in Riga, Latvia. The slides of my talk are available here for anyone interested. In this post I’ll give a brief overview of my talk.

## Perfect prediction handling in smcfcs for R

One of the things users have often asked me about the substantive model compatible fully conditional specification multiple imputation approach is the problem of perfect prediction. This problem arises when imputing a binary (or more generally a categorical variable) and there is a binary (or categorical) predictor, if among one or more levels of the predictor, the outcome is always 0 or always 1. Typically a logistic regression model is specified for the binary variable being imputed, and in the case of perfect prediction, the MLE for one or more parameters (on the log odds scale) is infinite. As described by White, Royston and Daniel (2010), this leads to problems in the imputations. In particular, to make the imputation process proper, a draw from the multivariate normal is used to draw new parameters of the logistic regression imputation model. The perfect prediction data configuration leads to standard errors that are essentially infinite, but in practice on the computer will be very very large. These huge standard errors lead to posterior draws (or what are used in place of posterior draws) which fluctuate from being very large and negative to very large and positive, when in reality they ought to be only large in one direction (see Section 4 of White, Royston and Daniel (2010)).

## Multiple imputation with splines in R using smcfcs

Tim Morris and I were recently discussing the topic of multiple imputation (MI) of covariates when one wants to assume the covariate affects the outcome via a spline of some kind. We thought that the Substantive Model Compatible Full Conditional Specification (smcfcs) approach to MI should be able to handle this, provided we can specify the spline’s basis functions in a way that smcfcs function (available in R and Stata) can handle. In this post I’ll show that it can be done in R, at least with a simple cubic spline setup.