I’m pleased to announce the release of an R package, smcfcs, which implements multiple imputation of missing covariates using substantive model compatible fully conditional specification. As described in a previous post, this is a modified version of the popular fully conditional specification, or chained equations, approach to multiple imputation (e.g. as implemented in the excellent MICE package).
smcfcs is an attractive approach when the outcome or substantive model includes interactions or non-linear covariate effects, or is itself a non-linear model, such as Cox’s proportional hazards model. In these case, it can be difficult, or sometimes impossible, to directly specify an imputation model for partially observed covariates that is compatible with the outcome/substantive model. Such incompatibility can lead to biased estimates, due to mis-specification of the imputation model. smcfcs resolves this potential problem by ensuring that each partially observed covariate is imputed from an imputation model which is compatible with a user specified outcome/substantive model.
smcfcs is available on CRAN in R. It supports linear and logistic regression outcome models, as well as Cox proportional hazards models for censored time to event outcomes. Competing risks outcomes can also be accommodated through specification of Cox models for each cause specific hazard function. A Stata version is all available, and can be installed from within Stata from the SSC archive using: ssc install smcfcs