smcfcs imputation in R – now with parallel functionality

Substantive model compatible fully conditional specification multiple imputation can be useful for imputing missing values in covariates in a way which accommodates the form of the substantive/outcome model. One of its drawbacks compared to standard FCS imputation, as implemented in the mice package in R, is its higher computational burden. This is due to the use of rejection sampling when imputing missing values in continuous covariates.

I am happy to announce that thanks to the efforts of Edouard Bonneville, the smcfcs package in R now supports the use of multiple cores by parallel processing. The package now has a function smcfcs.parallel. This can be used to call the other smcfcs functions in parallel. Having specified the number of imputations desired, smcfcs.parallel splits these across the number of cores/processors specified by the user in the n_core argument. Since multiple imputation is ’embarrassingly parallel’, substantial speed improvements can be achieved. Many thanks to Ed for his continuing contributions to smcfcs in R.

Comments on FDA guidance ‘Adjusting for Covariates in Randomized Clinical Trials for Drugs and Biological Products’

The FDA recently published revised guidance on statistical methods for adjusting for baseline covariates in trials. Overall I like the guidance and think it will prove useful. In this post I’ll give a few thoughts on aspects of the revised guidance, organised according to the sections of the guidance document.

Read more

Multiple imputation separately by groups in R and Stata

When using multiple imputation to impute missing values there are often situations where one wants to perform the imputation process completely separately in groups of subjects defined by some fully observed variable (e.g. sex or treatment group). In Stata, this is made very easy through use of the by() option. You simply tell the mi impute command what variable (or variables) you want to perform the imputation stratified on. Stata will then impute separately in groups defined by this variable(s), and then assemble the imputations of each strata back together so you have your desired number of imputed datasets.

Last week someone asked me how to do it in R, ideally with the mice package. Compared to Stata, one has to do a little bit more work. One approach is to use the mice.impute.bygroup function in the miceadds package, a package which extends functionality for mice in various directions. If you instead want to do it manually, you can do so by making using of the rbind function within the mice package.

Read more