Matching analysis to design: stratified randomization in trials

Yesterday I was re-reading the recent nice articles by Brennan Kahan and Tim Morris on how to analyse trials which use stratified randomization. Stratified randomization is commonly used in trials, and involves randomizing in a certain way to ensure that the treatments are assigned in a balanced way within strata defined by chosen baseline covariates.

Read more

Combining bootstrapping with multiple imputation

Multiple imputation (MI) is a popular approach to handling missing data. In the final part of MI, inferences for parameter estimates are made based on simple rules developed by Rubin. These rules rely on the analyst having a calculable standard error for their parameter estimate for each imputed dataset. This is fine for standard analyses, e.g. regression models fitted by maximum likelihood, where standard errors based on asymptotic theory are easily calculated. However, for many analyses analytic standard errors are not available, or are prohibitive to find by analytical methods. For such methods, if there were no missing data, an attractive approach for finding standard errors and confidence intervals is the method of bootstrapping. However, if one is using MI to handle missing data, and would ordinarily use bootstrapping to find standard errors / confidence intervals, how should these be combined?

Read more

Multiple imputation for missing covariates in Poisson regression

This week I’ve released a new version of the smcfcs package for R on CRAN. SMC-FCS performs multiple imputation for missing covariates in regression models, using an adaption of the chained equations / fully conditional specification approach to imputation, which we called Substantive Model Compatible Fully Conditional Specification MI.

The new version of smcfcs now supports Poisson regression outcome / substantive models, which are often used for count outcomes. Future additions will add support for negative binomial regression models, which are often used to model over dispersed count outcomes, and also support for offsets, which are often needed when fitting count regression models.