The Stats Geek

Matching analysis to design: stratified randomization in trials

April 5, 2016April 5, 2016 by Jonathan Bartlett

Yesterday I was re-reading the recent nice articles by Brennan Kahan and Tim Morris on how to analyse trials which use stratified randomization. Stratified randomization is commonly used in trials, and involves randomizing in a certain way to ensure that the treatments are assigned in a balanced way within strata defined by chosen baseline covariates.

Combining bootstrapping with multiple imputation

September 24, 2020March 12, 2016 by Jonathan Bartlett

Multiple imputation (MI) is a popular approach to handling missing data. In the final part of MI, inferences for parameter estimates are made based on simple rules developed by Rubin. These rules rely on the analyst having a calculable standard error for their parameter estimate for each imputed dataset. This is fine for standard analyses, e.g. regression models fitted by maximum likelihood, where standard errors based on asymptotic theory are easily calculated. However, for many analyses analytic standard errors are not available, or are prohibitive to find by analytical methods. For such methods, if there were no missing data, an attractive approach for finding standard errors and confidence intervals is the method of bootstrapping. However, if one is using MI to handle missing data, and would ordinarily use bootstrapping to find standard errors / confidence intervals, how should these be combined?

Multiple imputation for missing covariates in Poisson regression

February 5, 2016 by Jonathan Bartlett

This week I’ve released a new version of the smcfcs package for R on CRAN. SMC-FCS performs multiple imputation for missing covariates in regression models, using an adaption of the chained equations / fully conditional specification approach to imputation, which we called Substantive Model Compatible Fully Conditional Specification MI.

The new version of smcfcs now supports Poisson regression outcome / substantive models, which are often used for count outcomes. Future additions will add support for negative binomial regression models, which are often used to model over dispersed count outcomes, and also support for offsets, which are often needed when fitting count regression models.