The Stats Geek

Mixed models repeated measures (mmrm) package for R

October 31, 2022 by Jonathan Bartlett

I was recently made aware of the release of the mmrm package in R. It has been developed by a group of programmers and statisticians at a number of pharmaceutical companies, led by Daniel Sabanes Bove at Roche, as part of the ASA Biopharmaceutical Section Software Engineering Working Group. I’ve written previously about fitting mixed models for repeated measures (MMRM) using R, Stata and SAS. In R, this can be done using the gls function in the nlme package, but there are a number of limitations with this approach. For example, it is difficult (or impossible) to fit models where you allow the covariance parameters to be distinct between treatment groups. In this post, I’ll take a very quick look at the new mmrm package in R.

Causal (in)validity of the trimmed means estimand

July 14, 2022July 14, 2022 by Jonathan Bartlett

This week I’ve been given the opportunity to present some ongoing work with colleagues Camila Olarte Parra and Rhian Daniel about the so called ‘trimmed means estimand’ in clinical trials at the International Biometric Conference in Riga, Latvia. The slides of my talk are available here for anyone interested. In this post I’ll give a brief overview of my talk.

Perfect prediction handling in smcfcs for R

May 24, 2022 by Jonathan Bartlett

One of the things users have often asked me about the substantive model compatible fully conditional specification multiple imputation approach is the problem of perfect prediction. This problem arises when imputing a binary (or more generally a categorical variable) and there is a binary (or categorical) predictor, if among one or more levels of the predictor, the outcome is always 0 or always 1. Typically a logistic regression model is specified for the binary variable being imputed, and in the case of perfect prediction, the MLE for one or more parameters (on the log odds scale) is infinite. As described by White, Royston and Daniel (2010), this leads to problems in the imputations. In particular, to make the imputation process proper, a draw from the multivariate normal is used to draw new parameters of the logistic regression imputation model. The perfect prediction data configuration leads to standard errors that are essentially infinite, but in practice on the computer will be very very large. These huge standard errors lead to posterior draws (or what are used in place of posterior draws) which fluctuate from being very large and negative to very large and positive, when in reality they ought to be only large in one direction (see Section 4 of White, Royston and Daniel (2010)).