Following my recent post on fitting an MMRM in SAS, R, and Stata, someone recently asked me about when it is preferable to use a Mixed Model Repeated Measures (MMRM) analysis as opposed to a a linear mixed effects model (LME) which includes subject level random effects (e.g. intercepts).
What is the MMRM?
The term MMRM mainly comes from the literature on randomised trials (in particular pharmaceutical industry trials), where they are used to analyse the repeated measures over time from patients in the trial. In this context patients are intended to be measured at each of a fixed number of measurement times. The MMRM allows a separate mean parameter at each measurement time in each treatment group. This is a so called saturated model for the mean, in the sense that it imposes no assumptions on how the mean differs by time or treatment. To account for the correlation in measurements from the same patient, the MMRM specifies an unstructured covariance matrix for the residual errors.
Is an MMRM an LME?
At least according to my go to reference on mixed models (Verbeke and Molenberghs’ Linear Mixed Models for Longitudinal Data), a linear mixed model is an extension of standard linear regression which contains some combination of subject/cluster level random effects and correlated errors. The MMRM is thus an example of a linear mixed model – it doesn’t specify any random effects but it specifies a correlated residual error structure.
Random effects vs. MMRM
So, to the question I was asked, as to when it would be preferable to use a model with random subject level effects and when to use one with correlated errors in an MMRM?
The first thing to remember is that it generally only makes sense to fit the MMRM if subjects are measured at a common set of times. This is generally the case in designed studies, and randomised trials in particular. If the measurements occur on a more ad-hoc basis, such that the times of measurement vary across subjects, what is typically called the MMRM model can’t really be fitted. In this case modelling the underlying trajectory of the process being measured using subject level random effects can be used. It is worth noting that in this case, if one uses a relatively simple random effects structure (e.g. intercepts and slopes) one can still specify that the residual errors are serially correlated over time. Here the correlation in errors is accommodating residual correlation between measurements close in time that cannot be captured by the simple random effects structure. See Section 3.3.4 of Verbeke and Molenberghs for more on this.
Let’s now consider the case where an MMRM is typically used, namely a randomised trial, where patients are intended to be measured at a fixed set of times after baseline. As mentioned earlier, the MMRM specifies a separate mean per combination of measurement time and treatment group. In this case, if there are no missing values, whether you account for correlation in the repeated measures using random effects or correlated residual errors has no impact on the estimation of the treatment group means (the fixed effects) at each time point, which are usually the parameters of primary interest in a randomised trial. It does however have an impact on the estimation of standard errors – if the correlation structure of the repeated measures implied by the specified random effects is incorrect, the standard errors may be general biased. This potential issue suggests that in this setting the the MMRM, which uses an unstructured covariance structure for the residual errors, is preferable.
If as is commonly the case there are missing values, the choice of whether to use random effects, a structured residual correlation structure (e.g. first order autoregressive), or an unstructured residual correlation structure, becomes even more important. This is because this choice critically affects how the model handles the missing values. With missing values, mixed models fitted by maximum likelihood are obtained under the missing at random assumption. The modelling assumptions now play a critical role in how the model is (implicitly) imputing the missing values. If one uses a model with random subject effects and independent residual errors, if the implied correlation structure for the repeated measurements is incorrect, the missing values are (implicitly) being imputed from an incorrect imputation distribution, which will lead to biased estimates of the treatment group means. This again indicates why the MMRM model is (in this setting) generally preferable to a model with random subject effects.
Further justification for using an unstructured covariance structure for the residual errors rather than a structured covariance structure are given by Carpenter and Kenward’s excellent (freely available) ‘Missing data in clinical trials’, in Section 3.2. They show that even when in truth the data follow a simpler covariance structure (e.g. first order autoregressive), using this (correct) assumption in the mixed model only improves statistical power by a small amount compared to using the more general unstructured covariance, except for very small sample sizes.
If the number of measurement times is moderate or large and there are missing values, sometimes one can encounter convergence problems with the unstructured covariance matrix in the MMRM. In the case of this occurring, trials often pre-specify to use models which parametrise the residual error covariance matrix using fewer parameters (e.g. first order autoregressive). Following the discussion above, this solution is potentially vulnerable to bias if the simpler covariance structure assumed is incorrect. A possible alternative would be use doubly robust estimation, which would be (asymptotically) unbiased if either the outcome model (including the simpler parametrisation for the residual covariance matrix) or the model for dropout (missingness) were correct. Lastly, it is of course important to remember that with missing values everything one does is reliant on the missing data assumption (i.e. missing at random for mixed models) made being at least approximately correct.
Great Post- thanks.
Andy