Generalised estimating equations (GEEs) and generalised linear mixed models (GLMMs) are two approaches to modelling clustered or longitudinal categorical outcomes. Here I will focus on the common setting of a binary outcome. As is commonly described, the two approaches estimate different effect measures, with GEEs targeting so called marginal effects, and GLMMs targeting conditional or subject specific effects. Understanding the difference between these is potentially quite tricky I think.

# Longitudinal and clustered data

## Robustness of linear mixed models

Linear mixed models form an extremely flexible class of models for modelling continuous outcomes where data are collected longitudinally, are clustered, or more generally have some sort of dependency structure between observations. They involve modelling outcomes using a combination of so called fixed effects and random effects. Random effects allow for the possibility that one or more covariates have effects that vary from unit (cluster, subject) to unit. In the context of modelling longitudinal repeated measures data, popular linear mixed models include the random-intercepts and random-slopes models, which respectively allow each unit to have their own intercept or (intercept and) slope.

As implemented in statistical packages, linear mixed models assume that we have modelled the dependency structure correctly, and that both the random effects and within-unit residual errors follow normal distributions, and that these have constant variance. While it is possible to some extent to check these assumptions through various diagnostics, a natural concern is that if one or more assumptions do not hold, our inferences may be invalid. Fortunately it turns out that linear mixed models are robust to violations of some of their assumptions.