The FDA recently published revised guidance on statistical methods for adjusting for baseline covariates in trials. Overall I like the guidance and think it will prove useful. In this post I’ll give a few thoughts on aspects of the revised guidance, organised according to the sections of the guidance document.

# covariate adjustment

## Confounding vs. effect modification

A student asked me today about the differences between confounding and effect modification. In this post I’ll try and distinguish these conceptually and illustrate the differences using some very large simple simulated datasets in R.

## ANCOVA in RCTs – model based standard errors are valid even under misspecification

A nice paper by Wang and colleagues has just been published in Biometrics which examines the robustness of ANCOVA (i.e. linear regression) for analysing continuous outcomes in randomised trials. By this they mean a linear regression which adjusts for randomised treatment group and baseline covariates. Yang and Tsiatis in 2001 showed that this estimator is consistent for the average treatment effect even if the model is misspecified, and as such can be recommended for general use in the analysis of such trials. They offered a variance estimator for the resulting treatment effect estimate that is valid even if the model is misspecified, and compared this in simulations to the usual model based variance estimator from ANCOVA (which they refer to as the OLS variance). Yang and Tsiatis reported that this OLS variance estimator performed well even when the linear model was misspecified, but suggested that the OLS variance estimator had some bias as the sample size was increased.

In their new paper Wang and colleagues prove that so long as the randomisation ratio is 1:1, the standard model based variance estimator for the adjusted treatment effect from ANCOVA is valid even under model misspecification. As such, this further strengthens support for using a linear regression model with adjustment for baseline covariates in randomised trials.

An important caveat to the result of Wang and colleagues is that they assumed that the treatment group is assigned completely at random, and in particular independently of the baseline covariates. This rules out certain randomisation schemes, such as stratified randomisation, where randomisation depends on the subject’s baseline covariates. Indeed we know that in this setting, if we don’t properly model the effects of the variables used to stratify the randomisation, our treatment effect variance estimates in general are not correct.