A couple of months ago I came across this paper, “Bias, precision and statistical power of analysis of covariance in the analysis of randomized trials with baseline imbalance: a simulation study”, published in the open access online journal BMC Medical Research Methodology, by Egbewale, Lewis and Sim. Using simulation studies, as the title says, the authors investigate the bias, precision and power of three analysis methods for a randomized trial with a continuous outcome and a baseline measure of the same variable, when there is an imbalance at baseline in the baseline measure. The three methods considered are ANOVA (a two-sample t-test here), an analysis of change (CSA, change from baseline to follow-up) scores, and analysis of covariance (ANCOVA), which corresponds to fitting a linear regression model with outcome measurement as the dependent variable, with randomized treatment and baseline measure as covariates.
Randomized controlled trials
Robustness to misspecification when adjusting for baseline in RCTs
It is well known that adjusting for one or more baseline covariates can increase statistical power in randomized controlled trials. One reason that adjusted analyses are not used more widely may be because researchers may be concerned that results may be biased if the baseline covariate(s)’ effects are not modelled correctly in the regression model for outcome. For example, a continuous baseline covariate would by default be entered linearly in a regression model, but in truth it’s effect on outcome may be non-linear. In this post we’ll review an important result which shows that for continuous outcomes modelled with linear regression, this does not matter in terms of bias – we obtain unbiased estimates of treatment effect even if we mis-specify a baseline covariate’s effect on outcome.
Clustering in randomized controlled trials
Randomized clinical trials often involve some sort of clustering. The most obvious is in a cluster randomized trial, where clusters form the unit of randomization. It is well known that in this case the clustering must be allowed for in the analysis. But even in the common setting where individuals are randomized, clustering may be present. Perhaps the most common situation is where a trial involves a number of hospitals or centres, and individuals are recruited into the trial when they attend their local centre. Another example is where the intervention is administered to each individual by some professional (e.g. surgeon, therapist), such that outcomes from individuals treated by the same professional may be more similar to each other. In both of these situations, an obvious question is whether we need to allow for the clustering in the analysis?