It is well known that adjusting for one or more baseline covariates can increase statistical power in randomized controlled trials. One reason that adjusted analyses are not used more widely may be because researchers may be concerned that results may be biased if the baseline covariate(s)' effects are not modelled correctly in the regression model for outcome. For example, a continuous baseline covariate would by default be entered linearly in a regression model, but in truth it's effect on outcome may be non-linear. In this post we'll review an important result which shows that for continuous outcomes modelled with linear regression, this does not matter in terms of bias - we obtain unbiased estimates of treatment effect even if we mis-specify a baseline covariate's effect on outcome.

# Randomized controlled trials

## Clustering in randomized controlled trials

Randomized clinical trials often involve some sort of clustering. The most obvious is in a cluster randomized trial, where clusters form the unit of randomization. It is well known that in this case the clustering must be allowed for in the analysis. But even in the common setting where individuals are randomized, clustering may be present. Perhaps the most common situation is where a trial involves a number of hospitals or centres, and individuals are recruited into the trial when they attend their local centre. Another example is where the intervention is administered to each individual by some professional (e.g. surgeon, therapist), such that outcomes from individuals treated by the same professional may be more similar to each other. In both of these situations, an obvious question is whether we need to allow for the clustering in the analysis?

## Leveraging baseline covariates for improved efficiency in randomized controlled trials

In a previous post I talked about the issue of covariate adjustment in randomized controlled trials, and the potential for improving the precision of treatment effect estimates. In this post I'll look at one of the (fairly) recently developed approaches for improving estimates of marginal treatment effects, based on semiparametric theory.

## Adjusting for baseline covariates in randomized controlled trials

Randomized controlled trials constitute what are generally considered to be the gold standard design for evaluating the effects of some intervention or treatment of interest. The fact that participants are randomized to the two (sometimes more) groups ensures that, at least in expectation, the two treatment groups are balanced in respect of both measured, and importantly, unmeasured factors which may influence the outcome. As a consequence, differences in outcomes between the two groups can be attributed to the effect of being randomized to the treatment rather than the control (which often would be another treatment).