## Is stratified randomisation in trials (at least large ones) pointless?

I recently wrote a blog post giving some comments about the revised FDA guidance on covariate adjustment in randomised trials. One of the things I commented on was recent work by different authors on the impacts of stratified randomisation on inference. It is well known (or should be!) that if stratified randomisation is used, the statistical analysis should adjust for the variables used in the stratified randomisation in order for the efficiency gain to be realised.

A fact that was new to me is that if in the analysis one adjusts (in linear models for continuous outcomes) for dummy variables corresponding to each of the randomisation strata, the true variance of the resulting estimator of treatment effect is the same whether simple (non stratified) or stratified randomisation is used, in the most common situation that randomisation to the two arms is 1:1. This was shown by one of the results of Bugni et al 2018. Wang et al 2019 subsequently showed it also holds if additional baseline covariates (not used in the stratified randomisation) are also adjusted for, and that it also holds for the standardisation type estimator of the marginal risk difference based on a logistic regression working model.

These results mean that, at least for large sample sizes, if one considers strata defined by baseline covariates, provided you adjust for these strata as covariates in the analysis, performing the randomisation stratified on these gains you no additional efficiency compared to using simple randomisation. These theoretical results are in agreement with simulation results mentioned on Twitter today by Jack Wilkinson that prompted this post:

The theoretical results in the above papers are asymptotic results. Thus I can imagine it could well be the case that with small sample sizes stratified randomisation does buy you some additional efficiency.

Moreover, in practice I believe it is more common in the analysis model to adjust for only main effects of the variables used to define the randomisation strata, rather than dummy variables for each of their combinations. My guess is that if an assumption that the outcome only depends on these variables via their main effects is correct, the theoretical results mentioned above that imply no (asymptotic) benefit to using stratified randomisation would also hold true for this type of analysis model.

I don’t have time to actively investigate this myself right now, but if anyone is interested in pursuing it and leading on the work for this, please get in touch.

## Postscript – a small simulation illustration

The following small simple simulation study in R follows. The setup is very simple – one binary baseline covariate (X) which influences the outcome and either is ignored in the randomisation (simple randomisation) or randomisation is performed stratified on it to ensure balance. In both cases, the analysis is a linear regression adjusting for treatment (Z) and this baseline covariate (X).

``````library(blockrand)

n <- 250

nSim <- 10000
est <- array(0, dim=nSim)

#simple randomisation
for (sim in 1:nSim) {

#simulate binary baseline covariate
x <- 1*(runif(n)<0.5)
#simulate treatment assignment, using simple randomisation
z <- 1*(runif(n)<0.5)

y <- x+z+rnorm(n)
mod <- lm(y~x+z)
est[sim] <- coef(mod)[3]
}

#look at empirical SE of estimates
sd(est)

#stratified block randomisation

for (sim in 1:nSim) {

x <- 1*(runif(n)<0.5)
#stratified block randomisation
z <- rep(0,n)
n0 <- sum(x==0)
z[x==0] <- as.numeric(blockrand(n=n0, num.levels=2)\$treatment)[1:n0]-1
n1 <- sum(x==1)
z[x==1] <- as.numeric(blockrand(n=n1, num.levels=2)\$treatment)[1:n1]-1

y <- x+z+rnorm(n)
mod <- lm(y~x+z)
est[sim] <- coef(mod)[3]
}

sd(est)``````

The empirical SE from simple randomisation (based on 10,000 simulations) was 0.1259364 and for stratified randomisation was 0.1254624. This shows that, at least in this setup, the stratified randomisation, does not materially reduce the (true) variability of the treatment effect estimates.

These results are in accordance with a 1982 paper I just came across: ‘A note on stratifying versus complete random assignment in clinical trials’ by Grizzle.