Today I was lucky enough to listen Prof William Rosenberger present the 15th Armitage lecture in Cambridge. Prof Rosenberger has worked extensively on randomisation in trials in various respects (see his book), and he delivered a really interesting talk. The talk can now be viewed online here.
Randomisation based inference in trials
One of Prof Rosenberger’s main points was that the so called randomisation based approach to inference from trials is largely neglected from biostatistics teaching nowadays, something which personally I can attest to. The idea is that rather imagining the patients recruited into a given trial as some probability sample from a larger population, we consider the recruited patients as a fixed finite population, about which we wish to make some inferences.
We assume that for each of the recruited individuals, there are two outcome values that correspond to what outcome they will experience should they be randomised to each of the two treatments (with obvious extension to more than two treatments). We then randomise treatment allocation (another point of Prof Rosenberger’s was the lack of detail given in publications regarding exactly how randomisation was performed) somehow. We then decide on a test statistic, such as the difference in mean outcomes between randomised groups. We then consider the distribution of the test statistic under the sharp null hypothesis that the treatment has no effect on outcomes (at all), in order to calculate a p-value. Under this null, this distribution is induced by the randomisation process used, and can be calculated based on the probability of seeing a test statistic as extreme as the one observed under this null distribution.
Such randomisation based inference allows one to judge evidence against the null of there being no treatment effect in the specific patients recruited into the trial. Of itself, it makes no broader claims about the efficacy of the treatments in patients not involved in the trial. Prof Rosenberger highlighted that many (eminent) statisticians in the past have highlighted that this approach is the basis for inference from randomised trials.
Population model based inference
The vast majority of statistical analyses of randomised trials is based on the so called population model approach. Prof Rosenberger’s description of this model is that we assume that the patients randomised to the two treatments are random samples from infinite populations of patients taking the two treatments. As Prof Rosenberger noted in his talk and his book, this characterisation typically does not concur with the reality of how trials are conducted. He notes in his book, if one of the treatments is novel, there may be no patient currently receiving his treatment in the population. As such, he advocates that the randomisation based inference approach is preferable, since its assumptions match the reality of how trials are actually conducted.
One thought in response to this is that a different characterisation of the population model approach has been given in the causal inference world, which I don’t believe requires the notion of there being two super populations of patients taking the two treatments under consideration. Specifically (and omitting some important details and assumptions), we assume that the patients recruited in the trial are randomly sampled from some (super) population. Each individual in the population (and thereby the sample) have potential outcomes value Y0 and Y1 which correspond to the outcome values they will experience if they are given control or active treatment. The randomisation indicator Z then determines which of these two potential outcomes is observed and which is unobserved. The causal effect on an individual is then some contrast of Y0 and Y1, and if Z is randomly assigned, we are able to make inferences about certain population parameters, such as the mean of Y1-Y0. Here there is no notion that there exists a population of patients already taking each of the two treatments. Thus the point that no patients may yet be taking the treatment in the population presently doesn’t seem quite right to me.
Moreover, the notion of the two potential outcomes would seem to be within the randomisation based inference approach too. From Prof Rosenberger’s book:
The essential feature of a randomisation test is that, under the randomization null hypothesis, the set of observed responses is assumed to be a set of deterministic values that are unaffected by treatment.
Here what is meant by ‘unaffected by treatment’ is presumably that Y0=Y1 for each individual.
Generalising inferences from trials to populations
One of the arguments sometimes put forward by some (including I believe Stephen Senn) is that population model based inference for trials may often be inappropriate because the notion that the patients recruited are a random sample does not accord at all with the reality of how patients are recruited in most trials. In contrast, because randomisation based inference only attempts to make inferences about the particular patients recruited into the trial, it does not suffer from this issue.
A logical next question however is that if one uses randomisation based inference for a trial, because one has doubts about the representativeness of the trial’s patients, what use are those inferences and conclusions in regards treatment effects for patients not included in the trial. After all, the whole purpose of randomised trials is to learn about the efficacy of one treatment compared to another, estimated in such a way that we can use to make decisions about treatment choice for future patients, not in the trial.
Towards the end of his talk, Prof Rosenberger offered a number of answers to a series of commonly asked questions about randomisation based inference, one of which was the question asked in the preceding paragraph:
Q: If the analysis of a clinical trial is based on a randomization model that does not in any way involve the notion of a population, how can results of the trial be generalized to determine the best care for future patients?
A: Berger (2000) argues that the difficulty in generalizing to a target population is a weakness not of the randomization test, but of the study design. If it was suspected by investigators that patient experience in a particular clinical trial could not be generalized, there would be no reason to conduct the clinical trial in the first space. Thus we hope that the results of a randomised clinical trial will apply to the general population as well as to the patients in the trial. However, the study design only provides a formal assessment of the latter, not the former. By ensuring validity of the treatment comparison within the trial conducted, by limiting bias and ensuring strict adherence to the protocol, it is more likely that a generalization beyond the trial can be attained.
This position seems somewhat curious to me, as I raised in a question at the end of the lecture. I would agree that the difficulty of generalising to a target population is not a weakness of the randomisation test. However, if the user of the randomisation inference approach then makes the case that they hope the results of the trial will apply to the general population, then why not use the population model approach in the first place?
Thus far we have been talking about hypothesis testing. Prof Rosenberger touched briefly on the question of estimation, but suggested that randomisation inference methods could not offer much here. This would seem a major issue, since magnitude of effects is of obviously critical importance.
Having said all this, I really enjoyed Prof Rosenberger’s extremely thought provoking lecture, and this post has focused on the few things that I didn’t quite agree with or understand. Moreover, there were many aspects in the talk I’ve not mentioned here, such as improved robustness of randomisation tests compared to population based tests when the models used in the latter are misspecified.