Clustering in randomized controlled trials

Randomized clinical trials often involve some sort of clustering. The most obvious is in a cluster randomized trial, where clusters form the unit of randomization. It is well known that in this case the clustering must be allowed for in the analysis. But even in the common setting where individuals are randomized, clustering may be present. Perhaps the most common situation is where a trial involves a number of hospitals or centres, and individuals are recruited into the trial when they attend their local centre. Another example is where the intervention is administered to each individual by some professional (e.g. surgeon, therapist), such that outcomes from individuals treated by the same professional may be more similar to each other. In both of these situations, an obvious question is whether we need to allow for the clustering in the analysis?

Read more

Fixed versus random-effects meta-analysis – efficiency and confidence interval coverage

Meta-analysis is a critical tool for synthesizing existing evidence. It is commonly used within medical and clinical settings to evaluate the existing evidence regarding the effect of a treatment or exposure on an outcome of interest. The essential idea is that the estimates of the effect of interest from previous study are pooled together. A choice which has to be made when conducting a meta-analysis is between fixed-effects and random-effects. In this post we’ll look at some of the consequences of this choice, when in truth the studies are measuring different effects.

Read more

Multiple imputation with interactions and non-linear terms

Multiple imputation has become an extremely popular approach to handling missing data, for a number of reasons. One is that once the imputed datasets have been generated, they can each be analysed using standard analysis methods, and the results pooled using Rubin’s rules. However, in addition to the missing at random assumption, for multiple imputation to give unbiased point estimates the model(s) used to impute missing data need to be (at least approximately) correctly specified. Because of this, care must be taken when choosing the imputation model.

What constitutes a reasonable imputation model will obviously depend on the dataset and situation at hand. One situation which is commonly encountered, but where it is not obvious what one should do, is where the dataset, or the model(s) which will be fitted after imputation, contains interaction terms or non-linear terms such as squared terms.

Read more