We (Camila Olarte Parra (LSHTM), Rhian Daniel (Cardiff), myself, and David Wright (AstraZeneca)) recently put on arXiv a new paper which explores the use of estimators from both the causal inference and missing data literatures for estimating a so-called hypothetical estimand in a previously conducted clinical trial in diabetes.

# g-formula

## G-formula for causal inference via multiple imputation

G-formula (sometimes known as G-computation) is an approach for estimating the causal effects of treatments or exposures which can vary over time and which are subject to time-varying confounding. It is one of the so called G-methods developed by Jamie Robins and co-workers. For a nice overview of these, I recommend this open access paper by Naimi et al 2017, and for more details, the What If book by HernĂ¡n and Robins. In this post, I’ll describe some recent work with Camila Olarte Parra and Rhian Daniel in which we have explored the use of multiple imputation methods and software as a route to implementing G-formula estimators.

## Hypothetical estimands – a unification of causal inference and missing data methods

Camila Olarte Parra, Rhian Daniel and myself have just released a pre-print on arXiv (now published in Statistics in Biopharmaceutical Research) in detailing recent work looking at statistical methods targeting so called hypothetical estimands in clinical trials. The ICH E9 addendum on estimands is having a widespread impact on the way clinical trials are planned and analysed. One of the *strategies *described by the addendum for handling so called intercurrent events is the hypothetical strategy. This is where one hypothesizes of a way in which the trial could be modified such that the intercurrent event in question would not take place. For example, in trials where patients may receive a rescue medication, we could conceive of a trial where such medication were not made available. The goal of inference is then what treatment effect we would have seen in such a modified trial.

In the paper, building on work by others (e.g. Lipkovich et al 2020), we show how causal inference concepts and methods can be used to define and estimate hypothetical estimands. Currently estimation of estimands which use the hypothetical strategy is predominantly carried out using missing data methods such as mixed models and multiple imputation. To do so, any outcome measurements available after the intercurrent event being dealt with using the hypothetical strategy are deleted/ignored, and an analysis using these methods is performed, assuming the resulting missing data are missing at random (MAR). We set out to see how estimation of hypothetical estimands would proceed using the language and machinery from causal inference.

In this post I’ll highlight a few of the things the paper covers.