The Stats Geek

The role of post intercurrent event data in the estimation of hypothetical estimands in clinical trials

April 3, 2025 by Jonathan Bartlett

Clinical trial estimands which make use of the so-called hypothetical strategy target the effect of one randomised treatment compared to another in a scenario where the corresponding intercurrent event does not happen. Historically estimation of such estimands has made use of established techniques for handling missing data, setting any observed data after the intercurrent event to missing.

In the last few years it has been shown that data after the intercurrent event can be used for estimation of such hypothetical estimands, using methods such as G-formula and G-estimation from causal inference. These offer the potential for increased statistical power, but rely on making certain assumptions about how the intercurrent event influences subsequent outcomes. In a new pre-print available on arXiv, Rhian Daniel and I examine further the role of such post intercurrent event data in estimation of hypothetical estimands.

In the paper we:

show certain G-formula estimators are identical to certain G-estimators, something which is not obvious from their construction
show these estimators can only improve efficiency and power by making additional assumptions not required by estimators (such as imputation missing data estimators) that do not use data observed after the intercurrent event
show the gain in efficiency/power will typically be modest, since in most trials the rates of such intercurrent events is usually not too large
argue that the additional assumptions necessary will often not be plausible on clinical grounds

As such, we conclude by recommending that estimation of estimands that adopt the hypothetical strategy continue to be based on estimators that do not use data after the intercurrent event occurs. This involves setting any data observed after the intercurrent event to missing and handling the resulting missing counterfactual (no intercurrent event) outcomes using missing data methods, such as multiple imputation or inverse probability weighting.

Multiple imputation for coarsened (grouped) factor covariates

March 27, 2025 by Jonathan Bartlett

Missing data are a common problem in statistical analyses. A closely related but slightly different problem is when for an individual in a dataset, although we do not know the exact value of a particular variable, we have some partial information about the missing value. Specifically, we know the value belongs to a subset of the sample space. Such data is said to have been coarsened. An example of this is a factor variable, that takes values a, b, or c where for some individuals we know they are in a or c, and for other individuals we know their value is b or c, but we are not sure which.

In such a setting, we could try and use multiple imputation (MI) to impute the missing values. This would involve setting the ‘a or c’ values and ‘b or c’ values to missing, and imputing. An obvious issue with this approach would be that for individuals with ‘a or c’, some could be imputed as b – the imputation has not respected the known information about the true value. Ideally we want our information to respect and utilise this partial information about the missing value.

Thanks to the work of Lars van der Burg, the smcfcs package in R for MI of missing covariates now incorporates functionality for imputing factor covariates which are missing but for which there is such partial information (for some individuals). To see how the new functionality works, please see the accompanying vignette. For further details of the methodology, including simulations and an illustrative example, see van der Burg et al 2025, available open-access in Statistics in Medicine.

What is meant by a ‘while on treatment’ estimand?

March 3, 2025 by Jonathan Bartlett

The ICH E9 R1 addendum on estimands in clinical trials has made big waves in the clinical trial world in the last few years. It aims to provide a framework to think about and define more precisely what exactly the treatment effect(s) of interest is in a clinical trial, in light of what the addendum calls ‘intercurrent events’ (ICEs):

Events occurring after treatment initiation that affect either the interpretation or the existence of the
measurements associated with the clinical question of interest. It is necessary to address intercurrent
events when describing the clinical question of interest in order to precisely define the treatment effect
that is to be estimated.

A couple of weeks ago a really nice paper was published by Harrison and Brummel in the American Statistican which explored the five different ‘strategies’ described in the E9 addendum for handling ICEs in a simple example using potential outcomes. For each strategy they gave an example of an estimand defined using the strategy and a simple estimator for estimating the estimand from the data. In this post, I want to focus on the while on treatment strategy, as I think it’s one area where there is some debate as to what exactly the E9 addendum meant. I of course do not claim to have the definitive answer, but the following is my view.