The FDA recently published revised guidance on statistical methods for adjusting for baseline covariates in trials. Overall I like the guidance and think it will prove useful. In this post I’ll give a few thoughts on aspects of the revised guidance, organised according to the sections of the guidance document.
Jonathan Bartlett
Multiple imputation separately by groups in R and Stata
When using multiple imputation to impute missing values there are often situations where one wants to perform the imputation process completely separately in groups of subjects defined by some fully observed variable (e.g. sex or treatment group). In Stata, this is made very easy through use of the by() option. You simply tell the mi impute command what variable (or variables) you want to perform the imputation stratified on. Stata will then impute separately in groups defined by this variable(s), and then assemble the imputations of each strata back together so you have your desired number of imputed datasets.
Last week someone asked me how to do it in R, ideally with the mice package. Compared to Stata, one has to do a little bit more work. One approach is to use the mice.impute.bygroup function in the miceadds package, a package which extends functionality for mice in various directions. If you instead want to do it manually, you can do so by making using of the rbind function within the mice package.
Does the log rank test assume proportional hazards?
A student asked me recently whether the log rank test for time to event data assumes that the hazard ratio between the two groups is constant over time, as is assumed in Cox’s famous proportional hazards model. The BMJ ‘Statistics at square one’ Survival Analysis article for example says the test assumes:
That the risk of an event in one group relative to the other does not change with time. Thus if linoleic acid reduces the risk of death in patients with colorectal cancer, then this risk reduction does not change with time (the so called proportional hazards assumption ).
https://www.bmj.com/about-bmj/resources-readers/publications/statistics-square-one/12-survival-analysis
Personally I would not say the log rank test assumes proportional hazards. Under the null hypothesis that the (true) survival curves in the two groups are the same, or equivalently that the hazard functions are identical in the two groups, the log rank test would only wrongly reject 5% of the time. Of course under this null the hazards are proportional (indeed identical).
When this null does not hold, if the hazard ratio is constant over time, the log rank test is the most powerful test. When it is not constant over time it is not optimal in terms of power, but the non-constant hazard ratio does not invalidate the test per se. It just means that there may be alternative methods of analysis that might be preferable (see my recent PSI event slides here).