In a previous post I talked about the issue of covariate adjustment in randomized controlled trials, and the potential for improving the precision of treatment effect estimates. In this post I’ll look at one of the (fairly) recently developed approaches for improving estimates of marginal treatment effects, based on semiparametric theory.
The Hosmer-Lemeshow goodness of fit test for logistic regression
Before a model is relied upon to draw conclusions or predict future outcomes, we should check, as far as possible, that the model we have assumed is correctly specified. That is, that the data do not conflict with assumptions made by the model. For binary outcomes logistic regression is the most popular modelling approach. In this post we’ll look at the popular, but sometimes criticized, Hosmer-Lemeshow goodness of fit test for logistic regression.
A/B testing – confidence interval for the difference in proportions using R
In a previous post we looked at how Pearson’s chi-squared test (or Fisher’s exact test) can be used to test whether the ‘success’ proportions are equal under two conditions. In biostatistics this setting arises (for example) when patients are randomized to receive one or other of two treatments, and for each patient we observe either a ‘success’ (of course this could be a bad outcome, such as death) or ‘failure’. In web design people may have data where web site visitors are sent to one of two versions of a page at random, and for each visit a success is defined as some outcome such as a purchase of a product. In both cases, we may be interested in testing the hypothesis that the true proportion of successes in the population are equal, and this is what we looked at in an earlier post. Note that the randomization described in these two examples is not necessary for the statistical procedures described in this post, but of course randomization affects our interpretation of the differences between the groups.