Jonathan Bartlett

A/B testing – confidence interval for the difference in proportions using R

February 23, 2014February 15, 2014 by Jonathan Bartlett

In a previous post we looked at how Pearson’s chi-squared test (or Fisher’s exact test) can be used to test whether the ‘success’ proportions are equal under two conditions. In biostatistics this setting arises (for example) when patients are randomized to receive one or other of two treatments, and for each patient we observe either a ‘success’ (of course this could be a bad outcome, such as death) or ‘failure’. In web design people may have data where web site visitors are sent to one of two versions of a page at random, and for each visit a success is defined as some outcome such as a purchase of a product. In both cases, we may be interested in testing the hypothesis that the true proportion of successes in the population are equal, and this is what we looked at in an earlier post. Note that the randomization described in these two examples is not necessary for the statistical procedures described in this post, but of course randomization affects our interpretation of the differences between the groups.

The robust sandwich variance estimator for linear regression (using R)

May 10, 2014February 14, 2014 by Jonathan Bartlett

In a previous post we looked at the (robust) sandwich variance estimator for linear regression. This method allowed us to estimate valid standard errors for our coefficients in linear regression, without requiring the usual assumption that the residual errors have constant variance.

In this post we’ll look at how this can be done in practice using R, with the sandwich package (I’ll assume below that you’ve installed this library). To illustrate, we’ll first simulate some simple data from a linear regression model where the residual variance increases sharply with the covariate:

Wald vs likelihood ratio test

June 28, 2016February 8, 2014 by Jonathan Bartlett

When taking a course on likelihood based inference, one of the key topics is that of testing and confidence interval construction based on the likelihood function. Usually the Wald, likelihood ratio, and score tests are covered. In this post I’m going to revise the advantages and disadvantages of the Wald and likelihood ratio test. I will focus on confidence intervals rather than tests, because the deficiencies of the Wald approach are more transparently seen here.