The robust sandwich variance estimator for linear regression (using R)

In a previous post we looked at the (robust) sandwich variance estimator for linear regression. This method allowed us to estimate valid standard errors for our coefficients in linear regression, without requiring the usual assumption that the residual errors have constant variance.

In this post we’ll look at how this can be done in practice using R, with the sandwich package (I’ll assume below that you’ve installed this library). To illustrate, we’ll first simulate some simple data from a linear regression model where the residual variance increases sharply with the covariate:

Read more

R squared and goodness of fit in linear regression

I’ve been teaching a modelling course recently, and have been reading and thinking about the notion of goodness of fit. R squared, the proportion of variation in the outcome Y, explained by the covariates X, is commonly described as a measure of goodness of fit. This of course seems very reasonable, since R squared measures how close the observed Y values are to the predicted (fitted) values from the model.

Read more

R squared and adjusted R squared

One quantity people often report when fitting linear regression models is the R squared value. This measures what proportion of the variation in the outcome Y can be explained by the covariates/predictors. If R squared is close to 1 (unusual in my line of work), it means that the covariates can jointly explain the variation in the outcome Y. This means Y can be accurately predicted (in some sense) using the covariates. Conversely, a low R squared means Y is poorly predicted by the covariates. Of course, an effect can be substantively important but not necessarily explain a large amount of variance – blood pressure affects the risk of cardiovascular disease, but it is not a strong enough predictor to explain a large amount of variation in outcomes. Put another way, knowing someone’s blood pressure can’t tell you with much certainty whether a particular individual will suffer from cardiovascular disease.

Read more