In an introductory course on linear regression one learns about various diagnostics which might be used to assess whether the model is correctly specified. One of the assumptions of linear regression is that the errors have mean zero, conditional on the covariates. This implies that the unconditional or marginal mean of the errors have mean zero.
Linear regression
R squared/correlation depends on variance of predictor
I’ve written about R squared a few times before. In a discussion I was involved with today the question was raised as to how/whether the R squared in a linear regression model with a single continuous predictor depends on the variance of the predictor variable. The answer to the question is of course yes.
Interval regression with heteroskedastic errors
Interval regression allows one to fit a linear model of an outcome on covariates when the outcome is subject to censoring. In Stata an interval regression can be fitted using the intreg command. Each outcome value is either observed exactly, is interval censored (we know it lies in a certain range), left censored (we only know the outcome is less than some value), or right censored (we only know the outcome is greater than some value). In Stata’s implementation the robust option is available, which with regular linear regression can be used when the residual variance is not constant. Using robust option doesn’t change the parameter estimates, but the standard errors (SEs) are calculated using the sandwich variance estimator. In this post I’ll briefly look at the rationale for using robust with interval regression, and highlight the fact that if the residual variances are not constant, unlike for regular linear regression, the interval regression estimates are biased.