The Stats Geek

The t-test and robustness to non-normality

May 1, 2017September 28, 2013 by Jonathan Bartlett

The t-test is one of the most commonly used tests in statistics. The two-sample t-test allows us to test the null hypothesis that the population means of two groups are equal, based on samples from each of the two groups. In its simplest form, it assumes that in the population, the variable/quantity of interest X follows a normal distribution $N(\mu_{1},\sigma^{2})$ in the first group and is $N(\mu_{2},\sigma^{2})$ in the second group. That is, the variance is assumed to be the same in both groups, and the variable is normally distributed around the group mean. The null hypothesis is then that $\mu_{1}=\mu_{2}$ .

Linear regression with random regressors, part 2

March 23, 2020September 8, 2013 by Jonathan Bartlett

Previously I wrote about how when linear regression is introduced and derived, it is almost always done assuming the covariates/regressors/independent variables are fixed quantities. As I wrote, in many studies such an assumption does not match reality, in that both the regressors and outcome in the regression are realised values of random variables. I showed that the usual ordinary least squares (OLS) estimators are unbiased with random covariates, and that the usual standard error estimator, derived assuming fixed covariates, is unbiased with random covariates. This gives us some understand of the behaviour of these estimators in the random covariate setting.

Regression inference assuming predictors are fixed

February 25, 2020August 30, 2013 by Jonathan Bartlett

Linear regression is one the work horses of statistical analysis, permitting us to model how the expectation of an outcome Y depends on one or more predictors (or covariates, regressors, independent variables) X. Previously I wrote about the assumptions required for validity of ordinary linear regression estimates and their inferential procedures (tests, confidence intervals) assuming (as we often do) that the residuals are normally distributed with constant variance.