Independence of sample mean and sample variance

A student asked me a good question today about whether it is really the case that the sample mean and sample variance are independent random variables, given that both are functions of the same data. That this is true when the observations come from a normal distribution can be shown relatively easily – see here for example.

Is this independence true more generally? The answer is no. As a simple example, suppose that each of the observations come from a chi-squared distribution on one degree of freedom. This means that each of the observations is the square of an independent standard normal random variable. As such, their values are all positive. To see why the sample mean and sample variance are now dependent, suppose that the sample mean is small, and close to zero. Given that the observations are all positive, the only way the sample mean can be close to zero is if their variability around the sample mean is also small. That is, if the sample mean is small, we should expect the sample variance to also be small, and hence they are positively correlated random variables.

A quick simulation in R confirms this intuition:

set.seed(623423)
nSim <- 1000
n <- 10
ests <- array(0, dim=c(nSim,2))

for (i in 1:nSim) {

  #create sample from chi-squared on 1 d.f.
  x <- rnorm(n,mean=0,sd=1)^2
  #store sample mean and variance
  ests[i,] <- c(mean(x), var(x))

}

plot(ests[,1],ests[,2], xlab="Sample mean", ylab="Sample variance")

Frequentists should more often consider using Bayesian methods

Recently my colleague Ruth Keogh and I had a paper published: ‘Bayesian correction for covariate measurement error: a frequentist evaluation and comparison with regression calibration’ (open access here). The paper compares the popular regression calibration approach for handling covariate measurement error in regression models with a Bayesian approach. The two methods are compared from the frequentist perspective, and one of the arguments we make is that frequentists should more often consider using Bayesian methods.

Read more

On “The fallacy of placing confidence in confidence intervals”

Note: if you read this post, make sure to read the comments/discussion below it with Richard Morey, author of the paper in question, who put me straight on a number of points.

Thanks to Twitter I came across the latest draft of a very nicely written and thought provoking paper “The fallacy of placing confidence in confidence intervals”, by Morey, Rouder, Hoekstra, Lee and Wagenmakers. The paper aims to show why frequentist confidence intervals do not posses a number of properties that researchers often believe that they do. In contrast, they show that Bayesian credible intervals posses these desired properties, and advocate the replacement of confidence intervals with Bayesian credible intervals.

Read more