The Bland-Altman plot is a very popular approach for analysing data from a method agreement study, which I was teaching students about today. We have measurements of a sample of subjects using one measurement technique or method, and a second measurement on each, taken using a new technique or method. The objective is to see how closely the measurements from the two methods agree. If they are very similar, we could use the new method, which may be cheaper, easier or less invasive to use, rather than the old method. The Bland-Altman plot plots the pairwise differences between the measurements against their average. Sometimes one sees a correlation between the pair-wise differences and averages. What is the interpretation of such a correlation?

The plot below shows the Bland-Altman plot for some simulated data (the R code follows later). The difference variable is the value of the new method’s measurement (X2) minus the existing method’s measurement (X1). The avg is their mean/average.

We see that as the pairwise average increases, the pairwise differences tend to increase on average. Taking the pairwise average as an estimate of the underlying (true) value of the subject/sample being measured, this suggests an interpretation of this correlation as meaning that for large underlying true values, the new method overestimates (on average) compared to the existing method, whereas for lower true values, it underestimates. One way of expressing this would be to say the bias of the new method varies with the underlying true value being measured.

The R code used to generate this plot is:

```
set.seed(7176)
n <- 1000
#simulate true values
true <- rnorm(n, mean=120,sd=20)
#simulate method 1 measurements, with it being unbiased
x1 <- true + rnorm(n,mean=0,sd=5)
#simulate method 2 measurements, with bias changing with true value
#specifically, we will have method 2 overestimate for larger values
#and underestimate for smaller values
x2 <- 120 + 1.2*(true-120) + rnorm(n,mean=0,sd=5)
diff <- x2-x1
avg <- (x1+x2)/2
plot(avg,diff)
```

This code first simulates the underlying true values of 1,000 samples, from a normal distribution with mean 120 and standard deviation (SD) 20. It then generates the measurements from the existing method as the true value plus an independent mean zero normal error term with SD 5. The new method measurement (X2) is generated with independent error with the same distribution. However, the line generating X2 shows that rather than true value plus mean zero error, it is generated as 120+1.2*(true-120) plus mean zero error. This means that indeed the second method is unbiased when the true value being measured is 120, but for true values above 120 it is positively biased, while for true values less than 120 it is negatively biased. This accords with our earlier interpretation of the Bland-Altman plot.

Let us now generate a third measurement X3, from an additional measurement method, and re-draw the Bland-Altman plot:

```
#now introduce method 3 without bias, but larger error variance
x3 <- true + rnorm(n,mean=0,sd=20)
diff13 <- x3-x1
avg13 <- (x1+x3)/2
plot(avg13,diff13)
```

Thus here X3 is unbiased for the true value, but its error SD is much larger (20, rather than 5). The Bland-Altman plot for (X1,X3) is:

This Bland-Altman plot looks qualitatively the same as the earlier one – we see a correlation between the pairwise differences (X3-X1) and the pairwise averages (X1+X3)/2. But because we simulated the data, we know that X3 is in fact unbiased for the true value, and that an interpretation that is biased positively when the true value is large and biased negatively when the true value is low is incorrect. We have shown that a correlation in the Bland-Altman plot could be caused by two quite distinct things, and therefore that if we see such a correlation, we should be cautious about our interpretation of its cause.

What explains the correlation in the second example? As Chris Frost and I described in a review paper back in 2008 (we were not the first!), such a correlation will occur even when the new method is unbiased if its error variance differs to the error variance of the first method. To see this, suppose that the two measurements X1 and X3 are generated as follows (as per the R code above):

X1=T+E1

X3=T+E3

where T denotes a random variable for the true value being measured, E(E1)=E(E3)=0, and T, E1 and E3 are independent of each other. To determine if the pairwise differences and average are correlated, we evaluate the covariance between the pairwise difference and average:

Cov(X3-X1,0.5X1+0.5X3)

=Cov(E3-E1, T+0.5E1+0.5E3)

=Cov(E3,T) + 0.5Cov(E3,E1) + 0.5Cov(E3,E3) – Cov(E1,T) – 0.5Cov(E1,E1) – 0.5Cov(E1,E3)

Since the errors are assumed independent of each other and of T, Cov(E3,T) = Cov(E3,E1) = Cov(E1,T) = 0, and we have

Cov(X3-X1,0.5X1+0.5X3) = 0.5Cov(E3,E3) – 0.5Cov(E1,E1) = 0.5 (Var(E3) – Var(E1))

Thus if the measurement errors from each method have the same variance, this covariance will be zero. But if the error variances differ, we will see a non-zero covariance (and correlation), even when the methods are unbiased (or the second method is unbiased for the ‘true’ value the first method is unbiased for). We see from this expression that we will get positive correlation if the new method has higher error variance than the existing method, and negative correlation if the new method has a smaller error variance.

Given all this what should can we do if we see correlation in the Bland-Altman plot. As we and others have described previously, you can either assume the error variances are equal, and attribute any correlation to changing bias, assume the new method is unbiased across the range of measurement and that any correlation is due to differences in error variances, or (ideally) make two measurements using each method. If you do this, you can figure out which explanation is correct. We describe how you might go about analysing the data from such a study in the review paper mentioned previously.

Lastly, using our simulated data let’s plot the third measurement X3 against the true value, which here we have because we simulated the data (in reality we wouldn’t typically know the true value):

`plot(true,x3-true)`

As expected, this shows no correlation between the pairwise difference (X3-true) with the true value.