Christopher Partlett and Richard Riley have just published an interesting paper in Statistics in Medicine (open access here). They examine the performance of 95% confidence intervals for the mean effect and 95% prediction intervals for a new effect in random-effects meta-analysis.

They simulate meta-analysis datasets from a wide range of scenarios, and assess the performance of a variety of methods for constructing a 95% confidence interval the mean effect and a 95% prediction interval for the distribution of true effects. The idea of the latter is an interval such that 95% of true population effects would lie within it.

For the 95% confidence interval for the mean effect, they find that the Hartung-Knapp method generally performs well across the scenarios considered, although performance was degraded when the study sizes varied considerably and the between study heterogeneity was low.

For the 95% prediction intervals, they found that all the methods investigated only performed well when there was large heterogeneity and the study sizes are similar. In other settings, with low heterogeneity and particularly varied study sizes, they performed poorly. Consequently, they recommend caution in using such prediction intervals, and advocate further research to examine whether improved methods could be developed.

I am glad the authors have finally conceded the lack of utility of this approach. I did point this out right after they proposed this:

http://www.bmj.com/rapid-response/2011/11/03/case-kings-new-clothes-there-no-such-thing-random-effects-meta-analysis

They didn't conclude it had a lack of utility! They concluded that a range of estimators for prediction intervals of a new effect had sub-standard frequentist properties in certain scenarios, and therefore that more work needs to be done to find estimators of prediction intervals with better performance.