This post was prompted by a tweet by Frank Harrell yesterday asking:
In this post I’ll say a little bit about trying to answer Frank’s question, and then a little bit about an alternative question which I posed in response, namely, how does the interpretation change if the interval is a Bayesian credible interval, rather than a frequentist confidence interval.
Frequentist confidence intervals
A frequentist 95% confidence interval is constructed such that if the model assumptions are correct, if you were to (hypothetically) repeat the experiment or sampling many many times, 95% of the intervals constructed would contain the true value of the parameter. I made a short video last year which performs a simulation in R to demonstrate this definition/idea:
Frank asked how one interprets a particular realised confidence interval (0.72,0.91). The difficulty (as he is well aware!) is that all a frequentist can say is that this particular realised interval either does or does not contain the true parameter value, and they cannot tell you if it does or doesn’t. All they can say is that if modelling assumptions are correct, in hypothetical repetitions 95% of the intervals constructed using this procedure contain the true value.
Bayesian credible intervals
Let’s now suppose that we’ve done a Bayesian analysis. We’ve specified a prior distribution for the parameter, based on prior evidence, our subjective beliefs about the value of the parameter, or perhaps we used a default ‘non-informative’ prior built into our software package. We use the same model as before, and Bayes theorem gives us the posterior distribution. A Bayesian posterior credible interval is constructed, and suppose it gives us some values. For the sake of simplicity, I’ll assume the interval is again 0.72 to 0.91, but this is not done to suggest a Bayesian analysis credible interval will generally be identical to the frequentist’s confidence interval.
How should we interpret this credible interval? A Bayesian would I think say say something like it is an interval for which there is a 95% chance or probability the true parameter lies. At this point we must ask, what do they mean by 95% probability?
We could interpret it as a classical long run frequentist probability, but this means interpreting it like a confidence interval. In fact Bayesian procedures often have good frequentist properties. For example see Wang and Robins 1998 for an analysis of the frequentist properties of multiple imputation for missing data, or Bartlett and Keogh 2018 for a simulation investigation of the frequentist properties of Bayesian approaches for handling covariate measurement error. In fact, under certain conditions, Bayesian procedures achieve the same frequentist properties of maximum likelihood methods when the sample size gets large – see Chapter 4 of Gelman et al‘s excellent Bayesian Data Analysis book.
But conceptually we do not choose to do a Bayesian analysis simply as a means to performing frequentist inference. We choose it because it (hopefully) answers more directly what we are interested in (see Frank Harrell’s ‘My Journey From Frequentist to Bayesian Statistics‘ post). Namely, it enables us to make probability statements about the unknown parameter given our model, the prior, and the data we have observed. So what is the interpretation of the 95% chance or probability for a credible interval?
I don’t know the no doubt large literature on this topic well at all, but the Bayesian’s interpretation or definition of probability isn’t that clear to me. The Wikipedia entry on Bayesian probability says:
Broadly speaking, there are two interpretations of Bayesian probability. For objectivists, who interpret probability as an extension of logic, probability quantifies the reasonable expectation that everyone (even a “robot”) who shares the same knowledge should share in accordance with the rules of Bayesian statistics, which can be justified by Cox’s theorem. For subjectivists, probability corresponds to a personal belief. Rationality and coherence allow for substantial variation within the constraints they pose; the constraints are justified by the Dutch book argument or by decision theory and de Finetti’s theorem. The objective and subjective variants of Bayesian probability differ mainly in their interpretation and construction of the prior probability.Wikipedia entry on Bayesian probability
Section 1.5 ‘Probability as a measure of uncertainty’ of Bayesian Data Analysis talks about the way Bayesian analysis uses probability as a measure of uncertainty, but to my mind it doesn’t really define the concept. This is not a criticism. As Gelman et al say earlier in their book:
Rather than argue the foundations of statistics—see the bibliographic note at the end of this chapter for references to foundational debates—we prefer to concentrate on the pragmatic advantages of the Bayesian framework, whose flexibility and generality allow it to cope with complex problems.
If part of the appeal of Bayesian inference is that it answers the question we really want (i.e. conditional on what we’ve seen, what do we know / believe about the parameter(s)), it seems to me that the interpretation or definition of prior/posterior probabilities should be relatively straightforward and clear to us. But for me at least, it isn’t. I am quite confident (whatever I mean by that!) that this reflects my ignorance on the topic. Part of my motivation for writing this post is the hope that people will help me understand better how to unambiguously define what is meant by a Bayesian prior/posterior probability. Please write a comment if you can help in this regard.
The above does not mean I don’t like Bayesian methods. Indeed much of the last 10 years I have been working with and using methods like multiple imputation for missing data whose development take place in the Bayesian paradigm. For me this is fine because I know that methods like multiple imputation have good frequentist properties, and while there are definitely interpretational issues with things confidence intervals, I at least think I understand what they claim to do/be.
22nd November 2022 postscript
This evening there was further discussion of this topic on Twitter. As part of this Frank Harrell offered an interpretation for the Bayesian credible interval as follows:
I followed up with him as to the nature of the probability being referred to here, since it is clear that the probability notion being invoked is broader or distinct than the relative frequency notion of probability. Frank helpfully pointed me in the direction of the entry for ‘probability’ in his course glossary of terms, which can be accessed here. Part of this entry says:
The meaning attached to the metric known as a probability is up to the user; it can represent long-run relative frequency of repeatable observations, a degree of belief, or a measure of veracity or plausibility.https://hbiostat.org/doc/glossary.pdf
There are other schools of probability that do not require the notion of replication at all. For example, the school of subjective probability (associated with the Bayesian school) “considers probability as a measure of the degree of belief of a given subject in the occurrence of an event or, more generally, in the veracity of a given assertion” (see P. 55 of5). de Finetti defined subjective probability in terms of wagers and odds in betting. A risk-neutral individual would be willing to wager $P that an event will occur when the payoff is $1 and her subjective probability is P for the event.https://hbiostat.org/doc/glossary.pdf
As I wrote in the original post, I do not know the extensive literature on this topic well at all. But what Frank has summarised here is really useful in aiding my understanding of what the interpretation should be of the 95% probability statement attached to a 95% credible interval. It seems to me the notion of probability being invoked when interpreting a 95% credible interval has to be the subjective probability / degree of belief one described in the previous quote. Whether this definition/interpretation meets the criterion of being ‘exact’, as required by Frank in his tweet at the top of this post when asking for the exact interpretation of a particular realised frequentist confidence interval, I leave readers to decide.