Should we be using credible intervals more routinely?
Jason Oke, Luke Allen, José M. Ordóñez-Mena
23 July 2019
Research reviews & expert opinions
A colleague recently remarked on the increasing use of credible intervals and this made him reflect back on his understanding of confidence intervals but also to wonder whether we as researchers should be using credible intervals more routinely.
Credible intervals are the name given to uncertainty intervals constructed from a Bayesian analysis. Confidence intervals come from a frequentist analyses (this video discusses both). The Bayesian interval has what appears to be a considerable advantage in that it is a probability for the parameter of interest; with 95% probability the true value lies within the interval. In contrast, the frequentist version is a statement about repeated intervals; 95% of all intervals we could calculate will contain the true value. However, the frequentist interval is arguably more objective, depending only on the observed data. Even the interpretation of what probability is differs between Bayesian and frequentists, which led to one leading protagonists of the Bayesian approach to claim that “Probability does not exist” only subjectively within the minds of individuals (1).
But the methods are closer than that all suggests. Both calculate the probability of the data given the model (the likelihood), but the Bayesian interval requires you define a prior probability and then combine them via Bayes theorem. The posterior probability is said to be proportional to the prior times the likelihood, i.e.
prior * likelihood ∝ posterior
Different forms of evidence can be used for the prior and my prior does not need to be the same as your prior, nor does it have to come before the data (2). When the prior carries little information, credible and confidence intervals look very similar and can lead to similar conclusions. When the prior is informative, credible intervals will be narrower than the confidence interval with a centre shifted towards the prior. We illustrate this how this happens with an example adapted from (3).
Suppose as part of a routine health check, a women has her blood pressure taken and at 145 mmHg, is suspected as being hypertensive. The test is repeated and this time it is much lower (135 mmHg). We are interested in knowing her usual blood pressure. These two data points (m=2) form the likelihood and can be thought of as in terms of a normal distribution with mean ym and variance equal to σ2/m, i.e.
Assuming we know the within-person variance of blood pressure measurement to be = 62 mmHg, the likelihood has mean = (145 + 135) /2 = 140, and variance = 6^2/m = 2 and the 95% confidence interval is then
But we know a lot about blood pressure in women of this age, and we have an expectation of what it should be. Suppose, our experience suggests that on average blood pressure is 120 mmHg with variance 122 = 144 mmHg. Although, we don’t have to, assuming a normal distribution for the prior with mean (µ = 120) and variance (144 = σ2/n0) makes the posterior also normal and easy to derive. Choosing to parameterise the prior variance this way gives us a way to define the “effective” sample for the prior (n0). A little algebra shows us that n0 = 62/144 = 0.25). This may be surprising because our prior could have come observations from thousands of people but it is worth only one quarter of an observation in this calculation. Multiplying a normal prior with a normal likelihood gives a normal posterior;
And shows that the posterior has a mean which is the average of the prior and likelihood means, weighted by the “sample sizes” of the prior (n0) and likelihood (m). The posterior blood pressure is then;
The credible interval can then be found in the same way as we did for the frequentist interval but substituting the posterior mean and variance.
As it should be this interval is narrower and shifted towards the prior. In general, the credible interval will be more precise because we are adding information or data. This can be seen more clearly if we look at the respective widths of the confidence and credible intervals, with the former being based only on m and the latter being divided by a larger number (m + n0). This implies that for any σ2 and n0 > 0, the confidence interval will be wider than the credible interval because
So, should researchers be using credible intervals more routinely? Strict interpretation aside, we can think of confidence intervals as a special case of a credible interval when the prior carries little or no information. The question should then perhaps be, should we be using priors more routinely in our analyses?
We would like to thanks Richard Stevens for his helpful comments.
1. Nau RF. De Finetti was Right: Probability Does Not Exist. Theory and Decision. 2001;51(2):89-124.
2. Lindsey JK. Some Statistical Heresies. Journal of the Royal Statistical Society Series D (The Statistician). 1999;48(1):1-40.
3. David J Spiegelhalter KRA, Jonathon P Myles. Bayesian Approaches to Clinical Trials and Health Care Evaluation Statistics in Practice: John Wiley & Sons Ltd.; 2004.