Learning Objectives
By the end of this section, you will be able to:
📚 Core Knowledge
- Define credible intervals and their probability interpretation
- Distinguish between equal-tailed and HPD intervals
- Explain why HPD intervals are always shortest
- Contrast credible intervals with frequentist confidence intervals
🔧 Practical Skills
- Compute credible intervals for common conjugate posteriors
- Calculate intervals from MCMC samples
- Choose between HPD and equal-tailed based on context
- Communicate Bayesian uncertainty to stakeholders
🧠 Deep Learning Connections
- Prediction Uncertainty: Credible intervals for neural network predictions via MC Dropout or Bayesian layers
- Calibrated Uncertainty: Unlike softmax probabilities, Bayesian credible intervals give calibrated uncertainty estimates
- Safe AI: Credible intervals enable "I don't know" responses when models are uncertain
- Hyperparameter Uncertainty: Bayesian optimization uses credible intervals to balance exploration and exploitation
Where You'll Apply This: A/B testing (determining if a change is truly better), medical trials (efficacy bounds with honest probability statements), autonomous systems (safe decision-making under uncertainty), recommendation systems (Thompson Sampling), and any application where you need to communicate "how confident are we?"
The Big Picture: Honest Uncertainty
A credible interval is a Bayesian concept that directly answers the question most practitioners actually want to ask: "Given my data, what range of values is the parameter likely to take?"
Unlike frequentist confidence intervals, credible intervals allow us to make direct probability statements about parameters. We can legitimately say: "There is a 95% probability that the true parameter lies between 0.42 and 0.68." This is the interpretation that people intuitively (but incorrectly) apply to confidence intervals.
The Bayesian Promise
Given the observed data and our prior beliefs, a 95% credible interval satisfies:
The probability that the parameter lies in this interval is literally 95%.
Historical Motivation
The concept of quantifying uncertainty about parameters dates to Thomas Bayes and Pierre-Simon Laplace in the 18th century. However, for much of the 20th century, the frequentist paradigm dominated due to computational constraints. Modern computing power has revived Bayesian methods, and credible intervals are now standard in:
- Pharmaceutical trials: FDA accepts Bayesian analyses for drug approval
- Tech industry: Bayesian A/B testing at Google, Microsoft, Netflix
- Machine learning: Bayesian neural networks, Gaussian processes, uncertainty quantification
- Climate science: IPCC reports use Bayesian credible intervals for predictions
Mathematical Definition
A 100(1-α)% credible interval for parameter is any interval such that:
The interval contains (1-α) of the posterior probability mass
There are infinitely many intervals satisfying this condition. In practice, we use two main types:
Equal-Tailed Intervals
The equal-tailed interval excludes equal probability mass (α/2) from each tail of the posterior:
where is the inverse CDF (quantile function) of the posterior
Highest Posterior Density (HPD) Intervals
The HPD interval is the shortest interval containing the specified probability mass. It has a remarkable property:
HPD Property
Every point inside the HPD has higher posterior density than every point outside
| Property | Equal-Tailed | HPD |
|---|---|---|
| Tail probability | Equal (alpha/2 each) | Unequal (varies) |
| Width | Not necessarily shortest | Always shortest |
| Symmetry | Symmetric tails | Adapts to skewness |
| Computation | Simple (quantiles) | Optimization needed |
| For symmetric posteriors | Same as HPD | Same as equal-tailed |
Interactive: HPD vs Equal-Tailed
Explore how HPD and equal-tailed intervals differ for various posterior shapes. Try the presets or adjust the parameters manually to see when HPD provides significant width savings.
HPD vs Equal-Tailed Credible Intervals
Compare the shortest interval (HPD) with the symmetric equal-tailed interval
HPD Interval (Shortest)
[0.9990, 0.4392]
Width: -0.5598
Every point inside has higher density than every point outside
Equal-Tailed Interval
[0.9990, 0.4841]
Width: -0.5149
2.5% probability excluded from each tail
Width Savings
-8.7%
HPD is shorter
Skewness
0.64
Right-skewed
Posterior Mean
0.2308
E[theta|data]
Posterior Std
0.1126
Uncertainty
Key Insight: For symmetric posteriors (skewness near 0), HPD and equal-tailed intervals are identical. For skewed posteriors, HPD provides a shorter interval by shifting toward the mode. The width savings of -8.7% shows how much more precise HPD is for this posterior.
Confidence vs Credible Intervals
The distinction between frequentist confidence intervals and Bayesian credible intervals isphilosophically profound yet often confused. Here's the core difference:
Frequentist Confidence Interval
The parameter is fixed (but unknown). The interval is random (varies with sample).
Interpretation: "If we repeated this experiment many times, 95% of the computed intervals would contain the true θ."
Cannot make probability statements about THIS specific interval containing θ.
Bayesian Credible Interval
The parameter is random (has a distribution). The interval is fixed once computed.
Interpretation: "Given my data and prior, there is a 95% probability that θ lies in [a, b]."
CAN make direct probability statements about the parameter.
Interactive: The Two Paradigms
Confidence Intervals vs Credible Intervals
Two fundamentally different interpretations of interval estimates
Frequentist: Confidence Interval
Coverage Rate: 96%
24 contain true θ, 1 miss
Bayesian: Credible Interval
95% Credible Interval: [44.2, 59.8]
Width: 15.68
Key Philosophical Differences
| Aspect | Confidence Interval | Credible Interval |
|---|---|---|
| Parameter Status | Fixed but unknown | Random variable |
| Interval Status | Random (varies by sample) | Fixed once computed |
| Probability Statement | About the procedure | About the parameter |
| Requires Prior? | No | Yes |
| Interpretation | "95% of CIs will contain θ" | "95% probability θ is here" |
The Practical Reality: With uninformative priors and large samples, credible intervals and confidence intervals often give nearly identical numerical results. The philosophical difference matters most when making decisions about specific intervals or when incorporating prior knowledge is important.
Computing Credible Intervals
Step-by-Step Workflow
- Specify the prior: Choose a prior distribution reflecting your beliefs before seeing data. For minimal influence, use an uninformative prior.
- Collect data: Observe the outcomes of your experiment or study.
- Compute the posterior: Use Bayes' theorem to update your beliefs. For conjugate priors, this is closed-form; otherwise, use MCMC.
- Extract the interval: Compute equal-tailed quantiles or search for the HPD interval from the posterior distribution or MCMC samples.
- Interpret and communicate: State the result as a direct probability statement about the parameter.
Interactive: Bayesian Workflow
Walk through the complete workflow from prior to credible interval. Use the step navigation to see how each component contributes to the final result.
Step-by-Step: Computing Credible Intervals
Follow the Bayesian workflow from prior to posterior to credible interval
Step 1: Choose a Prior
Start with your prior beliefs about the parameter before seeing data.
Higher α pushes prior toward 1
Higher β pushes prior toward 0
What Determines Interval Width?
The width of a credible interval (and thus the precision of our inference) depends on three main factors:
Sample Size (n)
Width scales as . Doubling sample size reduces width by ~30%.
Prior Strength
Strong informative priors can narrow intervals if they match the data, or widen/shift them if they conflict.
Credible Level
Higher confidence requires wider intervals. A 99% CI is always wider than a 95% CI for the same posterior.
Interactive: Width Dynamics
What Affects Credible Interval Width?
Explore how sample size, prior strength, and credible level determine interval precision
More data = narrower intervals (sqrt(n) scaling)
Current Settings
Interval Width vs Sample Size
Posterior Distributions at Different Sample Sizes
Current CI
[0.0010, 0.5488]
Width: 0.5478
Posterior Mean
0.3824
vs true theta: 0.35
Effective n
34
= data (30) + prior (4)
Key Insight: Width ~ 1/sqrt(n)
Doubling sample size reduces interval width by approximately 30%. This is the fundamental sqrt(n) convergence rate of statistical inference.
Real-World Applications
Deep Learning Applications
Credible intervals are increasingly important in modern deep learning for uncertainty quantification. Here are key applications:
🎲 MC Dropout Uncertainty
Run inference multiple times with dropout enabled. The distribution of predictions gives a posterior over outputs. The 95% credible interval of predictions quantifies epistemic uncertainty - "what the model doesn't know."
🔄 Bayesian Neural Networks
Maintain full posteriors over weights instead of point estimates. Predictions integrate over weight uncertainty, naturally producing credible intervals. Crucial for safety-critical applications like medical diagnosis and autonomous vehicles.
🎯 Bayesian Optimization
Hyperparameter tuning with Gaussian processes produces credible intervals for the objective function at each point. The acquisition function balances exploitation (go where mean is good) and exploration (go where uncertainty is high).
🛡️ Uncertainty-Aware Predictions
Credible intervals enable "I don't know" responses. When the 95% credible interval is too wide, the system can defer to human judgment instead of making overconfident wrong predictions.
Python Implementation
Here's a comprehensive implementation of credible interval computation for both closed-form posteriors and Monte Carlo samples. Click on any highlighted line to see detailed explanations.
Common Pitfalls and Misconceptions
"Credible intervals and confidence intervals are the same thing"
They have fundamentally different interpretations. Credible intervals make probability statements about the parameter; confidence intervals make statements about the procedure. With uninformative priors and large samples they may be numerically similar, but the interpretation always differs.
"HPD intervals are always better than equal-tailed"
HPD is shorter, yes, but equal-tailed has simpler interpretation ("2.5% in each tail") and is easier to compute. For symmetric posteriors they're identical. For skewed posteriors, choose based on what you're communicating - sometimes the symmetric interpretation is what stakeholders need.
"The prior doesn't matter for credible intervals"
The prior always matters - it's part of the Bayesian model. With lots of data, the prior influence diminishes (Bernstein-von Mises theorem), but with small samples the prior can substantially affect both the center and width of the credible interval. Always report your prior.
"95% credible means 95% coverage in repeated experiments"
Credible intervals make probability statements conditional on the data you observed. They don't guarantee 95% frequentist coverage. A well-calibrated Bayesian model often has good coverage properties, but this isn't guaranteed and isn't the point.
Knowledge Check
Test your understanding of Bayesian credible intervals with this interactive quiz.
Knowledge Check
Question 1 of 8What is the fundamental difference between how frequentist and Bayesian frameworks treat the parameter θ?
Summary
Key Takeaways
- Credible intervals give the interpretation people want: "95% probability the parameter lies in this range" - a direct statement about where the parameter is, not about long-run procedure performance.
- Two main types exist: Equal-tailed intervals exclude α/2 from each tail (simple, symmetric interpretation). HPD intervals are shortest (optimal, adapt to skewness). For symmetric posteriors, they're identical.
- Width depends on sample size, prior, and credible level: More data narrows intervals (~1/sqrt(n)). Stronger priors can narrow or shift intervals. Higher credible levels require wider intervals.
- MCMC samples enable credible intervals for any posterior: For complex models without closed-form posteriors, compute intervals directly from MCMC samples using empirical quantiles or sorted-block methods.
- Deep learning needs credible intervals: Uncertainty quantification via MC Dropout, Bayesian NNs, and Gaussian processes all produce posterior distributions. Credible intervals translate these into actionable uncertainty statements for safe AI.
Looking Ahead: In the next section, we'll explore Bayes Factors and Model Comparison - how to use Bayesian methods not just for parameter estimation, but for deciding between competing models of the world.