Beta-Binomial Sample Size for Credible Interval Width

Statistics · Medium · Free problem

You are estimating an unknown conversion probability $p$ with a $\text{Beta}(a, b)$ prior. After observing $H$ heads and $T$ tails, the posterior is $\text{Beta}(a + H, b + T)$. You keep flipping until the 95% equal-tailed credible interval for $p$ has half-width at most $\varepsilon$.

Express the stopping rule in terms of the observed counts $H$ and $T$ (and the prior parameters $a$, $b$).

For $a = b = 1$ (uniform prior) and $\varepsilon = 0.01$, give an accurate approximation for the required sample size $n = H + T$.

Relate your Bayesian stopping rule to the classical fixed-width confidence interval for a Bernoulli mean using normal approximations. When do the two approaches agree?

Hints

For large sample sizes, the $\text{Beta}(\alpha, \beta)$ distribution is well-approximated by a Gaussian with variance $\hat{p}(1-\hat{p})/(\alpha + \beta + 1)$.
The worst case for sample size is $p = 0.5$, since that maximizes the posterior variance $p(1-p)$. Plug in $\hat{p} = 0.5$ to get the upper bound on $n$.
For part (iii), compare the Bayesian posterior variance $\hat{p}(1-\hat{p})/(n + a + b + 1)$ with the frequentist sampling variance $\hat{p}(1-\hat{p})/n$. The difference is the prior's effective sample size.

Worked Solution

How to Think About It: You want to keep collecting data until you have nailed down $p$ to within $\pm 0.01$. The posterior $\text{Beta}(a+H, b+T)$ concentrates as you get more data, and the credible interval width shrinks roughly like

/\sqrt{n}$. The question is: exactly when is it narrow enough? The practical challenge is that the posterior variance depends on the observed $H/n$, so the sample size you need depends on where $p$ actually is -- worst case is $p = 0.5$.

Quick Estimate: For large $n$ with a uniform prior, the posterior is approximately $N(\hat{p}, \hat{p}(1-\hat{p})/(n+2))$ where $\hat{p} = (H+1)/(n+2)$. The 95% credible interval half-width is

.96\sqrt{\hat{p}(1-\hat{p})/(n+2)}$. Setting this to $0.01$ and using worst case $\hat{p} = 0.5$: $n + 2 \approx 1.96^2 \times 0.25 / 0.01^2 = 9604$, so $n \approx 9602$. In practice with the actual Beta quantiles, you get something close to $n \approx 9604$.

Formal Solution:

(i) Stopping rule:

The 95% equal-tailed credible interval is $[q_{0.025}, q_{0.975}]$ where $q_\alpha$ is the $\alpha$-quantile of $\text{Beta}(a+H, b+T)$. The stopping rule is:

$\frac{q_{0.975} - q_{0.025}}{2} \le \varepsilon$

For large $n = H + T$, the Beta posterior is well-approximated by a Gaussian. Let $\alpha' = a + H$, $\beta' = b + T$, and $n' = \alpha' + \beta'$. The posterior mean is $\hat{p} = \alpha'/n'$ and variance is $\hat{p}(1-\hat{p})/(n'+1)$. The normal approximation gives the stopping rule:

$1.96 \sqrt{\frac{\hat{p}(1-\hat{p})}{n'+1}} \le \varepsilon$

Squaring and rearranging:

$n' \ge \frac{1.96^2 \hat{p}(1-\hat{p})}{\varepsilon^2} - 1$

Since $\hat{p}$ depends on the data, the stopping criterion must be checked after each observation.

(ii) Sample size for $a = b = 1$, $\varepsilon = 0.01$:

With $a = b = 1$, we have $n' = n + 2$. The worst case is $\hat{p} = 0.5$, giving:

$n + 2 \ge \frac{1.96^2 \times 0.25}{0.0001} = \frac{0.9604}{0.0001} = 9604$

$n \ge 9602$

For other values of $\hat{p}$, less data is needed. If $\hat{p} = 0.1$, you need $n + 2 \ge 1.96^2 \times 0.09 / 0.0001 \approx 3458$, so $n \approx 3456$.

A good approximation valid for all $\hat{p}$:

$n \approx \frac{1.96^2 \hat{p}(1-\hat{p})}{\varepsilon^2} - (a + b)$

The maximum sample size (worst case) is approximately $n \approx 9604$.

(iii) Connection to frequentist fixed-width CI:

The classical approach uses the CLT: the MLE $\hat{p} = H/n$ has approximate distribution $N(p, p(1-p)/n)$. A 95% confidence interval has half-width

.96\sqrt{\hat{p}(1-\hat{p})/n}$. Setting this to $\varepsilon$:

$n \ge \frac{1.96^2 \hat{p}(1-\hat{p})}{\varepsilon^2}$

Compare with the Bayesian version: $n + a + b \ge 1.96^2 \hat{p}(1-\hat{p})/\varepsilon^2 + 1$. The two agree when $a + b$ is negligible relative to $n$ -- i.e., when the prior is "washed out" by data. For $a = b = 1$ and $n \approx 9600$, the difference is tiny (

Beta-Binomial Sample Size for Credible Interval Width

Hints

Worked Solution

Intuition