Bayesian Coin: Fair vs. Double-Headed

Probability · Easy · Free problem

A coin is either fair ($P(H) = 0.5$) or double-headed ($P(H) = 1$), each with prior probability $0.5$. You flip it $n$ times and observe $k$ heads.

Derive the posterior probability that the coin is fair, $P(\text{fair} \mid k \text{ heads in } n \text{ flips})$, as a function of $n$ and $k$. Consider both cases: $k < n$ (at least one tail observed) and $k = n$ (all heads).

Hints

  1. This is a two-hypothesis Bayesian update. Write down the likelihood of observing $k$ heads in $n$ flips under each hypothesis before applying Bayes' theorem.
  2. The double-headed coin assigns probability 0 to any outcome with at least one tail. What does this imply immediately about the posterior when you observe a tail?
  3. For the all-heads case, apply Bayes: the numerator is $(1/2)^n \times 0.5$ (fair coin, all heads) and the denominator adds
    \times 0.5$ (double-headed, all heads). Simplify to get $\frac{1}{1 + 2^n}$.

Worked Solution

How to Think About It: This is a two-hypothesis Bayes problem. One hypothesis (double-headed) is deterministic -- it can only produce heads. The other (fair) produces both. A single tail observation is a smoking gun: a double-headed coin cannot produce a tail, so observing one tail instantly proves the coin is fair. The interesting case is all-heads, where you have to weigh the likelihood that a fair coin happened to land heads $n$ times in a row against the certainty that a double-headed coin would.

Quick Estimate: For $k = n = 5$ (all heads, 5 flips), the fair coin would produce this with probability $(1/2)^5 = 1/32 \approx 3\%$, while the double-headed coin does it with certainty. The prior odds are 50-50, so the posterior odds are

:32$ in favor of double-headed. That gives $P(\text{fair} \mid 5 \text{ heads}) = 1/33 \approx 3\%$. Even 5 heads in a row is pretty damning.

Formal Solution:

Let $F$ = fair, $D$ = double-headed. Prior: $P(F) = P(D) = 0.5$.

Likelihoods: $P(k \text{ heads} \mid F) = \binom{n}{k} \left(\frac{1}{2}\right)^n$ $P(k \text{ heads} \mid D) = \begin{cases} 1 & k = n \\ 0 & k < n \end{cases}$

Case 1: $k < n$ (at least one tail observed)

The double-headed hypothesis assigns probability 0 to any outcome with a tail. By Bayes' theorem: $P(F \mid k < n \text{ heads}) = \frac{P(k \text{ heads} \mid F) \cdot P(F)}{P(k \text{ heads} \mid F) \cdot P(F) + 0 \cdot P(D)} = 1$

One tail is sufficient to conclude with certainty that the coin is fair.

Case 2: $k = n$ (all heads)

Applying Bayes' theorem: $P(F \mid n \text{ heads}) = \frac{P(n \text{ heads} \mid F) \cdot P(F)}{P(n \text{ heads} \mid F) \cdot P(F) + P(n \text{ heads} \mid D) \cdot P(D)}$ $= \frac{\left(\frac{1}{2}\right)^n \cdot 0.5}{\left(\frac{1}{2}\right)^n \cdot 0.5 + 1 \cdot 0.5} = \frac{\left(\frac{1}{2}\right)^n}{\left(\frac{1}{2}\right)^n + 1} = \frac{1}{1 + 2^n}$

Numerical examples:

| $n$ | $P(\text{fair} \mid n \text{ heads})$ | Approx | |---|---|---| | 1 |

/3$ | $33\%$ | | 5 |
/33$ | $3\%$ | | 10 |
/1025$ | $0.1\%$ |

The posterior probability of fairness collapses exponentially with consecutive heads.

Answer: $P(\text{fair} \mid k \text{ heads in } n \text{ flips}) = \begin{cases} 1 & k < n \\ \dfrac{1}{1 + 2^n} & k = n \end{cases}$

Intuition

The asymmetry between the two cases -- one tail gives certainty, all heads give doubt that shrinks exponentially -- is the key lesson here. The double-headed hypothesis is falsifiable: it makes a definite prediction (no tails ever), and one counterexample rules it out completely. The fair hypothesis cannot be falsified by heads-only data, only made less likely. This is a clean example of how Bayesian updating treats falsifiable vs. unfalsifiable hypotheses differently.

The formula $P(\text{fair} \mid n \text{ heads}) = 1/(1 + 2^n)$ is worth memorizing. The posterior odds against fairness are

^n : 1$ -- doubling with each additional head. After 10 heads, a rational observer should be $99.9\%$ confident the coin is rigged. This is relevant on trading desks in any context where you are trying to distinguish a skilled trader from a lucky one: a track record of $n$ consecutive winning periods, even with symmetric priors, has posterior odds of ^n$ in favor of skill (if a lucky trader would win 50% of periods and a skilled one would win them all). The math is exactly the same.

Open the full interactive solver →