Generating Arbitrary Probabilities from a Fair Coin

Expectation · Medium · Free problem

You have a fair coin. Design an algorithm that simulates an event occurring with probability exactly $p = 1/3$. Your algorithm must terminate with probability 1 (no infinite loops).

What is the expected number of coin flips your algorithm uses?

Hints

  1. You cannot get exactly
    /3$ with a fixed number of flips -- you need a scheme that sometimes says "try again."
  2. Two flips give four equally likely outcomes. Can you assign them to "yes," "no," and "retry" so that the yes-to-no ratio is
    :2$?
  3. Once the retry probability is
    /4$ per round and each round costs 2 flips, the expected flip count follows from the geometric distribution.

Worked Solution

How to Think About It: The core challenge is that a single fair coin gives you probabilities in multiples of

/2$, and
/3$ is not a dyadic rational. So you cannot get exactly
/3$ with any fixed number of flips -- you need a randomized stopping rule. The classic trick is rejection sampling: flip a small batch, map some outcomes to "yes," some to "no," and if you land in an ambiguous region, start over. The art is choosing the mapping so the conditional probability of "yes" given that you did not retry is exactly
/3$. Once you have a valid scheme, the expected flip count falls out of the geometric distribution for the number of rounds.

Quick Estimate: We need to split outcomes into a 1:2 ratio (yes vs. no). Two flips give four equally likely outcomes. Assign one to "yes," two to "no," and one to "retry." Then $P(\text{yes} \mid \text{not retry}) = 1/3$. Each round uses 2 flips and the probability of resolving (not retrying) is $3/4$, so the expected number of rounds is $4/3$. That gives roughly

\times 4/3 \approx 2.67$ flips. The information-theoretic lower bound is about $0.918$ bits $/$
$ bit per flip $\approx 0.92$ flips, so
.67$ is reasonable -- we are paying a factor of about .9\times$ the entropy bound, which is typical for simple rejection schemes.

Approach: Use a two-flip rejection sampler and compute the expected flip count via the geometric distribution.

Formal Solution:

Flip the coin twice. There are four equally likely outcomes, each with probability

/4$:

| Outcome | Action | |---------|--------| | HH | Output "yes" | | HT | Output "no" | | TH | Output "no" | | TT | Retry (start over) |

When we do not retry (probability $3/4$), the conditional probabilities are:

$P(\text{yes} \mid \text{resolved}) = \frac{1/4}{3/4} = \frac{1}{3}, \quad P(\text{no} \mid \text{resolved}) = \frac{2/4}{3/4} = \frac{2}{3}$

Exactly what we want.

Termination: The probability of retrying $k$ times in a row is $(1/4)^k \to 0$, so the algorithm terminates with probability 1.

Expected number of flips: Let $R$ be the number of rounds (each round = 2 flips). Each round resolves with probability $3/4$, so $R \sim \text{Geometric}(3/4)$ and:

$E[R] = \frac{1}{3/4} = \frac{4}{3}$

$E[\text{flips}] = 2 \cdot E[R] = 2 \cdot \frac{4}{3} = \frac{8}{3} \approx 2.67$

A different (faster) algorithm -- the binary-expansion / bit-comparison method: For arbitrary $p$, write $p$ in binary, $p = 0.b_1 b_2 b_3 \ldots$ Generate a uniform $U = 0.c_1 c_2 c_3 \ldots$ by flipping coins (heads $= 0$, tails $= 1$). Compare $U$ to $p$ bit by bit; as soon as $c_i \neq b_i$ you can decide: if $c_i < b_i$ then $U < p$, output "yes"; if $c_i > b_i$ then $U > p$, output "no." For $p = 1/3 = 0.\overline{01}_2$ the target bits are $b_1=0, b_2=1, b_3=0, b_4=1, \ldots$

This method is exact, but note its expected flip count is not $8/3$ -- it is smaller. The algorithm stops at the *first* index $i$ where the fair draw $c_i$ differs from the fixed target bit $b_i$. Regardless of whether $b_i$ is $0$ or

$, a fair $c_i$ matches it with probability
/2$ and mismatches with probability
/2$. Hence the stopping index is $\text{Geometric}(1/2)$ and

$E[\text{flips}_{\text{bit-compare}}] = \sum_{i \ge 1} i \left(\tfrac{1}{2}\right)^{i} = \frac{1}{1/2} = 2,$

independent of $p$. So the bit-comparison method uses only

$ flips on average for $p = 1/3$, beating the $8/3 \approx 2.67$ of the two-flip rejection sampler. (The rejection sampler "wastes" the
/4$ TT outcomes, restarting from scratch; the bit-comparison method never discards information, which is why it is more efficient.)

Information-theoretic lower bound: Any algorithm must use at least $H(p)/H(1/2)$ flips on average, where $H$ is binary entropy. For $p = 1/3$:

$H(1/3) = -\frac{1}{3}\log_2 \frac{1}{3} - \frac{2}{3}\log_2 \frac{2}{3} \approx 0.918 \text{ bits}$

So any algorithm needs at least $\approx 0.918$ flips on average. The rejection sampler uses $8/3 \approx 2.67$ (about

.9\times$ the bound); the bit-comparison method uses
$ (about
.2\times$ the bound). More sophisticated algorithms (Knuth-Yao) can get even closer to $0.918$, but the two-flip rejection sampler is by far the easiest to explain in an interview.

Answer: Use a two-flip rejection sampler: HH $\to$ yes, HT or TH $\to$ no, TT $\to$ retry. This produces probability exactly

/3$ and uses $E[\text{flips}] = 8/3 \approx 2.67$ coin flips on average. (The alternative bit-comparison method is also exact and uses only
$ flips on average.)

Intuition

The key principle here is rejection sampling: when you cannot directly generate the distribution you want, generate a slightly larger one and throw away the cases that do not fit. With a fair coin, every fixed number of flips produces probabilities that are multiples of powers of

/2$. Since
/3$ is not a dyadic rational, no finite fixed-length scheme can work. But by allowing a random number of flips (via retries), you can hit any probability exactly.

This idea shows up constantly in practice. Monte Carlo simulations, accept-reject methods for generating non-standard distributions, and even the fundamental design of randomized algorithms all rely on this pattern. The trade-off is always efficiency vs. simplicity: our two-flip scheme uses about

.9\times$ the information-theoretic minimum, but it is trivial to implement and explain. In an interview, showing that you understand why retries are necessary (non-dyadic target) and can compute the expected cost (geometric series) is far more impressive than memorizing the optimal Knuth-Yao tree.

Open the full interactive solver →