Expected Rolls with a Fair Die

Expectation · Medium · Free problem

You have a fair six-sided die. Answer the following three questions:

  1. What is the expected number of rolls to get the first 6?
  1. What is the expected number of rolls until every face (1 through 6) has appeared at least once?
  1. What is the expected number of rolls until every face has appeared at least twice? (You may leave your answer as a sum or provide a numerical approximation.)

Hints

  1. Each part is about the waiting time until a collection criterion is met. What distribution describes the time until a single success?
  2. For part 2, decompose the process into phases: once you have $k$ distinct faces, the chance of seeing a new one is $(6 - k)/6$. Each phase is a geometric waiting time.
  3. For part 3, a simple phase decomposition does not work because progress has two dimensions (new faces vs. duplicates). Set up a Markov chain with state $(a, b)$ tracking how many faces have been seen exactly once versus at least twice.

Worked Solution

How to Think About It: All three parts are about waiting times -- how long until some collection criterion is met. Part 1 is pure geometric. Part 2 is the classic coupon collector problem. Part 3 is the generalized coupon collector with multiplicity 2. The key mental model is the same throughout: break the waiting time into phases where the probability of "making progress" is constant within each phase, then add up the expected phase lengths.

Quick Estimate: Part 1 is obviously 6. Part 2: you collect the first face instantly (1 roll), the second face takes $6/5$ rolls on average, the third $6/4$, etc. The harmonic sum

+ 1/2 + 1/3 + 1/4 + 1/5 + 1/6 \approx 2.45$, so $6 \times 2.45 = 14.7$. Part 3: once you have all six faces, you still need a second copy of each. But you probably already have some duplicates from the first collection phase. Intuitively, the second collection should take less than another 14.7 rolls -- maybe around 9-10 extra. So total is roughly
4.7 + 9 \approx 24$. The exact answer turns out to be about 24.13.

Approach: Geometric waiting times for part 1, coupon collector decomposition for part 2, Markov chain for part 3.

Formal Solution:

Part 1: First 6

Each roll lands on 6 with probability

/6$. The number of rolls until the first 6 follows a geometric distribution:

$E[T_1] = \frac{1}{1/6} = 6$

Part 2: Every face at least once (Coupon Collector)

After collecting $k$ distinct faces, the probability of seeing a new face on the next roll is $(6 - k)/6$. The waiting time in phase $k$ (going from $k$ to $k+1$ distinct faces) is geometric with parameter $(6-k)/6$, so it takes $6/(6-k)$ rolls in expectation.

Summing over all phases:

$E[T_2] = \sum_{k=0}^{5} \frac{6}{6-k} = 6\left(\frac{1}{6} + \frac{1}{5} + \frac{1}{4} + \frac{1}{3} + \frac{1}{2} + 1\right) = 6 \cdot H_6$

where $H_6 = 1 + 1/2 + 1/3 + 1/4 + 1/5 + 1/6 = 49/20$. So:

$E[T_2] = 6 \times \frac{49}{20} = \frac{147}{10} = 14.7$

Part 3: Every face at least twice (Generalized Coupon Collector)

This is harder because the phases no longer have a clean decomposition -- when you collect a new face for the first time versus promoting a face from "seen once" to "seen twice" both represent different kinds of progress, and the transition probabilities depend on the full state.

Define the state as $(a, b)$, where $a$ is the number of faces seen exactly once and $b$ is the number of faces seen at least twice. The number of unseen faces is $c = 6 - a - b$. We are done when $b = 6$.

From state $(a, b)$, each roll produces: - An unseen face (prob $c/6$): move to $(a+1, b)$ - A face seen exactly once (prob $a/6$): move to $(a-1, b+1)$ - A face seen $\geq 2$ times (prob $b/6$): stay at $(a, b)$ -- wasted roll

Let $E(a,b)$ be the expected number of rolls to reach $(0, 6)$. Then:

$E(a,b) = \frac{6}{6-b} + \frac{c}{6-b}\, E(a+1, b) + \frac{a}{6-b}\, E(a-1, b+1)$

with base case $E(0, 6) = 0$. Solving this system recursively starting from $E(0, 0)$ gives:

$E[T_3] = E(0,0) \approx 24.13$

To build intuition for this number, note that $E[T_3] - E[T_2] \approx 24.13 - 14.70 = 9.43$. The "second collection" takes fewer rolls than the first because many faces already have duplicates by the time the first collection completes.

Answer:

  • First 6: $E[T_1] = 6$
  • Every face once: $E[T_2] = 14.7$
  • Every face twice: $E[T_3] \approx 24.13$

Intuition

The coupon collector problem is one of the most frequently tested probability results in quant interviews because it captures a pattern that shows up everywhere: diminishing returns from random sampling. The first few coupons come quickly, but the last one takes forever -- the expected time for the final coupon alone is $n$ rolls (equal to the time for the very first one). This is why the harmonic series appears: you are summing geometric waiting times with shrinking success probabilities.

The generalized version (collecting each coupon $m$ times) is a natural extension and tests whether you can set up a proper Markov chain when the simple phase decomposition breaks down. The key insight is that once simple phases fail, you need to track the full state -- here, the pair $(a, b)$ counting faces by their multiplicity. This "state space thinking" is a core skill for any stochastic modeling problem in quant finance, from inventory management to order book dynamics.

Open the full interactive solver →