You roll a fair six-sided die repeatedly, keeping a running sum $S$. If $S$ ever lands exactly on a perfect square ($4, 9, 16, 25, 36, 49, \ldots$), the game ends immediately with payoff $\$0$. At any point before rolling, you may choose to stop and collect $S$ as your payoff. 1. Formulate the Bell…

How to Think About It: At $s = 35$, you are one pip away from $36 = 6^2$, an absorbing state that pays zero. Rolling a 1 kills you. That sounds dangerous -- but look at the other side. Rolls of 2 through 6 land you at $37$-$41$, which sit comfortably in the gap between the perfect squares $36$ and $49$ (a gap of 13, well beyond die range). From those safe states, you have room to accumulate more before the next threat at $49$. So the question is really: does the $5/6$ chance of landing in a safe zone worth roughly $45$ each outweigh the $1/6$ chance of instant death? Gut estimate: $(5/6) \times 45 \approx 37.5 > 35$. Rolling looks better than stopping. Quick Estimate: Let us put numbers on this. From $s = 35$, the six outcomes are $\{36, 37, 38, 39, 40, 41\}$. We know $V(36) = 0$ (perfect square). For states $37$-$41$, these are in the safe zone between squares $36$ and $49$, meaning no single roll from any of them can hit a perfect square. From these safe states, you can roll again risk-free, accumulating toward the $43$-$48$ range where the values stabilize around $43$-$46$. A rou…

Optimal Stopping with Absorbing Perfect Squares · Game Theory Quant Interview Question

How to Think About It: At $s = 35$, you are one pip away from $36 = 6^2$, an absorbing state that pays zero. Rolling a 1 kills you. That sounds dangerous -- but look at the other side. Rolls of 2 through 6 land you at $37$-$41$, which sit comfortably in the gap between the perfect squares $36$ and $49$ (a gap of 13, well beyond die range). From those safe states, you have room to accumulate more before the next threat at $49$. So the question is really: does the $5/6$ chance of landing in a safe zone worth roughly $45$ each outweigh the

/6$ chance of instant death? Gut estimate: $(5/6) \times 45 \approx 37.5 > 35$. Rolling looks better than stopping.

Quick Estimate: Let us put numbers on this. From $s = 35$, the six outcomes are $\{36, 37, 38, 39, 40, 41\}$. We know $V(36) = 0$ (perfect square). For states $37$-$41$, these are in the safe zone between squares $36$ and $49$, meaning no single roll from any of them can hit a perfect square. From these safe states, you can roll again risk-free, accumulating toward the $43$-$48$ range where the values stabilize around $43$-$46$. A rough estimate: each of $V(37)$ through $V(41)$ is worth about $44$-$45$ (you are around $37$-$41$ but can safely roll up by an expected $3.5$ per roll, and the states near $43$-$45$ stop at their face value). Using $\bar{V} \approx 45$ for the five safe outcomes:

$\text{Roll value} \approx \frac{1}{6}\bigl(0 + 5 \times 45\bigr) = \frac{225}{6} = 37.5$

Since $37.5 > 35$, the quick estimate says roll. The exact calculation confirms $V(35) \approx 37.41$, remarkably close to this back-of-envelope figure.

Approach: Formulate the Bellman equation for the MDP, then solve by backward induction starting from states near the next few perfect squares and working back to $s = 35$.

Formal Solution

Part 1: Bellman Equation

Define $V(s)$ as the optimal expected payoff when the current sum is $s$ and it is your turn to decide.

If $s$ is a perfect square, the game is already over: $V(s) = 0$.
Otherwise, you choose between stopping (payoff $s$) and rolling (expected continuation value):

$V(s) = \begin{cases} 0 & \text{if } s \text{ is a perfect square} \\\\ \displaystyle\max\!\left(s, \;\frac{1}{6}\sum_{k=1}^{6} V(s+k)\right) & \text{otherwise} \end{cases}$

This is the standard Bellman optimality equation for a Markov decision process with absorbing states. The optimal policy is a state-dependent threshold: at each state, compare the certain payoff $s$ against the expected value of continuing.

Part 2: Backward Induction for $V(35)$

From $s = 35$, rolling gives outcomes $\{36, 37, 38, 39, 40, 41\}$. Since $36 = 6^2$, we have $V(36) = 0$. We need $V(37)$ through $V(41)$, which depend on further states. The relevant perfect squares are $36, 49, 64, 81, \ldots$, with gaps of

3, 15, 17, \ldots$ (all exceeding 6), so there is always a safe zone in each gap where no single roll can reach either bounding square.

We solve layer by layer, starting far enough out that stopping is clearly optimal and working backward.

Layer 1 -- States near $49 = 7^2$.

States $43$-$48$ can reach the absorbing state $49$ in one roll:

$s = 43$: roll value $\approx 38.83 < 43$. Stop. The $V(49) = 0$ drags the average down too much.
$s = 44$: roll value $\approx 41.46 < 44$. Stop.
$s = 45$: roll value $\approx 43.94 < 45$. Stop.
$s = 46$: roll value $\approx 46.18 > 46$. Roll. Five of six outcomes avoid $49$, and states beyond $49$ have high value.
$s = 47$: roll value $\approx 48.08 > 47$. Roll.
$s = 48$: roll value $\approx 49.73 > 48$. Roll.

So $V(43) = 43$, $V(44) = 44$, $V(45) = 45$. States $43$-$45$ stop because they are too close to $49$ with insufficient upside, while $46$-$48$ roll because the probability of clearing $49$ is high enough and the beyond-$49$ values are large.

Layer 2 -- States $37$-$42$ (safe zone between $36$ and $49$).

From any state in $\{37, \ldots, 42\}$, the reachable states are $\{38, \ldots, 48\}$. No perfect square is reachable in one step from this zone (the nearest are $36$ below and $49$ above, both out of range). Rolling is therefore risk-free -- no immediate absorption can occur. The continuation values are:

| $s$ | Roll Value | $V(s)$ | Action | |-----|-----------|---------|--------| | 42 | 46.00 | 46.00 | Roll | | 41 | 45.38 | 45.38 | Roll | | 40 | 44.92 | 44.92 | Roll | | 39 | 44.72 | 44.72 | Roll | | 38 | 44.67 | 44.67 | Roll | | 37 | 44.78 | 44.78 | Roll |

All of these roll, since the continuation value (around $44$-$46$) far exceeds the stop value ($37$-$42$).

Layer 3 -- $V(35)$.

Now we assemble the final calculation:

$V(35) = \max\!\left(35, \;\frac{1}{6}\bigl(V(36) + V(37) + V(38) + V(39) + V(40) + V(41)\bigr)\right)$

$= \max\!\left(35, \;\frac{1}{6}(0 + 44.78 + 44.67 + 44.72 + 44.92 + 45.38)\right)$

$= \max\!\left(35, \;\frac{224.47}{6}\right) = \max(35,\; 37.41)$

The exact value, computed with rational arithmetic through full backward induction, is:

$V(35) = \frac{459{,}477{,}755{,}144{,}647{,}312{,}610{,}215}{12{,}281{,}884{,}428{,}929{,}630{,}994{,}432} \approx 37.411$

Since $37.41 > 35$, the expected value of rolling exceeds the stop payoff.

Answer: $V(35) \approx 37.41$. The optimal action at $s = 35$ is to roll. Despite the

/6$ probability of immediately hitting $36$ and losing everything, the five remaining outcomes land in states worth approximately $44$-$45$ each, giving an expected rolling value roughly

.41$ above the stop payoff of $35$.

Optimal Stopping with Absorbing Perfect Squares

Hints

Worked Solution

Intuition