Covariance Between Counts of Two Face Values in Dice Rolls
A fair six-sided die is rolled 5 times. Let $X$ be the total number of times a 3 appears, and let $Y$ be the total number of times a 5 appears. Find $\text{Cov}(X, Y)$.
Hints
- Define indicator variables $X_i$ and $Y_i$ for each roll. Can a single roll produce both a 3 and a 5?
- Use bilinearity: $\text{Cov}(\sum X_i, \sum Y_j) = \sum_i \sum_j \text{Cov}(X_i, Y_j)$. Which terms in the double sum are nonzero?
- Only the $i = j$ terms survive (cross-roll independence). For each, $E[X_i Y_i] = 0$ since the events are mutually exclusive, giving $\text{Cov}(X_i, Y_i) = -1/36$.
Worked Solution
How to Think About It: Before computing anything, think about the sign. If you roll a lot of 3s, those rolls are "used up" and cannot be 5s, so $X$ and $Y$ should be negatively correlated. The magnitude should be small because each individual roll only has a
Quick Estimate: Each roll contributes $\text{Cov}(X_i, Y_i)$ to the total (cross-roll terms are zero by independence). For a single roll, $X_i Y_i = 0$ always (a roll cannot be both 3 and 5), so $\text{Cov}(X_i, Y_i) = E[X_i Y_i] - E[X_i]E[Y_i] = 0 - (1/6)(1/6) = -1/36$. With 5 independent rolls: $5 \times (-1/36) = -5/36 \approx -0.139$.
Approach: Use indicator variables and the bilinearity of covariance.
Formal Solution:
Define indicator variables for each roll $i = 1, \ldots, 5$:
$X_i = \mathbf{1}[\text{roll } i = 3], \quad Y_i = \mathbf{1}[\text{roll } i = 5]$
Then $X = \sum_{i=1}^5 X_i$ and $Y = \sum_{i=1}^5 Y_i$.
By bilinearity of covariance:
$\text{Cov}(X, Y) = \sum_{i=1}^5 \sum_{j=1}^5 \text{Cov}(X_i, Y_j)$
Case $i \ne j$: Rolls $i$ and $j$ are independent, so $X_i$ and $Y_j$ are independent, giving $\text{Cov}(X_i, Y_j) = 0$.
Case $i = j$: On a single roll, $X_i = 1$ and $Y_i = 1$ cannot both occur (a roll is either 3 or 5 or something else), so $X_i Y_i = 0$ with probability 1. Therefore:
$\text{Cov}(X_i, Y_i) = E[X_i Y_i] - E[X_i] E[Y_i] = 0 - \frac{1}{6} \cdot \frac{1}{6} = -\frac{1}{36}$
Summing over the 5 diagonal terms:
$\text{Cov}(X, Y) = 5 \times \left(-\frac{1}{36}\right) = -\frac{5}{36}$
Answer: $\text{Cov}(X, Y) = -\dfrac{5}{36}$.
Intuition
This problem illustrates a fundamental principle: on a single trial, mutually exclusive outcomes are always negatively correlated. If you observe one, you cannot observe the other, so their indicator variables have negative covariance. The covariance equals $0 - p_1 p_2 = -p_1 p_2$, which is just the negative product of the marginal probabilities. This is a special case of the multinomial covariance structure.
The indicator variable decomposition is the single most useful trick for computing covariances of counts. It reduces the problem from reasoning about sums to reasoning about individual trials, where the analysis is trivial. In quant interviews, this technique shows up whenever you have counts of different event types across independent trials -- the same logic applies to counting different card types in a deal, different order types in a trading session, or different return regimes across days.