Expected Number and Variance of Records in a Random Permutation
Draw a uniformly random permutation of $\{1, 2, \ldots, n\}$. Call position $k$ a record if the value at position $k$ is strictly larger than all values at positions
- Compute $E[R_n]$ exactly.
- Compute $\text{Var}(R_n)$ exactly.
- For large $n$, give simple approximations for both and explain the connection to harmonic numbers.
Hints
- Define an indicator variable $I_k$ for whether position $k$ is a record. What is the probability that the $k$-th element is the largest among the first $k$?
- By symmetry of random permutations, $P(I_k = 1) = 1/k$. For the variance, check whether the record indicators at different positions are correlated -- consider what happens when you condition on the relative ranking of the first $k$ values.
- The indicators $I_1, I_2, \ldots, I_n$ are mutually independent. So $\text{Var}(R_n) = \sum_{k=1}^{n} \text{Var}(I_k) = \sum 1/k - \sum 1/k^2$. Use $H_n \approx \ln n + \gamma$ for the asymptotics.
Worked Solution
How to Think About It: The trick with records is to avoid thinking about the permutation globally. Instead, ask a local question: what is the probability that position $k$ holds a record? That only depends on the relative ranking of the first $k$ entries, and by symmetry of random permutations, each of those $k$ values is equally likely to be the largest. So the probability of a record at position $k$ is
Quick Estimate: For $n = 10$: $E[R_{10}] = H_{10} = 1 + 1/2 + 1/3 + \cdots + 1/10 \approx 2.93$. So out of 10 positions, you expect roughly 3 records. The variance is $H_{10} - \sum_{k=1}^{10} 1/k^2 \approx 2.93 - 1.55 \approx 1.38$, giving a standard deviation around
Approach: Define indicator random variables and exploit symmetry of random permutations, then verify independence of the indicators.
Formal Solution:
Define $I_k = \mathbf{1}\{\text{position } k \text{ is a record}\}$ for $k = 1, 2, \ldots, n$. Then $R_n = \sum_{k=1}^{n} I_k$.
Step 1: Compute $P(I_k = 1)$.
Position $k$ is a record if and only if the value at position $k$ is the maximum of the first $k$ values. In a uniformly random permutation, the first $k$ values are an exchangeable set -- any of the $k$ values is equally likely to be in position $k$. So:
$P(I_k = 1) = \frac{1}{k}$
Step 2: Compute $E[R_n]$.
By linearity of expectation:
$E[R_n] = \sum_{k=1}^{n} E[I_k] = \sum_{k=1}^{n} \frac{1}{k} = H_n$
where $H_n$ is the $n$-th harmonic number.
Step 3: Show independence of $I_j$ and $I_k$.
For $j < k$, we need $P(I_j = 1, I_k = 1)$. Position $j$ is a record if the value at $j$ is the max of positions
$P(I_j = 1, I_k = 1) = \frac{1}{j} \cdot \frac{1}{k} = P(I_j = 1) \cdot P(I_k = 1)$
So $I_j$ and $I_k$ are independent for all $j \neq k$.
Step 4: Compute $\text{Var}(R_n)$.
Since the indicators are independent:
$\text{Var}(R_n) = \sum_{k=1}^{n} \text{Var}(I_k) = \sum_{k=1}^{n} \frac{1}{k}\left(1 - \frac{1}{k}\right) = \sum_{k=1}^{n} \frac{1}{k} - \sum_{k=1}^{n} \frac{1}{k^2}$
$\text{Var}(R_n) = H_n - H_n^{(2)}$
where $H_n^{(2)} = \sum_{k=1}^{n} 1/k^2$ is the $n$-th generalized harmonic number of order 2.
Step 5: Large-$n$ asymptotics.
The harmonic number satisfies:
$H_n = \ln n + \gamma + O(1/n)$
where $\gamma \approx 0.5772$ is the Euler-Mascheroni constant. The generalized harmonic number converges:
$H_n^{(2)} \to \frac{\pi^2}{6} \approx 1.6449$
So for large $n$:
$E[R_n] \approx \ln n + \gamma$
$\text{Var}(R_n) \approx \ln n + \gamma - \frac{\pi^2}{6}$
Since both the mean and the variance grow like $\ln n$, and the record indicators are independent Bernoullis, the CLT gives:
$\frac{R_n - \ln n}{\sqrt{\ln n}} \xrightarrow{d} N(0, 1)$
Answer:
$E[R_n] = H_n = \sum_{k=1}^{n} \frac{1}{k}$
$\text{Var}(R_n) = H_n - H_n^{(2)} = \sum_{k=1}^{n} \frac{1}{k} - \sum_{k=1}^{n} \frac{1}{k^2}$
For large $n$: $E[R_n] \approx \ln n + 0.5772$ and $\text{Var}(R_n) \approx \ln n - 1.07$. Records grow logarithmically -- in a million-entry permutation, you expect only about 14 records.
Intuition
This problem showcases the power of indicator variables and symmetry. Instead of reasoning about the global structure of a random permutation -- which is combinatorially nightmarish -- you zoom in on each position and ask a simple local question: is this spot a record? The symmetry of random permutations makes the answer trivially
The connection to harmonic numbers runs deep in combinatorics and computer science. The expected number of records $H_n \approx \ln n$ appears in the analysis of algorithms (quicksort comparisons, skip list levels, coupon collector variants) and in extreme value theory. In a trading context, if you track daily all-time highs of a stock price modeled as an exchangeable sequence, you would see about $\ln n$ new records in $n$ days -- far fewer than people intuitively expect. This mismatch between intuition and reality is why record-breaking events (new market highs) feel significant but are actually statistically inevitable even in a trendless market.