Expected Maximum and Second Maximum of Standard Normals
Let $X_1, X_2, \ldots, X_n$ be i.i.d. $N(0,1)$ random variables. Write $X_{(n)}$ for the maximum and $X_{(n-1)}$ for the second largest.
- Give an exact integral expression for $E[X_{(n)}]$ in terms of the standard normal pdf $\phi$ and cdf $\Phi$.
- Give an exact integral expression for $E[X_{(n-1)}]$ in the same terms.
- For large $n$, the maximum is well-approximated by the Gumbel distribution. State the leading-order asymptotic formula for $E[X_{(n)}]$ as $n \to \infty$, and quantify the $O(1/\log n)$ correction term.
Hints
- The CDF of the maximum of $n$ i.i.d. draws is just the $n$-th power of the individual CDF. Differentiate to get the density and write $E[X_{(n)}]$ as an integral.
- For the second maximum, use the general order-statistic density formula: the $k$-th order statistic has density proportional to $\Phi(x)^{k-1}(1-\Phi(x))^{n-k}\phi(x)$.
- For the Gumbel asymptotics, find $a_n$ by solving $n(1 - \Phi(a_n)) \approx 1$ using Mill's ratio - \Phi(x) \sim \phi(x)/x$, and note the scale parameter is $b_n = 1/a_n$.
Worked Solution
How to Think About It: The maximum of $n$ i.i.d. draws is the classic order statistics problem. The CDF of the max is just $\Phi(x)^n$, so you can write down its density immediately. The expected value of the max grows like $\sqrt{2 \log n}$ -- slowly, because the normal tail is so thin. The second maximum trails behind by a gap of order
/\sqrt{\log n}$. If you know the Gumbel extreme value theory, you can state the approximation quickly; if not, you can derive it from a saddle-point / Mill's ratio argument.Quick Estimate: For $n = 100$: $\sqrt{2 \log 100} = \sqrt{2 \times 4.605} \approx \sqrt{9.21} \approx 3.03$. The exact value is $E[X_{(100)}] \approx 2.508$. The Gumbel approximation gives $a_{100} + \gamma / a_{100}$ where $a_{100} \approx 3.03$ and $\gamma \approx 0.5772$, so the estimate is roughly $3.03 + 0.5772/3.03 \approx 3.22$. The overestimate shows the correction terms matter -- the leading-order Gumbel approximation is not tight for moderate $n$, which is exactly why the problem asks about the correction.
Approach: Use the standard order-statistics density formulas, then apply extreme value theory with the normalizing constants from Mill's ratio.
Formal Solution:
Part 1: Exact expression for $E[X_{(n)}]$.
The CDF of $X_{(n)} = \max(X_1, \ldots, X_n)$ is
$F_{X_{(n)}}(x) = \Phi(x)^n$
so its density is
$f_{X_{(n)}}(x) = n\,\Phi(x)^{n-1}\,\phi(x).$
Therefore
$E[X_{(n)}] = \int_{-\infty}^{\infty} x\,n\,\Phi(x)^{n-1}\,\phi(x)\,dx.$
An equivalent tail-integral form (often more convenient for asymptotics) is
$E[X_{(n)}] = \int_0^{\infty} \bigl[1 - \Phi(x)^n - (1 - \Phi(-x))^n\bigr]\,dx = \int_0^{\infty} \bigl[1 - \Phi(x)^n\bigr]\,dx - \int_0^{\infty} \bigl[1 - (1-\Phi(x))^n\bigr]\,dx.$
But the density form above is the standard answer.
Part 2: Exact expression for $E[X_{(n-1)}]$.
The density of the $(n-1)$-th order statistic (second largest) from $n$ i.i.d. draws is
$f_{X_{(n-1)}}(x) = n(n-1)\,\Phi(x)^{n-2}\,\bigl[1 - \Phi(x)\bigr]\,\phi(x).$
This follows from the general order-statistic density $f_{X_{(k)}}(x) = \frac{n!}{(k-1)!(n-k)!}\,\Phi(x)^{k-1}\,(1-\Phi(x))^{n-k}\,\phi(x)$ with $k = n-1$. Therefore
$E[X_{(n-1)}] = \int_{-\infty}^{\infty} x\,n(n-1)\,\Phi(x)^{n-2}\,\bigl[1-\Phi(x)\bigr]\,\phi(x)\,dx.$
Note the useful identity: summing all order-statistic expectations gives $n \cdot E[X_1] = 0$, and the symmetry $E[X_{(k)}] = -E[X_{(n+1-k)}]$ lets you relate different order statistics.
Part 3: Gumbel approximation and correction.
By extreme value theory, the maximum of $n$ i.i.d. standard normals, after centering and scaling, converges to a Gumbel distribution:
$\frac{X_{(n)} - a_n}{b_n} \xrightarrow{d} G, \quad P(G \leq x) = e^{-e^{-x}},$
where the normalizing constants are
$a_n = \sqrt{2 \log n} - \frac{\log(4\pi \log n)}{2\sqrt{2 \log n}}, \qquad b_n = \frac{1}{\sqrt{2 \log n}}.$
These come from inverting $n(1 - \Phi(a_n)) \approx 1$ using Mill's ratio
- \Phi(x) \sim \phi(x)/x$ for large $x$.Since $E[G] = \gamma$ (the Euler-Mascheroni constant, $\gamma \approx 0.5772$), we get
$E[X_{(n)}] \approx a_n + \gamma\,b_n = \sqrt{2 \log n} - \frac{\log(4\pi \log n)}{2\sqrt{2 \log n}} + \frac{\gamma}{\sqrt{2 \log n}}.$
Collecting the
/\sqrt{2\log n}$ terms:$E[X_{(n)}] = \sqrt{2 \log n} + \frac{2\gamma - \log(4\pi \log n)}{2\sqrt{2 \log n}} + O\!\left(\frac{1}{\log n}\right).$
The leading term is $\sqrt{2 \log n}$, which grows very slowly. The correction is $O(1/\sqrt{\log n})$, and higher-order corrections are $O(1/\log n)$. The convergence to the Gumbel limit is notoriously slow -- you need $n$ in the thousands before the approximation is reasonably tight.
Answer:
$E[X_{(n)}] = \int_{-\infty}^{\infty} x\,n\,\Phi(x)^{n-1}\,\phi(x)\,dx$
$E[X_{(n-1)}] = \int_{-\infty}^{\infty} x\,n(n-1)\,\Phi(x)^{n-2}\,(1-\Phi(x))\,\phi(x)\,dx$
For large $n$:
$E[X_{(n)}] = \sqrt{2\log n} + \frac{2\gamma - \log(4\pi\log n)}{2\sqrt{2\log n}} + O\!\left(\frac{1}{\log n}\right)$
where $\gamma \approx 0.5772$ is the Euler-Mascheroni constant.
Intuition
The expected maximum of $n$ standard normals grows like $\sqrt{2 \log n}$ -- remarkably slowly. Even going from $n = 100$ to $n = 10{,}000$ only pushes the expected max from about 2.5 to about 3.5. This is because the normal tail decays so fast (like $e^{-x^2/2}$) that you need exponentially more draws to push the max up by a constant amount. This slow growth is the fundamental reason why the normal distribution is "well-behaved" compared to heavy-tailed distributions, where the maximum can grow polynomially or even linearly in $n$.
In practice, this matters whenever you are looking at the best (or worst) outcome across many bets, strategies, or signals. If you run $n$ uncorrelated strategies and pick the best performer, the expected Sharpe of the winner grows only as $\sqrt{2 \log n}$ -- this is the core of the multiple-testing / overfitting problem. The Gumbel approximation's slow convergence also means you should be cautious about using it for moderate sample sizes; numerical integration of the exact formula is often more reliable for $n < 1000$.