VaR vs. Expected Shortfall
Two risk measures dominate the industry: Value-at-Risk (VaR) and Expected Shortfall (ES, also called Conditional VaR or CVaR).
- Define both measures precisely at confidence level $\alpha$. What does each one actually tell you about your loss distribution?
- Explain why ES is considered a more coherent risk measure than VaR. Focus on the property that matters most for portfolio aggregation.
- Give a concrete example -- with numbers -- where two portfolios have identical VaR but very different risk profiles. What does ES reveal that VaR hides?
Hints
- VaR is a quantile -- what information about the tail distribution does a single quantile discard?
- Think about what 'coherent risk measure' means, specifically the subadditivity axiom: can combining two portfolios ever increase VaR above the sum of their individual VaRs?
- For the concrete example, try two portfolios where the 5% tail is identical in probability but very different in severity -- both will have the same VaR$_{5\%}$ but their ES$_{5\%}$ will be orders of magnitude apart.
Worked Solution
How to Think About It: These two measures answer different questions. VaR answers: "What is the minimum loss I suffer in the worst $\alpha$ fraction of scenarios?" ES answers: "Conditional on being in that worst $\alpha$ fraction, how bad is it on average?" VaR is a threshold -- it tells you nothing about the severity of losses beyond it. This is not a minor technical quibble. In the 2008 crisis, banks held positions with catastrophic tail losses that looked fine by VaR, because VaR only cares whether losses exceed the threshold, not by how much. If you are managing a book and your risk limit is VaR-based, a trader can construct portfolios that exploit this blind spot.
Key Insight: VaR fails the subadditivity property -- combining two portfolios can produce a VaR higher than the sum of their individual VaRs. This means VaR can penalize diversification, which is incoherent as a risk measure. ES does not have this problem.
The Method:
1. Define VaR. For confidence level $\alpha \in (0,1)$ and loss variable $L$, VaR is the $\alpha$-quantile of the loss distribution: $\text{VaR}_{\alpha} = \inf\{x : P(L > x) \leq \alpha\}$ In plain language: with probability
2. Define ES. Expected Shortfall is the expected loss conditional on exceeding VaR: $\text{ES}_{\alpha} = E[L \mid L > \text{VaR}_{\alpha}]$ This is the average loss in the worst $\alpha$ fraction of scenarios. It integrates the entire tail rather than just reading off a single quantile.
3. Why ES is better -- subadditivity. A coherent risk measure must satisfy four axioms (Artzner et al. 1999). The one VaR violates is subadditivity: $\rho(A + B) \leq \rho(A) + \rho(B)$ This says that combining two portfolios should not increase risk beyond the sum of parts -- diversification should help, never hurt. VaR can violate this for non-normal distributions. ES always satisfies it. Practically, this means ES-based risk limits are aggregable: a desk's ES is bounded by the sum of its traders' ES limits. VaR limits do not have this property.
- The misleading VaR example.
Consider two portfolios, each evaluated at the 95% VaR (i.e., $\alpha = 5\%$):
- Portfolio A: 95% probability of \$0 loss, 5% probability of \0 loss.
- Portfolio B: 95% probability of \$0 loss, 4% probability of \
0 loss, 1% probability of \,000 loss.Both have $\text{VaR}_{5\%} = \$0$. By VaR, they look identical.
Now compute ES: $\text{ES}_{5\%}(A) = E[L \mid L > 0] = \
0$ $\text{ES}_{5\%}(B) = \frac{4 \times 10 + 1 \times 1{,}000}{5} = \frac{1{,}040}{5} = \08$Portfolio B is 20 times riskier in the tail, and VaR cannot see it. Any risk framework relying solely on VaR would treat these identically.
Practical Considerations: - Estimation difficulty. VaR requires estimating a single quantile. ES requires estimating the conditional mean beyond that quantile -- harder to do accurately, especially in low-data regimes (fat tails, rare events). This is why some firms still use VaR in practice despite its theoretical weaknesses. - Regulatory context. Basel III moved toward ES (at the 97.5% level) for internal models. The rationale was exactly the example above -- VaR was gamed during the financial crisis. - Backtesting. VaR is easy to backtest (count exceedances). ES is harder to backtest, though methods based on the Acerbi-Szekely framework exist.
Answer: VaR is the $\alpha$-quantile of the loss distribution -- a threshold that tells you nothing about what happens beyond it. ES is the expected loss given you are in the tail -- it integrates tail severity. ES dominates VaR theoretically because it is subadditive (VaR is not) and captures the full shape of the tail. The canonical counterexample: two portfolios with identical VaR but ES of \
0 vs. \08 -- VaR is blind to catastrophic concentrations that ES catches immediately.Intuition
The deeper lesson here is that a single number can never capture risk fully -- the question is which single number loses the least information. VaR discards everything about tail shape beyond the threshold. In normal markets, where tails are thin and losses scale predictably, this does not matter much. But financial losses are fat-tailed and lumpy -- rare catastrophic events coexist with routine small losses. VaR was designed in an era when normal approximations seemed reasonable; ES is the risk measure you would design knowing that the next crisis will look like 2008, not a textbook Gaussian.
The subadditivity failure of VaR has a practical consequence that is easy to miss: it means a firm's total VaR can be less than the sum of its desks' VaRs, even without any real diversification. This creates a perverse incentive -- you can appear to reduce firm-level risk on paper by restructuring how positions are bucketed, without actually reducing exposure. ES eliminates this game because it aggregates cleanly. If you are ever designing a risk framework or sitting in a risk committee, insist on ES as the primary measure and use VaR only as a secondary reference for regulatory reporting where required.
- Portfolio B: 95% probability of \$0 loss, 4% probability of \