GARCH(1,1) with Student-t Innovations: VaR and Tail Risk

Options Pricing · Hard · Free problem

Consider a GARCH(1,1) model for a return series $r_t$:

$r_t = \sigma_t z_t, \quad \sigma_t^2 = \omega + \alpha r_{t-1}^2 + \beta \sigma_{t-1}^2$

where the innovations $z_t$ are i.i.d. Student-$t$ with $\nu$ degrees of freedom (zero mean, unit variance).

Write the conditional log-likelihood for one observation $r_t$ given the past.

Derive the one-step-ahead Value-at-Risk at confidence level $(1 - \alpha_{\text{VaR}})$ as a function of $\hat{\sigma}_t$, the GARCH parameters $(\omega, \alpha, \beta)$, and the Student-$t$ quantile.

Explain how the estimated degrees of freedom $\nu$ influences tail risk, and why $\nu \leq 4$ creates problems for Expected Shortfall (ES) estimation.

Hints

The Student-$t$ with $\nu$ degrees of freedom must be scaled to have unit variance -- the scaling factor involves $\nu - 2$ in the denominator. Write the density with this scaling, then change variables from $z_t$ to $r_t = \sigma_t z_t$.
VaR is just a quantile: $\text{VaR} = \hat{\sigma}_{t+1} \times |\text{quantile of the standardized } t_\nu|$. The GARCH part gives you $\hat{\sigma}_{t+1}$; the distribution part gives you the quantile.
The Student-$t$ has finite $k$-th moment only when $k < \nu$. ES estimation requires reliable tail averages, whose standard errors depend on the fourth moment -- which is infinite when $\nu \leq 4$.

Worked Solution

How to Think About It: GARCH(1,1) captures volatility clustering -- big moves beget big moves. Using Student-$t$ innovations instead of Gaussian ones adds heavier tails, which is crucial for risk management because real return distributions have fatter tails than the normal. The key objects are: (1) the conditional variance $\sigma_t^2$ (which updates with each new return), (2) the likelihood (needed to estimate parameters), and (3) the VaR (a direct function of $\sigma_t$ and the $t$-quantile).

Quick Sanity Checks: The VaR should scale linearly with $\sigma_t$ (if volatility doubles, VaR doubles). As $\nu \to \infty$, the Student-$t$ converges to the normal, so the VaR should approach the Gaussian VaR. For small $\nu$, the tails get fatter and VaR increases.

Derivation:

Part 1: Conditional log-likelihood.

The standardized innovation is $z_t = r_t / \sigma_t$. The Student-$t$ density with $\nu$ degrees of freedom (scaled to unit variance) is:

$f(z; \nu) = \frac{\Gamma\left(\frac{\nu+1}{2}\right)}{\Gamma\left(\frac{\nu}{2}\right) \sqrt{\pi(\nu - 2)}} \left(1 + \frac{z^2}{\nu - 2}\right)^{-(\nu+1)/2}$

The conditional density of $r_t$ given the past is $f(r_t/\sigma_t; \nu) / \sigma_t$. The conditional log-likelihood for observation $t$ is:

$\ell_t = \ln \Gamma\!\left(\frac{\nu+1}{2}\right) - \ln \Gamma\!\left(\frac{\nu}{2}\right) - \frac{1}{2}\ln[\pi(\nu-2)] - \ln \sigma_t - \frac{\nu+1}{2} \ln\!\left(1 + \frac{r_t^2}{(\nu-2)\sigma_t^2}\right)$

The total log-likelihood is $\mathcal{L} = \sum_{t=1}^{T} \ell_t$, maximized over $(\omega, \alpha, \beta, \nu)$.

Part 2: One-step-ahead VaR.

The one-step-ahead forecast of $\sigma_{t+1}^2$ uses the GARCH recursion:

$\hat{\sigma}_{t+1}^2 = \omega + \alpha r_t^2 + \beta \sigma_t^2$

Since $r_{t+1} = \sigma_{t+1} z_{t+1}$ and $z_{t+1} \sim t_\nu$ (standardized), the $(1-\alpha_{\text{VaR}})$-VaR is the $\alpha_{\text{VaR}}$-quantile of the loss distribution:

$\text{VaR}_{1-\alpha_{\text{VaR}}} = -\hat{\sigma}_{t+1} \cdot t_{\nu, \alpha_{\text{VaR}}}^{-1}$

where $t_{\nu, \alpha_{\text{VaR}}}^{-1}$ is the $\alpha_{\text{VaR}}$-quantile of the standardized Student-$t$ distribution (this is negative for $\alpha_{\text{VaR}} < 0.5$, so the VaR is positive).

Equivalently:

$\text{VaR}_{1-\alpha_{\text{VaR}}} = \hat{\sigma}_{t+1} \cdot |t_{\nu, \alpha_{\text{VaR}}}^{-1}|$

For example, for a 99% VaR ($\alpha_{\text{VaR}} = 0.01$), with $\nu = 5$, the quantile $|t_{5, 0.01}^{-1}| \approx 3.365$ vs. the Gaussian $|z_{0.01}| \approx 2.326$ -- the Student-$t$ VaR is about 45% larger.

Part 3: Tail risk and the $\nu \leq 4$ problem.

The Student-$t$ distribution with $\nu$ degrees of freedom has the $k$-th moment finite only if $k < \nu$. Specifically: - $\nu > 2$: variance exists (we need this for the GARCH model to make sense). - $\nu > 3$: skewness exists. - $\nu > 4$: kurtosis (and thus the fourth moment) exists.

How $\nu$ influences tail risk: Smaller $\nu$ means heavier tails and higher VaR/ES. The tail decays as $|r|^{-(\nu+1)}$ (power law), compared to the exponential decay of the normal. In practice, equity returns often give $\hat{\nu}$ in the range 4-8, meaning tails are much heavier than Gaussian.

Why $\nu \leq 4$ breaks ES: Expected Shortfall (Conditional VaR) is defined as:

$\text{ES}_{1-\alpha} = E[-r_{t+1} \mid r_{t+1} \leq -\text{VaR}_{1-\alpha}]$

This requires the conditional expectation of the tail to be finite. For the standardized Student-$t$, the ES integral involves $E[|z| \mid z < q]$, which converges only if $E[|z|] < \infty$ -- i.e., $\nu > 1$. However, the ES formula involves $E[z^2 \mid z < q]$ indirectly through the variance scaling $\sigma_{t+1}$, and the standard closed-form ES expression for the Student-$t$ is:

$\text{ES}_{\alpha} = \sigma_{t+1} \cdot \frac{f_\nu(t_{\nu,\alpha}^{-1})}{\alpha} \cdot \frac{\nu + (t_{\nu,\alpha}^{-1})^2}{\nu - 1}$

This formula is well-defined for $\nu > 1$, but the deeper issue is that when $\nu \leq 4$, the fourth moment of returns does not exist, which means: - Standard errors of ES estimates diverge (the variance of the ES estimator involves the fourth moment). - Backtesting ES becomes unreliable because the sampling distribution of tail averages has infinite variance. - Convergence of MLE for $(\alpha, \beta)$ parameters can be poor since the GARCH information matrix involves fourth moments of $z_t$.

In practice, when $\hat{\nu} \leq 4$, the ES point estimate may still be computable, but it is statistically unreliable -- you cannot meaningfully put confidence intervals on it.

Practical Interpretation: A trader using GARCH-$t$ for risk management should: - Always check $\hat{\nu}$. If $\hat{\nu}$ is close to 4, VaR estimates are still usable but ES estimates should be treated with skepticism. - Consider using a distribution with more controlled tail behavior (e.g., GED or skewed-$t$) if $\hat{\nu}$ is very low. - Remember that VaR scales as $\sigma_t \times |\text{quantile}|$, so the dynamic part is all in the GARCH volatility forecast.

Answer: The conditional log-likelihood is $\ell_t = \ln\Gamma(\frac{\nu+1}{2}) - \ln\Gamma(\frac{\nu}{2}) - \frac{1}{2}\ln[\pi(\nu-2)] - \ln\sigma_t - \frac{\nu+1}{2}\ln(1 + r_t^2/((\nu-2)\sigma_t^2))$. The one-step VaR is $\text{VaR} = \hat{\sigma}_{t+1} \cdot |t_{\nu,\alpha}^{-1}|$. Small $\nu$ fattens tails and inflates VaR; when $\nu \leq 4$, the fourth moment of returns is infinite, making ES statistically unreliable because its standard errors and backtesting statistics depend on moments that do not exist.

Intuition

GARCH with Student-$t$ innovations is the workhorse model for risk management because it captures two empirical facts about financial returns: volatility clusters (GARCH) and fat tails ($t$ distribution). The VaR formula is beautifully simple -- it is just conditional volatility times a quantile -- which makes it easy to implement in production. The degree-of-freedom parameter $\nu$ acts as a dial between normality ($\nu \to \infty$) and heavy tails ($\nu$ small).

The $\nu \leq 4$ issue is subtle and important. The VaR itself is always well-defined (it is just a quantile), but ES requires computing a conditional tail expectation, and the statistical reliability of that estimate depends on moments that may not exist. This is why regulators' shift from VaR to ES (Basel III) created genuine practical challenges: ES is a better risk measure in theory, but harder to estimate reliably when tails are very heavy. Quants need to understand that a finite-sample ES estimate can look perfectly reasonable even when the underlying estimator has infinite variance -- a dangerous situation for risk management.

Open the full interactive solver →