Engle's ARCH LM Test

Time Series · Hard · Free problem

Consider a mean-zero return series $r_t$ with conditional variance following an ARCH($p$) specification:

$\sigma_t^2 = \omega + \alpha_1 \varepsilon_{t-1}^2 + \alpha_2 \varepsilon_{t-2}^2 + \cdots + \alpha_p \varepsilon_{t-p}^2$

where $\varepsilon_t = r_t$ under the null of zero mean.

Derive Engle's ARCH LM test for $H_0\colon \alpha_1 = \alpha_2 = \cdots = \alpha_p = 0$ (no ARCH effects) using an OLS regression of $\hat{\varepsilon}_t^2$ on its $p$ lags.

State the test statistic and its asymptotic distribution under $H_0$.

Explain what steps are needed to ensure the test remains valid when the conditional mean model may be misspecified.

Hints

Under the null of no ARCH effects, the squared residuals should be unpredictable from their own lags. The test checks whether an OLS regression of $\hat{\varepsilon}_t^2$ on its lags has significant explanatory power.
The test statistic is $N \cdot R^2$ from the auxiliary regression. Under $H_0$, this is asymptotically $\chi^2(p)$ by the general LM test principle.
Mean misspecification makes the residuals $\hat{\varepsilon}_t$ serially correlated, which can spuriously inflate the ARCH LM statistic. Pre-test for serial correlation in $\hat{\varepsilon}_t$ and over-fit the mean model to absorb dynamics.

Worked Solution

How to Think About It: You have a return series and you want to check whether its volatility is time-varying. Under the null, the squared residuals are just white noise -- they do not predict their own future values. The LM test checks this by running a simple regression of $\hat{\varepsilon}_t^2$ on its own lags and testing whether the regression has explanatory power. If the $R^2$ of this auxiliary regression is high, there is evidence of ARCH effects. This is appealing because you never actually need to estimate the ARCH model -- you only estimate the (simpler) null model.

Key Insight: The LM test exploits the fact that, under $H_0$ (no ARCH), $\varepsilon_t^2$ has constant conditional expectation $\omega$. Any predictability of $\varepsilon_t^2$ from its own lags is evidence against $H_0$. The test statistic is just $N \cdot R^2$ from an auxiliary OLS regression.

The Method:

Step 1 -- Fit the null model. Under $H_0$, the conditional variance is constant ($\sigma_t^2 = \omega$). Estimate the conditional mean model (e.g., $r_t = \mu + \varepsilon_t$ or an AR model) by OLS. Obtain the residuals $\hat{\varepsilon}_t$.

Step 2 -- Run the auxiliary regression. Regress the squared residuals on their $p$ lags:

$\hat{\varepsilon}_t^2 = c_0 + c_1 \hat{\varepsilon}_{t-1}^2 + c_2 \hat{\varepsilon}_{t-2}^2 + \cdots + c_p \hat{\varepsilon}_{t-p}^2 + u_t$

Compute the $R^2$ of this regression using observations $t = p+1, \ldots, N$.

Step 3 -- Compute the test statistic. The LM statistic is:

$T_{LM} = (N - p) \cdot R^2$

where $N - p$ is the effective sample size of the auxiliary regression.

Step 4 -- Derivation of the asymptotic distribution.

Under $H_0\colon \alpha_1 = \cdots = \alpha_p = 0$, the residuals $\varepsilon_t$ are i.i.d. with mean zero and constant variance $\omega$. Therefore $\varepsilon_t^2$ is also i.i.d. with mean $\omega$ and finite variance $\text{Var}(\varepsilon_t^2) = E[\varepsilon_t^4] - \omega^2$.

The auxiliary regression of $\hat{\varepsilon}_t^2$ on its lags is an OLS regression of i.i.d. variables on their own lags. Under $H_0$, the true coefficients $c_1 = \cdots = c_p = 0$, and standard OLS asymptotic theory gives:

$T_{LM} = (N-p) \cdot R^2 \xrightarrow{d} \chi^2(p)$

This follows from the general LM test principle: under $H_0$, the score vector (proportional to the regression coefficients) is asymptotically normal, and the quadratic form in the score gives a chi-squared distribution with degrees of freedom equal to the number of restrictions being tested.

Reject $H_0$ at level $\alpha$ if $T_{LM} > \chi^2_\alpha(p)$. Large values indicate ARCH effects.

Step 5 -- Controlling for conditional mean misspecification.

If the conditional mean model is wrong (e.g., you fit a constant mean when the true model is AR(1)), then $\hat{\varepsilon}_t = r_t - \hat{\mu}$ contains mean-model errors that induce spurious autocorrelation in $\hat{\varepsilon}_t^2$, inflating the LM statistic.

To guard against this:

Over-fit the mean model: Include enough AR/MA lags to absorb any mean dynamics before computing residuals. Use AIC/BIC to select the lag order.

Pre-whiten the residuals: Apply a Ljung-Box test to $\hat{\varepsilon}_t$ (not $\hat{\varepsilon}_t^2$) first. If there is serial correlation in the levels, the mean model is misspecified and should be corrected before running the ARCH LM test.

Use robust variants: Wooldridge (1991) proposed a regression-based test that is robust to certain forms of mean misspecification by including additional regressors that capture mean-model error.

Include mean regressors in the auxiliary regression: Add the original mean-model regressors (e.g., lagged returns) to the auxiliary regression of $\hat{\varepsilon}_t^2$ on its lags. This "partials out" the effect of mean misspecification.

Practical Considerations:

The choice of $p$ matters. Too small and you miss higher-order ARCH effects; too large and you lose power. Common defaults: $p = 5$ for daily data, $p = 12$ for monthly.
The test assumes finite fourth moments. For heavy-tailed returns (e.g., $t$-distributed with low degrees of freedom), the chi-squared approximation can be poor. Bootstrap the test statistic in such cases.
The ARCH LM test detects ARCH but not GARCH directly. However, since GARCH($p$,$q$) implies ARCH($\infty$), the LM test with moderate $p$ typically has good power against GARCH alternatives.

Answer: The ARCH LM test statistic is $T_{LM} = (N-p) \cdot R^2$ from the regression of $\hat{\varepsilon}_t^2$ on $p$ lags of itself. Under $H_0$ (no ARCH), $T_{LM} \xrightarrow{d} \chi^2(p)$. To control for mean misspecification, over-fit the conditional mean model before extracting residuals, and verify that $\hat{\varepsilon}_t$ itself shows no serial correlation.

Intuition

The beauty of the LM test is that you only need to estimate the model under the null -- you never fit a GARCH model. You just ask: "Can I predict future squared residuals from past squared residuals?" If yes, volatility is time-varying. The $N \cdot R^2$ formulation makes implementation trivial: run one OLS regression and read off the $R^2$.

In practice, the ARCH LM test is the first thing you run when exploring a new return series. It tells you whether a constant-volatility model is adequate or whether you need GARCH/stochastic vol. The mean-misspecification caveat is real -- if you have an asset with strong momentum (serial correlation in returns), the squared residuals from a naive constant-mean model will show spurious ARCH-like patterns. Always check the mean model first.

Open the full interactive solver →