Johansen Cointegration Test and Walk-Forward Spread Validation

Time Series · Hard · Free problem

You have three price series $p_1(t)$, $p_2(t)$, $p_3(t)$, each integrated of order 1 (i.e., $I(1)$). You suspect there is exactly one cointegrating relationship among them.

  1. Describe how you would use the Johansen test to determine the cointegration rank, including the null hypotheses tested at each step.
  1. If the rank is 1, explain how you would estimate the cointegrating vector $\boldsymbol{\beta}$ and construct a stationary spread.
  1. Propose a walk-forward procedure to test whether the spread remains stationary out of sample, without look-ahead bias.

Hints

  1. The Johansen test works within a vector error correction model (VECM) -- the cointegration rank equals the rank of the long-run impact matrix $\Pi = \alpha \beta'$.
  2. The trace and max-eigenvalue tests sequentially test $r = 0, 1, 2, \ldots$ using critical values from Johansen's (non-standard) distribution -- not chi-squared.
  3. For walk-forward validation, the key principle is that the cointegrating vector must be estimated using only past data at each step -- never allow information from the evaluation window to leak into the estimation.

Worked Solution

How to Think About It: Cointegration is the time-series version of "these things move together in the long run." Three $I(1)$ series individually wander like random walks, but if there is a linear combination that is stationary, that combination is mean-reverting -- and potentially tradeable. The Johansen test is the multivariate tool for detecting how many such independent linear combinations exist (the cointegration rank). The hard part is not finding cointegration in-sample (overfit is easy) but confirming it holds out of sample.

Key Insight: The Johansen test works by examining the rank of a matrix derived from a vector error correction model (VECM). Each eigenvalue corresponds to a potential cointegrating relationship, and sequential hypothesis tests determine how many eigenvalues are significantly nonzero.

The Method:

Part (i): Johansen Test for Cointegration Rank

Fit a VECM to the vector $\mathbf{p}(t) = (p_1(t), p_2(t), p_3(t))'$:

$\Delta \mathbf{p}(t) = \Pi \mathbf{p}(t-1) + \sum_{j=1}^{k-1} \Gamma_j \Delta \mathbf{p}(t-j) + \boldsymbol{\mu} + \boldsymbol{\varepsilon}(t)$

where $\Pi = \alpha \beta'$ is the long-run impact matrix. The cointegration rank $r$ equals $\text{rank}(\Pi)$.

The test uses the eigenvalues $\hat{\lambda}_1 \geq \hat{\lambda}_2 \geq \hat{\lambda}_3 \geq 0$ of $\Pi$ (obtained via reduced-rank regression / canonical correlation analysis).

Trace test (sequential):

  • $H_0: r = 0$ vs. $H_1: r \geq 1$. Test statistic: $\lambda_{\text{trace}}(0) = -T \sum_{i=1}^{3} \ln(1 - \hat{\lambda}_i)$.
  • If rejected: $H_0: r \leq 1$ vs. $H_1: r \geq 2$. Test statistic: $\lambda_{\text{trace}}(1) = -T \sum_{i=2}^{3} \ln(1 - \hat{\lambda}_i)$.
  • If rejected: $H_0: r \leq 2$ vs. $H_1: r = 3$. Test statistic: $\lambda_{\text{trace}}(2) = -T \ln(1 - \hat{\lambda}_3)$.

Critical values come from Johansen's tabulated distributions (not standard chi-squared). They depend on the deterministic specification (constant, trend, etc.) and the number of variables.

Max-eigenvalue test (alternative):

  • $H_0: r = 0$ vs. $H_1: r = 1$. Statistic: $\lambda_{\max}(0) = -T \ln(1 - \hat{\lambda}_1)$.
  • $H_0: r = 1$ vs. $H_1: r = 2$. Statistic: $\lambda_{\max}(1) = -T \ln(1 - \hat{\lambda}_2)$.

For our case: we expect to reject $r = 0$ but fail to reject $r \leq 1$, confirming exactly one cointegrating relationship.

Practical notes: Choose the lag order $k$ by information criteria (AIC/BIC) on the underlying VAR in levels. Include a constant in the cointegrating equation if the spread has a nonzero mean. Test for and correct serial correlation in the residuals.

Part (ii): Estimating the Cointegrating Vector and Spread

With $r = 1$, the matrix $\Pi = \alpha \beta'$ has rank 1. The cointegrating vector $\boldsymbol{\beta} = (\beta_1, \beta_2, \beta_3)'$ is the eigenvector corresponding to the largest eigenvalue $\hat{\lambda}_1$.

The stationary spread is:

$s(t) = \beta_1 p_1(t) + \beta_2 p_2(t) + \beta_3 p_3(t)$

Normalization: typically fix $\beta_1 = 1$ (or normalize to the asset with the largest loading), so the spread represents a dollar-neutral portfolio.

The vector $\alpha = (\alpha_1, \alpha_2, \alpha_3)'$ gives the speed-of-adjustment coefficients -- how quickly each price series responds to deviations in the spread. For a tradeable spread, at least one $\alpha_i$ should be economically meaningful (the spread mean-reverts within a reasonable time horizon).

Validation checks: - The estimated spread $s(t)$ should pass an ADF test for stationarity. - The half-life of mean reversion (from the AR(1) coefficient of $s(t)$) should be in a tradeable range (days to weeks, not years). - The loadings $\beta_i$ should be economically interpretable.

Part (iii): Walk-Forward Out-of-Sample Procedure

  1. Initial training window: Use the first $W$ observations (e.g., 2 years of daily data) to estimate the cointegrating vector $\hat{\boldsymbol{\beta}}$ via Johansen.
  1. Out-of-sample evaluation window: Apply the estimated $\hat{\boldsymbol{\beta}}$ to the next $H$ observations (e.g., 3 months) to construct the spread $s(t) = \hat{\beta}_1 p_1(t) + \hat{\beta}_2 p_2(t) + \hat{\beta}_3 p_3(t)$.
  1. Stationarity test on OOS spread: Run an ADF test on the out-of-sample spread. Record the test statistic and $p$-value.
  1. Roll forward: Advance the training window by $H$ observations (expanding or rolling) and repeat.
  1. Aggregate: Track the fraction of OOS windows where the spread passes the ADF test, the OOS half-life, and any structural breaks.

Critical safeguards against look-ahead bias:

  • Never use future data for estimation. The cointegrating vector at time $t$ is estimated using only data up to time $t$.
  • No peeking at OOS data for tuning. Lag order, deterministic specification, and other hyperparameters are chosen within the training window only.
  • Re-estimate $\hat{\boldsymbol{\beta}}$ each window. Cointegrating vectors can drift -- assuming a fixed $\boldsymbol{\beta}$ over the full sample is a common source of spurious backtesting results.
  • Track the cointegration rank OOS. If the Johansen test within the training window fails to find $r = 1$ for some windows, the relationship may have broken down.

Answer: Use the Johansen trace or max-eigenvalue test to sequentially test $r = 0, 1, 2$. If $r = 1$, the cointegrating vector is the eigenvector of the largest canonical correlation, and the spread $s(t) = \boldsymbol{\beta}' \mathbf{p}(t)$ is stationary. Validate OOS with a rolling-window procedure: estimate $\hat{\boldsymbol{\beta}}$ on past data, apply it forward, and test the OOS spread for stationarity via ADF. Re-estimate each window and never use future data for any estimation step.

Intuition

Cointegration testing is one of the most important -- and most commonly botched -- procedures in quantitative trading. The Johansen test is the right tool when you have multiple potentially cointegrated series, because it handles multiple cointegrating relationships simultaneously and does not require you to specify which variable is "dependent." The sequential testing procedure (starting from $r = 0$ and working up) controls the overall error rate.

The walk-forward validation in part (iii) is where most practitioners fail. It is easy to find cointegration in-sample, especially with three or more series -- you are fitting a linear combination to maximize stationarity, which can overfit noise. The true test is whether the same linear combination remains stationary on unseen data. Rolling the estimation window forward and tracking OOS performance gives you an honest assessment. If the spread breaks down frequently out of sample, the "cointegration" was likely spurious or the relationship is too unstable to trade.

Open the full interactive solver →