Bayesian Regime Filter for Alpha Signal

Stochastic Processes · Hard · Free problem

Daily returns $A_t$ are generated by one of two regimes. In regime $z_t = 1$ (the "alpha" regime), returns are drawn from $N(\mu_1, \sigma^2)$, while in regime $z_t = 0$ (the "no-alpha" regime), returns come from $N(\mu_0, \sigma^2)$ with $\mu_1 > \mu_0$. The latent regime $z_t \in \{0, 1\}$ follows a two-state Markov chain with transition probabilities:

$P(z_t = 1 \mid z_{t-1} = 1) = p, \quad P(z_t = 0 \mid z_{t-1} = 0) = q$

You are given the parameters $(\mu_0, \mu_1, \sigma^2, p, q)$ and a sequence of observed returns $A_1, A_2, \ldots, A_T$.

Write the forward recursion that computes the filtered probability $\pi_t = P(z_t = 1 \mid A_{1:t})$ at each time step.

Design a position-sizing rule that reduces exposure when $\pi_t$ drops below a threshold $\pi^{*}$, and explain how you would choose $\pi^{*}$ to control the rate of false de-risking (reducing exposure when the alpha regime is actually active).

Hints

The filtered probability has two steps at each time: a prediction step using the Markov transition, and an update step using Bayes' rule with the Gaussian likelihood.
Write the likelihood ratio $L_t = \phi(A_t; \mu_1, \sigma^2) / \phi(A_t; \mu_0, \sigma^2)$ and note that $\log L_t$ is linear in $A_t$ because both densities share the same variance.
For the threshold, think of it as a Bayes decision problem: $\pi^{*}$ is where the expected cost of de-risking (missing alpha) equals the expected cost of staying exposed (no alpha). Simulate the model to compute the false de-risking rate at each candidate $\pi^{*}$.

Worked Solution

How to Think About It: This is a classic hidden Markov model (HMM) filtering problem applied to regime detection. The core idea is simple: you have a belief about which regime you are in, and each new return observation updates that belief via Bayes' rule. The Markov structure means your prior for today's regime depends on yesterday's posterior and the transition matrix. If you have been seeing fat returns consistent with $\mu_1$, your belief $\pi_t$ stays high. If returns start looking like $\mu_0$, $\pi_t$ drifts down. The practical question is: how low does $\pi_t$ have to go before you reduce your position?

Quick Estimate: Suppose $\mu_1 = 5$ bps, $\mu_0 = 0$ bps, $\sigma = 100$ bps (daily), $p = 0.98$, $q = 0.95$. The likelihood ratio for a single observation $A_t$ is $L_t = \phi(A_t; \mu_1, \sigma^2) / \phi(A_t; \mu_0, \sigma^2)$. Since $\mu_1 - \mu_0 = 5$ bps is tiny relative to $\sigma = 100$ bps, a single observation barely moves the posterior -- you need a string of positive or negative returns to shift $\pi_t$ meaningfully. This is why regime filters in practice are sluggish: the signal-to-noise ratio per observation is low, and detection relies on accumulating evidence over many days.

Part 1: The Forward Recursion

Let $\pi_t = P(z_t = 1 \mid A_{1:t})$ denote the filtered probability. The recursion has two steps:

Prediction step. Use the Markov transition matrix to compute the prior for $z_t$ given yesterday's posterior:

$\pi_{t|t-1} = P(z_t = 1 \mid A_{1:t-1}) = p \cdot \pi_{t-1} + (1 - q) \cdot (1 - \pi_{t-1})$

This says: you stay in regime 1 with probability $p$ if you were there, and you enter regime 1 with probability

- q$ if you were in regime 0.

Update step. Incorporate the new observation $A_t$ via Bayes' rule:

$\pi_t = \frac{\pi_{t|t-1} \cdot \phi(A_t; \mu_1, \sigma^2)}{\pi_{t|t-1} \cdot \phi(A_t; \mu_1, \sigma^2) + (1 - \pi_{t|t-1}) \cdot \phi(A_t; \mu_0, \sigma^2)}$

where $\phi(x; \mu, \sigma^2)$ is the Gaussian density. This can be simplified using the likelihood ratio $L_t = \phi(A_t; \mu_1, \sigma^2) / \phi(A_t; \mu_0, \sigma^2)$:

$\pi_t = \frac{\pi_{t|t-1} \cdot L_t}{\pi_{t|t-1} \cdot L_t + (1 - \pi_{t|t-1})}$

Since both densities are Gaussian with the same variance, the log-likelihood ratio has a clean form:

$\log L_t = \frac{(\mu_1 - \mu_0)}{\sigma^2} \left(A_t - \frac{\mu_1 + \mu_0}{2}\right)$

This is a linear function of the observation $A_t$, which is computationally convenient. Initialize with $\pi_0 = P(z_0 = 1)$, typically the ergodic probability $(1 - q) / (2 - p - q)$.

Part 2: De-Risking Rule and Threshold Selection

Key Insight: The threshold $\pi^{*}$ controls a trade-off between two costs: false de-risking (cutting exposure when the alpha regime is actually on, forfeiting returns) and false staying-in (keeping full exposure when alpha has disappeared, eating losses).

The Rule: Set position size proportional to the filtered probability, with a hard cutoff:

$w_t = \begin{cases} w_{\text{full}} \cdot \frac{\pi_t - \pi^{*}}{1 - \pi^{*}} & \text{if } \pi_t \geq \pi^{*} \\ 0 & \text{if } \pi_t < \pi^{*} \end{cases}$

This ramps down linearly as confidence in the alpha regime fades, and goes flat at zero below the threshold. A simpler binary version just toggles between full position ($\pi_t \geq \pi^{*}$) and no position ($\pi_t < \pi^{*}$).

**Choosing $\pi^{*}$:**

Simulation approach. Simulate the full Markov-switching model with known parameters. For each candidate threshold $\pi^{*}$, compute the false de-risking rate (fraction of time $\pi_t < \pi^{*}$ when $z_t = 1$) and the false exposure rate (fraction of time $\pi_t \geq \pi^{*}$ when $z_t = 0$). Plot the ROC curve and choose $\pi^{*}$ to hit a target false de-risking rate (e.g., 5%).

Cost-ratio approach. Let $c_{\text{miss}}$ be the cost of missing one day of alpha and $c_{\text{exposed}}$ be the cost of one day of false exposure. The optimal threshold satisfies:

$\pi^{*} = \frac{c_{\text{exposed}}}{c_{\text{miss}} + c_{\text{exposed}}}$

This is just the classical Bayes decision boundary. If missing alpha is much more costly than false exposure ($c_{\text{miss}} \gg c_{\text{exposed}}$), you set $\pi^{*}$ low and stay in longer. If drawdown risk dominates, you set $\pi^{*}$ high and cut early.

Practical calibration. In practice, you estimate $p$, $q$, $\mu_0$, $\mu_1$, $\sigma$ from historical data (often via EM algorithm), then simulate many paths to find the threshold that maximizes risk-adjusted returns (Sharpe ratio of the filtered strategy). Typical values of $\pi^{*}$ in equity regime models are in the range 0.3-0.5.

Answer: The forward filter is a predict-update recursion: predict via the Markov transition $\pi_{t|t-1} = p \pi_{t-1} + (1-q)(1-\pi_{t-1})$, then update via Bayes' rule using the Gaussian likelihood ratio. Position sizing is tied to $\pi_t$ with a threshold $\pi^{*}$ chosen by balancing the cost of false de-risking against the cost of false exposure, either through simulation-based ROC analysis or the Bayes-optimal decision boundary $\pi^{*} = c_{\text{exposed}} / (c_{\text{miss}} + c_{\text{exposed}})$.

Intuition

This problem is the workhorse of systematic trading: you have a signal that might or might not be active, and you need to decide in real time how much to bet. The HMM filter is elegant because it compresses the entire history of observations into a single sufficient statistic -- the filtered probability $\pi_t$. Every new return shifts this probability via a likelihood ratio that measures how much more consistent the observation is with the alpha regime versus the null. Because daily returns are noisy (signal-to-noise per observation is tiny), the filter relies heavily on the Markov persistence parameters $p$ and $q$ to maintain its beliefs. High persistence means the filter is sluggish but stable; low persistence makes it responsive but noisy.

The threshold decision is where theory meets PnL. Setting it too high means you cut exposure at the first sign of trouble and miss a lot of legitimate alpha. Setting it too low means you ride through regime changes and eat drawdowns. The Bayes-optimal threshold depends on the asymmetry of these costs, which in practice is driven by the ratio of alpha magnitude to drawdown severity. This is the same structure that appears in every signal-following system: how much evidence do you need before you change your position? The answer is never a fixed number -- it depends on what you stand to gain versus what you stand to lose.

Open the full interactive solver →