You observe a return series $r_1, r_2, \ldots, r_T$ that you believe is generated by a two-regime switching model. The hidden regime $Z_t \in \{1, 2\}$ follows a Markov chain with
\times 2$ transition matrix $P$, where $P_{ij} = \Pr(Z_{t+1} = j \mid Z_t = i)$. Conditional on the regime, the returns follow an AR(1) process:
The parameter set is $\theta = \{\mu_i, \rho_i, \sigma_i, P_{ij}\}$ for $i, j \in \{1, 2\}$, and the regime sequence $Z_1, \ldots, Z_T$ is unobserved.
E-step. Write out the forward-backward recursions that produce the smoothed state probabilities $\gamma_t(i) = \Pr(Z_t = i \mid r_1, \ldots, r_T)$ and the smoothed transition probabilities $\xi_t(i, j) = \Pr(Z_t = i, Z_{t+1} = j \mid r_1, \ldots, r_T)$. Define the forward and backward variables clearly.
M-step. Using $\gamma_t(i)$ and $\xi_t(i, j)$ from part (1), derive closed-form updates for all parameters: $\mu_i$, $\rho_i$, $\sigma_i^2$, and $P_{ij}$.
Practical issues. Discuss the label-switching identification problem (why can you always swap the regime labels and get an identical likelihood?) and propose a reasonable stopping rule for the EM iterations.
Hints
The E-step is the standard HMM forward-backward algorithm -- you just need to figure out what the emission density $b_t(i)$ is for an AR(1) observation.
For the M-step, notice that given the soft regime assignments $\gamma_t(i)$, each regime's AR(1) update is just a weighted least squares regression of $r_t$ on $r_{t-1}$.
For label switching, think about what happens if you swap every occurrence of regime 1 and regime 2 in the model. Why does the likelihood not change? How would you break this symmetry?
Worked Solution
How to Think About It: This is the bread and butter of regime-switching models in finance -- the Hamilton filter extended with a backward pass. The idea is simple: you cannot observe which regime the market is in, but each regime produces returns with different mean, persistence, and volatility. The EM algorithm alternates between (a) figuring out which regime you were probably in at each time step given the current parameter guesses (E-step), and (b) re-estimating the parameters using those regime probabilities as soft weights (M-step). Think of it as iteratively running a weighted regression where the weights themselves depend on the regression output.
Key Insight: The E-step is just the standard HMM forward-backward algorithm, with the emission density being a Gaussian whose mean depends on the previous return (because of the AR(1) structure). The M-step reduces to weighted least squares for the AR(1) parameters and frequency counting for the transition matrix.
---
### Part (1): E-step -- Forward-Backward Recursions
Emission density. For regime $i$ at time $t$, the observation likelihood given $r_{t-1}$ is:
Forward variable. Define $\alpha_t(i) = \Pr(Z_t = i, r_1, \ldots, r_t)$. Initialize with some prior $\pi_i$ (e.g., the stationary distribution of $P$):
Implementation note: In practice, work with log-scaled or normalized versions (dividing $\alpha_t$ by $\sum_i \alpha_t(i)$ at each step) to avoid numerical underflow over long series.
---
### Part (2): M-step -- Parameter Updates
The M-step maximizes the expected complete-data log-likelihood using $\gamma_t(i)$ and $\xi_t(i,j)$ as weights. Let $\hat{\gamma}_i = \sum_{t=2}^{T} \gamma_t(i)$ be the effective sample size for regime $i$ (starting from $t=2$ because the AR(1) conditions on $r_{t-1}$).
Transition matrix. Each row is re-estimated by counting expected transitions:
AR(1) parameters ($\mu_i$, $\rho_i$). For each regime $i$, the AR(1) model $r_t = \mu_i + \rho_i(r_{t-1} - \mu_i) + \sigma_i \varepsilon_t$ can be rewritten as $r_t = \mu_i(1 - \rho_i) + \rho_i \, r_{t-1} + \sigma_i \varepsilon_t$, which is a linear regression of $r_t$ on $r_{t-1}$ with intercept $c_i = \mu_i(1-\rho_i)$ and slope $\rho_i$. The M-step is weighted least squares with weights $\gamma_t(i)$:
where $\bar{r}_i$ and $\bar{r}_i^{\text{lag}}$ are the $\gamma_t(i)$-weighted means of $r_t$ and $r_{t-1}$ respectively. More cleanly, define the weighted regression vectors $y = (r_2, \ldots, r_T)$ and $x = (r_1, \ldots, r_{T-1})$. Let $W_i = \text{diag}(\gamma_2(i), \ldots, \gamma_T(i))$. Stack the design matrix $X = [\mathbf{1}, x]$. Then:
\leftrightarrow 2$ everywhere -- swap $\mu_1 \leftrightarrow \mu_2$, $\rho_1 \leftrightarrow \rho_2$, $\sigma_1 \leftrightarrow \sigma_2$, and relabel the rows/columns of $P$ -- the likelihood is unchanged. The model is symmetric under permutation of regime indices. This means the likelihood surface has (at least) two global maxima that are mirror images. EM will converge to one or the other depending on initialization, but the result is economically identical.
To pin down the labeling, impose an ordering constraint, e.g., $\sigma_1 < \sigma_2$ (regime 1 is the low-volatility regime). Alternatively, $\mu_1 < \mu_2$. This breaks the symmetry without constraining the model.
Stopping rule. Monitor the observed-data log-likelihood:
(or equivalently, accumulate the log normalizing constants from the forward pass). EM guarantees $\ell(\theta^{(k+1)}) \geq \ell(\theta^{(k)})$. A standard stopping rule is:
0^{-6}$), or equivalently when the relative change drops below the threshold. Also cap the number of iterations (e.g., 500) to avoid spinning in flat regions near saddle points. Running from multiple random initializations and keeping the best final log-likelihood guards against local optima.
Answer: The E-step uses standard HMM forward-backward recursions with Gaussian AR(1) emission densities to produce $\gamma_t(i)$ and $\xi_t(i,j)$. The M-step updates the transition matrix by expected transition counts and the AR(1) parameters by weighted least squares with weights $\gamma_t(i)$. Label switching is resolved by imposing an ordering constraint (e.g., $\sigma_1 < \sigma_2$), and EM is iterated until the log-likelihood increment falls below a tolerance, with multiple restarts to avoid local optima.
Intuition
Regime-switching models are the workhorse for capturing the empirical fact that financial markets behave very differently in calm vs. stressed periods -- different mean returns, different persistence, and especially different volatility. The EM algorithm is the natural estimation tool because the regime labels are latent: you never observe whether the market is in the "bull" or "bear" state, you only see the returns. The forward-backward pass is the mechanism that propagates information both from the past and from the future to give you the best guess of the regime at each point in time, and the M-step is just "re-run the regression with updated weights."
The practical pitfalls are important for real work. Label switching is not just a theoretical curiosity -- if you run EM from different starting points without an ordering constraint, you will get solutions that look different but are economically identical, which wreaks havoc on averaging across runs or interpreting parameter paths. The stopping rule matters because EM can crawl near flat ridges on the likelihood surface, and blindly iterating to full convergence wastes compute without improving the economic content of the model. In production, multiple restarts with an ordering convention and a sensible tolerance are standard practice.