Stochastic Volatility and Particle Filtering
Consider a stochastic volatility model where log-returns follow $r_t = \sigma_t \varepsilon_t$ with $\varepsilon_t \overset{\text{iid}}{\sim} N(0,1)$, and log-volatility evolves as an AR(1) process: $h_t = \log \sigma_t^2 = \mu + \rho(h_{t-1} - \mu) + \sigma_\eta \xi_t, \quad \xi_t \sim N(0,1)$ with parameters $\theta = (\mu, \rho, \sigma_\eta^2)$ and $|\rho| < 1$.
- Write the model in state-space form (observation equation and state transition equation). Design a bootstrap particle filter to sequentially estimate the filtering distribution $p(h_t \mid r_{1:t})$.
- Using the particle approximation, derive an expression for the one-step-ahead variance forecast $E[\sigma_{t+1}^2 \mid r_{1:t}]$.
- Explain how to compute an approximation to the log-likelihood $\log p(r_{1:T}; \theta)$ using the particle filter output. How would you use this to perform MLE for $\theta$?
Hints
- The observation equation $r_t = e^{h_t/2} \varepsilon_t$ is nonlinear in the state $h_t$, ruling out the Kalman filter. Think about how to represent the posterior $p(h_t \mid r_{1:t})$ with a cloud of weighted samples.
- In the bootstrap filter, use the transition prior as the proposal -- propagate each particle through $h_t^{(i)} \sim N(\mu + \rho(h_{t-1}^{(i)} - \mu), \sigma_\eta^2)$, then weight by $p(r_t \mid h_t^{(i)}) = N(r_t; 0, e^{h_t^{(i)}})$.
- The log-likelihood is $\sum_t \log p(r_t \mid r_{1:t-1})$. Each term equals the log of the average unnormalized weight at step $t$: $\log\left(\frac{1}{M}\sum_i w_t^{(i)}\right)$.
Worked Solution
How to Think About It: This is a Hidden Markov Model where the latent state $h_t$ (log-volatility) drives the observable returns. You never see $h_t$ directly -- you infer it from the noisy signal $r_t$. Kalman filtering would work if both equations were linear-Gaussian, but the observation equation $r_t = e^{h_t/2} \varepsilon_t$ is nonlinear (the variance is $e^{h_t}$, which is a nonlinear function of $h_t$). This is why we need a particle filter -- it approximates the posterior $p(h_t \mid r_{1:t})$ with a weighted cloud of samples rather than a Gaussian.
Key Insight: The log-likelihood factors as $\log p(r_{1:T}) = \sum_{t=1}^{T} \log p(r_t \mid r_{1:t-1})$, and the particle filter naturally estimates each predictive density term $p(r_t \mid r_{1:t-1})$ as a byproduct of the resampling step.
Part 1 -- State-Space Form and Bootstrap Particle Filter:
State-space form: - Observation equation: $r_t \mid h_t \sim N(0, e^{h_t})$ - State transition: $h_t \mid h_{t-1} \sim N(\mu + \rho(h_{t-1} - \mu),\ \sigma_\eta^2)$ - Initial state: $h_0 \sim N(\mu,\ \sigma_\eta^2 / (1 - \rho^2))$ (stationary distribution)
Bootstrap particle filter with $M$ particles $\{h_t^{(i)}\}_{i=1}^{M}$:
- Initialize ($t = 0$): Draw $h_0^{(i)} \overset{\text{iid}}{\sim} N(\mu,\ \sigma_\eta^2 / (1-\rho^2))$ for $i = 1, \ldots, M$.
2. Propagate (predict step): For each particle, draw from the transition: $\tilde{h}_t^{(i)} \sim N\!\left(\mu + \rho(h_{t-1}^{(i)} - \mu),\ \sigma_\eta^2\right)$
3. Weight (update step): Compute unnormalized weights using the observation likelihood: $w_t^{(i)} \propto p(r_t \mid \tilde{h}_t^{(i)}) = \frac{1}{\sqrt{2\pi e^{\tilde{h}_t^{(i)}}}} \exp\!\left(-\frac{r_t^2}{2 e^{\tilde{h}_t^{(i)}}}\right)$ Normalize: $\bar{w}_t^{(i)} = w_t^{(i)} / \sum_j w_t^{(j)}$.
- Resample: Draw $M$ indices with replacement from $\{1,\ldots,M\}$ with probabilities $\{\bar{w}_t^{(i)}\}$. Set $h_t^{(i)}$ to the selected particles. (Systematic resampling is preferred for lower variance.)
- Repeat from step 2 at time $t+1$.
The particle approximation to the filtering distribution is $p(h_t \mid r_{1:t}) \approx \frac{1}{M} \sum_{i=1}^{M} \delta(h_t - h_t^{(i)})$.
Part 2 -- One-Step-Ahead Variance Forecast:
The forecast $E[\sigma_{t+1}^2 \mid r_{1:t}] = E[e^{h_{t+1}} \mid r_{1:t}]$. Using the law of total expectation and the transition: $E[e^{h_{t+1}} \mid r_{1:t}] = E\!\left[E[e^{h_{t+1}} \mid h_t] \,\middle|\, r_{1:t}\right]$
Since $h_{t+1} \mid h_t \sim N(\mu + \rho(h_t - \mu),\ \sigma_\eta^2)$, the inner expectation is a log-normal moment: $E[e^{h_{t+1}} \mid h_t] = \exp\!\left(\mu + \rho(h_t - \mu) + \frac{\sigma_\eta^2}{2}\right)$
Plugging in the particle approximation: $E[\sigma_{t+1}^2 \mid r_{1:t}] \approx \frac{1}{M} \sum_{i=1}^{M} \exp\!\left(\mu + \rho(h_t^{(i)} - \mu) + \frac{\sigma_\eta^2}{2}\right)$
Part 3 -- Log-Likelihood via Particle Filtering:
The log-likelihood decomposes as: $\log p(r_{1:T}; \theta) = \sum_{t=1}^{T} \log p(r_t \mid r_{1:t-1}; \theta)$
Each predictive factor is estimated by the average weight before normalization at step $t$: $p(r_t \mid r_{1:t-1}; \theta) \approx \frac{1}{M} \sum_{i=1}^{M} w_t^{(i)}$
where $w_t^{(i)} = p(r_t \mid \tilde{h}_t^{(i)})$ are the unnormalized weights. The particle log-likelihood estimate is: $\widehat{\log p}(r_{1:T}; \theta) = \sum_{t=1}^{T} \log\!\left(\frac{1}{M} \sum_{i=1}^{M} w_t^{(i)}\right)$
For MLE, treat this as an objective function in $\theta = (\mu, \rho, \sigma_\eta^2)$ and maximize numerically. Because the estimate is noisy (different across runs), use iterated filtering (IF2) or particle MCMC rather than vanilla gradient methods. With $M = 1000$--$5000$ particles, the log-likelihood estimate is reasonably smooth and gradient-free optimizers (Nelder-Mead, differential evolution) work well.
Answer: - Particle filter: propagate from prior, weight by observation likelihood $N(0, e^{h_t})$, resample. - Forecast: $E[\sigma_{t+1}^2 \mid r_{1:t}] \approx \frac{1}{M} \sum_i \exp(\mu + \rho(h_t^{(i)} - \mu) + \sigma_\eta^2/2)$. - Log-likelihood: sum of log average weights; maximize over $\theta$ using derivative-free optimization.
Intuition
Stochastic volatility models sit at the intersection of two core quant challenges: the state is latent (you never observe volatility directly), and the model is nonlinear (variance is an exponential function of the log-volatility state). The particle filter handles both by brute-force Monte Carlo -- it maintains a population of plausible volatility trajectories and reweights them every time a new return arrives. High-volatility particles get upweighted after large returns, low-volatility particles get upweighted after quiet days.
The marginal likelihood trick -- estimating $\log p(r_{1:T})$ as a sum of log-mean-weights -- is one of the most practically useful ideas in sequential Monte Carlo. It turns an intractable integral into an average, and makes MLE feasible even for models where the likelihood has no closed form. The catch is variance: with too few particles, the log-likelihood estimate is noisy and optimization becomes unreliable. In practice, variance reduction (stratified resampling, auxiliary particle filters) and robust optimization (particle MCMC, IF2) are essential for production-quality estimation.