Filtrations and the Central Limit Theorem
This is a two-part conceptual question:
- What is a filtration in probability theory? Give the formal definition and explain the intuition. How does it relate to martingales and adapted processes?
- What are the assumptions of the Central Limit Theorem? State the classical (Lindeberg-Levy) version precisely, including the convergence statement. What changes in the Lindeberg-Feller generalization?
Hints
- For filtrations, think about what information you have at each point in time. The sigma-algebra at time $t$ encodes everything you can observe up to and including time $t$.
- For the CLT, there are three key assumptions. Which one does the Lindeberg-Feller version drop, and what replaces it?
- A martingale is defined relative to a filtration: $E[X_t \mid \mathcal{F}_s] = X_s$. The CLT requires i.i.d. with finite variance in its classical form; the Lindeberg condition handles the non-identical case.
Worked Solution
How to Think About It: These are two foundational concepts that come up in almost every quant interview at some level. For filtrations, the key is the idea of "information growing over time" -- think of it as watching a stock price tick by tick, where at each moment you know everything that has happened so far but nothing about the future. For the CLT, the key is understanding exactly which assumptions you need and what breaks when you relax them.
Key Insight: A filtration formalizes the concept of "what you know at time $t$," and it is the backbone of stochastic calculus and options pricing. The CLT tells you when sums of random variables look Gaussian, and knowing its assumptions tells you when you can and cannot rely on normal approximations.
The Method:
Part 1: Filtration
A filtration on a probability space $(\Omega, \mathcal{F}, P)$ is a family of sigma-algebras $\{\mathcal{F}_t\}_{t \geq 0}$ satisfying:
$\mathcal{F}_s \subseteq \mathcal{F}_t \subseteq \mathcal{F} \quad \text{for all } s \leq t$
Intuition: $\mathcal{F}_t$ is the collection of all events whose occurrence or non-occurrence you can determine at time $t$. As time progresses, you learn more, so the sigma-algebra grows. You never "forget" information.
Key related concepts:
- Adapted process: A stochastic process $X_t$ is adapted to $\{\mathcal{F}_t\}$ if $X_t$ is $\mathcal{F}_t$-measurable for every $t$. In plain language: the value of $X_t$ is knowable at time $t$.
- Natural filtration: $\mathcal{F}_t = \sigma(X_s : s \leq t)$ -- the smallest sigma-algebra that makes all past values of $X$ measurable. This is the "no extra information" filtration.
- Martingale: An adapted, integrable process satisfying $E[X_t \mid \mathcal{F}_s] = X_s$ for $s < t$. The best forecast of the future value, given current information, is the current value.
Part 2: Central Limit Theorem
Classical (Lindeberg-Levy) CLT:
Let $X_1, X_2, \dots$ be i.i.d. random variables with mean $\mu$ and finite variance $\sigma^2 < \infty$. Then:
$\frac{\bar{X}_n - \mu}{\sigma / \sqrt{n}} \xrightarrow{d} N(0, 1) \quad \text{as } n \to \infty$
where $\bar{X}_n = \frac{1}{n}\sum_{i=1}^n X_i$.
Three required assumptions: 1. Independence of the $X_i$ 2. Identical distribution (same mean and variance) 3. Finite variance ($\sigma^2 < \infty$)
Lindeberg-Feller generalization: Drops the "identically distributed" requirement. Instead, for independent (not necessarily identically distributed) $X_i$ with $E[X_i] = \mu_i$ and $\text{Var}(X_i) = \sigma_i^2$, let $s_n^2 = \sum_{i=1}^n \sigma_i^2$. The CLT holds if the Lindeberg condition is satisfied: for every $\epsilon > 0$,
$\frac{1}{s_n^2} \sum_{i=1}^n E\left[(X_i - \mu_i)^2 \, \mathbf{1}\{|X_i - \mu_i| > \epsilon \, s_n\}\right] \to 0$
This ensures no single variable dominates the sum.
Practical Considerations:
- Filtrations are essential for pricing derivatives: the price of an option at time $t$ must be $\mathcal{F}_t$-measurable (you can only use information available now).
- The CLT justifies the widespread use of normal approximations in risk management, but it requires finite variance. For heavy-tailed data (e.g., Cauchy), the CLT does not apply and stable distributions arise instead.
- In finite samples, the CLT is an approximation. Berry-Esseen bounds quantify the approximation error: $O(1/\sqrt{n})$ for i.i.d. variables with finite third moment.
Answer: A filtration is an increasing family of sigma-algebras representing growing information over time; it underpins the definitions of adapted processes and martingales. The CLT requires independent, identically distributed random variables with finite variance, and states that the standardized sample mean converges in distribution to $N(0,1)$. The Lindeberg-Feller version relaxes identical distribution but imposes the Lindeberg condition to prevent any single variable from dominating.
Intuition
Filtrations and the CLT are two pillars of quantitative finance that seem unrelated but actually connect deeply. Filtrations formalize the flow of information -- every pricing formula, every conditional expectation, every hedging strategy is defined relative to a filtration. When a trader says "based on what we know now," they are implicitly invoking $\mathcal{F}_t$-measurability. The CLT, meanwhile, is the reason normal distributions show up everywhere in finance: aggregate many small, independent risks and the total looks Gaussian. But knowing the assumptions is critical -- when independence or finite variance breaks down (correlated risks, fat tails), the normal approximation fails, and that is exactly when financial crises happen.
The common mistake is treating the CLT as a universal truth rather than a theorem with specific conditions. In an interview, stating the assumptions precisely -- and knowing what goes wrong without them -- signals real understanding.