Hawkes Process Basics

Stochastic Processes · Hard · Free problem

Buy orders arrive according to a Hawkes process with baseline intensity $\mu > 0$ and exponential kernel $\phi(t) = \alpha e^{-\beta t}$ where $\alpha, \beta > 0$.

The conditional intensity at time $t$ is:

$\lambda(t) = \mu + \sum_{t_i < t} \alpha e^{-\beta(t - t_i)}$

Each event bumps the intensity by $\alpha$, and the excitement decays exponentially at rate $\beta$.

  1. State the stability condition and compute the branching ratio.
  1. Compute the stationary mean intensity $\bar{\lambda}$.
  1. Write the log-likelihood for observed event times $\{t_1, t_2, \ldots, t_n\}$ on the interval $[0, T]$.
  1. An immigrant arrives at time $0$. Compute the expected total number of descendants (children, grandchildren, etc.) by time $T$.

Hints

  1. The branching ratio is the expected number of direct children per event -- integrate the kernel over the positive half-line. The process is stable when this ratio is less than 1.
  2. For the stationary mean intensity, write a self-consistency equation: the mean rate equals the baseline plus the mean rate times the branching ratio.
  3. For part (iv), set up a renewal equation: the expected descendants equal the expected children plus the expected descendants of each child in the remaining time window. Solve via Laplace transform.

Worked Solution

How to Think About It: A Hawkes process is a self-exciting point process -- each event makes the next one more likely, then the excitement fades. Think of it like order flow: a buy comes in, it triggers more buys (momentum, stop-losses, copycat algos), but the effect decays. The key parameter is the branching ratio $n = \alpha / \beta$, which is the expected number of direct children per event. If $n \geq 1$, the process explodes (infinite events in finite time). If $n < 1$, each cascade dies out and the process is stationary. This is the same math as a sub-critical Galton-Watson branching process.

Quick Estimate: If $\alpha = 0.3$ and $\beta = 0.5$, the branching ratio is $n = 0.6$. With baseline $\mu = 2$, the stationary intensity is $\mu / (1 - n) = 2 / 0.4 = 5$ events per unit time. So self-excitation multiplies the base rate by

/(1-n) = 2.5\times$. The expected descendants of one immigrant over a long horizon converge to $n/(1-n) = 1.5$.

---

(i) Stability Condition and Branching Ratio

The branching ratio is the expected number of direct offspring (first-generation children) triggered by a single event. An event at time $s$ contributes intensity $\alpha e^{-\beta(t-s)}$ for $t > s$. The expected number of children is:

$n = \int_0^{\infty} \alpha e^{-\beta u} \, du = \frac{\alpha}{\beta}$

Stability condition: $n < 1$, i.e.,

$\frac{\alpha}{\beta} < 1 \quad \Longleftrightarrow \quad \alpha < \beta$

When $n < 1$, each generation of offspring is smaller on average, so cascades die out and the process has a well-defined stationary distribution. When $n \geq 1$, cascades grow without bound.

---

(ii) Stationary Mean Intensity

In stationarity, the mean intensity $\bar{\lambda}$ satisfies a self-consistency equation. The total rate equals the baseline plus the contribution from all past events:

$\bar{\lambda} = \mu + \bar{\lambda} \int_0^{\infty} \alpha e^{-\beta u} \, du = \mu + \bar{\lambda} \cdot \frac{\alpha}{\beta}$

Solving:

$\bar{\lambda} = \mu + \bar{\lambda} \cdot n \quad \Rightarrow \quad \bar{\lambda}(1 - n) = \mu$

$\boxed{\bar{\lambda} = \frac{\mu}{1 - \alpha/\beta} = \frac{\mu \beta}{\beta - \alpha}}$

The factor

/(1 - n)$ is the self-excitation multiplier. As $\alpha \to \beta$, the process becomes critical and the mean intensity diverges.

---

(iii) Log-Likelihood

For a point process on $[0, T]$ with conditional intensity $\lambda(t)$, the log-likelihood is:

$\ell = \sum_{i=1}^{n} \ln \lambda(t_i) - \int_0^T \lambda(t) \, dt$

First, the conditional intensity at each event time:

$\lambda(t_i) = \mu + \sum_{j < i} \alpha e^{-\beta(t_i - t_j)}$

Second, the compensator (integrated intensity). Split the integral:

$\int_0^T \lambda(t) \, dt = \mu T + \sum_{i=1}^{n} \int_{t_i}^{T} \alpha e^{-\beta(t - t_i)} \, dt$

Each kernel integral evaluates to:

$\int_{t_i}^{T} \alpha e^{-\beta(t - t_i)} \, dt = \frac{\alpha}{\beta}\left(1 - e^{-\beta(T - t_i)}\right)$

Putting it together:

$\boxed{\ell = \sum_{i=1}^{n} \ln\!\left(\mu + \sum_{j < i} \alpha e^{-\beta(t_i - t_j)}\right) - \mu T - \frac{\alpha}{\beta} \sum_{i=1}^{n} \left(1 - e^{-\beta(T - t_i)}\right)}$

Note: the double sum in the first term can be computed in $O(n)$ time using a recursive update. Define $R_i = \sum_{j < i} e^{-\beta(t_i - t_j)}$. Then $R_1 = 0$ and $R_i = e^{-\beta(t_i - t_{i-1})}(R_{i-1} + 1)$ for $i \geq 2$. This makes MLE fitting practical even for large datasets.

---

(iv) Expected Descendants by Time $T$

Let $D(T)$ be the expected total number of descendants (all generations) of an immigrant at time $0$, counted up to time $T$.

First generation (direct children): The immigrant triggers offspring at rate $\alpha e^{-\beta t}$. The expected number of direct children by time $T$ is:

$m_1(T) = \int_0^{T} \alpha e^{-\beta t} \, dt = \frac{\alpha}{\beta}\left(1 - e^{-\beta T}\right) = n\left(1 - e^{-\beta T}\right)$

All generations: A child born at time $s$ itself triggers a sub-cascade with the same statistics, but the remaining observation window is $T - s$. By the branching property, $D(T)$ satisfies the renewal equation:

$D(T) = m_1(T) + \int_0^{T} \alpha e^{-\beta s} \, D(T - s) \, ds$

This is solved via Laplace transforms. Let $\hat{D}(z) = \int_0^{\infty} e^{-zt} D(t) \, dt$ and $\hat{\phi}(z) = \alpha/(z + \beta)$. The renewal equation gives:

$\hat{D}(z) = \frac{\hat{\phi}(z)}{z(1 - \hat{\phi}(z))} = \frac{\alpha}{z(z + \beta - \alpha)}$

Partial fractions:

$\hat{D}(z) = \frac{\alpha}{\beta - \alpha}\left(\frac{1}{z} - \frac{1}{z + \beta - \alpha}\right)$

Inverting:

$\boxed{D(T) = \frac{\alpha}{\beta - \alpha}\left(1 - e^{-(\beta - \alpha)T}\right) = \frac{n}{1 - n}\left(1 - e^{-(\beta - \alpha)T}\right)}$

Sanity checks: - As $T \to \infty$: $D(\infty) = n/(1 - n)$, the total progeny of a sub-critical branching process. - As $T \to 0$: $D(T) \approx \alpha T$, just the first-generation children in a short window. - As $\alpha \to 0$ (no excitation): $D(T) \to 0$. - The rate of convergence is governed by $\beta - \alpha$: the closer to criticality, the longer the cascade takes to die out.

Answer: The stability condition is $\alpha / \beta < 1$. The branching ratio is $n = \alpha / \beta$. The stationary mean intensity is $\bar{\lambda} = \mu\beta/(\beta - \alpha)$. The log-likelihood is the standard point-process formula with the exponential kernel admitting an $O(n)$ recursive computation. The expected descendants by time $T$ are $D(T) = \frac{n}{1-n}(1 - e^{-(\beta - \alpha)T})$.

Intuition

The Hawkes process is the workhorse model for self-exciting event data in finance -- order arrivals, trade clustering, default contagion. The key insight is the branching interpretation: each event is either an "immigrant" (arriving at baseline rate $\mu$) or an "offspring" triggered by a previous event. This makes the Hawkes process equivalent to a Poisson cluster process, where immigrant arrivals are Poisson($\mu$) and each event spawns children according to a Galton-Watson branching tree with mean offspring $n = \alpha/\beta$. The stability condition $n < 1$ is exactly the sub-criticality condition from branching process theory.

In practice, the branching ratio $n$ tells you how much of the observed activity is endogenous (self-generated feedback) versus exogenous (genuine new information). Empirical estimates for equity order flow often give $n \approx 0.6$-$0.8$, meaning the majority of events are actually triggered by other events, not by new information. This has deep implications for market microstructure: most of what looks like "activity" is just the market talking to itself. The exponential kernel is the simplest choice and makes everything analytically tractable (the process is Markovian in the intensity), but real data often shows power-law decay, which requires more careful modeling.

Open the full interactive solver →