Kelly Criterion and Execution Cost Models
You're a PM building a systematic equity strategy. Walk through the following:
- What is the Kelly criterion, and how does it apply to portfolio construction? Start with the single-bet case, then extend to a multi-asset portfolio.
- In practice, why do most quant funds use "fractional Kelly" rather than full Kelly?
- Now suppose you need to actually execute trades. Describe the main models for slippage and market impact -- specifically the linear impact model, the square-root model, and the Almgren-Chriss framework. What are the key differences between them, and when would you choose one over another?
- How do execution costs interact with Kelly sizing? If your impact costs are significant, how should that change your position sizing?
Hints
- Start with the simplest case: a single binary bet. What fraction of your bankroll maximizes the long-run growth rate, and why is it log-wealth that matters?
- For the execution cost models, think about how impact scales with order size. Is doubling your order twice as costly? What does the empirical evidence say?
- Consider how execution costs modify the Kelly formula -- if trading costs are a function of position size, your effective edge shrinks as you size up. What does that imply about strategy capacity?
Worked Solution
How to Think About It: This question is really about the gap between theory and practice in portfolio management. Kelly tells you the theoretically optimal bet size to maximize long-run wealth growth -- but it assumes you can execute at the prices you want, with no friction. In reality, execution costs eat into your edge, and the bigger you size, the more they eat. A senior quant thinks about Kelly and execution costs as two sides of the same coin: your optimal size depends on your edge, your uncertainty about that edge, AND your cost of expressing it.
Key Insight: Kelly sizing and execution cost modeling are not separate topics -- they are coupled. Your true optimal position size is the Kelly fraction adjusted downward for the impact costs of getting into and out of the position.
The Method:
Part 1: Kelly Criterion
For a single bet with win probability $p$ and payoff odds $b:1$, the Kelly fraction is:
$f^{*} = \frac{bp - (1 - p)}{b} = \frac{p(b + 1) - 1}{b}$
This maximizes $E[\log W]$, the expected log-wealth, which is equivalent to maximizing the long-run geometric growth rate.
For a single asset with normally distributed returns (mean $\mu$, variance $\sigma^2$, risk-free rate $r_f$):
$f^{*} = \frac{\mu - r_f}{\sigma^2}$
For a portfolio of $n$ assets with expected excess return vector $\boldsymbol{\mu} - r_f \mathbf{1}$ and covariance matrix $\Sigma$:
$\mathbf{f}^{*} = \Sigma^{-1}(\boldsymbol{\mu} - r_f \mathbf{1})$
Notice this is exactly the mean-variance tangency portfolio, just with a different normalization. Kelly and Markowitz are the same math -- Kelly just tells you how much leverage to use.
Part 2: Why Fractional Kelly?
Full Kelly maximizes long-run growth but with enormous volatility along the way. The drawdowns are brutal -- a full Kelly bettor has a 50% chance of a 50% drawdown at some point. Practically:
- Your estimates of $\mu$ and $\Sigma$ are noisy. Kelly is highly sensitive to estimation error -- if you overestimate your edge by 2x, you overbet by 2x.
- Half-Kelly ($f^{*}/2$) gives you 75% of the growth rate with substantially less variance and drawdown risk.
- Most funds use something in the range of 0.2 to 0.5 times Kelly, depending on confidence in their edge estimates.
Part 3: Slippage and Market Impact Models
*Linear Impact Model:*
$\Delta P = \lambda \cdot Q$
where $Q$ is order size and $\lambda$ is the Kyle lambda (price impact per unit traded). Simple, tractable, good for small orders. But it overestimates impact for very small orders and underestimates it for large ones.
*Square-Root Model:*
$\text{Impact} \approx \sigma \cdot \sqrt{\frac{Q}{V}}$
where $\sigma$ is daily volatility and $V$ is average daily volume. This is the empirical workhorse -- decades of data across markets confirm the square-root scaling. It captures the concavity of impact: doubling your order size less than doubles your impact.
*Almgren-Chriss Framework:*
This adds a time dimension. You are not just asking "how much does my trade move the price?" but "how should I split my order across time to minimize total cost?" The total cost has two components:
- Temporary impact -- price displacement that decays after you stop trading
- Permanent impact -- information content of your trade that shifts the equilibrium price
The optimal execution schedule balances impact cost (trade slowly) against timing risk (trade quickly before the price moves against you). The solution is a deterministic trading trajectory that minimizes $E[\text{cost}] + \lambda \cdot \text{Var}[\text{cost}]$.
When to use which: - Linear: quick back-of-envelope calculations, small orders, academic models - Square-root: realistic cost estimation for institutional orders, pre-trade analytics - Almgren-Chriss: optimal execution scheduling, VWAP/TWAP algorithm design, balancing urgency vs. cost
Part 4: Interaction Between Kelly and Execution Costs
Execution costs reduce your effective edge. If your expected return is $\mu$ but executing costs you $c(f)$ as a function of position size, your net edge is $\mu - c(f)$. With square-root impact, costs scale as $\sqrt{f}$, so the adjusted Kelly fraction solves:
$f^{*}_{\text{net}} = \frac{\mu - c(f^{*}_{\text{net}}) - r_f}{\sigma^2}$
This is smaller than the frictionless Kelly. The practical upshot: capacity constraints are real. Even if your signal is strong, there is a maximum size beyond which execution costs eat your entire edge.
Practical Considerations:
- Estimation error in $\mu$ matters far more than in $\sigma$ for Kelly sizing -- a small bias in expected returns leads to large overbetting
- Impact models need calibration to your specific market and order type -- equity impact is very different from futures or FX
- Temporary vs. permanent impact matters for how you think about round-trip costs: if most impact is temporary, you can recover some cost by being patient on the exit
- In practice, most systematic funds compute Kelly as a theoretical upper bound and then apply a discount factor (typically 0.25-0.5x) that implicitly accounts for estimation error, execution costs, and model risk
Answer: The Kelly criterion gives the growth-optimal bet size: $f^{*} = (\mu - r_f)/\sigma^2$ for a single asset, $\mathbf{f}^{*} = \Sigma^{-1}(\boldsymbol{\mu} - r_f\mathbf{1})$ for a portfolio. In practice, fractional Kelly (0.25-0.5x) is used because of parameter uncertainty and drawdown risk. Execution costs are modeled via linear impact (simple, small orders), square-root impact (empirically validated for institutional sizes), or Almgren-Chriss (optimal scheduling over time). These costs reduce effective edge and lower the optimal position size, creating a natural capacity constraint on any strategy.
Intuition
The Kelly criterion and execution cost modeling are two halves of the same question every systematic trader faces: how big should my position be? Kelly gives you the theoretical ceiling -- the size that maximizes geometric growth given your edge and variance. But it assumes frictionless markets, which do not exist. Execution costs create a wedge between your theoretical edge and your realized edge, and that wedge grows with position size. This is why capacity is such a central concept in quantitative finance -- even the best signal has a point where trading costs consume the entire alpha.
The deeper lesson is about humility in estimation. Kelly is exquisitely sensitive to your estimate of expected returns, which is the hardest thing to estimate in finance. Overestimate your edge by a factor of two and you will overbet by a factor of two, leading to catastrophic drawdowns. This is why practitioners almost universally use fractional Kelly, and why the best quant funds spend as much time on execution cost modeling and capacity analysis as they do on signal research. The signal tells you what to trade; Kelly tells you how much; execution modeling tells you whether you actually can.