Rank IC vs. Pearson IC Under Monotone Transforms

Finance · Hard · Free problem

You have a cross-section of stocks with signals $s_i$ and returns $r_i = s_i + \varepsilon_i$, where $\varepsilon_i \overset{\text{i.i.d.}}{\sim} N(0, \sigma^2)$ and the noise is independent of $s_i$. You apply a strictly increasing function $g$ to the signal, producing a transformed signal $\tilde{s}_i = g(s_i)$. Throughout, IC denotes the cross-sectional correlation between a signal and the subsequent returns.

  1. Rank invariance. Prove that the Spearman rank IC between $\tilde{s}$ and $r$ equals the Spearman rank IC between $s$ and $r$. That is, $\text{Spearman}(\tilde{s}, r) = \text{Spearman}(s, r)$.
  1. Pearson IC changes. Explain why the Pearson IC between $\tilde{s}$ and $r$ is generally different from the Pearson IC between $s$ and $r$, even though the same monotone $g$ leaves the rank IC untouched.
  1. Turnover trap. Construct an explicit example -- choose a distribution for $s_i$ and a specific function $g$ -- where the transformed signal carries the *same predictive ranking* as the raw signal, yet a portfolio sized off the transformed values earns a *worse* out-of-sample Sharpe ratio once turnover costs are charged. Compute the Pearson IC before and after the transform, explain the mechanism that pulls signal quality and net performance apart, and say what it implies about using Pearson IC as your signal-quality metric.

Hints

  1. Think about what Spearman IC actually computes -- it only uses the ranks of the data. What happens to ranks when you apply a strictly increasing function, and are the ranks of $r$ affected at all?
  2. For Pearson IC, write out $\text{Corr}(g(s), r)$ explicitly. With $\varepsilon \perp s$ the numerator is $\text{Cov}(g(s), s)$ and the denominator carries $\text{Var}(g(s))$ -- both depend on the shape of $g$, not just the ordering. Then ask: can a monotone $g$ ever push this above the raw IC? (Consider $E[r \mid s] = s$ and Cauchy-Schwarz.)
  3. For Part 3, do not try to make Pearson IC rise -- it cannot under this model. Take $g(x) = x^3$ with $s \sim N(0,1)$: the ranking is fixed, Pearson IC falls to $3/\sqrt{30}$, but the derivative $g'(x) = 3x^2$ amplifies period-to-period signal changes, inflating turnover by about $3\times$ and hence lowering net Sharpe via $\text{SR}_{\text{net}} \approx \text{SR}_{\text{gross}} - \lambda \cdot \text{Turnover}$.

Worked Solution

How to Think About It: This problem tests whether you really understand what information content (IC) measures and when it can mislead you. In quant equity, IC is the cross-sectional correlation between your signal and subsequent returns -- the go-to metric for evaluating alpha signals. But there is a subtlety: Spearman IC uses ranks, Pearson IC uses raw values. A strictly increasing transform $g$ preserves ranks but can wildly change the raw values. So rank-based IC is invariant to any monotone rescaling of your signal, while Pearson IC is not. The trap to avoid is the naive story that you can always 'juice' Pearson IC with a clever $g$. Under the linear model $r = s + \varepsilon$ with $\varepsilon$ independent of $s$, the identity is optimal -- $E[r \mid s] = s$ -- so no monotone $g$ can raise the population Pearson IC; it can only stay flat or fall. The real lesson is sharper: a monotone $g$ can hold the predictive ranking fixed, lower the Pearson IC, AND still wreck net performance by inflating turnover.

Quick Estimate: Before any algebra, anchor on signs and magnitudes. Spearman IC should not move at all -- $g$ only reshuffles values within their existing order. Pearson IC should be at most the raw value: with $s, \varepsilon \sim N(0,1)$ the raw IC is

/\sqrt{2} \approx 0.71$, and cubing -- which stretches the tails hard -- should drag it down, not up. (It lands at $3/\sqrt{30} \approx 0.55$.) And the cube's derivative $3s^2$ is large in the tails, so a value-sized portfolio should churn its tail names harder, pushing turnover up by a factor of order $3$. So expect: ranking flat, Pearson IC down, net Sharpe down.

Approach: Part 1 uses that Spearman IC is Pearson correlation of ranks and that strict monotonicity preserves ranks. Part 2 writes Pearson IC as $\text{Cov}(g(s), s) / \sqrt{\text{Var}(g(s))\, \text{Var}(r)}$ and notes both pieces depend on the shape of $g$. Part 3 takes $s, \varepsilon \sim N(0,1)$ and $g(x) = x^3$, computes both ICs exactly via Gaussian moments, and links the cube's amplified spread and derivative to turnover and hence net Sharpe.

Formal Solution:

Part 1 -- Rank invariance. Spearman IC is the Pearson correlation of the ranks:

$\text{Spearman}(\tilde{s}, r) = \text{Pearson}(\text{rank}(\tilde{s}), \text{rank}(r)).$

Because $g$ is strictly increasing, $g(s_i) > g(s_j)$ if and only if $s_i > s_j$, so $g$ never reorders the cross-section:

$\text{rank}(\tilde{s}_i) = \text{rank}(g(s_i)) = \text{rank}(s_i) \quad \text{for all } i.$

The ranks of $r$ are untouched (we never transformed $r$). Substituting the identical rank vectors,

$\text{Spearman}(\tilde{s}, r) = \text{Pearson}(\text{rank}(s), \text{rank}(r)) = \text{Spearman}(s, r). \qquad \blacksquare$

Spearman depends on the data only through its ranks, and a strictly increasing map preserves ranks, so the rank IC is invariant.

Part 2 -- Pearson IC changes. Pearson IC reads the actual values of $g(s_i)$, not just their order:

$\text{Pearson}(\tilde{s}, r) = \frac{\text{Cov}(g(s), r)}{\sqrt{\text{Var}(g(s))\, \text{Var}(r)}}.$

Since $\varepsilon$ is independent of $s$ (hence of $g(s)$), the covariance collapses to

$\text{Cov}(g(s), r) = \text{Cov}(g(s), s + \varepsilon) = \text{Cov}(g(s), s).$

Both the numerator $\text{Cov}(g(s), s)$ and the normalizer $\text{Var}(g(s))$ depend on the *shape* of $g$, not merely on the ordering it induces. A nonlinear monotone $g$ moves both -- e.g. stretching the tails pushes the covariance up but the variance up even faster -- so the ratio generally changes. This is exactly why Pearson IC is sensitive to $g$ while Spearman, which sees only ranks, is not. Note the direction is constrained: with $\varepsilon \perp s$ we have $E[r \mid s] = s$, and by Cauchy-Schwarz $\text{Corr}(g(s), r) = \text{Corr}(g(s), s)\,\sqrt{\text{Var}(s)/\text{Var}(r)} \le \text{Corr}(s, r)$, with equality only for affine $g$. So a monotone $g$ can never raise the population Pearson IC; it can only hold it or lower it.

Part 3 -- Turnover trap. Take $s_i \sim N(0, 1)$ and $\varepsilon_i \sim N(0, 1)$, so $r_i = s_i + \varepsilon_i$, and use $g(x) = x^3$.

*Step 1 -- the ranking is preserved.* Cubing is strictly increasing, so by Part 1 the Spearman IC is unchanged: the transformed signal carries exactly the same predictive ranking as the raw one.

*Step 2 -- the Pearson IC actually falls.* Raw signal:

$\rho(s, r) = \frac{\text{Cov}(s, r)}{\sqrt{\text{Var}(s)\, \text{Var}(r)}} = \frac{1}{\sqrt{1 \cdot 2}} = \frac{1}{\sqrt{2}} \approx 0.707.$

Cubed signal, using Gaussian moments $E[s^4] = 3$ and $E[s^6] = 15$ (so $\text{Cov}(s^3, s) = E[s^4] = 3$ and $\text{Var}(s^3) = E[s^6] - (E[s^3])^2 = 15$):

$\rho(s^3, r) = \frac{\text{Cov}(s^3, s)}{\sqrt{\text{Var}(s^3)\, \text{Var}(r)}} = \frac{3}{\sqrt{15 \cdot 2}} = \frac{3}{\sqrt{30}} \approx 0.548.$

So the Pearson IC FALLS, from

/\sqrt{2}$ to $3/\sqrt{30}$, consistent with the Part 2 bound. The trap is NOT that Pearson IC went up.

*Step 3 -- the mechanism that destroys net Sharpe.* Cubing magnifies the spread of signal values, so a strategy that sizes positions off values loads up on extreme tail names. Suppose signals persist across periods, $s_i^{(t+1)} = \phi\, s_i^{(t)} + \eta_i$ with a small innovation $\eta_i$. A move $\eta$ in $s$ shifts the cubed signal by about $g'(s)\,\eta = 3 s^2 \eta$, so tail names swing hard and turnover (the $L^1$ change in weights) scales up by roughly $3\, E[s^2] = 3\times$ the raw signal's turnover here. Gross Sharpe is about the same (the ranking is unchanged and the Pearson IC is no higher), but net of costs

$\text{SR}_{\text{net}} \approx \text{SR}_{\text{gross}} - \lambda \cdot \text{Turnover},$

for cost $\lambda$ per unit turnover, so the extra churn the cube creates eats the edge and net Sharpe drops.

Answer:

  1. Spearman IC is invariant under $g$: strict monotonicity preserves ranks, and Spearman IC depends only on ranks, so $\text{Spearman}(\tilde{s}, r) = \text{Spearman}(s, r)$.
  1. Pearson IC changes because it reads the raw values of $g(s_i)$: with $\varepsilon \perp s$, $\text{Cov}(g(s), r) = \text{Cov}(g(s), s)$, and both this numerator and the normalizer $\text{Var}(g(s))$ depend on the shape of $g$. The direction is bounded -- $E[r \mid s] = s$ plus Cauchy-Schwarz gives $\text{Corr}(g(s), r) \le \text{Corr}(s, r)$ (equality only for affine $g$) -- so a monotone $g$ can only hold or lower the population Pearson IC.
  1. Take $s_i \sim N(0,1)$, $r_i = s_i + \varepsilon_i$, and $g(x) = x^3$. The Spearman IC is unchanged, so the predictive ranking is identical; the Pearson IC FALLS from
    /\sqrt{2} \approx 0.707$ to $3/\sqrt{30} \approx 0.548$. Yet with signal persistence $s_i^{(t+1)} = \phi\, s_i^{(t)} + \eta_i$, a value-sized long-short portfolio has turnover scaling like $3 s^2 |\eta|$ versus $|\eta|$ for the raw signal -- about $3 E[s^2] = 3$ times higher -- so $\text{SR}_{\text{net}} \approx \text{SR}_{\text{gross}} - \lambda \cdot \text{Turnover}$ drops even though gross performance is similar. The takeaway: $g$ rescales the signal's *representation*, not its predictive ranking, so chasing Pearson IC (which is not rank-invariant) optimizes an artifact and ignores turnover. Practitioners use Spearman IC for signal quality and decide representation by portfolio construction and cost.

Intuition

The core lesson here is one that every systematic quant learns (often painfully): the way you scale your signal has no effect on its informational content about the cross-section, but it has enormous effects on implementation. Spearman IC captures the pure informational content -- do stocks with higher signals tend to have higher returns? Pearson IC also cares about *how much* higher, which drags in the distributional shape of your signal. A tempting but wrong instinct is that you can find a monotone rescaling that makes Pearson IC look better; under $r = s + \varepsilon$ with $\varepsilon \perp s$ you cannot, because the identity is already optimal ($E[r \mid s] = s$), so a monotone $g$ can only hold or lower the population Pearson IC while the predictive ranking stays put.

The turnover piece is where the real damage hides. In theory, if you rank-neutralize your portfolio weights, the choice of $g$ does not matter. But in practice many systems use signal values (not just ranks) to set position sizes, and even rank-based systems rebalance on signal *changes*. A transform like cubing amplifies small signal movements into large weight changes -- its derivative $3x^2$ blows up in the tails -- driving up transaction costs and sinking net Sharpe even as the ranking, and gross performance, are unchanged. The meta-lesson: any signal metric that is not invariant to monotone transforms is measuring something about your signal's *representation*, not its *quality*. Representation choices should be driven by portfolio construction and cost considerations, not by IC optimization.

Open the full interactive solver →