You compute a daily information coefficient (IC) series $IC(1), \ldots, IC(T)$ and estimate the mean $\bar{IC}$. Assume the IC series follows a stationary AR(1) process:
Derive $\operatorname{Var}(\bar{IC})$ in closed form as a function of $T$, $\phi$, and $\sigma^2$.
Define an "effective sample size" $T_{\text{eff}}$ so that $\operatorname{Var}(\bar{IC}) = \sigma_{IC}^2 / T_{\text{eff}}$, where $\sigma_{IC}^2 = \sigma^2 / (1 - \phi^2)$ is the marginal variance of the IC series.
Show how $T_{\text{eff}}$ behaves as $\phi \to 1^{-}$, and interpret the result.
Hints
The variance of the sample mean involves summing all pairwise covariances -- for an AR(1), each autocovariance $\gamma(h) = \sigma_{IC}^2 \phi^{|h|}$ is a geometric function of the lag.
Use the geometric series $\sum_{h=0}^{\infty} \phi^h = 1/(1 - \phi)$ to simplify the double sum. The large-$T$ limit is cleaner than the exact finite-$T$ expression.
Define $T_{\text{eff}}$ by matching $\operatorname{Var}(\bar{IC}) = \sigma_{IC}^2 / T_{\text{eff}}$ and isolate the ratio $T_{\text{eff}} / T = (1 - \phi)/(1 + \phi)$.
Worked Solution
How to Think About It: When you average an autocorrelated series, the effective amount of information is much less than the raw sample size $T$ suggests. Intuitively, consecutive IC values are not independent draws -- they are "echoes" of the same shocks. The stronger the autocorrelation, the fewer truly independent observations you have. This is the single most common trap in quant research: you run a strategy for 1,000 days, compute a t-stat using $\sqrt{1000}$, and feel great -- but if daily ICs have $\phi = 0.5$, your effective sample size is only about 330. Your t-stat is inflated by $\sqrt{3}$.
Quick Estimate: For $\phi = 0.5$ and $T = 252$ (one year of daily data), the effective sample size should be roughly $T \cdot (1 - \phi)/(1 + \phi) = 252 \cdot (0.5/1.5) = 84$. So you have roughly one-third the information you thought. For $\phi = 0.9$, it drops to
52 \cdot (0.1/1.9) \approx 13$ -- barely two weeks of independent observations from a full year of data.
Approach: We compute $\operatorname{Var}(\bar{IC})$ by summing all autocovariances of the AR(1) process, then extract $T_{\text{eff}}$.
Formal Solution:
*Step 1: Autocovariance of AR(1).* For a stationary AR(1), the autocovariance at lag $h$ is:
Evaluating the sum $\sum_{h=1}^{T-1}(T - h)\phi^h = T \cdot \frac{\phi(1 - \phi^{T-1})}{1 - \phi} - \frac{\phi(1 - \phi^{T-1})}{(1 - \phi)^2} + \frac{(T-1)\phi^{T+1} - T\phi^T + \phi}{(1-\phi)^2}$... For large $T$ with $|\phi| < 1$, the terms involving $\phi^T$ vanish and we get the clean asymptotic result:
$T_{\text{eff}} = T \cdot \frac{1 - \phi}{1 + \phi}$
*Step 4: Behavior as $\phi \to 1^{-}$.* As $\phi \to 1$, $T_{\text{eff}} \to T \cdot \frac{1 - \phi}{2} \to 0$. The effective sample size collapses. Specifically, $T_{\text{eff}} \approx T(1 - \phi)/2$ for $\phi$ near 1. A highly persistent series provides almost no independent information about the mean no matter how long you observe it.
*Exact (finite-T) formula.* For completeness, the exact variance is:
Answer: For large $T$, $T_{\text{eff}} \approx T \cdot \frac{1 - \phi}{1 + \phi}$. As $\phi \to 1^{-}$, $T_{\text{eff}} \to 0$: a near-unit-root IC series provides essentially no information about its mean, regardless of sample length.
Intuition
This result is one of the most practically important facts in quantitative research. Whenever you compute a t-statistic, a Sharpe ratio, or any significance measure from time series data, you are implicitly dividing by $\sqrt{T}$. But if the data is autocorrelated, $\sqrt{T}$ overstates how much information you actually have. The correction factor $(1 - \phi)/(1 + \phi)$ shows that even moderate autocorrelation ($\phi = 0.5$) cuts your effective sample by a factor of 3, and high autocorrelation ($\phi = 0.9$) cuts it by a factor of nearly 20. In practice, daily alpha signals, IC series, and P&L streams are often significantly autocorrelated, so naive significance tests vastly overstate confidence. The fix is simple: replace $T$ with $T_{\text{eff}}$ in your standard error calculations, or equivalently, use Newey-West or other HAC standard errors that account for serial dependence.