Overlapping Returns and Sharpe Ratio Inference

Statistics · Hard · Free problem

A strategy produces daily returns $r_1, r_2, \ldots, r_T$. To estimate performance at a longer horizon, you construct $h$-day overlapping returns:

$R_t^{(h)} = \sum_{j=0}^{h-1} r_{t+j}, \quad t = 1, \ldots, T - h + 1$

You then compute the sample mean $\bar{R}^{(h)}$ and sample standard deviation $s^{(h)}$ of these overlapping returns to estimate the $h$-day Sharpe ratio $\text{SR}^{(h)} = \bar{R}^{(h)} / s^{(h)}$.

Naive standard errors are wrong. Show that the overlapping construction induces autocorrelation in $R_t^{(h)}$, and explain why treating these $T - h + 1$ observations as i.i.d. leads to standard errors that understate the true sampling uncertainty of $\bar{R}^{(h)}$.

HAC standard errors. Derive a Newey-West (HAC) standard error for $\bar{R}^{(h)}$ that correctly accounts for the serial correlation in the overlapping returns. Specify the bandwidth choice and kernel.

Confidence interval for the Sharpe ratio. Using the delta method together with HAC standard errors, construct an asymptotically valid confidence interval for $\text{SR}^{(h)}$. State the required regularity conditions.

Hints

Think about how many daily observations two consecutive overlapping windows share, and what that does to the covariance between $R_t^{(h)}$ and $R_{t+1}^{(h)}$.
The overlapping construction creates an MA($h-1$) process. The Newey-West HAC estimator with bandwidth $L = h - 1$ and Bartlett kernel captures exactly the nonzero autocovariances.
For the Sharpe CI, write $\text{SR} = \mu / \sigma$ as a function of $(\mu, \sigma^2)$, compute $\nabla g$, and apply the delta method using the

\times 2$ HAC long-run covariance matrix.

Worked Solution

How to Think About It: This is a bread-and-butter problem in performance evaluation. Every quant has computed a Sharpe ratio from overlapping returns at some point -- and the trap is always the same: overlapping returns are mechanically correlated because consecutive windows share $h - 1$ daily observations. If you pretend they are independent, your standard errors shrink by a factor that can be massive (roughly $\sqrt{h}$), and you end up thinking your strategy is much more precisely estimated than it really is. The fix is HAC standard errors, plus the delta method to push those corrected SEs through to the Sharpe ratio.

Key Insight: Overlapping $h$-day returns built from daily data create an MA($h-1$) autocorrelation structure. The variance of the sample mean depends on all the autocovariances, not just the variance.

---

### Part (i): Why Naive Standard Errors Are Biased

Let the daily returns $r_t$ be stationary with mean $\mu$ and autocovariance $\gamma_r(k) = \text{Cov}(r_t, r_{t+k})$. Define the overlapping return:

$R_t^{(h)} = \sum_{j=0}^{h-1} r_{t+j}$

The autocovariance of the overlapping series at lag $k$ is:

$\gamma_R(k) = \text{Cov}\left(R_t^{(h)}, R_{t+k}^{(h)}\right) = \sum_{i=0}^{h-1} \sum_{j=0}^{h-1} \gamma_r(k + j - i)$

Even if daily returns are uncorrelated ($\gamma_r(k) = 0$ for $k \neq 0$), consecutive overlapping returns share $h - 1$ common daily observations. For $|k| < h$, we get:

$\gamma_R(k) = (h - |k|) \cdot \sigma_r^2, \quad |k| < h$

and $\gamma_R(k) = 0$ for $|k| \geq h$. So $R_t^{(h)}$ follows an MA($h - 1$) process.

The true variance of the sample mean $\bar{R}^{(h)}$ (with $N = T - h + 1$ overlapping observations) is:

$\text{Var}(\bar{R}^{(h)}) = \frac{1}{N^2} \sum_{s=1}^{N} \sum_{t=1}^{N} \gamma_R(s - t) = \frac{1}{N} \sum_{k=-(N-1)}^{N-1} \left(1 - \frac{|k|}{N}\right) \gamma_R(k)$

The naive standard error treats observations as i.i.d. and uses:

$\text{Var}_{\text{naive}}(\bar{R}^{(h)}) = \frac{\gamma_R(0)}{N} = \frac{h \sigma_r^2}{N}$

This ignores all the cross-covariance terms. Since $\gamma_R(k) > 0$ for $|k| < h$, the true variance is strictly larger than the naive estimate. In the i.i.d. daily returns case, a quick calculation gives:

$\text{Var}(\bar{R}^{(h)}) \approx \frac{h \sigma_r^2}{N} \cdot \left(\frac{2h - 1}{3}\right) \cdot \frac{3}{1} \approx \frac{h^2 \sigma_r^2}{T}$

for large $T$, while the naive variance is $h \sigma_r^2 / N \approx h \sigma_r^2 / T$. The ratio is roughly $h$, meaning the naive SE understates the true SE by a factor of approximately $\sqrt{h}$. With monthly overlapping returns ($h = 21$), that is a factor of about 4.6 -- your t-stats are inflated by nearly 5x.

---

### Part (ii): HAC/Newey-West Standard Error

The Newey-West HAC estimator of the long-run variance of $\bar{R}^{(h)}$ is:

$\hat{\sigma}_{\text{HAC}}^2 = \hat{\gamma}_R(0) + 2 \sum_{k=1}^{L} w(k, L) \, \hat{\gamma}_R(k)$

where the sample autocovariances are:

$\hat{\gamma}_R(k) = \frac{1}{N} \sum_{t=1}^{N-k} \left(R_t^{(h)} - \bar{R}^{(h)}\right)\left(R_{t+k}^{(h)} - \bar{R}^{(h)}\right)$

and the Bartlett kernel weights are:

$w(k, L) = 1 - \frac{k}{L + 1}$

Bandwidth choice: Since the overlapping returns have an MA($h - 1$) structure, the natural bandwidth is $L = h - 1$. This captures exactly the lags with nonzero autocovariance. (In practice, using $L = h - 1$ with the Bartlett kernel gives a positive semi-definite estimator by construction.)

The HAC standard error of $\bar{R}^{(h)}$ is then:

$\text{SE}_{\text{HAC}}(\bar{R}^{(h)}) = \sqrt{\frac{\hat{\sigma}_{\text{HAC}}^2}{N}}$

This accounts for all the serial correlation induced by the overlapping construction.

---

### Part (iii): Delta Method CI for the Sharpe Ratio

The Sharpe ratio is a function of two quantities:

$\text{SR}^{(h)} = g(\mu, \sigma) = \frac{\mu}{\sigma}$

where $\mu = E[R_t^{(h)}]$ and $\sigma = \text{sd}(R_t^{(h)})$. Define $\theta = (\mu, \sigma^2)^\top$ with sample estimator $\hat{\theta} = (\bar{R}^{(h)}, \hat{\sigma}_{R}^2)^\top$.

By the CLT for dependent data (under stationarity and mixing conditions):

$\sqrt{N}(\hat{\theta} - \theta) \xrightarrow{d} \mathcal{N}(0, \Sigma)$

where $\Sigma$ is the long-run covariance matrix of $(R_t^{(h)}, (R_t^{(h)})^2)^\top$, estimated via HAC:

$\hat{\Sigma} = \hat{\Gamma}(0) + \sum_{k=1}^{L} w(k, L) \left[\hat{\Gamma}(k) + \hat{\Gamma}(k)^\top\right]$

where $\hat{\Gamma}(k)$ is the \times 2$ sample cross-autocovariance matrix of $Z_t = (R_t^{(h)}, (R_t^{(h)})^2)^\top$ at lag $k$.

The gradient of $g(\mu, \sigma^2) = \mu / \sqrt{\sigma^2}$ is:

$\nabla g = \left(\frac{1}{\sigma}, \; -\frac{\mu}{2\sigma^3}\right)^\top$

By the delta method:

$\sqrt{N}\left(\widehat{\text{SR}}^{(h)} - \text{SR}^{(h)}\right) \xrightarrow{d} \mathcal{N}\left(0, \; \nabla g^\top \, \Sigma \, \nabla g\right)$

The asymptotic variance of the Sharpe ratio estimator is:

$V_{\text{SR}} = \nabla g^\top \, \hat{\Sigma} \, \nabla g$

and the $(1 - \alpha)$ confidence interval is:

$\widehat{\text{SR}}^{(h)} \pm z_{\alpha/2} \sqrt{\frac{V_{\text{SR}}}{N}}$

Regularity conditions: - Stationarity and ergodicity of daily returns $r_t$ - Finite fourth moments: $E[r_t^4] < \infty$ (needed for the CLT on the variance estimator and for the HAC estimator to be consistent) - A mixing condition (e.g., strong mixing with summable mixing coefficients) so that the dependent CLT applies - $\sigma > 0$ (the delta method requires us to divide by $\sigma$, so the denominator must be nonzero)

---

Answer:

(i) Overlapping $h$-day returns share $h - 1$ daily observations, creating MA($h-1$) autocorrelation. The naive SE ignores these positive autocovariances and understates the true sampling uncertainty by a factor of roughly $\sqrt{h}$.

(ii) Use the Newey-West HAC estimator with Bartlett kernel and bandwidth $L = h - 1$:

$\hat{\sigma}_{\text{HAC}}^2 = \hat{\gamma}_R(0) + 2 \sum_{k=1}^{h-1} \left(1 - \frac{k}{h}\right) \hat{\gamma}_R(k)$

(iii) Apply the delta method to $\text{SR} = \mu / \sigma$ using the \times 2$ HAC covariance matrix of $(\bar{R}^{(h)}, \hat{\sigma}_R^2)$. The CI is $\widehat{\text{SR}} \pm z_{\alpha/2} \sqrt{V_{\text{SR}} / N}$, valid under stationarity, finite fourth moments, and a mixing condition.

Intuition

The core issue is one of effective sample size. When you build overlapping returns, you are reusing data -- each daily return appears in $h$ different overlapping windows. The raw observation count $T - h + 1$ massively overstates how much independent information you have. The autocorrelation is not some nuisance that might or might not matter; it is a mechanical, guaranteed consequence of the construction. Ignoring it is one of the most common mistakes in strategy backtesting, and it produces Sharpe ratios that look far more precisely estimated than they are.

The broader lesson applies everywhere in quant finance: whenever your observations are constructed from shared underlying data (overlapping returns, rolling averages, cumulative sums), your effective sample size is much smaller than the number of observations. HAC standard errors are the standard fix, and the delta method lets you propagate that correction through any smooth function of your estimates -- here the Sharpe ratio, but the same machinery applies to information ratios, risk-adjusted alphas, or any other ratio statistic. Getting the standard errors right is the difference between a real signal and a backtest artifact.

Open the full interactive solver →