Asymptotic Distribution of the Sharpe Ratio

Statistics · Hard · Free problem

You have $n$ i.i.d. returns $r_1, r_2, \ldots, r_n$ with mean $\mu$ and variance $\sigma^2$. The sample Sharpe ratio is defined as

$\hat{S} = \frac{\bar{r}}{s}$

where $\bar{r}$ is the sample mean and $s$ is the sample standard deviation.

Using the delta method, derive the asymptotic variance of $\hat{S}$ in terms of $\mu$, $\sigma$, and the population kurtosis $\kappa = E[(r - \mu)^4]/\sigma^4$.

Construct a large-sample
00(1 - \alpha)\%$ confidence interval for the true Sharpe ratio $S = \mu / \sigma$.

Now suppose the returns exhibit serial correlation. Explain why the i.i.d. asymptotic variance formula understates the true sampling variability, and describe how a HAC (Heteroskedasticity and Autocorrelation Consistent) estimator can correct for this.

Hints

The Sharpe ratio is a smooth function of two sample moments -- the sample mean and sample variance. What general technique gives you the asymptotic distribution of a function of jointly normal estimators?
Write $g(\mu, \sigma^2) = \mu/\sqrt{\sigma^2}$ and compute its gradient. The delta method says $\sqrt{n}(g(\hat{\theta}) - g(\theta))$ is asymptotically normal with variance $\nabla g^T \Sigma \nabla g$, where $\Sigma$ is the joint covariance of the sample moments.
For the moment covariance matrix, you need $\text{Var}(\bar{r}) = \sigma^2$, $\text{Var}(s^2) = \mu_4 - \sigma^4 = (\kappa - 1)\sigma^4$, and $\text{Cov}(\bar{r}, s^2) = \mu_3$. Plug these in and simplify using $S = \mu/\sigma$.

Worked Solution

How to Think About It: The Sharpe ratio is a ratio of two estimated quantities -- the sample mean and sample standard deviation. Whenever you have a smooth function of sample moments, the delta method is your go-to tool for getting an asymptotic distribution. The idea is simple: linearize the function around the true parameter values, then use the known joint CLT for sample moments to read off the variance. The practical payoff is a confidence interval that tells you how much noise is in your Sharpe estimate -- which, for typical hedge fund track records (5-10 years of monthly data), is a lot.

Quick Estimate: For normal returns, the asymptotic variance of $\hat{S}$ simplifies to $(1 + S^2/2)/n$. If $S = 0.5$ annually (a decent fund) and you have $n = 60$ monthly observations, the standard error is $\sqrt{(1 + 0.125)/60} \approx \sqrt{0.0188} \approx 0.137$. A 95% CI would be roughly $0.5 \pm 0.27$, spanning from $0.23$ to $0.77$. That is an enormous range -- the Sharpe ratio is measured with far less precision than most people realize.

Formal Derivation:

Part 1: Delta Method for $\hat{S}$

Define the parameter vector $\theta = (\mu, \sigma^2)$ and the function $g(\mu, \sigma^2) = \mu / \sqrt{\sigma^2} = \mu / \sigma$. The sample analogue is $\hat{S} = g(\bar{r}, s^2)$.

By the multivariate CLT, we have

$\sqrt{n} \begin{pmatrix} \bar{r} - \mu \\ s^2 - \sigma^2 \end{pmatrix} \xrightarrow{d} N\left(0, \Sigma\right)$

where the covariance matrix of the sample moments is

$\Sigma = \begin{pmatrix} \sigma^2 & \mu_3 \\ \mu_3 & \mu_4 - \sigma^4 \end{pmatrix}$

Here $\mu_3 = E[(r - \mu)^3]$ is the third central moment (related to skewness) and $\mu_4 = E[(r - \mu)^4]$ is the fourth central moment. In terms of kurtosis $\kappa = \mu_4/\sigma^4$, we have $\mu_4 - \sigma^4 = (\kappa - 1)\sigma^4$.

The gradient of $g$ is

$\nabla g = \left(\frac{\partial g}{\partial \mu}, \frac{\partial g}{\partial \sigma^2}\right) = \left(\frac{1}{\sigma}, -\frac{\mu}{2\sigma^3}\right)$

By the delta method,

$\sqrt{n}(\hat{S} - S) \xrightarrow{d} N(0, \nabla g^T \Sigma \, \nabla g)$

Computing the asymptotic variance:

$V = \nabla g^T \Sigma \, \nabla g = \frac{1}{\sigma^2} \cdot \sigma^2 + 2 \cdot \frac{1}{\sigma} \cdot \left(-\frac{\mu}{2\sigma^3}\right) \cdot \mu_3 + \frac{\mu^2}{4\sigma^6} \cdot (\kappa - 1)\sigma^4$

$V = 1 - \frac{\mu \, \mu_3}{\sigma^4} + \frac{S^2}{4}(\kappa - 1)$

Let $\gamma = \mu_3/\sigma^3$ denote the skewness. Then $\mu_3/\sigma^4 = \gamma/\sigma$ and $\mu = S\sigma$, so

$V = 1 - S\gamma + \frac{S^2}{4}(\kappa - 1)$

Thus the asymptotic distribution is

$\hat{S} \overset{\text{approx}}{\sim} N\left(S, \; \frac{1}{n}\left[1 - S\gamma + \frac{S^2}{4}(\kappa - 1)\right]\right)$

Special case (normal returns): For Gaussian returns, $\gamma = 0$ and $\kappa = 3$, giving

$V_{\text{normal}} = 1 + \frac{S^2}{2}$

which is the well-known Lo (2002) result.

Part 2: Confidence Interval

A large-sample

00(1-\alpha)\%$ confidence interval for $S$ is

$\hat{S} \pm z_{\alpha/2} \sqrt{\frac{\hat{V}}{n}}$

where $z_{\alpha/2}$ is the standard normal critical value and $\hat{V}$ is $V$ evaluated at sample estimates:

$\hat{V} = 1 - \hat{S}\hat{\gamma} + \frac{\hat{S}^2}{4}(\hat{\kappa} - 1)$

Here $\hat{\gamma}$ and $\hat{\kappa}$ are the sample skewness and kurtosis.

Part 3: Serial Correlation and HAC Correction

When returns are serially correlated, the i.i.d. CLT no longer applies directly. The key issue is that the covariance matrix $\Sigma$ of the sample moments becomes larger. Under serial correlation, $\text{Var}(\bar{r})$ is no longer $\sigma^2/n$ but instead

$\text{Var}(\bar{r}) = \frac{1}{n}\left(\sigma^2 + 2\sum_{j=1}^{\infty} \text{Cov}(r_t, r_{t+j})\right)$

The additional autocovariance terms inflate the variance. In practice, positive autocorrelation (momentum in returns) makes the Sharpe ratio appear more precisely estimated than it actually is. The same issue affects $s^2$ and the cross-moment.

A HAC estimator (such as Newey-West) replaces $\Sigma$ with a consistent estimate $\hat{\Sigma}_{\text{HAC}}$ that accounts for autocovariances up to some bandwidth $L$:

$\hat{\Sigma}_{\text{HAC}} = \hat{\Gamma}_0 + \sum_{j=1}^{L} w_j (\hat{\Gamma}_j + \hat{\Gamma}_j^T)$

where $\hat{\Gamma}_j$ is the sample autocovariance matrix at lag $j$ and $w_j$ are kernel weights (e.g., Bartlett weights $w_j = 1 - j/(L+1)$). You then apply the delta method using $\hat{\Sigma}_{\text{HAC}}$ in place of $\Sigma$ to get a corrected standard error for $\hat{S}$.

Answer:

The asymptotic variance of the sample Sharpe ratio under i.i.d. returns is

$\text{Var}(\hat{S}) \approx \frac{1}{n}\left(1 - S\gamma + \frac{S^2}{4}(\kappa - 1)\right)$

For normal returns this reduces to $(1 + S^2/2)/n$. The

00(1-\alpha)\%$ CI is $\hat{S} \pm z_{\alpha/2}\sqrt{\hat{V}/n}$. Serial correlation inflates sampling variability through autocovariance terms that the i.i.d. formula ignores; a HAC estimator (e.g., Newey-West) corrects this by incorporating estimated autocovariances into the moment covariance matrix before applying the delta method.

Intuition

The Sharpe ratio is the most quoted performance metric in finance, yet its sampling distribution is surprisingly fat. The delta method reveals that even under i.i.d. normality, the standard error scales like $\sqrt{(1 + S^2/2)/n}$. With typical fund track records of 60-120 monthly observations, the 95% confidence interval for a "good" Sharpe of 0.5 can easily span from near zero to 1.0. This means most Sharpe comparisons between strategies are statistically meaningless unless the sample is very long or the difference is very large.

The serial correlation piece is equally important in practice. Many strategies -- especially those that hold positions over multiple periods -- exhibit autocorrelated returns. Positive autocorrelation makes the sample mean look more precisely estimated than it really is, which artificially shrinks the Sharpe's confidence interval. The HAC correction is not optional for real-world performance evaluation; ignoring it is the statistical equivalent of overfitting your backtest. Andrew Lo's 2002 paper "The Statistics of Sharpe Ratios" formalized much of this and showed that annualizing monthly Sharpes by multiplying by $\sqrt{12}$ is only valid under the i.i.d. assumption -- with serial correlation, the scaling factor changes.

Open the full interactive solver →