Let $Y_1, Y_2, \ldots, Y_n$ be i.i.d. $N(\mu, \sigma^2)$ with both $\mu$ and $\sigma^2$ unknown. Let $S^2 = \frac{1}{n-1}\sum_{i=1}^{n}(Y_i - \bar{Y})^2$ be the sample variance. 1. Show that $(n-1)S^2 / \sigma^2 \sim \chi^2_{n-1}$. 2. Using part (1), derive a $100(1 - \alpha)\%$ confidence interval…

Chi-Square Confidence Interval for Variance and Sample Sizing

Statistics · Hard · Free problem

Let $Y_1, Y_2, \ldots, Y_n$ be i.i.d. $N(\mu, \sigma^2)$ with both $\mu$ and $\sigma^2$ unknown. Let $S^2 = \frac{1}{n-1}\sum_{i=1}^{n}(Y_i - \bar{Y})^2$ be the sample variance.

Show that $(n-1)S^2 / \sigma^2 \sim \chi^2_{n-1}$.

Using part (1), derive a
00(1 - \alpha)\%$ confidence interval for $\sigma^2$.

Suppose you want the confidence interval to have a target relative half-width -- that is, you want the CI radius to equal $\delta \sigma^2$ for some specified $\delta > 0$ (e.g., $\delta = 0.1$ means the half-width is
0\%$ of $\sigma^2$). Approximate the minimum sample size $n$ required to achieve this.

Hints

Start by decomposing $\sum(Y_i - \mu)^2$ into a piece involving $\bar{Y}$ and a piece involving $S^2$. What distributions do those pieces follow?
To build the CI, use $(n-1)S^2/\sigma^2 \sim \chi^2_{n-1}$ as a pivot and invert the probability statement. Remember the chi-square is not symmetric.
For the sample size approximation, use the fact that $\chi^2_{n-1} \approx N(n-1, 2(n-1))$ for large $n$, which makes the relative half-width approximately $z_{\alpha/2}\sqrt{2/(n-1)}$.

Worked Solution

How to Think About It: This is a foundational inference problem about variance. The key fact is that when you have normal data, the sample variance (properly scaled) follows a chi-square distribution. This is one of the few exact pivotal results in statistics -- most confidence intervals rely on asymptotics, but this one is exact for any $n$. The practical question (part 3) is the one interviewers care about most: how many observations do you need before your variance estimate is "tight enough"? Variance is notoriously hard to estimate precisely, and the answer will surprise you -- you need a lot of data.

Quick Estimate: For large $n$, a $\chi^2_{n-1}$ is approximately normal with mean $n-1$ and variance

Intuition

Variance is much harder to estimate precisely than the mean. A mean CI shrinks like

/\sqrt{n}$, but the relative precision of a variance estimate also scales like

/\sqrt{n}$ -- the catch is the constant is $\sqrt{2}$ times larger. At 95% confidence, pinning down variance to within $\pm 10\%$ requires about 769 observations, compared to roughly 384 for a mean with known variance. This is a practical reality that bites in finance: if you are estimating realized volatility from daily returns, a year of data (252 points) gives you a relative half-width of roughly 18%. Two years gets you to about 13%. Getting below 10% takes three years. This is why experienced quants are skeptical of "precise" volatility estimates from short samples, and why techniques like EWMA or GARCH that impose structure can outperform raw sample variance despite their model risk.

The chi-square pivot is also one of the cleanest examples of a quantity that is distribution-free in the parameter of interest. Cochran's theorem does the heavy lifting: it tells you the sample mean and sample variance are independent for normal data, which is a surprising and powerful result. This independence is what makes the $t$-test and $F$-test work, and it breaks down immediately for non-normal data -- a fact that matters a lot when working with heavy-tailed financial returns.