Rolling Heteroskedastic t-Statistic Under GARCH

Time Series · Hard · Free problem

Consider a return series $r_t$ with conditional mean $\mu$ and residuals $\varepsilon_t = r_t - \mu$. The conditional variance follows a GARCH(1,1) model:

$\sigma_t^2 = \omega + \alpha\, \varepsilon_{t-1}^2 + \beta\, \sigma_{t-1}^2$

You want to test whether a rolling mean signal $\bar{r}_W = \frac{1}{W}\sum_{i=0}^{W-1} r_{t-i}$ is significantly different from zero using a heteroskedasticity-robust t-statistic computed over a window of width $W$.

Define a heteroskedasticity-robust rolling t-statistic for testing $H_0\colon \mu = 0$.

State its asymptotic distribution under $H_0$ as $W \to \infty$.

In practice, $\sigma_t^2$ is estimated (not known). Show how this estimation introduces bias into the t-statistic and describe how to debias it.

Hints

The variance of the rolling mean under GARCH is not $\sigma^2/W$ -- think about what replaces the constant variance when each observation has its own $\sigma_t^2$.
The standardized residuals $z_t = \varepsilon_t / \sigma_t$ are i.i.d. under the GARCH model, so a martingale CLT gives the asymptotic distribution.
For the estimated-variance step, apply the delta method to $\hat{V}_W$ as a function of $\hat{\theta}$: the parameter error is $O_p(T^{-1/2})$, so with $T \gg W$ the ratio $W/T \to 0$ and the null stays $N(0,1)$ -- the finite-sample fix is a filtered bootstrap, not a closed-form rescale.

Worked Solution

How to Think About It: A standard t-statistic divides the sample mean by $s / \sqrt{W}$, where $s$ is computed assuming homoskedastic residuals. But under GARCH, the variance changes over time -- some observations are much noisier than others. If you ignore this and use the plain sample variance, your t-stat will be distorted: it will be too large in volatile periods and too small in calm ones. The fix is to weight each observation by the inverse of its conditional variance, or equivalently, to use a variance estimator that accounts for heteroskedasticity. Once you have the right denominator, the standardized residuals $z_i = \varepsilon_i / \sigma_i$ are i.i.d. and a martingale CLT delivers a clean $N(0,1)$ null. The only subtlety is that in practice $\sigma_t^2$ is estimated, not known -- so you must ask whether plugging in the GARCH estimate changes that null law.

Quick Estimate: Before any algebra, sanity-check the denominator. If every $\sigma_{t-i}^2$ were equal to a constant $\sigma^2$, then $\hat{V}_W = W^{-2}\sum \sigma_{t-i}^2$ collapses to $\sigma^2 / W$ -- exactly the textbook variance of the sample mean. So the heteroskedastic estimator must reduce to the homoskedastic one, which it does. The asymptotic null should therefore be $N(0,1)$, the same as the ordinary t-stat, since we have only re-weighted i.i.d.-after-standardization terms. A naive worry is that estimating the GARCH parameters must blow this up -- but with the estimation sample $T$ much larger than the window $W$, the parameter error is $O_p(T^{-1/2})$ and the ratio $W/T \to 0$, so the effect should vanish asymptotically, not inflate the variance. Expect the null to stay $N(0,1)$; any finite-sample fix should be small.

Approach: Build a GLS-style rolling t-statistic that divides the rolling mean by the GARCH-weighted variance of the mean, invoke the martingale CLT on the standardized residuals for the $N(0,1)$ null, and then analyze the plug-in error from estimating $\theta = (\omega, \alpha, \beta)$ via a delta-method expansion to show it is asymptotically negligible when $T \gg W$.

Formal Solution:

Step 1 -- Define the estimator. The heteroskedasticity-robust rolling t-statistic is:

$t_W = \frac{\bar{r}_W}{\sqrt{\hat{V}_W}}, \quad \text{where } \hat{V}_W = \frac{1}{W^2}\sum_{i=0}^{W-1} \hat{\sigma}_{t-i}^2$

Here $\bar{r}_W = \frac{1}{W}\sum_{i=0}^{W-1} r_{t-i}$ is the rolling mean and $\hat{\sigma}_{t-i}^2$ is the GARCH-fitted conditional variance at time $t - i$. The denominator is the correct variance of the sample mean under conditionally uncorrelated residuals: $\mathrm{Var}(\bar{r}_W) = W^{-2}\sum_{i=0}^{W-1}\sigma_{t-i}^2$, not $\sigma^2/W$.

Alternatively, you can use the "sandwich" form. Define the standardized residuals $z_i = \varepsilon_{t-i} / \sigma_{t-i}$. Then:

$t_W = \frac{\sum_{i=0}^{W-1} \varepsilon_{t-i} / \sigma_{t-i}^2}{\sqrt{\sum_{i=0}^{W-1} 1/\sigma_{t-i}^2}}$

This is the GLS-style t-stat that downweights volatile observations.

Step 2 -- Asymptotic distribution. Under $H_0\colon \mu = 0$ and standard GARCH regularity conditions (stationarity, finite fourth moments), as $W \to \infty$:

$t_W \xrightarrow{d} N(0, 1)$

This follows because $\bar{r}_W / \sqrt{V_W}$ is a sum of martingale differences $\varepsilon_{t-i}$ scaled by known $\sigma_{t-i}$, and the martingale CLT applies. The key condition is that the standardized residuals $z_i = \varepsilon_i / \sigma_i$ are i.i.d. with mean 0 and variance 1.

Step 3 -- Effect of estimating $\sigma_t^2$. When $\sigma_t^2$ is replaced by the GARCH estimate $\hat{\sigma}_t^2$, two distinct effects arise:

Estimation error in $\hat{\theta}$: The GARCH parameters $(\hat{\omega}, \hat{\alpha}, \hat{\beta})$ are estimated (typically by QMLE) on a history of length $T$, so each $\hat{\sigma}_t^2$ carries parameter noise.

Filter initialization bias: The recursive GARCH filter depends on an initial value $\sigma_0^2$. Early in the window, $\hat{\sigma}_t^2$ carries this initialization error.

Quantify the parameter effect by a delta-method (first-order) expansion of the denominator. With $\hat{\theta} - \theta = O_p(T^{-1/2})$,

$\sum_{i=0}^{W-1}\big(\hat{\sigma}_{t-i}^2 - \sigma_{t-i}^2\big) = \Big(\sum_{i=0}^{W-1}\partial_\theta \sigma_{t-i}^2\Big)^{\!\top}(\hat{\theta} - \theta) + O_p\!\big(\tfrac{W}{T}\big) = O_p\!\big(W\,T^{-1/2}\big).$

Since $\sum_{i=0}^{W-1}\sigma_{t-i}^2 = O_p(W)$, the relative error in the variance is

$\frac{\hat{V}_W - V_W}{V_W} = O_p\!\big(T^{-1/2}\big), \qquad\text{hence}\qquad t_W(\hat{\theta}) - t_W(\theta) = O_p\!\big(T^{-1/2}\big).$

In the realistic regime where GARCH is fit on a long lookback, $T \gg W$, so $W/T \to 0$ and the plug-in error is asymptotically negligible. The feasible statistic therefore keeps the *same* limiting null law as the known-variance case:

$t_W(\hat{\theta}) \xrightarrow{d} N(0,1).$

There is *no* first-order variance inflation of the form

+ W\,c(\theta)/T$, and no universal closed-form rescale $t_W/\sqrt{1 + W\,c(\theta)/T}$ is warranted: the estimation error enters only through the denominator at order $O_p(T^{-1/2})$ and washes out. (It does not create a systematic over-rejection.)

For *finite samples*, where the negligible-order argument is only approximate, the right remedy is not a closed-form correction but a filtered (parametric) bootstrap:

Re-estimate $\hat{\theta}$ on the data and form standardized residuals $\hat{z}_t = \hat{\varepsilon}_t/\hat{\sigma}_t$.
For each replication, resample $\hat{z}_t^{*}$, propagate the GARCH recursion to rebuild $\sigma_t^{2*}$ and a synthetic return path under the fitted null ($\mu = 0$), and re-estimate $\theta$ inside the replication (so the bootstrap inherits the same plug-in error as the real statistic).
Recompute $t_W$ on each replication under that re-fitted null and read the critical values / p-value from the bootstrap distribution.

Apply burn-in trimming alongside this: discard the first $B$ observations of the GARCH filter (commonly $B = 50$-

00$) so initialization error does not contaminate the early window. The bootstrap captures parameter uncertainty, filter dynamics, persistence, and estimation/test-window overlap jointly -- effects a single scalar rescale cannot represent.

Practical Considerations:

If the conditional mean is misspecified (e.g., an AR term is missing), the residuals $\hat{\varepsilon}_t$ carry mean-model error, which biases both the GARCH variance estimates and the t-stat. A two-step procedure (first fit the mean model, then fit GARCH to residuals) with HAC-robust standard errors on the mean parameters is standard.
For short windows ($W < 50$), the asymptotic normal approximation is poor for reasons unrelated to parameter estimation. Use a $t$-distribution with effective degrees of freedom, or the filtered bootstrap.
In live trading, GARCH parameters are typically estimated on a longer lookback than the signal window $W$. This separation ($T \gg W$) is exactly what drives $W/T \to 0$ and makes the plug-in effect negligible; the standard $N(0,1)$ critical values remain valid to first order.

Answer: The rolling heteroskedastic t-stat is $t_W = \bar{r}_W \big/ \sqrt{W^{-2}\sum \hat{\sigma}_{t-i}^2}$. Under $H_0$ and GARCH regularity, $t_W \xrightarrow{d} N(0,1)$. When $\sigma_t^2$ is estimated on a long history $T \gg W$, the plug-in error is $O_p(T^{-1/2})$ with $W/T \to 0$, so it is asymptotically negligible and $t_W$ retains the *same* $N(0,1)$ null -- there is no universal

+ W\,c(\theta)/T$ inflation and no closed-form rescale is needed. The correct finite-sample remedy is a filtered/parametric bootstrap that re-estimates $\theta$ within each replication and recomputes $t_W$ under the fitted null, combined with burn-in trimming of the GARCH filter.

Intuition

The fundamental issue here is that a t-statistic is only as good as its variance estimate. In a GARCH world, using the unconditional variance ignores the time-varying nature of risk -- you are treating a calm Tuesday the same as a volatile FOMC day. The heteroskedastic t-stat fixes this by letting each observation contribute to the variance according to its own conditional volatility. This is exactly the logic behind GLS: observations with higher noise get less weight.

The debiasing subtlety is important in practice. GARCH parameters are estimated, not known, and that estimation error propagates into your test statistic. For large estimation samples relative to the window ($T \gg W$), this is negligible. But in live trading systems where you might be re-estimating GARCH parameters frequently on limited data, the correction matters -- ignoring it leads to slightly over-rejecting the null, meaning you will see more false signals than you expect.

Open the full interactive solver →