Online Mean and Variance for Streaming Returns

Statistics · Medium · Free problem
You receive a stream of returns for $N$ instruments over $T$ days, delivered in mini-batches of $B$ days at a time. $N$ can be up to
0^4$ and $T$ up to
0^6$. Design an algorithm that computes the exact sample mean vector $\hat{\mu} \in \mathbb{R}^N$ and diagonal variance vector $\hat{\sigma}^2 \in \mathbb{R}^N$ in a single pass using $O(N)$ memory. 1. State the update formulas (Welford-style) for incorporating each new mini-batch. 2. Explain how you handle missing values (NaNs) -- some instruments may have no return on certain days. 3. State the overall time complexity. **Constraints:** -
\le N \le 10^4$ -
\le T \le 10^6$ -
\le B \le T$ - Returns may contain NaN entries (missing data) - Only $O(N)$ memory allowed -- you cannot store the full $N \times T$ matrix

Open the full interactive solver, hints, and worked solution →