Ridge vs. Lasso Regression: Theory and Practice
Consider a linear model $y = X\beta + \varepsilon$ with design matrix $X \in \mathbb{R}^{n \times d}$ and response $y \in \mathbb{R}^n$.
1. Write the optimization problems for ridge regression and lasso, including how the intercept is typically handled.
2. Derive the closed-form solution for ridge regression when $X^\top X$ is invertible.
3. Explain mathematically (not just heuristically) why lasso can produce exact zeros in the coefficient vector while ridge typically does not.
4. Describe a cross-validation scheme to select the regularization hyperparameter $\lambda$, and recommend a metric you would optimize when building a trading signal.
Open the full interactive solver, hints, and worked solution →