Closed-Form Lasso Under Orthonormal Design
Suppose $X$ is an $n \times p$ design matrix with orthonormal columns ($X^T X = I$) and $y$ is the response vector.
1. Show that the Lasso solution $\hat{\beta} = \arg\min_{\beta} \frac{1}{2}\|y - X\beta\|_2^2 + \lambda \|\beta\|_1$ reduces to componentwise soft-thresholding of $X^T y$. Give the exact formula for each $\hat{\beta}_j$.
2. Explain how the threshold $\lambda$ controls feature selection: which coefficients are zeroed out and which survive?
3. Compare the Lasso shrinkage to Ridge regression shrinkage under the same orthonormal design. How do the two operators differ in their treatment of small vs. large coefficients?
Open the full interactive solver, hints, and worked solution →