Mean-Variance Optimization via Linear Algebra
You have $n$ assets with expected return vector $\mu \in \mathbb{R}^n$ and covariance matrix $\Sigma \in \mathbb{R}^{n \times n}$ (symmetric, positive definite).
- Formulate the mean-variance optimization (MVO) problem in matrix notation: minimize portfolio variance subject to a target return and the constraint that weights sum to one.
- Derive the closed-form solution for the optimal weight vector $w^{*}$ using Lagrange multipliers.
- Show that as the target return varies, the optimal portfolios trace out the efficient frontier -- a parabola in $(\sigma_p^2, r_p)$ space.
- Discuss the practical challenges of implementing MVO, particularly the sensitivity to input estimation errors.
Hints
- Set up the Lagrangian with two constraints (target return and weights summing to one) and take the first-order condition with respect to $w$.
- You will need to invert $\Sigma$ -- define the scalars $A = \mathbf{1}^\top \Sigma^{-1} \mu$, $B = \mu^\top \Sigma^{-1} \mu$, $C = \mathbf{1}^\top \Sigma^{-1} \mathbf{1}$ to keep the algebra clean.
- After solving for the multipliers, substitute back into $w^{*} = \Sigma^{-1}(\lambda_1 \mu + \lambda_2 \mathbf{1})$ and compute $\sigma_p^2 = (w^{*})^\top \Sigma w^{*}$ as a function of $r_{\text{target}}$ to see the parabola.
Worked Solution
How to Think About It: MVO is just constrained quadratic programming dressed up in finance language. You are minimizing a quadratic form $w^\top \Sigma w$ (portfolio variance) subject to linear constraints. The math is clean because the objective is convex and the constraints are linear -- so the KKT conditions give you a closed-form solution. The real challenge is not the algebra; it is that the solution is extremely sensitive to the expected return vector $\mu$, which is notoriously hard to estimate. A senior quant once told me: "MVO is an error-maximizing machine -- it takes your worst estimates and gives them the biggest weights."
Key Insight: The optimal weight vector is a linear function of the target return. This means the entire efficient frontier is generated by just two portfolios (the two-fund separation theorem).
The Method:
*Step 1 -- Formulation:*
$\min_w \frac{1}{2} w^\top \Sigma w \quad \text{subject to} \quad w^\top \mu = r_{\text{target}}, \quad \mathbf{1}^\top w = 1$
The
*Step 2 -- Lagrangian and KKT Conditions:*
$\mathcal{L} = \frac{1}{2} w^\top \Sigma w - \lambda_1(w^\top \mu - r_{\text{target}}) - \lambda_2(\mathbf{1}^\top w - 1)$
First-order condition: $\nabla_w \mathcal{L} = \Sigma w - \lambda_1 \mu - \lambda_2 \mathbf{1} = 0$, so:
$w^{*} = \Sigma^{-1}(\lambda_1 \mu + \lambda_2 \mathbf{1})$
*Step 3 -- Solve for Multipliers:*
Define the three scalars:
- $A = \mathbf{1}^\top \Sigma^{-1} \mu$
- $B = \mu^\top \Sigma^{-1} \mu$
- $C = \mathbf{1}^\top \Sigma^{-1} \mathbf{1}$
Substituting $w^{*}$ into the two constraints gives:
$\lambda_1 B + \lambda_2 A = r_{\text{target}}, \quad \lambda_1 A + \lambda_2 C = 1$
Solving this
$\lambda_1 = \frac{C \cdot r_{\text{target}} - A}{\Delta}, \quad \lambda_2 = \frac{B - A \cdot r_{\text{target}}}{\Delta}$
*Step 4 -- Efficient Frontier:*
The minimum variance at target return $r_{\text{target}}$ is:
$\sigma_p^2 = (w^{*})^\top \Sigma \, w^{*} = \frac{C \cdot r_{\text{target}}^2 - 2A \cdot r_{\text{target}} + B}{\Delta}$
This is a parabola in $(\sigma_p^2, r_p)$ space -- the efficient frontier. The upper branch (higher return for given risk) is the part investors care about.
Practical Considerations:
- Estimation of $\mu$: Expected returns are estimated with large error. Small changes in $\mu$ can flip the optimal portfolio entirely. Shrinkage estimators (e.g., James-Stein, Black-Litterman) help stabilize the solution.
- Estimation of $\Sigma$: More stable than $\mu$, but for large $n$ the sample covariance matrix may be ill-conditioned. Shrinkage toward a structured target (e.g., Ledoit-Wolf) or factor models are standard remedies.
- No short-sale constraints: The closed-form above allows negative weights. Adding $w_i \geq 0$ constraints removes the analytical solution and requires a QP solver.
- Two-fund separation: Any efficient portfolio is a linear combination of any two distinct efficient portfolios. This is a direct consequence of $w^{*}$ being linear in $r_{\text{target}}$.
Answer: The optimal weight vector is $w^{*} = \Sigma^{-1}(\lambda_1 \mu + \lambda_2 \mathbf{1})$ where $\lambda_1, \lambda_2$ are determined by the constraints. The efficient frontier is the parabola $\sigma_p^2 = (C r_{\text{target}}^2 - 2A r_{\text{target}} + B)/\Delta$ with $A, B, C$ defined from $\Sigma^{-1}$ and $\mu$. In practice, MVO is highly sensitive to estimation error in $\mu$, making robust estimation or regularization essential.
Intuition
MVO is the foundation of modern portfolio theory, and every quant should be able to derive it from scratch. The key mathematical insight is that minimizing a quadratic form subject to linear constraints is one of the few optimization problems with a clean closed-form solution. The two-fund separation result -- that the entire efficient frontier is spanned by two portfolios -- is a direct consequence of the optimal weights being linear in the target return.
The practical lesson is more sobering: MVO is famously unstable. The optimizer concentrates weight on assets with the highest estimated returns and lowest estimated correlations, which are precisely the quantities with the most estimation error. This is why practitioners almost never use raw MVO. Instead, they use shrinkage estimators for $\mu$ and $\Sigma$, impose constraints on position sizes, or bypass $\mu$ estimation entirely (risk parity, minimum variance). Understanding why MVO fails in practice is arguably more important than being able to derive it.