LAD Estimator as MLE Under Laplace Errors

Statistics · Medium · Free problem

Consider the location model $y_i = \mu + \varepsilon_i$ where the errors $\varepsilon_i$ are i.i.d. $\text{Laplace}(0, b)$ with density $f(\varepsilon) = \frac{1}{2b}\exp\!\left(-\frac{|\varepsilon|}{b}\right)$.

Show that the maximum likelihood estimator of $\mu$ is the least absolute deviations (LAD) estimator -- i.e., the value of $\mu$ that minimizes $\sum_{i=1}^n |y_i - \mu|$.

Show that the LAD estimator equals the sample median of $y_1, \ldots, y_n$.

Derive the asymptotic variance of the LAD estimator and compare its efficiency to OLS (the sample mean) under Laplace errors.

Hints

Write out the log-likelihood for Laplace errors -- what loss function does maximizing it correspond to?
To show the LAD minimizer is the median, examine the subgradient of $\sum |y_i - \mu|$ and find where it equals zero.
For the asymptotic variance, use the standard result that the sample median has asymptotic variance
/(4nf(0)^2)$ where $f$ is the error density, and compare to $\text{Var}(\varepsilon_i)/n$ for the mean.

Worked Solution

How to Think About It: The connection between MLE and loss functions is direct -- taking the negative log-likelihood turns a maximization into a minimization of some loss. For Gaussian errors, the negative log-likelihood is proportional to the sum of squared residuals, giving OLS. For Laplace errors, the negative log-likelihood is proportional to the sum of absolute residuals, giving LAD. This is why robust statisticians love the median: it is the MLE when errors are heavy-tailed (Laplace), while the mean is the MLE when errors are Gaussian. Under Laplace errors, the median is actually more efficient than the mean.

Key Insight: The Laplace density's absolute value in the exponent directly produces an $L^1$ loss function in the log-likelihood, making the LAD estimator the MLE.

The Method:

Part 1: LAD is the MLE.

The joint density of $y_1, \ldots, y_n$ given $\mu$ is:

$L(\mu) = \prod_{i=1}^n \frac{1}{2b} \exp\!\left(-\frac{|y_i - \mu|}{b}\right) = \frac{1}{(2b)^n} \exp\!\left(-\frac{1}{b}\sum_{i=1}^n |y_i - \mu|\right)$

The log-likelihood is:

$\ell(\mu) = -n\ln(2b) - \frac{1}{b}\sum_{i=1}^n |y_i - \mu|$

Maximizing $\ell(\mu)$ over $\mu$ is equivalent to minimizing:

$\sum_{i=1}^n |y_i - \mu|$

This is precisely the LAD objective. So the MLE $\hat{\mu}_{\text{MLE}} = \hat{\mu}_{\text{LAD}} = \arg\min_{\mu} \sum_i |y_i - \mu|$.

Part 2: LAD equals the sample median.

The function $g(\mu) = \sum_{i=1}^n |y_i - \mu|$ is convex and piecewise linear, with kinks at each $y_i$. Its subdifferential at $\mu$ is:

$\partial g(\mu) = \sum_{i=1}^n \text{sign}(\mu - y_i)$

where $\text{sign}(\mu - y_i) \in [-1, 1]$ when $\mu = y_i$. The minimum occurs where $0 \in \partial g(\mu)$, which requires:

$|\{i : y_i < \mu\}| - |\{i : y_i > \mu\}| \ni 0$

This is exactly the condition that $\mu$ is a median of $y_1, \ldots, y_n$. For odd $n$, the minimizer is the middle order statistic. For even $n$, any value between the two middle order statistics minimizes $g$ (conventionally, we take the average).

Part 3: Asymptotic variance and efficiency comparison.

For the LAD estimator (sample median), the asymptotic distribution is:

$\sqrt{n}(\hat{\mu}_{\text{LAD}} - \mu) \xrightarrow{d} N\!\left(0, \frac{1}{4f(0)^2}\right)$

where $f(0) = \frac{1}{2b}$ is the density of $\varepsilon_i$ at zero. Substituting:

$\text{Var}_{\text{asy}}(\hat{\mu}_{\text{LAD}}) = \frac{1}{n} \cdot \frac{1}{4 \cdot (1/(2b))^2} = \frac{b^2}{n}$

For OLS (sample mean $\bar{y}$):

$\text{Var}(\bar{y}) = \frac{\text{Var}(\varepsilon_i)}{n} = \frac{2b^2}{n}$

since $\text{Var}(\text{Laplace}(0,b)) = 2b^2$.

Efficiency comparison: The asymptotic relative efficiency of LAD to OLS is:

$\text{ARE}(\text{LAD}, \text{OLS}) = \frac{\text{Var}(\bar{y})}{\text{Var}(\hat{\mu}_{\text{LAD}})} = \frac{2b^2/n}{b^2/n} = 2$

So under Laplace errors, the sample median is twice as efficient as the sample mean. The median has asymptotic variance $b^2/n$, while the mean has

LAD Estimator as MLE Under Laplace Errors

Hints

Worked Solution

Intuition