Numerically Stable Log-Sum-Exp and Softmax

Coding · Medium · Free problem
Naive computation of $\log \sum_i e^{x_i}$ and $\text{softmax}(x)_j = e^{x_j} / \sum_i e^{x_i}$ blows up or underflows when entries of $x$ are large in magnitude. Your job is to implement numerically stable versions of both, and then build cross-entropy loss and its gradient on top. 1. Implement `log_sum_exp(x)` that computes $\log \sum_{i=1}^{n} e^{x_i}$ without overflow or underflow. 2. Implement `softmax(x)` that returns the vector $s$ with $s_j = e^{x_j} / \sum_i e^{x_i}$, also numerically stable. 3. Implement `cross_entropy_loss(logits, target)` where `target` is the index of the correct class. This should return $-\log s_{\text{target}}$ computed in a stable way. 4. Implement `cross_entropy_gradient(logits, target)` that returns $\partial L / \partial x$. 5. Write unit tests covering: - Normal inputs (small values) - Extreme inputs (entries on the order of $\pm 10^3$) - Inputs containing $-\infty$ - Inputs where all entries are equal - A single-element input **Constraints:** - Input is a 1-D array of floats with
\le n \le 10^5$. - You may use NumPy but not any library softmax / logsumexp. - All outputs must be free of `NaN` and `Inf` for valid inputs. **Example 1:** ``` Input: x = [1.0, 2.0, 3.0] log_sum_exp(x) = 3.40761... softmax(x) = [0.09003, 0.24473, 0.66524] ``` **Example 2:** ``` Input: x = [1000.0, 1000.0, 1000.0] log_sum_exp(x) = 1001.0986... softmax(x) = [0.33333, 0.33333, 0.33333] ``` **Example 3:** ``` Input: x = [-1000.0, -999.0, -998.0] log_sum_exp(x) = -997.5924... softmax(x) = [0.09003, 0.24473, 0.66524] ```

Open the full interactive solver, hints, and worked solution →