QR Factorization via Householder Reflectors
Let $A \in \mathbb{R}^{n \times p}$ be a full-rank matrix with $n \geq p$.
- Construct the first Householder reflector $H_1 = I - 2\mathbf{u}\mathbf{u}^T$ (where $\mathbf{u}$ is a unit vector) that zeroes out entries 2 through $n$ of the first column of $A$.
- Prove that $H$ is both symmetric and orthogonal.
- Derive the flop count for computing the full QR factorization via Householder reflectors, showing it is $O(np^2)$.
- Explain the numerical advantages of Householder QR over the classical Gram-Schmidt process.
Hints
- To zero entries below the diagonal in column $\mathbf{a}_1$, you need a reflection that maps $\mathbf{a}_1$ to $\alpha \mathbf{e}_1$ where $\alpha = \pm\|\mathbf{a}_1\|$. The reflection hyperplane is perpendicular to $\mathbf{a}_1 - \alpha \mathbf{e}_1$.
- For symmetry and orthogonality: expand $(I - 2\mathbf{u}\mathbf{u}^T)^2$ and use $\mathbf{u}^T\mathbf{u} = 1$.
- At step $k$, applying the reflector to the trailing submatrix costs $O((n-k)(p-k))$ flops. Sum over $k = 1, \ldots, p$ and extract the leading term in $n$ and $p$.
Worked Solution
How to Think About It: QR factorization writes $A = QR$ where $Q$ is orthogonal and $R$ is upper triangular. The Householder approach builds $R$ one column at a time: at each step, you apply a reflection that zeros out everything below the diagonal in the current column. Each reflection is an orthogonal transformation, so the product of reflections gives you $Q$. The beauty is that reflections are numerically stable -- they preserve vector norms exactly (in exact arithmetic) -- unlike Gram-Schmidt, which can lose orthogonality when columns are nearly dependent.
Formal Solution:
Part 1: Constructing the first Householder reflector.
Let $\mathbf{a}_1$ be the first column of $A$. We want $H_1 \mathbf{a}_1 = \alpha \mathbf{e}_1$ where $\mathbf{e}_1 = (1, 0, \ldots, 0)^T$ and $\alpha = \pm\|\mathbf{a}_1\|_2$.
Define the vector:
$\mathbf{v} = \mathbf{a}_1 - \alpha\, \mathbf{e}_1, \quad \alpha = -\text{sign}(a_{11})\|\mathbf{a}_1\|_2$
The sign choice $\alpha = -\text{sign}(a_{11})\|\mathbf{a}_1\|$ avoids catastrophic cancellation when $a_{11}$ and $\|\mathbf{a}_1\|$ have the same sign.
Set the unit vector:
$\mathbf{u} = \frac{\mathbf{v}}{\|\mathbf{v}\|_2}$
Then the Householder reflector is:
$H_1 = I - 2\mathbf{u}\mathbf{u}^T$
Verification: $H_1 \mathbf{a}_1 = \mathbf{a}_1 - 2\mathbf{u}(\mathbf{u}^T\mathbf{a}_1)$. Since $\mathbf{v} = \mathbf{a}_1 - \alpha\mathbf{e}_1$, we have $\mathbf{u}^T\mathbf{a}_1 = \frac{\mathbf{v}^T\mathbf{a}_1}{\|\mathbf{v}\|}$, and $\mathbf{v}^T\mathbf{a}_1 = \|\mathbf{a}_1\|^2 - \alpha a_{11}$. Working through the algebra:
$\|\mathbf{v}\|^2 = \|\mathbf{a}_1\|^2 - 2\alpha a_{11} + \alpha^2 = 2(\alpha^2 - \alpha a_{11})$
So $\mathbf{u}^T\mathbf{a}_1 = \frac{\alpha^2 - \alpha a_{11}}{\|\mathbf{v}\|} = \frac{\|\mathbf{v}\|^2}{2\|\mathbf{v}\|} = \frac{\|\mathbf{v}\|}{2}$.
Therefore $H_1\mathbf{a}_1 = \mathbf{a}_1 - 2\mathbf{u}\cdot\frac{\|\mathbf{v}\|}{2} = \mathbf{a}_1 - \mathbf{v} = \alpha\mathbf{e}_1$. The entries below position 1 are zeroed.
Part 2: Proving $H$ is symmetric and orthogonal.
Symmetry: $H^T = (I - 2\mathbf{u}\mathbf{u}^T)^T = I^T - 2(\mathbf{u}\mathbf{u}^T)^T = I - 2\mathbf{u}\mathbf{u}^T = H$.
Orthogonality: $H^T H = H H = (I - 2\mathbf{u}\mathbf{u}^T)(I - 2\mathbf{u}\mathbf{u}^T)$
$= I - 4\mathbf{u}\mathbf{u}^T + 4\mathbf{u}(\mathbf{u}^T\mathbf{u})\mathbf{u}^T = I - 4\mathbf{u}\mathbf{u}^T + 4\mathbf{u}\mathbf{u}^T = I$
using $\mathbf{u}^T\mathbf{u} = 1$ (since $\mathbf{u}$ is a unit vector).
Since $H^T = H$ and $H^T H = I$, we also have $H^2 = I$, so $H$ is an involution (its own inverse). Geometrically, it is a reflection through the hyperplane perpendicular to $\mathbf{u}$. $\blacksquare$
Part 3: Flop count for full QR.
The full QR factorization applies $p$ successive Householder reflectors. At step $k$ ($k = 1, \ldots, p$), the reflector $H_k$ acts on the trailing $(n - k + 1) \times (p - k + 1)$ submatrix.
At step $k$: - Computing $\mathbf{v}_k$: $O(n - k)$ flops. - Applying $H_k$ to the remaining $p - k$ columns: for each column, compute $\mathbf{u}_k^T \mathbf{c}_j$ (an inner product costing