Eigenvectors of an Equicorrelation Matrix
Consider $n$ assets, each with variance $\sigma^2$ and every pairwise correlation equal to $\rho$. The covariance matrix has the form
$\Sigma = \sigma^2[(1 - \rho)I + \rho \mathbf{1}\mathbf{1}^T]$
where $I$ is the $n \times n$ identity and $\mathbf{1}$ is the all-ones vector.
- Find all eigenvalues and eigenvectors of $\Sigma$.
- What does the first principal component represent? What fraction of total variance does it explain?
Hints
- The matrix is the identity plus a rank-1 perturbation. Think about how rank-1 updates affect the spectrum.
- Try the all-ones vector $\mathbf{1}$ as a candidate eigenvector. What happens when $\mathbf{1}\mathbf{1}^T$ acts on $\mathbf{1}$?
- Any vector whose components sum to zero is killed by the $\rho \mathbf{1}\mathbf{1}^T$ term, leaving only the $(1 - \rho)I$ piece.
Worked Solution
How to Think About It: This matrix has a very special structure: it is the identity (scaled) plus a rank-1 perturbation. Whenever you see "identity plus rank-1," you should immediately know the eigenstructure. The rank-1 piece $\mathbf{1}\mathbf{1}^T$ only affects vectors in the direction of $\mathbf{1}$ -- everything orthogonal to $\mathbf{1}$ is untouched. So the eigenvectors split into two groups: the all-ones direction and everything perpendicular to it. This is a pattern that comes up constantly in portfolio theory.
Quick Estimate: Before doing any algebra, think about what the eigenvalues should look like. If $\rho = 0$, we get $\Sigma = \sigma^2 I$, so all eigenvalues equal $\sigma^2$. If $\rho = 1$, the matrix is $\sigma^2 \mathbf{1}\mathbf{1}^T$, which has one eigenvalue $n\sigma^2$ and the rest are zero. So as $\rho$ increases from 0 to 1, one eigenvalue should grow from $\sigma^2$ toward $n\sigma^2$ while the others shrink toward zero. This is exactly what we will find.
Approach: Use the rank-1 perturbation structure to identify eigenvectors by inspection.
Formal Solution:
Eigenvector 1 -- the market direction: Let $\mathbf{v}_1 = \frac{1}{\sqrt{n}}\mathbf{1}$. Then:
$\Sigma \mathbf{v}_1 = \sigma^2[(1 - \rho)\mathbf{v}_1 + \rho \mathbf{1}(\mathbf{1}^T \mathbf{v}_1)]$
Since $\mathbf{1}^T \mathbf{v}_1 = \sqrt{n}$, we get $\mathbf{1}(\mathbf{1}^T \mathbf{v}_1) = \sqrt{n} \cdot \mathbf{1} = n \mathbf{v}_1$. Therefore:
$\Sigma \mathbf{v}_1 = \sigma^2[(1 - \rho) + n\rho] \mathbf{v}_1 = \sigma^2[1 + (n-1)\rho] \mathbf{v}_1$
$\lambda_1 = \sigma^2[1 + (n-1)\rho]$
Eigenvectors 2 through $n$ -- the long-short directions: Take any vector $\mathbf{v}$ with $\mathbf{1}^T \mathbf{v} = 0$ (i.e., components sum to zero). Then:
$\Sigma \mathbf{v} = \sigma^2[(1 - \rho)\mathbf{v} + \rho \mathbf{1}(\underbrace{\mathbf{1}^T \mathbf{v}}_{= 0})] = \sigma^2(1 - \rho)\mathbf{v}$
$\lambda_2 = \lambda_3 = \cdots = \lambda_n = \sigma^2(1 - \rho)$
This is an $(n-1)$-fold degenerate eigenvalue, so any orthonormal basis of the subspace $\{\mathbf{v} : \mathbf{1}^T \mathbf{v} = 0\}$ works.
Fraction of variance explained by PC1:
$\frac{\lambda_1}{\text{tr}(\Sigma)} = \frac{\sigma^2[1 + (n-1)\rho]}{n\sigma^2} = \frac{1 + (n-1)\rho}{n}$
For large $n$, this approaches $\rho$. So if you have 100 stocks with average pairwise correlation 0.3, the first PC explains roughly 30% of total variance.
Answer: The eigenvalues are $\lambda_1 = \sigma^2[1 + (n-1)\rho]$ (with eigenvector $\propto \mathbf{1}$) and $\lambda_2 = \cdots = \lambda_n = \sigma^2(1 - \rho)$ (with eigenvectors spanning the orthogonal complement of $\mathbf{1}$). The first PC is the equal-weighted market factor, capturing a fraction $\frac{1 + (n-1)\rho}{n} \approx \rho$ of total variance for large $n$. The remaining PCs represent zero-net-investment (long-short) portfolios -- relative value bets that are orthogonal to the market.
Intuition
The equicorrelation matrix is the simplest model of a market driven by one common factor plus idiosyncratic noise. The first PC (the equal-weighted portfolio) captures that common factor, and its eigenvalue grows linearly with $n$ and $\rho$ because adding more correlated assets amplifies the shared signal. Every other eigenvector is a long-short portfolio -- its components sum to zero, so the common factor cancels out and you are left with pure idiosyncratic risk, which has eigenvalue $\sigma^2(1 - \rho)$ regardless of $n$.
This decomposition is the backbone of single-factor risk models in practice. When someone says "the market explains 30% of variance," they are implicitly saying the average pairwise correlation is about 0.3. The key subtlety people miss is that the $(n-1)$-fold degeneracy means PCs 2 through $n$ are not uniquely determined -- any rotation within that eigenspace works equally well. In real data with heterogeneous correlations, the degeneracy breaks and you get interpretable sector/style factors.