1. **Problem Statement:** We want to understand Principal Component Analysis (PCA) in high dimensions, focusing on dimension reduction, spectral properties of covariance matrices, and the spike model.
2. **PCA as Best d-Dimensional Affine Fit:** Given data points $x_1, \ldots, x_n \in \mathbb{R}^p$, PCA finds a $d$-dimensional affine subspace minimizing the sum of squared distances:
$$\min_{\mu, V, \beta_k} \sum_{k=1}^n \|x_k - (\mu + V \beta_k)\|^2 \quad \text{s.t. } V^T V = I_d,$$
where $V = [v_1 \cdots v_d]$ is an orthonormal basis.
3. **Optimal Translation $\mu^*$:** By setting gradient to zero, the optimal $\mu^*$ is the sample mean:
$$\mu^* = \mu_n = \frac{1}{n} \sum_{k=1}^n x_k.$$
4. **Optimal Coefficients $\beta_k$:** For fixed $V$, the best $\beta_k$ is the projection:
$$\beta_k = V^T (x_k - \mu_n).$$
5. **Equivalent Optimization:** The problem reduces to maximizing variance captured:
$$\max_{V^T V = I} \mathrm{Tr}(V^T \Sigma_n V),$$
where $\Sigma_n = \frac{1}{n-1} \sum_{k=1}^n (x_k - \mu_n)(x_k - \mu_n)^T$ is the sample covariance.
6. **Solution:** The columns of $V$ are the top $d$ eigenvectors of $\Sigma_n$ corresponding to the largest eigenvalues.
7. **PCA as Variance Maximization:** PCA also maximizes variance of projected data, confirming equivalence.
8. **Computational Complexity:** Computing $\Sigma_n$ costs $O(np^2)$, eigen-decomposition $O(p^3)$. Using SVD of centered data matrix $X - \mu_n 1^T$ reduces cost to $O(\min(n^2 p, p^2 n))$.
9. **Choosing $d$:** Use scree plot of eigenvalues $\lambda_1 \geq \lambda_2 \geq \cdots$ and look for an "elbow" to select $d$ capturing significant variance.
10. **High-Dimensional PCA and Marchenko-Pastur Law:** When $p,n \to \infty$ with $p/n = \gamma \leq 1$, eigenvalues of sample covariance $S_n = \frac{1}{n} X X^T$ follow Marchenko-Pastur distribution:
$$dF_\gamma(\lambda) = \frac{\sqrt{(\gamma_+ - \lambda)(\lambda - \gamma_-)}}{2 \pi \gamma \lambda} 1_{[\gamma_-, \gamma_+]}(\lambda) d\lambda,$$
where $\gamma_\pm = (1 \pm \sqrt{\gamma})^2$.
11. **Spike Model:** Consider $\Sigma = I + \beta v v^T$ with $v$ unit vector and $\beta \geq 0$. Data $x \sim N(0, \Sigma)$.
12. **BBP Phase Transition:** The largest eigenvalue $\lambda_{max}(S_n)$ of sample covariance exhibits a phase transition:
- If $\beta \leq \sqrt{\gamma}$, $\lambda_{max}(S_n) \to \gamma_+$ (edge of Marchenko-Pastur support).
- If $\beta > \sqrt{\gamma}$, $\lambda_{max}(S_n) \to (1 + \beta)(1 + \frac{\gamma}{\beta}) > \gamma_+$.
13. **Eigenvector Alignment:** The leading eigenvector aligns with $v$ only if $\beta > \sqrt{\gamma}$.
14. **Summary:** PCA finds principal components as eigenvectors of covariance matrix. In high dimensions, random matrix theory (Marchenko-Pastur) describes eigenvalue distribution. Spike models reveal detectability thresholds (BBP transition) for low-rank signals in noise.
**Final answer:** PCA components correspond to top eigenvectors of $\Sigma_n$. In high dimensions, eigenvalues follow Marchenko-Pastur law. Spike model shows eigenvalue separation if $\beta > \sqrt{\gamma}$.
Pca Spike Model B63145
Step-by-step solutions with LaTeX - clean, fast, and student-friendly.