Contributions to high-dimensional principal component analysis
- Principal component analysis is a widely used dimension reduction method, but difficulties can arise when it is applied to very high dimensional data. In this thesis, motivated by examples in chemometrics, signal processing, econometrics, etc., we investigate two aspects of high-dimensional principal component analysis for a class of "low-rank signal plus noise'' models under Gaussian assumption. In the first part, we study rates of convergence for the distributions of extreme sample eigenvalues to their Tracy-Widom limits, when there is no signal in the observations and the sample size and the dimensionality of the feature space grow to infinity proportionally. By careful choice of the rescaling constants, we improve the rate to the second order for the largest eigenvalue. An analogous result is established for the smallest eigenvalue. Numerical experiments show that the asymptotic distributions are informative even when the sample size or the feature dimenionality is as small as 2. In the second part, we consider recovery of principal subspaces under the assumption that the principal axes have a sparse representation. We find that a new iterative thresholding approach recovers the leading principal subspace consistently, and even achieves near optimal rate of convergence, in a wide range of high-dimensional settings. Both statistical and computational properties of the approach are studied. Its competitive performance is demonstrated on a collection of simulated examples.
|Type of resource
|electronic; electronic resource; remote
|1 online resource.
|Stanford University, Department of Statistics
|Statement of responsibility
|Submitted to the Department of Statistics.
|Thesis (Ph. D.)--Stanford University, 2010.
- © 2010 by Zongming Ma
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...