Selecting the dimension of a subspace in principal component analysis and canonical correlation analysis

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fds035mb2953" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: It is common practice in statistical data analysis to perform dimension reduction, as modern data sets grow larger and more complex. Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA) are two of the most popular methods for dimension reduction. Despite the popularity of these methods, there is no widely adopted standard approach to select the proper dimension of the subspace to be obtained by PCA or CCA. To address this issue, we propose a novel method utilizing the hypothesis testing framework and test whether the currently selected subspace via PCA or CCA captures all the statistically significant signals in the given data set. While existing hypothesis testing approaches do not enjoy the exact type 1 error property and lose power under some scenarios, the proposed method provides exact type 1 error controls along with decent size of power in detecting signals. Central to our work is the post-selection inference framework which facilitates valid inference after data-driven model selection; the proposed hypothesis testing method provides exact type 1 error controls by conditioning on the selection event which leads to the inference.

Type of resource	text
Form	electronic; electronic resource; remote
Extent	1 online resource.
Publication date	2016
Issuance	monographic
Language	English

Associated with	Choi, Yunjin
Associated with	Stanford University, Department of Statistics.
Primary advisor	Taylor, Jonathan
Primary advisor	Tibshirani, Robert
Thesis advisor	Taylor, Jonathan
Thesis advisor	Tibshirani, Robert
Thesis advisor	Johnstone, Iain
Advisor	Johnstone, Iain

Genre	Theses

Statement of responsibility	Yunjin Choi.
Note	Submitted to the Department of Statistics.
Thesis	Thesis (Ph.D.)--Stanford University, 2016.
Location	electronic resource

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

View in SearchWorks

Loading usage metrics...