Optimization methods for regularized high dimensional graphical model selection
- Graphical models yield compact representations of the dependencies present in a multivariate random vector via graphs/networks. Nodes in the graph encode random variables in a high dimensional random vector, and the edges represent different types of associations, such as conditional or marginal dependences. Sparse graphical models have been useful for encoding complex multivariate dependencies in ultra high dimensional sample starved settings, where limited sample sizes often only allow for the estimation of sparse graphs. Given the wide applicability of such models, the field has seen several key contributions from a wide spectrum of communities, including the statistics, machine learning, mathematics, computer science, computational mathematics and optimization communities. Despite tremendous efforts, the vast majority of work on graphical model selection for continuous data have been centered around the multivariate Gaussian distribution. This restriction often poses serious shortcomings in various applications. In this thesis we propose a comprehensive methodology for graphical model selection that goes beyond the Gaussian paradigm. In particular, we propose a nested sequence of families of distributions rooted in probability and statistical theory that enrich the Gaussian, so as to yield a a more flexible family. We demonstrate that our proposed class of distributions, the log-concave elliptical family, has deep and interesting structure. Moreover, this family of multivariate distributions are constructed so as to take advantage of convex optimization tools that yield fast algorithms in order to estimate high dimensional partial correlation graphs. We develop rigorous theory to give a firm foundation to our proposed approach, both from optimization and statistical perspectives. Statistical issues such as identifiability, calculation of the Fisher information, consistency and asymptotic normality are considered. Consistent estimation of the additional shape parameters of the log-concave elliptical family in a way that is computationally tractable is carefully developed. From the optimization perspective, first and second order proximal methods are used for maximizing l1 regularized log-concave elliptical likelihoods, and linear and quadratic rates of convergence for these approaches are established. To our knowledge, our endeavour is the only such approach in the literature with established theory that is applicable in moderate or high dimensions. The methodology is illustrated on both real and simulated data to demonstrate its efficacy.
|Type of resource
|electronic; electronic resource; remote
|1 online resource.
|Dalal, Onkar Anant
|Stanford University, Department of Computational and Mathematical Engineering.
|Romano, Joseph P, 1960-
|Romano, Joseph P, 1960-
|Statement of responsibility
|Onkar Anant Dalal.
|Submitted to the Department of Computational and Mathematical Engineering.
|Ph.D. Stanford University 2013
- © 2013 by Onkar Anant Dalal
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...