Learning graphical models fundamental limits and efficient algorithms

Placeholder Show Content

Abstract/Contents

Abstract
Graphical models provide a flexible and yet powerful language to describe high dimensional probability distributions. Over the last 20 years, graphical models methods have been successfully applied to a broad range of problems, from computer vision to error correcting codes to computational biology, to name a few. In a graphical model, the graph structure characterizes the conditional independence properties of the underlying probability distributions. Roughly speaking, it encodes key information about which variables influence each other. It allows us to answer questions of the type: are variables X and Y dependent because they 'interact' directly, or because they are both dependent on a third variable Z? In many applications, this information has utmost practical importance, and it is therefore crucial to develop efficient algorithms to learn the graphical structure from data. This problem is largely unsolved and for a long time several heuristics have been used without a solid theoretical foundation in place. In the first part of this work, we consider the problem of learning the structure of Ising models (pairwise binary Markov random fields) from i.i.d. samples. While several methods have been proposed to accomplish this task, their relative merits and limitations remain somewhat obscure. By analyzing a number of concrete examples, we show that low-complexity algorithms often fail when the Markov random field develops long-range correlations. More precisely, this phenomenon appears to be related to the Ising model phase transition (although it does not coincide with it). An example of an important algorithm that exhibits this behavior is the L1- regularized logistic regression estimator introduced by Ravikumar et al. Ravikumar et al. proved a set sufficient conditions under which this algorithm exactly learns Ising models, the most interesting being a so-called incoherence condition. In this thesis we show that this incoherence condition is also necessary and analytically establish whether it holds for several families of graphs. In particular, denoting by [theta] the edge strength and by \Delta the maximum degree, we prove that regularized logistic regression succeeds on any graph with \Delta [less than or equal to] 3/(10 \theta) and fails on most regular graphs with \Delta [greater than or equal to] 2 / \theta. In the second part of this work, we address the important scenario in which data is not composed of i.i.d. samples. We focus on the problem of learning the drift coefficient of a p-dimensional stochastic differential equation (SDE) from a sample path of length T. We assume that the drift is parametrized by a high-dimensional vector, and study the support recovery problem in the case where p is allowed to grow with T. In particular, we describe a general lower bound on the sample-complexity T by using a characterization of mutual information as a time integral of conditional variance, due to Kadota, Zakai, and Ziv. For linear SDEs, the drift coefficient is parametrized by a p-by-p matrix which describes which degrees of freedom interact under the dynamics. In this case, we analyze an L1-regularized least-squares estimator and describe an upper bound on T that nearly matches the lower bound on specific classes of sparse matrices. We describe how this same algorithm can be used to learn non-linear SDEs and in addition show by means of a numerical experiment why one should expect the sample-complexity to be of the same order as that for linear SDEs.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2012
Issuance monographic
Language English

Creators/Contributors

Associated with Bento Ayres Pereira, Jose
Associated with Stanford University, Department of Electrical Engineering
Primary advisor Montanari, Andrea
Thesis advisor Montanari, Andrea
Thesis advisor Candès, Emmanuel J. (Emmanuel Jean)
Thesis advisor Johari, Ramesh, 1976-
Advisor Candès, Emmanuel J. (Emmanuel Jean)
Advisor Johari, Ramesh, 1976-

Subjects

Genre Theses

Bibliographic information

Statement of responsibility José Bento.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis (Ph.D.)--Stanford University, 2012.
Location electronic resource

Access conditions

Copyright
© 2012 by Jose Bento Ayres Pereira
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...