LP-based MAP inference in rich structured graphical models
- Probabilistic graphical models are a ubiquitous tool in artificial intelligence and related domains, including computer vision, natural language processing, and computational biology. To deploy a graphical model, a crucial step is to perform inference: reasoning about probable states of the system (assignments of variables) given the model and its parameters. The inference problem can be formulated mainly in two ways: (1) finding marginal or conditional probabilities of states, and (2) finding the most probable states. The latter is often referred as MAP (maximum a posteriori) inference or energy minimization, which will be the primary focus of this dissertation. Because of its wide applicability, there are many existing methods and software packages for solving or approximating the MAP inference problem. Although the problem in general has shown to be NP-hard, efficient solutions have been found in various special cases (such as tree-structured problems and submodular problems). How to find better approximations in more difficult situations has been a field of intensive study in recent years. A large family of these approximation methods are based on LP (linear programming) relaxation. The tightness of the relaxation (accuracy of approximation) depends on whether or not the graphical model has rich structures, including dense connectivity and strong interactions among variables. Richer structures are preferable in the modeling perspective because of the promise of better capturing real world interactions, however they also induce major difficulties in inference. Existing methods are usually very slow or inaccurate, or both, when applied to rich-structured problems. The main challenge in improving them lies in how to construct tighter LP relaxations and how to solve them more efficiently. In this dissertation, we investigate the LP based approach to MAP inference. We first describe a flexible framework based on dual decompostion for constructing and optimizing tightened LP relaxations. A series of new techniques and algorithms are then proposed under this framework. By putting these techniques together we developed a powerful MAP inference toolbox. We empirically evaluated our toolbox over a large number of datasets with real world models across different domains. The results show that our toolbox achieves state-of-the-art performance across a wide spectrum of problems. The contributions of this dissertation are summarize as follows. For solving the aforementioned LP, we developed a unified framework for convergent message passing algorithms. It is previously known that these algorithms can be interpreted as block coordinate ascent (BCA) in the dual. However only for a rather limited selection of blocks we knew how to achieve dual optimal via ``message passing''. Our framework helped identify a much larger family of tractable block selections, which gives rise to new and potentially more powerful message passing algorithms. This broadened family of messages passing algorithms are called subproblem-tree calibration (STC). STC still enforces some restriction on the move-direction (block coordinates) in dual space. It could suffer from slow convergence if the restricted moves do not align well with good ascent directions. That is, we would observe ``messages'' oscillating indefinitely, making very slow progress. To tackle this problem we investigated applying general purpose optimization techniques (such as L-BFGS) to dual subspaces that are chosen adaptively without the restrictions of STC. Convergence is greatly accelerated by (1) adaptively choosing a good search direction from a given dual state, and (2) fast evaluating of subgradient in a projected lower dimensional space. As we will show, a bottleneck subroutine in using tightened LP relaxations is performing MAP inference for a cycle subproblem. To that end we developed a fast and exact algorithm for this subroutine. Our algorithm achieves high performance by exploiting the structure of the problem and pruning unnecessary computation. Empirically it is more than an order of magnitude faster than state-of-the-art alternatives. We put all these techniques together and developed a powerful MAP inference toolbox. We also performed a thorough evaluation featuring 32 datasets with 3,574 problem instances in total. The models are from several different domains with extremely large variance in their characteristics. We compared our method with over 20 different state-of-the-art methods. It is shown that our method achieves the best performance in a wide spectrum of scenarios. Detailed comparisons and ablation analysis are also shown to provide more empirical evidence for future customizing and deploying of our toolbox in practical situations.
|Type of resource
|electronic; electronic resource; remote
|1 online resource.
|Stanford University, Department of Computer Science.
|Li, Fei Fei, 1976-
|Li, Fei Fei, 1976-
|Statement of responsibility
|Submitted to the Department of Computer Science.
|Ph.D. Stanford University 2014
- © 2014 by Huayan Wang
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...