Thallo : a domain specific language for non-linear least squares optimization in visual computing
Abstract/Contents
- Abstract
- Large-scale non-linear least squares (NLLS) optimization problems lie at the core of many graphics, vision, and imaging applications. The mathematical descriptions of these objective functions are extremely concise, but in order to use them in interactive applications, solvers are often implemented by hand in tedious and error-prone processes (particularly on GPUs). High-level libraries exist but are orders of magnitude slower than specialized optimized implementations. Even competing handwritten implementations can vary in performance by multiple orders of magnitude, depending on the problem. A survey of existing work reveals that the keys for high performance are twofold: exposing the data-parallelism in these problems and a problem-specific schedule that enables efficient usage of the underlying hardware. This thesis argues that cleanly separating the energy formulation and organization of computation from implementation details of NLLS using a domain-specific language (DSL) enables the generation of high performance massively parallel GPU solvers from high-level specifications. Thus I introduce Thallo, a DSL for large-scale non-linear least-squares optimization problems. I observed various code reorganizations performed by implementers of high-performance solvers in the literature and then define a set of basic operations that span these scheduling choices, thereby defining a large scheduling space. Users can either specify code transformations in a scheduling co-language or use an autoscheduler. Thallo takes as input a compact, shader-like representation of an energy function and a (potentially auto-generated) schedule, translating the combination into high-performance GPU solvers. Since Thallo can generate solvers from a large scheduling space, it can handle a large set of large-scale non-linear and non-smooth problems with various degrees of non-locality and compute-to-memory ratios, including diverse applications such as as-rigid-as-possible mesh deformation, bundle adjustment, blendshape fitting, and spatially-varying Poisson deconvolution. By abstracting schedules from the energy formulation and using a domain-specific compiler, we outperform the fastest state-of-the-art comparable high-level systems, the Ceres solver library, by multiple orders of magnitude and even outperform hand-tuned problem-specific GPU implementations.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2020; ©2020 |
Publication date | 2020; 2020 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Mara, Michael Thomas |
---|---|
Degree supervisor | Hanrahan, P. M. (Patrick Matthew) |
Thesis advisor | Hanrahan, P. M. (Patrick Matthew) |
Thesis advisor | Fatahalian, Kayvon |
Thesis advisor | James, Doug L |
Degree committee member | Fatahalian, Kayvon |
Degree committee member | James, Doug L |
Associated with | Stanford University, Computer Science Department |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Michael Mara. |
---|---|
Note | Submitted to the Computer Science Department. |
Thesis | Thesis Ph.D. Stanford University 2020. |
Location | electronic resource |
Access conditions
- Copyright
- © 2020 by Michael Thomas Mara
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...