Effective usage of muscle simulation and deep learning for high-quality facial performance capture

Bao, Michael H

Effective usage of muscle simulation and deep learning for high-quality facial performance capture

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fqf017zz1590" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: This dissertation explores the problem of the uncanny valley in digital facial performances; however, it is unclear what exactly causes the psychological rejection of near-realistic digital human faces. While the uncanny valley may be, in part, caused by an imperfect digital rendering of the face, this dissertation hypothesizes that the largest hurdle to solving the uncanny valley problem are unrealistic facial shapes and motions. Such imperfections are due to sources of error in the facial deformation model, the minimized objective function(s), as well as the regularization and optimization method of choice. Linear blendshape-based deformation models have parameter spaces which permit a wide range of subtly to implausible facial shapes. Commonly used objective functions assume that one has perfect correspondences between the captured data and the synthetic model; this is an assumption that is rarely correct. Lastly, in an attempt to prevent the model from wandering into uncanny territory, simple regularization terms are used to restrain the parameter values to tend towards zero. This dissertation addresses these problems by introducing and exploring the usage of a fully differentiable muscle simulation model, deep learning in objective functions, and an alternative minimization method that avoids ad hoc regularization. Firstly, we build upon the previously introduced facial muscle track simulation model and make it fully differentiable from end-to-end. This is accomplished by driving the muscle tracks using a parallel set of blendshape parameter values. We prove that this model is not only fully differentiable but also as expressive as the muscle track simulation model and, in certain cases, mathematically equivalent. We also demonstrate that this model can be used and is effective as the deformation model in an optimization problem for targeting 3D geometry and 2D monocular RGB images. Furthermore, we show that the resulting activation values are a promising basis for future work on semantic interpretability. Secondly, we address the manual correspondence problem when capturing 2D RGB images by applying the same pretrained deep neural networks to both the captured image and a synthetic differentiable render of the face model. Such an approach can seamlessly be used in an optimization problem due to the fully differentiable nature of both the neural network and differentiable renderer. We demonstrate the efficacy of this approach for estimating facial pose and expression using facial alignment and optical flow networks. By relying on a trained network, we can remove human judgement from the facial performance capture process which presents a clear path towards improvement in future work. Lastly, we briefly explore the usage of regularization for facial performance capture and demonstrate how an alternative nonlinear least squares optimization method can produce comparable results without modifying the energy landscape.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2019; ©2019
Publication date	2019; 2019
Issuance	monographic
Language	English

Creators/Contributors

Author	Bao, Michael H
Degree supervisor	Fedkiw, Ronald P, 1968-
Thesis advisor	Fedkiw, Ronald P, 1968-
Thesis advisor	Grabli, Stéphane, 1977-
Thesis advisor	Liu, Cheng-Yun Karen, 1977-
Degree committee member	Grabli, Stéphane, 1977-
Degree committee member	Liu, Cheng-Yun Karen, 1977-
Associated with	Stanford University, Computer Science Department.

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Michael H. Bao.
Note	Submitted to the Computer Science Department.
Thesis	Thesis Ph.D. Stanford University 2019.
Location	electronic resource

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial No Derivatives 3.0 Unported license (CC BY-NC-ND).

Also listed in

View in SearchWorks

Loading usage metrics...