Embedding spaces and implicit representations for recovering high frequency cloth geometry

Placeholder Show Content

Abstract/Contents

Abstract
Recovering cloth geometry from simulation or real world data using machine learning has shown promise, but existing methods still struggle to capture high-frequency folds/wrinkles and generalize beyond the limited 3D data currently available. To address these challenges, this dissertation explores a general paradigm whereby high-frequency information is procedurally embedded into low-frequency data so that when the latter is smoothed by the network the former still retains its high-frequency detail. This paradigm is applied to various aspects of the cloth capture pipeline, ranging from simulation to data acquisition to reconstruction. The first part of my Ph.D. dissertation addresses the challenge of choosing the right data representations for inferring high-frequency geometry; how data is parameterized is crucial to effectively training learning-based models and improving generalizability to unseen distributions. The first parameterization we developed is texture sliding, which we define as the changing of 2D texture coordinates on a per-camera basis such that any point which is visible from some stereo pair of cameras can be triangulated back to its ground truth 3D position. The second parameterization we developed is a skinned tetrahedral mesh framework for virtual cloth, which we demonstrate improves the prediction of high-frequency wrinkles and folds. Intrinsically, a tetrahedral mesh provides a more robust parameterization of three-dimensional space, since it contains a true extra degree of freedom as compared to the degenerate co-dimension one body surface. The second part of my Ph.D. dissertation investigates how to enable democratization of clothed human reconstruction by moving from models only trained on limited scans of people to weakly supervised methods that can make use of large-scale image datasets that are readily available. To alleviate dependency on labeled 3D data, we present a weakly supervised approach to monocular clothed human reconstruction using 2D normal maps as ground truth labels (only) during training. Given a single RGB image, our neural network infers a signed distance function (SDF) discretized on a tetrahedral mesh surrounding the body. Subsequently, estimated skinning and camera parameters are used to generate a triangle mesh and then a normal map from the SDF. Additionally, we propose a number of SDF regularizers to aid in silhouette matching as well as losses based on motion by mean curvature to encourage smoothness. The system is trained end-to-end, and we analytically compute backpropagation gradients for Marching Tetrahedra, rasterization, and the SDF regularizers.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2023; ©2023
Publication date 2023; 2023
Issuance monographic
Language English

Creators/Contributors

Author Wu, Jane Hong
Degree supervisor Fedkiw, Ron
Thesis advisor Fedkiw, Ron
Thesis advisor Bohg, Jeannette
Thesis advisor Liu, Karen
Degree committee member Bohg, Jeannette
Degree committee member Liu, Karen
Associated with Stanford University, School of Engineering
Associated with Stanford University, Computer Science Department

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Jane Wu.
Note Submitted to the Computer Science Department.
Thesis Thesis Ph.D. Stanford University 2023.
Location https://purl.stanford.edu/jf782pp1653

Access conditions

Copyright
© 2023 by Jane Hong Wu
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...