Embedding spaces and implicit representations for recovering high frequency cloth geometry
Abstract/Contents
- Abstract
- Recovering cloth geometry from simulation or real world data using machine learning has shown promise, but existing methods still struggle to capture high-frequency folds/wrinkles and generalize beyond the limited 3D data currently available. To address these challenges, this dissertation explores a general paradigm whereby high-frequency information is procedurally embedded into low-frequency data so that when the latter is smoothed by the network the former still retains its high-frequency detail. This paradigm is applied to various aspects of the cloth capture pipeline, ranging from simulation to data acquisition to reconstruction. The first part of my Ph.D. dissertation addresses the challenge of choosing the right data representations for inferring high-frequency geometry; how data is parameterized is crucial to effectively training learning-based models and improving generalizability to unseen distributions. The first parameterization we developed is texture sliding, which we define as the changing of 2D texture coordinates on a per-camera basis such that any point which is visible from some stereo pair of cameras can be triangulated back to its ground truth 3D position. The second parameterization we developed is a skinned tetrahedral mesh framework for virtual cloth, which we demonstrate improves the prediction of high-frequency wrinkles and folds. Intrinsically, a tetrahedral mesh provides a more robust parameterization of three-dimensional space, since it contains a true extra degree of freedom as compared to the degenerate co-dimension one body surface. The second part of my Ph.D. dissertation investigates how to enable democratization of clothed human reconstruction by moving from models only trained on limited scans of people to weakly supervised methods that can make use of large-scale image datasets that are readily available. To alleviate dependency on labeled 3D data, we present a weakly supervised approach to monocular clothed human reconstruction using 2D normal maps as ground truth labels (only) during training. Given a single RGB image, our neural network infers a signed distance function (SDF) discretized on a tetrahedral mesh surrounding the body. Subsequently, estimated skinning and camera parameters are used to generate a triangle mesh and then a normal map from the SDF. Additionally, we propose a number of SDF regularizers to aid in silhouette matching as well as losses based on motion by mean curvature to encourage smoothness. The system is trained end-to-end, and we analytically compute backpropagation gradients for Marching Tetrahedra, rasterization, and the SDF regularizers.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2023; ©2023 |
Publication date | 2023; 2023 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Wu, Jane Hong |
---|---|
Degree supervisor | Fedkiw, Ron |
Thesis advisor | Fedkiw, Ron |
Thesis advisor | Bohg, Jeannette |
Thesis advisor | Liu, Karen |
Degree committee member | Bohg, Jeannette |
Degree committee member | Liu, Karen |
Associated with | Stanford University, School of Engineering |
Associated with | Stanford University, Computer Science Department |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Jane Wu. |
---|---|
Note | Submitted to the Computer Science Department. |
Thesis | Thesis Ph.D. Stanford University 2023. |
Location | https://purl.stanford.edu/jf782pp1653 |
Access conditions
- Copyright
- © 2023 by Jane Hong Wu
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...