Synthesizing high-quality and controllable tennis animation from real-world video collections

Placeholder Show Content

Abstract/Contents

Abstract
Demonstrations of human performance are important for creating realistic virtual characters. However, collecting high-quality demonstrations via motion capture can be costly in both time and monetary cost, because it requires expensive environment setup and skilled performers to be present at the capture location. In contrast, there is an abundance of video capturing expert-level human performances. Given this observation, this thesis seeks to answer the question: how can we harvest information from internet videos to create realistic virtual human characters? In particular, online videos of athletic events provide a rich sampling of in-activity motion data which encompass the full spectrum of skills an athlete must perform in a sport. The challenge is that compared to the high quality of motion capture data, demonstrations extracted from monocular real-world videos will exhibit errors due to limitations of current machine perception algorithms, and source video deficiencies like occlusions and motion blur. This thesis centers its focus on a specific sports domain: tennis, and demonstrates that it is possible to use large-scale observations of athlete performance obtained from real-world video collections to create controllable, high-quality tennis animations of virtual characters playing singles tennis points. These characters successfully conduct tennis rallies, carry out realistic decision-making, and appear photorealistic. Specifically, this thesis shows that a fairly simple state machine designed given the domain knowledge of tennis, is sufficient to provide structure for building complicated tennis controllers from unstructured video demonstrations. Next, this thesis contributes a system for learning a motion controller from real-world video demonstrations that is capable of precisely controlling a physically simulated character to play tennis points using a diverse set of skills. Finally, this thesis shows that macro-level behavior and player appearance information can be extracted from videos to enhance the realism of synthesized virtual characters. Overall, this thesis creates realistic virtual characters that can be effectively controlled to play singles tennis points involving a diverse array of strokes (serves, forehands, and backhands), spins (topspins and slices), and playing styles (one hand vs. two hand, left hand vs. right hand), as well as macro level behavioral choices that reflect player specific styles, such as shot selection choices. In the limited case of 2D sprite animation generation, it also demonstrates that appearance data from the videos can be employed to yield characters that are photorealistic in their appearance.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2023; ©2023
Publication date 2023; 2023
Issuance monographic
Language English

Creators/Contributors

Author Zhang, Haotian
Degree supervisor Fatahalian, Kayvon
Thesis advisor Fatahalian, Kayvon
Thesis advisor Fedkiw, Ron
Thesis advisor Liu, Karen
Degree committee member Fedkiw, Ron
Degree committee member Liu, Karen
Associated with Stanford University, School of Engineering
Associated with Stanford University, Computer Science Department

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Haotian Zhang.
Note Submitted to the Computer Science Department.
Thesis Thesis Ph.D. Stanford University 2023.
Location https://purl.stanford.edu/nm925nc7955

Access conditions

Copyright
© 2023 by Haotian Zhang
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...