Learned motion models for the perception and generation of dynamic humans and objects

Rempe, Davis Winston

Learned motion models for the perception and generation of dynamic humans and objects

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fkc338bg9787" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Understanding the motion of humans and objects is key for intelligent systems. Motions are the result of physics, but non-physical dynamics also play an important role, for example, social norms and traffic laws determine how pedestrians and vehicles behave. The ability to perceive and generate these motions enables important applications, such as autonomous robots that operate in the real world, mixed reality that augments the real world, and animation and simulation that imitate the real world. Despite often being approached as separate problems, the perception and generation of motion both fundamentally rely on having an accurate model of dynamics for humans and objects in a scene. Perception problems like pose estimation, tracking, and shape estimation require motion understanding to reason about occlusions and noise from partial and ambiguous inputs. Generation problems such as forecasting future motion rely entirely on being able to predict motion. A promising avenue to solve these problems is learning models of motion, however, it is challenging to develop models that accurately reflect the real world, capture the diversity of motion due to inherent uncertainty, and robustly generalize to many possible scenarios. This thesis explores how to effectively learn models of motion to solve important perception and generation problems. We propose several data-driven methods to accurately capture the dynamics of humans, objects, and how they interact with each other and their environment. In the first part of the thesis, we introduce two methods for perceiving 3D human pose and 3D object shape, respectively. The first uses a robust generative model of 3D human pose transitions, while the second learns a continuous motion representation entirely from point cloud observations. The second part of the thesis focuses on motion models for synthesizing high-level human behavior in the form of 2D top-down trajectories. In these works, we introduce two new generative models that handle complex multi-agent interactions and can be controlled by a user to produce trajectories with desirable properties. We show this is useful to create rare scenarios for testing autonomous vehicles and to animate crowds of pedestrians. Finally, the thesis ends with a discussion of important future directions to continue improving learned models of motion for humans and objects.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2023; ©2023
Publication date	2023; 2023
Issuance	monographic
Language	English

Creators/Contributors

Author	Rempe, Davis Winston
Degree supervisor	Guibas, Leonidas J
Thesis advisor	Guibas, Leonidas J
Thesis advisor	Bohg, Jeannette, 1981-
Thesis advisor	Liu, Cheng-Yun Karen, 1977-
Degree committee member	Bohg, Jeannette, 1981-
Degree committee member	Liu, Cheng-Yun Karen, 1977-
Associated with	Stanford University, School of Engineering
Associated with	Stanford University, Computer Science Department

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Davis Rempe.
Note	Submitted to the Computer Science Department.
Thesis	Thesis Ph.D. Stanford University 2023.
Location	https://purl.stanford.edu/kc338bg9787

Access conditions

Also listed in

View in SearchWorks

Loading usage metrics...