Uncertainty-aware spatiotemporal perception for autonomous vehicles

Itkina, Mikhal

Uncertainty-aware spatiotemporal perception for autonomous vehicles

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fyk710jh3806" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Autonomous vehicles are set to revolutionize transportation in terms of safety and efficiency. However, autonomous systems still have challenges operating in complex human environments, such as an autonomous vehicle in a cluttered, dynamic urban setting. A key obstacle to deploying autonomous systems on the road is understanding, anticipating, and making inferences about human behaviors. Autonomous perception builds a general understanding of the environment for a robot. This includes making inferences about human behaviors in both space and time. Humans are difficult to model due to their vastly diverse behaviors and rapidly evolving objectives. Moreover, in cluttered settings, there are computational and visibility limitations. However, humans also possess desirable capabilities, such as their ability to generalize beyond their observed environment. Although learning-based systems have had success in recent years in modeling and imitating human behavior, efficiently capturing the data and model uncertainty for these systems remains an open problem. This thesis proposes algorithmic advances to uncertainty-aware autonomous perception systems in human environments. We make system-level contributions to spatiotemporal robot perception that reasons about human behavior, and foundational advancements in uncertainty-aware machine learning models for trajectory prediction. These contributions enable robotic systems to make uncertainty- and socially-aware spatiotemporal inferences about human behavior. Traditional robot perception is object-centric and modular, consisting of object detection, tracking, and trajectory prediction stages. These systems can fail prior to the prediction stage due to partial occlusions in the environment. We thus propose an alternative end-to-end paradigm for spatiotemporal environment prediction from a map-centric occupancy grid representation. Occupancy grids are robust to partial occlusions, can handle an arbitrary number of human agents in the scene, and do not require a priori information regarding the environment. We investigate the performance of computer vision techniques in this context and develop new mechanisms tailored to the task of spatiotemporal environment prediction. Spatially, robots also need to reason about fully occluded agents in their environment, which may occur due to sensor limitations or other agents on the road obstructing the field of view. Humans excel at extrapolating from their experiences by making inferences from observed social behaviors. We draw inspiration from human intuition to fill in portions of the robot's map that are not observable by traditional sensors. We infer occupancy in these occluded regions by learning a multimodal mapping from observed human driver behaviors to the environment ahead of them, thus treating people as sensors. Our system handles multiple observed agents to maximally inform the occupancy map around the robot. In order to safely integrate human behavior modeling into the robot autonomy stack, the perception system must efficiently account for uncertainty. Human behavior is often modeled using discrete latent spaces in learning-based models to capture the multimodality in the distribution. For example, in a trajectory prediction task, there may be multiple valid future predictions given a past trajectory. To accurately model this latent distribution, the latent space needs to be sufficiently large, leading to tractability concerns for downstream tasks, such as path planning. We address this issue by proposing a sparsification algorithm for discrete latent sample spaces that can be applied post hoc without sacrificing model performance. Our approach successfully balances multimodality and sparsity to achieve efficient data uncertainty estimation. Aside from modeling data uncertainty, learning-based autonomous systems must be aware of their model uncertainty or what they do not know. Flagging out-of-distribution or unknown scenarios encountered in the real world could be helpful to downstream autonomy stack components and to engineers for further system development. Although the machine learning community has been prolific in model uncertainty estimation for small benchmark problems, relatively little work has been done on estimating this uncertainty in complex, learning-based robotic systems. We propose efficiently learning the model uncertainty over an interpretable, low-dimensional latent space in the context of a trajectory prediction task. The algorithms presented in this thesis were validated on real-world autonomous driving data and baselined against state-of-the-art techniques. We show that drawing inspiration from human-level reasoning while modeling the associated uncertainty can inform environment understanding for autonomous perception systems. The contributions made in this thesis are a step towards uncertainty- and socially-aware autonomous systems that can function seamlessly in human environments.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2022; ©2022
Publication date	2022; 2022
Issuance	monographic
Language	English

Creators/Contributors

Author	Itkina, Mikhal
Degree supervisor	Kochenderfer, Mykel J, 1980-
Thesis advisor	Kochenderfer, Mykel J, 1980-
Thesis advisor	Sadigh, Dorsa
Thesis advisor	Schwager, Mac
Degree committee member	Sadigh, Dorsa
Degree committee member	Schwager, Mac
Associated with	Stanford University, Department of Aeronautics and Astronautics

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Masha (Mikhal) Itkina.
Note	Submitted to the Department of Aeronautics and Astronautics.
Thesis	Thesis Ph.D. Stanford University 2022.
Location	https://purl.stanford.edu/yk710jh3806

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...