Large-scale simulation for embodied perception and robot learning

Placeholder Show Content

Abstract/Contents

Abstract
Being able to perceive and interact with complex human environments is an important yet challenging problem in robotics for decades. Learning active perception and sensorimotor control by interacting with the physical world is cumbersome as existing algorithms are too slow to learn in real-time, and robots are fragile and costly. This has given rise to learning in simulation, and to make progress on this problem, efficient simulation infrastructure needs to be developed to support interactive and long-horizon tasks, and sample-efficient learning algorithms need to be developed to solve these tasks. In this dissertation, I present two lines of work contributing to these topics. The first line of work is to create large-scale, realistic, and interactive simulation environments, including Gibson Environment and iGibson. Gibson Environment is proposed for learning real-world perception for active agents. Gibson Environment is built from the real world and reflects its semantic complexity. It has a neural network-based renderer and a mechanism named ``Goggle" to ensure no need to further domain adaptation before deployment of results in the real world. Gibson Environment significantly improves pixel-level realism over existing simulation environments. To build upon Gibson Environment and improve the physical realism of the simulation, I propose iGibson, a simulation environment to develop robotic solutions for interactive tasks in large-scale realistic scenes. The simulated scenes are replicas of 3D scanned real-world homes, aligning the distribution of objects and layout to those of the real world. Novel long horizon problems including interactive navigation and mobile manipulation can be defined in this environment, and I show evidence that solutions can be transferred to the real world. The second line of work studies reinforcement learning (RL) for long-horizon robotics problems enabled by the interactive simulation environments. First, I introduce the interactive navigation problem and associated metrics. I leverage model-free RL algorithms to solve the proposed interactive navigation problems. Second, to solve challenging tasks in fully interactive simulation environments and improve sample efficiency of RL, I propose ReLMoGen, a framework to integrate motion generation into RL. I propose to lift the action space from joint control signals to motion generation subgoals. By lifting the action space and leveraging sampling-based motion planners, I can efficiently use RL to solve complex long-horizon tasks that existing RL methods cannot solve in the original action space.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2021; ©2021
Publication date 2021; 2021
Issuance monographic
Language English

Creators/Contributors

Author Xia, Fei, (Researcher in computer vision)
Degree supervisor Guibas, Leonidas J
Degree supervisor Savarese, Silvio
Thesis advisor Guibas, Leonidas J
Thesis advisor Savarese, Silvio
Thesis advisor Haber, Nick
Thesis advisor Sadigh, Dorsa
Thesis advisor Wetzstein, Gordon
Degree committee member Haber, Nick
Degree committee member Sadigh, Dorsa
Degree committee member Wetzstein, Gordon
Associated with Stanford University, Department of Electrical Engineering

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Fei Xia.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis Ph.D. Stanford University 2021.
Location https://purl.stanford.edu/rx403rd5035

Access conditions

Copyright
© 2021 by Fei Xia
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...