Large-scale simulation for embodied perception and robot learning

Xia, Fei, (Researcher in computer vision)

Large-scale simulation for embodied perception and robot learning

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Frx403rd5035" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Being able to perceive and interact with complex human environments is an important yet challenging problem in robotics for decades. Learning active perception and sensorimotor control by interacting with the physical world is cumbersome as existing algorithms are too slow to learn in real-time, and robots are fragile and costly. This has given rise to learning in simulation, and to make progress on this problem, efficient simulation infrastructure needs to be developed to support interactive and long-horizon tasks, and sample-efficient learning algorithms need to be developed to solve these tasks. In this dissertation, I present two lines of work contributing to these topics. The first line of work is to create large-scale, realistic, and interactive simulation environments, including Gibson Environment and iGibson. Gibson Environment is proposed for learning real-world perception for active agents. Gibson Environment is built from the real world and reflects its semantic complexity. It has a neural network-based renderer and a mechanism named ``Goggle" to ensure no need to further domain adaptation before deployment of results in the real world. Gibson Environment significantly improves pixel-level realism over existing simulation environments. To build upon Gibson Environment and improve the physical realism of the simulation, I propose iGibson, a simulation environment to develop robotic solutions for interactive tasks in large-scale realistic scenes. The simulated scenes are replicas of 3D scanned real-world homes, aligning the distribution of objects and layout to those of the real world. Novel long horizon problems including interactive navigation and mobile manipulation can be defined in this environment, and I show evidence that solutions can be transferred to the real world. The second line of work studies reinforcement learning (RL) for long-horizon robotics problems enabled by the interactive simulation environments. First, I introduce the interactive navigation problem and associated metrics. I leverage model-free RL algorithms to solve the proposed interactive navigation problems. Second, to solve challenging tasks in fully interactive simulation environments and improve sample efficiency of RL, I propose ReLMoGen, a framework to integrate motion generation into RL. I propose to lift the action space from joint control signals to motion generation subgoals. By lifting the action space and leveraging sampling-based motion planners, I can efficiently use RL to solve complex long-horizon tasks that existing RL methods cannot solve in the original action space.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2021; ©2021
Publication date	2021; 2021
Issuance	monographic
Language	English

Creators/Contributors

Author	Xia, Fei, (Researcher in computer vision)
Degree supervisor	Guibas, Leonidas J
Degree supervisor	Savarese, Silvio
Thesis advisor	Guibas, Leonidas J
Thesis advisor	Savarese, Silvio
Thesis advisor	Haber, Nick
Thesis advisor	Sadigh, Dorsa
Thesis advisor	Wetzstein, Gordon
Degree committee member	Haber, Nick
Degree committee member	Sadigh, Dorsa
Degree committee member	Wetzstein, Gordon
Associated with	Stanford University, Department of Electrical Engineering

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Fei Xia.
Note	Submitted to the Department of Electrical Engineering.
Thesis	Thesis Ph.D. Stanford University 2021.
Location	https://purl.stanford.edu/rx403rd5035

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...