Scaling deep robotic learning to broad real-world data

Nair, Suraj

Scaling deep robotic learning to broad real-world data

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Ffk655fk4359" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: From general object grasping to in-hand manipulation, deep learning has enabled a number of exciting robotic manipulation capabilities in recent years. Despite this, the quintessential home robot that can enter a previously unseen home environment and complete a wide range of tasks like humans can is far from a reality. While there are many problems to solve in accomplishing this goal, one of the central bottlenecks lies in learning control policies from the robot's sensor inputs that can generalize to new tasks, objects, and environments. For example, a robot cooking in a home cannot afford to re-learn from scratch for each new dish, nor is it feasible to hard-code state features for every new kitchen a robot might encounter. One potential route to accomplishing this generalization is to train the robot on a wide distribution of data that contains many tasks, objects, and environments. Indeed, this recipe of large, diverse datasets combined with scalable offline learning algorithms (e.g. self-supervised or cheaply supervised learning) has been the key behind recent successes in natural language processing (NLP) and vision. However, directly extending this recipe to robotics is nontrivial, as we neither have sufficiently large and diverse datasets of robot interaction nor is it obvious what types of learning algorithms or sources of supervision can enable us to scalably learn skills from these datasets. The goal of this thesis lies in tackling these challenges, and replicating the recipe of large-scale data and learning in the context of robotic manipulation. The first part of this thesis will discuss how we can scalably collect large and diverse datasets of robots interacting in the physical world and how we can effectively pre-train self-supervised world models on such offline robot datasets. We'll then explore how we might use these pre-trained world models to solve tasks by combining them with planning, first for solving long-horizon manipulation tasks, and second for completing tasks specified by natural language. Finally, we'll discuss how we might go beyond robot data and unlock the broad sources of data that exist on the web, like videos of humans, to enable more effective learning in our robots, specifically through reward learning and visual pre-training. The thesis will conclude by discussing open challenges, particularly how we might unify the paradigms of simulation, real-world data collection, and videos of humans to realize the vision of a general-purpose household robot.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2023; ©2023
Publication date	2023; 2023
Issuance	monographic
Language	English

Creators/Contributors

Author	Nair, Suraj
Degree supervisor	Finn, Chelsea
Thesis advisor	Finn, Chelsea
Thesis advisor	Sadigh, Dorsa
Degree committee member	Sadigh, Dorsa
Associated with	Stanford University, School of Engineering
Associated with	Stanford University, Computer Science Department

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Suraj Nair.
Note	Submitted to the Computer Science Department.
Thesis	Thesis Ph.D. Stanford University 2023.
Location	https://purl.stanford.edu/fk655fk4359

Access conditions

License: This work is licensed under a Creative Commons Attribution 3.0 Unported license (CC BY).

Also listed in

View in SearchWorks

Loading usage metrics...