Training and deploying visual agents at scale

Placeholder Show Content

Abstract/Contents

Abstract
Autonomous agents that perceive and interact with the world, such as home robots and self-driving vehicles, hold great promises to a future that automates mundane tasks and improves the living standards for billions of people. However, two major obstacles stand in our way towards this grand goal. First, modern AI systems require huge amount of data to learn meaningful behaviors, yet training them directly on physics robots is unscalable due to high cost and low efficiency. Second, mobile robot platforms typically have limited onboard computing resources but demand low reaction latency, which hinders the mass deployment of large-capacity visual models. In this dissertation, we will explore an effective recipe towards developing algorithms and systems that are able to train and deploy visual agents at scale. The key idea is to train the agents in rich simulation, then overcome the sim-to-real gap, and finally deploy efficiently on edge devices with lightweight video processing architectures. This dissertation is organized around 4 primary components in the pipeline. First, we propose an open-source distributed framework that provides a full-stack solution to accelerate reinforcement learning (RL) significantly for complex robotics tasks. Second, we construct an ecologically valid and visually realistic simulator for home robotic tasks. Third, we introduce a novel policy learning method that achieves zero-shot generalization to unseen visual environments with large distributional shifts, which facilitates sim-to-real transfer. Finally, we design a new family of video learning architectures that enables deep video understanding for visual agents on resource-constrained devices. We hope that the techniques and ideas presented in this dissertation will bring us one step closer to the future where intelligent robots will become as ubiquitous as smartphones in our lives.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2021; ©2021
Publication date 2021; 2021
Issuance monographic
Language English

Creators/Contributors

Author Fan, Linxi
Degree supervisor Li, Fei Fei, 1976-
Thesis advisor Li, Fei Fei, 1976-
Thesis advisor Niebles Duque, Juan Carlos, 1980-
Thesis advisor Wu, Jiajun, (Computer scientist)
Degree committee member Niebles Duque, Juan Carlos, 1980-
Degree committee member Wu, Jiajun, (Computer scientist)
Associated with Stanford University, Computer Science Department

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Linxi Fan.
Note Submitted to the Computer Science Department.
Thesis Thesis Ph.D. Stanford University 2021.
Location https://purl.stanford.edu/jk266yw1361

Access conditions

Copyright
© 2021 by Linxi Fan
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...