See, act, and conceptualize : a learning system for robots to interact with the world

Shao, Lin

See, act, and conceptualize : a learning system for robots to interact with the world

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fbd998td4251" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Building intelligent systems for robots to interact with the world is a challenging problem due to many factors, such as the high dimensionality of the state and action space, the enormous variability of tasks, and the uncertainty of surrounding environments. This dissertation contains our work to guide robots to learn to perceive objects and motion, master various manipulation skills, and develop concepts to better interact with the world. Robots need to develop their visual systems to perceive objects and motion. When robots start to move and interact with surrounding environments, they thereby autonomously induce motion in the scene. Such motion creates a rich, visual sensory signal facilitating better scene understanding. We introduce our work that jointly estimates the segmentation of a scene into a finite number of rigidly moving objects, the motion trajectories of these objects, and the object scene flow. After robots perceive objects, they can move their hands to manipulate these objects. We present our approaches to train robots to master various primitive and complex manipulation skills. Choosing the right action representation is important to master primitive skills. We present a data-driven grasp synthesis method that considers both the object geometry and gripper attributes. Our method leverages contact points as an abstraction that can be re-used by a diverse set of robot hands. Besides contact points between objects and robotic hands, we propose a contact point matching representation between two objects and utilize it to train robots to learn how to hang arbitrary objects onto diverse supporting items such as racks or hooks. For complex skills such as tool manipulation and robotic assembly, we describe a learning framework that allows a robot to autonomously modify the environment and discover how to ease manipulation skill learning. As robots master more and more skills, it becomes important for robots to learn to abstract and represent these skills. We present our learning framework that endows robots with the ability to acquire various concepts to represent manipulation skills. These manipulation concepts act as mental representations of verbs in natural language instructions. We propose a learning from demonstration approach to learn manipulation actions from large-scale video data sets that are annotated with natural language instructions. Thus, we could use natural language instructions to guide robots to better interact with the world. We conclude by summarizing our efforts and discuss future directions.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2021; ©2021
Publication date	2021; 2021
Issuance	monographic
Language	English

Creators/Contributors

Author	Shao, Lin
Degree supervisor	Bohg, Jeannette, 1981-
Thesis advisor	Bohg, Jeannette, 1981-
Thesis advisor	Guibas, Leonidas J
Thesis advisor	Khatib, Oussama
Degree committee member	Guibas, Leonidas J
Degree committee member	Khatib, Oussama
Associated with	Stanford University, Institute for Computational and Mathematical Engineering

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Lin Shao.
Note	Submitted to the Institute for Computational and Mathematical Engineering.
Thesis	Thesis Ph.D. Stanford University 2021.
Location	https://purl.stanford.edu/bd998td4251

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...