Learning Object-Centric Visual Representations for Common Sense Reasoning
Abstract/Contents
- Abstract
- People understand the world as a sum of its parts. Our effortless mental ability to simulate and imagine what will happen crucially depends on a scene representation that is compositional with respect to objects and the interactions between them. Similarly, learning representations with an emphasis on the perception and understanding of objects plays a key role in building human-like AI capable of supporting higher-level cognitive abilities such as common sense reasoning, causal reasoning, and goal-oriented planning. Many current methods in machine learning focus on learning structured representations in which objects are only implicitly represented, which poses a threat to model interpretability. Motivated by these observations, this work aims to develop algorithms for learning object-centric representations in which objects are explicitly represented. Given the importance of objects in human cognition, we draw inspiration from cognitive science to motivate and provide theoretical underpinnings. Specifically, we take a rationalist approach in augmenting deep neural network architectures to exhibit object-centric representations, and demonstrate empirically how these representations can be efficieintly leveraged for downstream tasks such as controllable image synthesis.
Description
Type of resource | text |
---|---|
Date created | June 4, 2021 |
Date modified | December 5, 2022 |
Publication date | September 8, 2021 |
Creators/Contributors
Author | Tan, Kevin |
---|
Subjects
Subject | Machine learning |
---|---|
Subject | computer vision |
Subject | cognitive science |
Genre | Text |
Genre | Thesis |
Bibliographic information
Access conditions
- Use and reproduction
- User agrees that, where applicable, content will not be used to identify or to otherwise infringe the privacy or confidentiality rights of individuals. Content distributed via the Stanford Digital Repository may be subject to additional license and use restrictions applied by the depositor.
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Collection
Master's Theses, Symbolic Systems Program, Stanford University
View other items in this collection in SearchWorksContact information
- Contact
- kevtan@stanford.edu
Also listed in
Loading usage metrics...