Understanding feature use divergences between human and machine vision
Abstract/Contents
- Abstract
- Recent work has highlighted a seemingly sharp divergence between human and machine vision: studies have argued that, whereas people exhibit a shape bias, preferring to classify objects according to their shape (Landau et al., 1988; Geirhos et al., 2018; Kucker et al., 2019), standard ImageNet-trained CNNs privilege texture (Geirhos et al., 2018). How prevalent is this texture bias, and where does it come from? I will present evidence that, while both model architecture and training objective affect a model's level of texture bias, the statistics of the training data are the most important factor, and that naturalistic data augmentation schemes can ameliorate texture bias and improve generalization to out-of-distribution images. On the human side, existing studies have tested people under conditions different than those faced by a feedforward CNN; does the human—machine divergence remain when testing conditions are more fairly aligned? In experiments using brief stimulus presentations, we find that people do still privilege shape over texture. Even so, texture information plays more of a role than previously reported. This work establishes a new benchmark for assessing how "human-like" feedforward vision models are in their shape bias. Shape and texture are two features that are both useful in predicting an object's class. Zooming out, I will study models' treatment of such redundant features in a more general setting. Using synthetic data to explore which input features models learn as a function of their task relevance and difficulty of extraction, we find that CNNs are vulnerable to "feature blindness", privileging a single useful feature even when two features perfectly and redundantly predict image labels. Which of the features is privileged is predictable from the untrained model. Finally, I will discuss challenges and open questions that remain in the quest to build models with human-like visual representations.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2022; ©2022 |
Publication date | 2022; 2022 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Hermann, Katherine Laura |
---|---|
Degree supervisor | McClelland, James L |
Thesis advisor | McClelland, James L |
Thesis advisor | Grill-Spector, Kalanit |
Thesis advisor | Yamins, Daniel |
Degree committee member | Grill-Spector, Kalanit |
Degree committee member | Yamins, Daniel |
Associated with | Stanford University, Department of Psychology |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Katherine Hermann. |
---|---|
Note | Submitted to the Department of Psychology. |
Thesis | Thesis Ph.D. Stanford University 2022. |
Location | https://purl.stanford.edu/gc637jd7786 |
Access conditions
- Copyright
- © 2022 by Katherine Laura Hermann
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...