The pragmatics of image description generation
Abstract/Contents
- Abstract
- Finding true words that describe an image seems easy, but it is finding the right words for an image that poses a communicative challenge. Describing an image to make it accessible for someone who can't see it additionally carries a lot of social responsibility when selecting which information to put into words, since those words fundamentally determine what our interlocutor will learn and what they might miss. With recent advancements in computational vision-language models comes an opportunity for building systems that can describe the visual world (e.g., images) for us -- and that instantly. This bears special promise for advancing nonvisual accessibility for people who are blind or have low vision, as their access to visual content online is often restricted to rarely present alt text descriptions. In this dissertation, I argue that the development of vision-language systems that are useful for social needs such as accessibility needs to be centered around pragmatic communicative principles. This includes the datasets we construct, the way we design and train our models, and how we evaluate them. The opportunities are vast, but now is the time where we can start to center computational modeling progress around the intricacies and complexities that make our communication so rich and effective.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2023; ©2023 |
Publication date | 2023; 2023 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Kreiss, Elisa |
---|---|
Degree supervisor | Potts, Christopher |
Thesis advisor | Potts, Christopher |
Thesis advisor | Goodman, Noah |
Thesis advisor | Jurafsky, Dan |
Degree committee member | Goodman, Noah |
Degree committee member | Jurafsky, Dan |
Associated with | Stanford University, School of Humanities and Sciences |
Associated with | Stanford University, Department of Linguistics |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Elisa Kreiss. |
---|---|
Note | Submitted to the Department of Linguistics. |
Thesis | Thesis Ph.D. Stanford University 2023. |
Location | https://purl.stanford.edu/th574rc1780 |
Access conditions
- Copyright
- © 2023 by Elisa Kreiss
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...