Improving neural language models with black-box analysis and generalization through memorization
Abstract/Contents
- Abstract
- Neural language models (LMs) have become the workhorse of most natural language processing tasks and systems today. Yet, they are not perfect, and the two most important challenges in improving them further are (1) their lack of interpretability, and (2) their inability to generalize consistently, both in- and out-of-distribution. In this dissertation, I first describe my work on studying these LMs via black-box analysis, in order to understand how their predictions change in response to strategic changes in inputs. This makes model predictions more transparent by highlighting the features of the input that the model relies on. Then, I describe my work on Generalization through Memorization -- exploiting the notion of similarity between examples by using data saved in an external memory and retrieving nearest neighbors from it. This approach improves existing LM and machine translation models in terms of both in- and out-of-domain generalization, without any added training costs. Beyond improving generalization, memorization also makes model predictions more interpretable.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2021; ©2021 |
Publication date | 2021; 2021 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Khandelwal, Urvashi |
---|---|
Degree supervisor | Jurafsky, Dan, 1962- |
Thesis advisor | Jurafsky, Dan, 1962- |
Thesis advisor | Liang, Percy |
Thesis advisor | Manning, Christopher D |
Degree committee member | Liang, Percy |
Degree committee member | Manning, Christopher D |
Associated with | Stanford University, Computer Science Department |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Urvashi Khandelwal. |
---|---|
Note | Submitted to the Computer Science Department. |
Thesis | Thesis Ph.D. Stanford University 2021. |
Location | https://purl.stanford.edu/st056pp9441 |
Access conditions
- Copyright
- © 2021 by Urvashi Khandelwal
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...