Improving neural language models with black-box analysis and generalization through memorization

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fst056pp9441" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Neural language models (LMs) have become the workhorse of most natural language processing tasks and systems today. Yet, they are not perfect, and the two most important challenges in improving them further are (1) their lack of interpretability, and (2) their inability to generalize consistently, both in- and out-of-distribution. In this dissertation, I first describe my work on studying these LMs via black-box analysis, in order to understand how their predictions change in response to strategic changes in inputs. This makes model predictions more transparent by highlighting the features of the input that the model relies on. Then, I describe my work on Generalization through Memorization -- exploiting the notion of similarity between examples by using data saved in an external memory and retrieving nearest neighbors from it. This approach improves existing LM and machine translation models in terms of both in- and out-of-domain generalization, without any added training costs. Beyond improving generalization, memorization also makes model predictions more interpretable.

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2021; ©2021
Publication date	2021; 2021
Issuance	monographic
Language	English

Author	Khandelwal, Urvashi
Degree supervisor	Jurafsky, Dan, 1962-
Thesis advisor	Jurafsky, Dan, 1962-
Thesis advisor	Liang, Percy
Thesis advisor	Manning, Christopher D
Degree committee member	Liang, Percy
Degree committee member	Manning, Christopher D
Associated with	Stanford University, Computer Science Department

Genre	Theses
Genre	Text

Statement of responsibility	Urvashi Khandelwal.
Note	Submitted to the Computer Science Department.
Thesis	Thesis Ph.D. Stanford University 2021.
Location	https://purl.stanford.edu/st056pp9441

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

View in SearchWorks

Loading usage metrics...