Building robust natural language processing systems

Placeholder Show Content

Abstract/Contents

Abstract
Modern natural language processing (NLP) systems have achieved outstanding performance on benchmark datasets, in large part due to the stunning rise of deep learning. These research advances have led to great improvements in production systems for tasks like machine translation, speech recognition, and question answering. However, these NLP systems still often fail catastrophically when given inputs from different sources or inputs that have been adversarially perturbed. This lack of robustness exposes troubling gaps in current models' language understanding capabilities, and creates problems when NLP systems are deployed to real users. In this thesis, I will argue that many different aspects of the current deep learning paradigm for building NLP systems can be significantly improved to ensure greater robustness. In the first half of this thesis, I will build models that are robust to adversarially chosen perturbations. State-of-the-art models that achieve high average accuracy make surprising errors on inputs that have been slightly perturbed without altering meaning, for example by replacing words with synonyms or inserting typos. For a single sentence, there is a combinatorially large set of possible word-level perturbations, so guaranteeing correctness on all perturbations of an input requires new techniques that can reason about this combinatorial space. I will present two methods for building NLP systems that are provably robust to perturbations. First, certifiably robust training creates robust models by minimizing an upper bound on the loss that the worst possible perturbation can induce. Second, robust encodings enforce invariance to perturbations through a carefully constructed encoding layer that can be reused across different tasks and combined with any model architecture. Our improvements in robustness are dramatic: certifiably robust training improves accuracy on examples with adversarially chosen word substitutions from 10% to 75% on the IMDB sentiment analysis dataset, while robust encodings improve accuracy on examples with adversarially chosen typos from 7% to 71% on average across six text classification datasets from the GLUE benchmark. In the second half of the thesis, I will consider robustness failures that stem from the unrealistic narrowness of modern datasets. Datasets for tasks like question answering or paraphrase detection contain only a narrow slice of all valid inputs, so models trained on such datasets often learn to predict based on shallow heuristics. These heuristics generalize poorly to other similar, valid inputs. I will present methods for both constructing more challenging test data and collecting training data that aids generalization. For the task of question answering, I will use adversarially constructed distracting sentences to reveal weaknesses in systems that standard in-distribution test data fails to uncover. In our adversarial setting, the accuracy of sixteen contemporaneous models on the SQuAD dataset drops from an average of 75% F1 score to 36%; on a current state-of-the-art model, accuracy drops from 92% F1 score to 61%. For pairwise classification tasks, I will show that active learning with neural sentence embedding models collects training data that greatly improves generalization to test data with realistic label imbalance, compared to standard training datasets collected heuristically. On a realistically imbalanced version of the Quora Question Pairs paraphrase detection dataset, our method improves average precision from 2% to 32%. Overall, this thesis shows that state-of-the-art deep learning models have serious robustness defects, but also argues that by modifying different parts of the standard deep learning paradigm, we can make significant progress towards building robust NLP systems

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2020; ©2020
Publication date 2020; 2020
Issuance monographic
Language English

Creators/Contributors

Author Jia, Robin
Degree supervisor Liang, Percy
Thesis advisor Liang, Percy
Thesis advisor Jurafsky, Dan, 1962-
Thesis advisor Manning, Christopher D
Degree committee member Jurafsky, Dan, 1962-
Degree committee member Manning, Christopher D
Associated with Stanford University, Computer Science Department

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Robin Jia
Note Submitted to the Computer Science Department
Thesis Thesis Ph.D. Stanford University 2020
Location electronic resource

Access conditions

Copyright
© 2020 by Robin Jia
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...