Arc-factored biaffine dependency parsing
Abstract/Contents
- Abstract
- This thesis describes a simple approach to neural arc-factored dependency parsing, building on neural machine learning techniques that have gained considerable popularity in recent years. Dependency parsing is a way of identifying the latent syntactic and semantic relationships between words in a sentence, with solid foundations in linguistic theory that I describe in some detail. In this work, I introduce new classification techniques that extend the affine softmax classifier ubiquitous in machine learning that would otherwise be inappropriate for parsing. What's more, I demonstrate that the new biaffine classification techniques can be derived mathematically from the same principles that yield the affine softmax classifier. Related works either use an alternative to the proposed biaffine classifiers---based on feedforward neural attention---or else use an entirely different parsing algorithm---known as transition-based parsing---based on constituency parsing. In this work, I find evidence that the biaffine classifiers outperform the traditional attention-based classifiers, and that the arc-factored system outperforms transition-based parsers more broadly. I also demonstrate that the hyperparameter choices are optimal or near optimal, with significant deviations either leading to overfitting or underfitting. Consequently, any modifications to the architecture that yield better accuracy are unlikely to be due to simply compensating for poor hyperparameters. The basic system can be batched to parse large documents very quickly, and achieves accuracy comparable to state-of-the-art on the most popular English benchmark. However, the original system makes a few design choices that introduce complications for other languages, namely a reliance on whole word tokens and part-of-speech tags. To solve the first limitation, I have the system construct word representations from characters, so that the model can learn how morphology expressed through orthography reflects syntactic structure. To solve the second, I minimally adapt the architecture of the parser so it can be trained as a sequence labeler. A tagger that directly uses insights gleaned from the parser can be trained on any dependency treebank with gold part-of-speech tags. This approach achieved the highest performance at tagging and parsing on the 2017 CoNLL shared task on dependency parsing, inspiring most of the top-performing systems of the 2018 shared task. I also extend the system for multitask tagging, such that morphological features and language-specific part-of-speech tags are conditioned on the predicted coarse-grained universal tag. Finally, I modify the edge classifier to condition predictions directly on the relative location of words, so the system can more effectively leverage linearization and distance. Both of these make statistically significant improvements to accuracy. In order to accommodate dependency formalisms that don't adhere to strict tree structures, I minimally adapt the parser once more to produce arbitrary dependency graphs instead of dependency trees. I again ablate the system to explore how important the different hyperparameters and components of the system are, finding that while most of them do make a statistically significant difference, in general the differences are very small and the system is very robust. The work in this thesis not only contributes narrowly to the field of dependency parsing, but also more broadly provides tools for tasks with more complex dependencies than sequence labeling or classification.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2019; ©2019 |
Publication date | 2019; 2019 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Dozat, Timothy Allen | |
---|---|---|
Degree supervisor | Manning, Christopher D | |
Thesis advisor | Manning, Christopher D | |
Thesis advisor | Jurafsky, Dan, 1962- | |
Thesis advisor | Kay, Martin | |
Degree committee member | Jurafsky, Dan, 1962- | |
Degree committee member | Kay, Martin | |
Associated with | Stanford University, Department of Linguistics. |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Timothy Dozat. |
---|---|
Note | Submitted to the Department of Linguistics. |
Thesis | Thesis Ph.D. Stanford University 2019. |
Location | electronic resource |
Access conditions
- Copyright
- © 2019 by Timothy Allen Dozat
- License
- This work is licensed under a Creative Commons Attribution Share Alike 3.0 Unported license (CC BY-SA).
Also listed in
Loading usage metrics...