Arc-factored biaffine dependency parsing

Dozat, Timothy Allen

Arc-factored biaffine dependency parsing

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fbm970wf5494" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: This thesis describes a simple approach to neural arc-factored dependency parsing, building on neural machine learning techniques that have gained considerable popularity in recent years. Dependency parsing is a way of identifying the latent syntactic and semantic relationships between words in a sentence, with solid foundations in linguistic theory that I describe in some detail. In this work, I introduce new classification techniques that extend the affine softmax classifier ubiquitous in machine learning that would otherwise be inappropriate for parsing. What's more, I demonstrate that the new biaffine classification techniques can be derived mathematically from the same principles that yield the affine softmax classifier. Related works either use an alternative to the proposed biaffine classifiers---based on feedforward neural attention---or else use an entirely different parsing algorithm---known as transition-based parsing---based on constituency parsing. In this work, I find evidence that the biaffine classifiers outperform the traditional attention-based classifiers, and that the arc-factored system outperforms transition-based parsers more broadly. I also demonstrate that the hyperparameter choices are optimal or near optimal, with significant deviations either leading to overfitting or underfitting. Consequently, any modifications to the architecture that yield better accuracy are unlikely to be due to simply compensating for poor hyperparameters. The basic system can be batched to parse large documents very quickly, and achieves accuracy comparable to state-of-the-art on the most popular English benchmark. However, the original system makes a few design choices that introduce complications for other languages, namely a reliance on whole word tokens and part-of-speech tags. To solve the first limitation, I have the system construct word representations from characters, so that the model can learn how morphology expressed through orthography reflects syntactic structure. To solve the second, I minimally adapt the architecture of the parser so it can be trained as a sequence labeler. A tagger that directly uses insights gleaned from the parser can be trained on any dependency treebank with gold part-of-speech tags. This approach achieved the highest performance at tagging and parsing on the 2017 CoNLL shared task on dependency parsing, inspiring most of the top-performing systems of the 2018 shared task. I also extend the system for multitask tagging, such that morphological features and language-specific part-of-speech tags are conditioned on the predicted coarse-grained universal tag. Finally, I modify the edge classifier to condition predictions directly on the relative location of words, so the system can more effectively leverage linearization and distance. Both of these make statistically significant improvements to accuracy. In order to accommodate dependency formalisms that don't adhere to strict tree structures, I minimally adapt the parser once more to produce arbitrary dependency graphs instead of dependency trees. I again ablate the system to explore how important the different hyperparameters and components of the system are, finding that while most of them do make a statistically significant difference, in general the differences are very small and the system is very robust. The work in this thesis not only contributes narrowly to the field of dependency parsing, but also more broadly provides tools for tasks with more complex dependencies than sequence labeling or classification.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2019; ©2019
Publication date	2019; 2019
Issuance	monographic
Language	English

Creators/Contributors

Author	Dozat, Timothy Allen
Degree supervisor	Manning, Christopher D
Thesis advisor	Manning, Christopher D
Thesis advisor	Jurafsky, Dan, 1962-
Thesis advisor	Kay, Martin
Degree committee member	Jurafsky, Dan, 1962-
Degree committee member	Kay, Martin
Associated with	Stanford University, Department of Linguistics.

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Timothy Dozat.
Note	Submitted to the Department of Linguistics.
Thesis	Thesis Ph.D. Stanford University 2019.
Location	electronic resource

Access conditions

License: This work is licensed under a Creative Commons Attribution Share Alike 3.0 Unported license (CC BY-SA).

Also listed in

View in SearchWorks

Loading usage metrics...