Mixed-initiative natural language translation

Green, Spence; Stanford University, Computer Science Department.

Mixed-initiative natural language translation

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fjh270hf3782" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: There are two classical applications of the automatic translation of natural language. Assimilation is translation when a gist of the meaning is sufficient, and speed and convenience are prioritized. Dissemination is translation with the intent to communicate, so there is usually a predefined quality threshold. The most common assimilation scenario is cross-lingual web browsing, where fully automatic machine translation (MT) best satisfies the speed and convenience requirements. Dissemination is the setting for professional translators, who produce translations with the intent to communicate. MT output does not yet come with quality guarantees, so it is best incorporated as an assistive technology in this setting. This dissertation proposes a mixed-initiative approach to translation for the dissemination scenario. In a mixed-initiative system, human users and intelligent machine agents collaborate to complete some task. The central question is how to design an efficient human/machine interface. By efficient we mean that human productivity should be enhanced, and the machine should be able to self-correct its model by observing human interactions. We separate human productivity into two measurable components: translation time and quality. We first compare unaided translation to post-editing, the simplest form of machine assistance. Human translators manipulate machine output to arrive at a final translation. We find that simple post-editing decisively improves translation along both coordinates, a result that motivates more advanced machine assistance. However, it is widely observed in prior work that users regard post-editing as a tedious task. The main contribution of this dissertation is therefore a more interactive mode of machine assistance that can improve both productivity and the user experience. We present Predictive Translation Memory (PTM), a new interactive, mixed-initiative translation system. The machine suggests future translations based on previous interactions. For example, if the user has typed part of a translation for a given input sentence, PTM can propose a completion. We also show how PTM can self-correct its model via incremental machine learning. A human evaluation shows that PTM helps translators produce higher quality translations than post-editing when baseline MT quality is high. This is the desired result for dissemination. The translators are slightly slower, but we observe a significant learning curve, suggesting practice may close the time gap. In addition, PTM enables better translation model adaptation than post-editing. We describe novel machine learning techniques that result in significant reductions in human Translation Edit Rate (HTER), which is an interpretable measure of human effort. Our results suggest that adaptation could amplify time and quality gains by shifting the balance of routinizable work toward the machine agent.

Description

Type of resource	text
Form	electronic; electronic resource; remote
Extent	1 online resource.
Publication date	2014
Issuance	monographic
Language	English

Creators/Contributors

Associated with	Green, Spence
Associated with	Stanford University, Computer Science Department.
Primary advisor	Heer, Jeffrey Michael
Primary advisor	Manning, Christopher D
Thesis advisor	Heer, Jeffrey Michael
Thesis advisor	Manning, Christopher D
Thesis advisor	DeNero, John
Thesis advisor	Jurafsky, Dan, 1962-
Advisor	DeNero, John
Advisor	Jurafsky, Dan, 1962-

Subjects

Genre	Theses

Bibliographic information

Statement of responsibility	Spence Green.
Note	Submitted to the Department of Computer Science.
Thesis	Thesis (Ph.D.)--Stanford University, 2014.
Location	electronic resource

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial Share Alike 3.0 Unported license (CC BY-NC-SA).

Also listed in

View in SearchWorks

Loading usage metrics...