What's that supposed to mean? Modeling the pragmatic meaning of utterances

Placeholder Show Content

Abstract/Contents

Abstract
Many strands of natural language processing work, by and large, capture only the literal meaning of sentences. However, in even our most mundane interactions, much of what we communicate is not said explicitly but rather inferred from the context. If I ask a friend to lunch and she replies, "I had a very large breakfast", I will infer that she does not want go, even though she (perhaps deliberately) avoided saying so directly. This dissertation focuses on building computational models of such pragmatic enrichment. I aim at capturing aspects of pragmatic meaning, the kind of information that a reader will reliably extract from an utterance within a discourse. I investigate three phenomena for which humans readily make inferences. The first study concentrates on interpreting answers to yes/no questions which do not straightforwardly convey a 'yes' or 'no' answer. I focus on questions involving scalar modifiers (Was the movie wonderful? It was worth seeing.) and numerical answers (Are your kids little? I have a 10 year-old and a 7 year-old.). To determine whether the intended answer is 'yes' or 'no', we need to evaluate how "worth seeing" relates to "wonderful", and how "10 and 7 year-old" relate to "little". Can we automatically learn from real texts what meanings people assign to these modifiers? I exploit the availability of a large amount of text to learn meanings from words and sentences in contexts. I show that we can ground scalar modifier meaning based on large unstructured databases, and that such meanings can drive pragmatic inference. The second study detects conflicting statements. If an article about a factory says that 100 people were working inside the plant where the police defused the rockets, whereas a second about the same factory reports that 100 people were injured, and we understand these statements, we will infer that they are contradictory. I created the first available corpus of contradictions which, departing from the traditional view in formal semantics, I have defined as pieces of text that are extremely unlikely to be considered true simultaneously. I argue that such a definition, rather than a logical notion of contradiction, better fits people's intuitions of what a contradiction is. Through a detailed analysis of such naturally-occurring conflicting statements, I identified linguistic factors which give rise to contradiction. I then used a logistic regression model to learn the best way of weighing these different factors, and put this model to use to predict whether a new set of sentence pairs was contradictory. The third study targets veridicality -- whether events described in a text are viewed as actual, non-actual or uncertain. What do people infer from a sentence such as "At a news conference, Mr. Fournier accused Paribas of planning to pay for the takeover by selling parts of the company"? Is Paribas going to pay for the takeover by selling parts of the company? I show that not only lexical semantic properties but context and world knowledge shape veridicality judgments. Since such judgments are not always categorical, I suggest they should be modeled as distributions. I build and describe a classifier, which balances both lexical and contextual factors and can faithfully model human judgments of veridicality distributions. Together these studies illustrate how computer systems begin to recover hearers' readings by exploiting probabilistic methods and learning from large amounts of data in context. My dissertation highlights the importance of modeling pragmatic meaning to reach real natural language understanding. Humans rely on context in their everyday use of language. Computer programs must do likewise, and the work presented here shows that it is feasible to automatically capture some aspects of pragmatic meaning.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2012
Issuance monographic
Language English

Creators/Contributors

Associated with De Marneffe, Marie-Catherine
Associated with Stanford University, Department of Linguistics
Primary advisor Manning, Christopher D
Thesis advisor Manning, Christopher D
Thesis advisor Jurafsky, Dan, 1962-
Thesis advisor Levin, Beth
Thesis advisor Potts, Christopher, 1977-
Advisor Jurafsky, Dan, 1962-
Advisor Levin, Beth
Advisor Potts, Christopher, 1977-

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Marie-Catherine de Marneffe.
Note Submitted to the Department of Linguistics.
Thesis Thesis (Ph.D.)--Stanford University, 2012.
Location electronic resource

Access conditions

Copyright
© 2012 by Marie-Catherine H.J.N.L. de Marneffe
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...