My research is in theoretical and computational linguistics. Most of my work is in formal semantics and pragmatics, although I have also applied machine learning techniques to various interpretation tasks.
I am mainly interested in natural language interpretation. I focus much of my research on modelling discourse coherence and its interaction with semantics. I'm involved in developing formal models of semantics that use techniques from commonsense reasoning, dynamic semantics and semantic underspecification. I'm also involved in building dialogue systems, using a combination of symbolic and statistical methods.
Much of my research is the result of collaborations with colleagues in various disciplines, including Nicholas Asher (Texas), Jason Baldridge (Austin), Ann Copestake (Cambridge), Claire Grover (Edinburgh), Markus Guhe, Alexander Koller (Saarbrucken), Mirella Lapata (Edinburgh), Caroline Sporleder (Saarbrucken) and Matthew Stone (Rutgers).
I am particularly interested in the following areas:
In collaboration with Nicholas Asher, we've developed a dynamic semantic theory of discourse interpretation that's called Segmented Discourse Representation Theory (SDRT).
SDRT explores the interplay between discourse interpretation and discourse coherence. It consists of several connected components. First, it supplies a language for representing the logical form of discourse and of dialogue. A discourse is represented as a set of labels, each one standing for a segment of the discourse, and each label is associated with a representation of its content. That content can feature rhetorical relations like Explanation and Contrast between labels. Consequently, a coherent discourse is a segment consisting of rhetorically connected subsegments.
The language is assigned a dynamic semantic interpretation. The interpretations of rhetorical relations often specify additional content to that given by the compositional or lexical semantics of the utterances they connect together. In this way, meaning representations for discourse in SDRT capture those implicatures that arise via assumptions that the discourse is coherent (in other words, that every segment of the discourse is connected to another segment).
SDRT also supplies a so-called glue logic in which one computes the logical form of a discourse, using compositional semantics and non-linguistic information such as real world knowledge as clues. This logic supports default reasoning, since one never has complete and accurate information about the context, including speaker intentions. The axioms of the logic are designed to help the interpreter solve three logically co-dependent tasks in interpretation:
Taken together, SDRT's glue logic and its dynamic semantics for the logical form of discourse supply a theory for computing pragmatically preferred interpretations of discourse.
Unlike prior theories of discourse interpretation within AI, the reasoning architecture is highly modular, in that the glue logic is distinct from the logic in which those logical forms are interpreted. It's also kept separate from the logic of the lexicon, domain knowledge, cognitive states and so on. The glue logic has only restricted access to these information sources in order to maintain computability.
SDRT offers a novel approach to speech acts, where they are treated as anaphoric relations between utterances (anaphoric because the part of the discourse context that the current utterance is related to via the speech act is anaphorically determined). This contrasts with the more usual conception of speech acts as properties of utterances. In fact, each rhetorical relation can be viewed as a type of speech act. The status of rhetorical relations as actions is reflected in their interpretations: unlike other predicates, they transform the input context (a world-function assignment pair) to a different output one. The output context ensures that the semantic information that's necessary for the successful performance of the speech act has been accommodated. In this way, SDRT provides a logical account of how people bridge the gap between literal and intended meaning. In essence it's a formal theory of how the syntax/semantics interface and the semantics/pragmatics interface interact.
Together with my collaborators, I have used SDRT to model a wide range of phenomena where both semantics and pragmatics interact in complex ways, in particular: nominal anaphora, temporal and causal structures in text and dialogue, word sense disambiguation, lexical sense modulations in context, bridging inferences, presuppositions, metonymy, metaphor, questions and responses in dialogue, imperatives, non-sentential fragments, indirect speech acts, agreement and denial, grounding, non-cooperative conversation, and gesture.
Nicholas Asher and I have written a book on SDRT, which is published by Cambridge University Press. You can buy it from www.amazon.co.uk
I am currently involved in an ERC funded project entitled Strategic Conversation. The aim of this project is to model conversation in scenarios where the agents goals conflict. In such scenarios, the maxims of cooperativity and sincerity that normally form the bedrock for computing discourse interpretation don't necessarily apply. We are therefore modelling implicature by other means, integrating our existing model of discourse coherence and its interaction with semantics to models of human action decision making from game theory and behavioural economics. From a theoretical perspective, this involves extending and refining SDRT with a model of human reasoning involving solution concepts from game theory. We use this extended framework to predict when it's safe to treat an implicature as a matter of public record, for instance. From a practical perspective, we are collecting a dialogue corpus of agents playing Settlers of Catan: a zero-sum game where players negotiate and trade over restricted resources. This is arguably the first data collection effort for non-cooperative conversations where each utterance is temporally aligned with its game state, in machine readable form. We intend to use this corpus to develop a symbolic model of a dialogue agent that plays Settlers, and we intend to use machine learning techniques to adapt that agent to handle unseen game states. My main collaborators on this project are Nicholas Asher, Markus Guhe, Oliver Lemon and Verena Rieser.
I have done joint work with Matthew Stone on developing a formal semantic model of the spontaneous and improvised hand gestures that happen in face to face conversation. Our model proposes a way to capture the abstract and very incomplete meanings of the gestures that are derivable from just their form, and using SDRT it also specifies in a logically precise manner how those underspecified meanings are resolved to specific interpretations via contextual information, including in particular the content of the synchronous speech. Our aim with this work is to demonstrate that the methodologies that have already been established for studying the pragmatic interpretation of purely linguistic discourse apply also to gesture, providing a seamless integration of communicative actions with both speech and gesture. In collaboration with Katya Alahverdzhieva, I have also developed a model of how the interpretation of gesture is constrained by its form, its relative timing to speech phrases, and the form of those phrases. This form-meaning mapping is encoded in an HPSG, implemented within the Delphin Framework.
In collaboration with Ann Copestake and Dan Flickinger, I devised a constraint-based approach to constructing underspecified logical forms on the syntax/semantics interface that is more constrained than the lambda calculus. This is implemented in the grammar development environments and parsing and generation platforms provided by the Delphin Framework. More recently, in collaboration with Ann Copestake and Alexander Koller, I have helped to design the syntax, semantics and proof theory for a formal language that is maximally flexible in the type of semantic information that can be left underspecified: it can express not only the standard underspecified information about semantic scope and antecedents to anaphora, but in addition it allows one to underspecify the arity of predicates, the arguments they take, and the argument position of a variable to a predicate. This makes the language ideal for building semantic components to shallow language processors (from POS taggers to intermediate statistical parsers), where information about syntactic dependencies and/or lexical subcategorisation may be missing. My research on gesture draws on this work on underspecified semantics.
Together with Mirella Lapata and Caroline Sporleder, I have designed, implemented and evaluated statistical models of the discourse structure of narrative text and the temporal order of its events. This project was funded by EPSRC.
To overcome supervised learning over sparse data, we use a combination of unsupervised learning and supervised learning with automatically labelled training examples that are captured from massive online resources such as the Web and the BNC. Our models exploit the relationship between rhetorical relations and discourse cue phrases (e.g., but indicates Contrast, and because indicates Explanation). We use probabilistic modelling to combine multiple sources of linguistic knowledge for estimating discourse structure; combining the features is done automatically by training on large corpora such as the BNC. The approach therefore deals with domain independent narrative text.
SDRT informs this work in a number of ways. It provides the basis for more accurate smoothing over sparse data, for instance. SDRT has also provided the basis for selecting and motivating which features to include in the model.
I was PI on a project involving Edinburgh and Stanford, funded by Scottish Enterprise, whose aim was to improve the state of the art in the robust and accurate interpretation of dialogue. Part of this work involved building, in collaboration with Jason Baldridge, the first dialogue parser that automatically learns how to assign a discourse structure to dialogues from the scheduling domain. The discourse structures that are assigned to the dialogues allow one to construct deterministically a logical form to the dialogue in the style of SDRT, with the sentences being assigned their logical forms as chosen through the Redwoods corpus (gold standard) or through existing statistical parse selection models for the English Resource Grammar. The result is a statistical interpretation model of scheduling dialogues whose output has a truth conditional interpretation that stems from (a) the ERG (since in the ERG all parses are assigned a compositional semantic interpretation) and (b) the rhetorical relations that appear in the discourse structure. The latter feature allows one to compute aspects of meaning that go beyond the grammar, such as the underlying goal of the dialogues, and resolve anaphoric terms such as 3pm (e.g., to 3pm on 26/05/04).
We are exploring the ways in which active learning can speed up the annotation process. Within the realm of parsing, we have found that by using active learning one can achieve results on the parse selection task with half the training data that one would need to achieve similar results when the training examples are chosen randomly.
Following on from work I did almost a decade ago with Ann Copestake on the symbolic representation of lexical information, I have been involved in some work on acquiring lexical semantic information from large corpora, using unsupervised machine learning. This work is done in collaboration with Claire Grover and Mirella Lapata.
We focus on interpretation tasks where the semantic information is implicit. For example, we have modelled the acquisition of semantic relations in compound nouns with deverbal heads in the medical domain and over the BNC. This involves predicting that in "patient arrival" the patient is the subject of the arriving event, whereas in "hospital arrival" the destination of the arriving event is the hospital. We achieve this by unsupervised learning through exploiting meaning paraphrases in the corpus and surface syntactic cues (i.e., we estimate that "patient" is more likely to be the subject of "arrive" on the basis of sentences in the corpus that feature the verb "arrive"). We have demonstrated that these techniques are also useful for interpreting logical metonymies: e.g., estimating that "enjoy the book" means "enjoy reading the book" and "good soup" is a soup that tastes good.
This work was funded by ESRC (project title is Data Intensive Semantics and Pragmatics). More information and access to the software tools that we used is available here.