Michael Auli
I work on improving the syntactic analysis of natural language using Combinatory Categorial Grammar (CCG). Its use in practical, fast and accurate parsing poses many interesting challenges. My research focuses on overcoming these challenges via efficient search, modelling and optimisation.
I am a third year Ph.D student in the Institute for Language, Cognition and Computation (ILLC) at The University of Edinburgh supervised by Philipp Koehn and Adam Lopez. I also had the good fortune to collaborate with Prachya Boonkwan, David Chiang, Stephen Clark, Hieu Hoang, Kevin Knight, Tom Kwiatkowski, Adam Pauls, Mark Steedman, Charles Sutton, and Luke Zettlemoyer.
Statistical models used in natural language processing are becoming ever more complicated which makes exact optimisation and inference often infeasible. I am interested in making these models practical via using approximation methods such as loopy belief propagation.
I am also interested in the empirical relationship between exact and approximation methods. I believe that knowing the effectiveness of approximations can improve our understanding of our models and will allow us to make more informed decisions when designing future models.
News
- I am graduating in spring 2012 and therefore interested in job opportunities.
Publications
-
Training a Log-Linear Parser with Loss Functions via Softmax-Margin.
Michael Auli and Adam Lopez.
In
Proceedings of EMNLP,
July 2011.
[abstract]
Log-linear parsing models are often trained by optimizing likelihood, but we would prefer to optimize for a task-specific metric like F-measure. Softmax-margin is a convex objective for such models that minimizes a bound on expected risk for a given loss function, but its naive application requires the loss to decompose over the predicted structure, which is not true of F-measure. We use softmax-margin to optimize a log-linear CCG parser for a variety of loss functions, and demonstrate a novel dynamic programming algorithm that enables us to use it with F-measure, leading to substantial gains in accuracy on CCGBank. When we embed our loss-trained parser into a larger model that includes supertagging features incorporated via belief propagation, we obtain further improvements and achieve a labelled/unlabelled dependency F-measure of 89.3%/94.0% on gold part-of-speech tags, and 87.2%/92.8% on automatic part-of-speech tags, the best reported results for this task.
-
A Comparison of Loopy Belief Propagation and Dual Decomposition for Integrated CCG Supertagging and Parsing.
Michael Auli and Adam Lopez.
In
Proceedings of ACL,
June 2011.
[abstract]
Via an oracle experiment, we show that the upper bound on accuracy of a CCG parser is significantly lowered when its search space is pruned using a supertagger, though the supertagger also prunes many bad parses. Inspired by this analysis, we design a single model with both supertagging and parsing features, rather than separating them into distinct models chained together in a pipeline. To overcome the resulting increase in complexity, we experiment with both belief propagation and dual decomposition approaches to inference, the first empirical comparison of these algorithms that we are aware of on a structured natural language processing problem. On CCGbank we achieve a labelled dependency F-measure of 88.8\% on gold POS tags, and 86.7\% on automatic part-of-speeoch tags, the best reported results for this task.
-
Efficient CCG Parsing: A* versus Adaptive Supertagging.
Michael Auli and Adam Lopez.
In
Proceedings of ACL,
June 2011.
[abstract]
We present a systematic comparison and combination of two orthogonal techniques for efficient parsing of Combinatory Categorial Grammar (CCG). First we consider adaptive supertagging, a widely used approximate search technique that prunes most lexical categories from the parser's search space using a separate sequence model. Next we consider several variants on A*, a classic exact search technique which to our knowledge has not been applied to more expressive grammar formalisms like CCG. In addition to standard hardware-independent measures of parser effort we also present what we believe is the first evaluation of A* parsing on the more realistic but more stringent metric of CPU time. By itself, A* substantially reduces parser effort as measured by the number of edges considered during parsing, but we show that for CCG this does not always correspond to improvements in CPU time over a CKY baseline. Combining A* with adaptive supertagging decreases CPU time by 15\% for our best model.
-
CCG-based Models for Statistical Machine Translation.
First-Year PhD Report, University of Edinburgh.
May 2009.
[abstract]
The arguably best performing statistical machine translation systems are based on context-free formalisms or weakly equivalent ones. These models usually use a syn- chronous version of a context-free grammar (SCFG) which we argue is too rigid for the highly ambiguous task of human language translation. This is exacerbated by the fact that the imperfect methods available for aligning parallel texts make extracting an effi- cient grammar very hard. As a result, the context-free grammars extracted are usually very large in size after having already been restricted through a variety of constraints.
We propose to use Combinatorial Categorial Grammar (CCG) for machine trans- lation models. CCG is a lexicalized, mildly-context-sensitive formalism which is very well suited to capture long-distance dependencies that are not addressed very well by most current models. We believe that CCG is very well suited for the task of machine translation due to its ability to represent non-constituents in a syntactic way which fre- quently occur in parallel texts as well as its high derivational flexibility. This allows us to use some of the advantages of non-syntactic phrase-based approaches within a syntactic framework such as a relatively small grammar size compared to context-free- based machine translation grammars.
A number of models leveraging the advantages of CCG are possible, however, our principal goal is to develop a string-to-tree based model which projects CCG on the target side of a synchronous grammar. We intend to apply the vast progress made in monolingual CCG parsing to machine translation. Additionally, we propose to extend CCG to a synchronous grammar (SCCG) as it has been done for other related for- malisms such as tree adjoining grammars. We hope that a SCCG may provide similar derivational flexibility to monolingual CCG which may result in a better model for translational equivalence. -
A Systematic Analysis of Translation Model Search Spaces.
Michael Auli, Adam Lopez, Hieu Hoang, and Philipp Koehn.
In
Proceedings of the Fourth Workshop on Statistical Machine Translation,
March 2009.
[abstract]
Translation systems are complex, and most metrics do little to pinpoint causes of error or isolate system differences. We use a simple technique to discover induction errors, which occur when good translations are absent from model search spaces. Our results show that a common pruning heuristic drastically increases induction error, and also strongly suggest that the search spaces of phrase-based and hierarchical phrase-based models are highly overlapping despite the well known structural differences.
Talks
Integrated CCG Parsing and Supertagging. Invited talk at Carnegie Mellon University and Johns Hopkins University, June 2011.Accurate CCG Parsing with Approximate Language Intersection and Task-specific Optimization. Invited talk at the University of Cambridge, May 2011.
