John Torr

John Torr



E-mail:
Address:   IF 3.48, School of Informatics
University of Edinburgh
10 Crichton Street
Edinburgh EH8 9AB, UK

I am a Ph.D. candidate in ILCC, School of Informatics, The University of Edinburgh. I am developing a broad coverage, supervised, statistical parser which uses an extended version of Ed Stabler's (1997) Minimalist Grammars (MGs) formalism. From a formal perspective, MGs are a type of highly succinct lexicalised Multiple Context Free Grammar (MCFG), and therefore fall into the LCFRS class of mildly context sensitive formalisms (more expressive than the TAG class, but still polynomial). From a theoretical/linguistic perspective, these grammars can be viewed as a rigorous and computationally-oriented formalization of many aspects of Chomsky's (1995) Minimalist Program. The aim of the project is to build a parser and wide coverage grammar that can capture and exploit the sorts of deep linguistic genralizations discovered by Chomskyan theoretical syntacticians to achieve richer and more accurate syntactic/semantic parsing. In particular, MGs are adept at capturing long-range dependencies (such as wh-movement and topicalization), as well as the constraints which hold over them (such as wh-islands, subjects islands etc) and can be used to constrain the strong generative capacity of the grammar and therefore the parser's search space. In order to keep parsing efficient, a statistical model is required to enable the parser to focus only on the most likely hypotheses, and for this training data is required in the form of a Minimalist treebank. I have therefore developed the Autobank GUI system (an earlier version is discussed in the EACL paper below) and am currently using it to construct a wide and deep coverage grammar of English, and to semi-automatically convert the Penn Treebank into an MGbank. This approach differs from the earlier tree transduction method I was using that was described in the 1st year review and MIT presentations. Autobank will soon be available for download from this website, and I hope others will use it either to modify my treebank (no one agrees on much in theoretical linguistics!), or to construct their own from scratch, which can be done in a relatively short amount of time. I am supervised by Prof. Mark Steedman and Dr. Shay Cohen.


Research Interests

Syntactic and Semantic Parsing, Minimalist Grammars, The Minimalist Program, Combinatory Categorial Grammar, Machine Translation.

Education

PhD student in ILCC, School of Informatics, University of Edinburgh             April 2015 -- Present
MSc Artificial Intelligence (Natural Language Processing), School of Informatics, University of Edinburgh             Sep 2013 -- Jul 2014
BA (hons), Linguistics, University of Cambridge             Sep 2010 -- Jul 2013

Publications

Presentations


Last updated: 10/06/2015