Application-based Text-to-Text Generation, Research Project Grant, funded by the EPSRC, 06/2006-06/2009.

Principal Investigator: Mirella Lapata
Research Fellow: Trevor Cohn

An emerging area of research in the field of Natural Language Processing (NLP) is text-to-text generation. Text-to-text generation takes naturally occurring texts as input and transforms them into new texts satisfying constraints such as length or style. Examples of applications that require text-to-text generation are single- and multidocument summarization, text simplification, sentence compression, and question answering. At the heart of methods developed for text-to-text generation lies the ability to identify and generate paraphrases, i.e., alternative ways to convey the same information either at the sentence or at the document level. The aim of this grant is to create algorithms and software for the collection of corpora appropriate for studying meaning equivalences and to develop and evaluate a summarisation system that incorporates models for identifying and generating paraphrases at the sentence and document level. The application is particularly suited for studying text-rewriting as it involves the extraction, potentially regeneration and ordering of information across multiple information sources.