|Date||Apr 21, 2014|
|Title||Context models for meaning-preserving machine translation|
|Abstract||The quality of machine translation (MT) has improved dramatically in recentyears thanks largely to statistical models trained on large amounts ofnaturally existing translated text. The resulting translations are often goodenough to be useful in some contexts, such as gisting and manual post-editing.However, current systems cannot always be trusted to convey the correct andcomplete meaning (who did what to whom) of the source language. My workleverages linguistic context to enable MT systems to resolve lexical choiceambiguities, and to accurately preserve the meaning of the source.|
I will firstdescribe Phrase Sense Disambiguation (PSD) models, which improve translationquality using local context. Conventional translation models selecttranslations for words and phrases independently of context. Inspired by wodsense disambiguation approaches, PSD uses rich context cues in the sourcesentence to disambiguate between word and phrase translations proposed by a standardMT system.
I will then turnto the impact of wider document context and investigate whether the "onesense per discourse" hypothesis holds in translation. Using manuallypost-edited translations, I show that machine translation systems aresurprisingly consistent, even when they translate sentences independently fromtheir document context. However, inconsistent translations are often symptomsof translation errors, which suggests that document-level context could benefittranslation quality.
|Bio||Marine Carpuat is a Research Scientist at the National Research Council Canada,where she works on multilingual natural language processing and statisticalmachine translation. Before joining the NRC, Marine was a postdoctoralresearcher at Columbia University in New York. She received a PhD in ComputerScience from the Hong Kong University of Science & Technology.|