SpeakerAdam Lopez
DateMar 24, 2014
Time11:00AM 12:00PM
LocationIF 4.31/4.33
TitleA Formal Model of Semantics-Preserving Translation
AbstractStatistical machine translation has been very successful, resulting in athriving industry highlighted by products like Google Translate. Yettranslation systems still often fail to capture many linguistic phenomena,because they model translation as simple substitution and permutation of wordtokens, sometimes informed by syntax. Formally, these models are probabilisticrelations on regular or context-free sets, a poor fit for many of the world'slanguages. If we are to build translation systems that adequately capturelinguistic phenomena, we must model those phenomena. Computational linguistshave developed expressive mathematical models of language that exhibit high empiricalcoverage of annotated language data, correctly predict a variety of importantlinguistic phenomena in many languages, and can be processed with efficientalgorithms. I will describe a new formal model of translation based on one ofthese formalisms, combinatory categorial grammar (CCG). I will describe asynchronous CCG that generates a relation on sentence pairs with provablyequivalent semantics. I will then give a solution for the crucial problem of recognition---thebasis of any probabilistic translation algorithm - derived from a view ofparsing as language intersection.

BioAdam Lopez works on problems at the intersection of computational linguistics,algorithms, formal language theory, and machine learning, withapplications to problems in natural language processing, particularly machinetranslation. He is an assistant research professor at Johns Hopkins University.He has spent time as a visiting scientist at SDL Research (formerlyLanguageWeaver), the first company to commercialize statistical machinetranslation. He was previously a research fellow at the University ofEdinburgh, and earned his Ph.D. at the University of Maryland.

Previous Next