Estimating Word Translation Probabilities from Unrelated Monolingual Corpora Using the EM Algorithm

2000, English, 5 pages, postscript, .ps.gz. Published at AAAI 2000 at Austin, Texas.

Selecting the right word translation among several options in the lexicon is a core problem for machine translation. We present a novel approach to this problem that can be trained using only unrelated monolingual corpora and a lexicon. By estimating word translation probabilities using the EM algorithm, we extend upon target language modeling. We construct a word translation model for 3830 German and 6147 English noun tokens, with very promising results.