Translation is “AI-hard” challenge: preserving the _meaning_ and the _fluency_ of the text.
STATISTICAL MT
Noisy Channel Model
Language model $P(e)$ -> e -> encoder channel $P(f|e)$ -> f -> decoder $argmax P(e|f)$ -> $\hat{e}$
Two components: language model (LM); translation model (TM)
TM: based on word co-occurrences in _parallel texts_
ALIGNMENT (rarely observed)
$P(A(a_1, \ldots, a_J)|E(e_1, \ldots, e_I), F(f_1,\ldots,f_J))$
have to infer the alignments: probabilistic model use EM algorithm
IBM MODEL 1
Formulate probabilistic model of translation:
Translation table: to learn the parameter tables $t$, need the word alignments
Estimate the model (EM) algorithm (?):
- Make initial guess of $t$ parameters
- Estimate alignments $P(A|E,F)$ under our model
- Learn parameters $t$, based on expected alignments (from step 2)
- Repeat from step 2