WSTA 21 - MACHINE TRANSLATION WORD BASED MODELS

NLP

 2018/05/29   Share

Translation is “AI-hard” challenge: preserving the _meaning_ and the _fluency_ of the text.

STATISTICAL MT

Noisy Channel Model

Language model $P(e)$ -> e -> encoder channel $P(f|e)$ -> f -> decoder $argmax P(e|f)$ -> $\hat{e}$

Two components: language model (LM); translation model (TM)

TM: based on word co-occurrences in _parallel texts_
ALIGNMENT (rarely observed)
- $P(A(a_1, \ldots, a_J)|E(e_1, \ldots, e_I), F(f_1,\ldots,f_J))$
- have to infer the alignments: probabilistic model use EM algorithm

IBM MODEL 1

Formulate probabilistic model of translation:

$P(F, A|E)=\frac{\epsilon}{(I+1)^J}\prod_{j=1}^{J}t(f_j|e_{a_j})$

Translation table: to learn the parameter tables $t$, need the word alignments

Estimate the model (EM) algorithm (?):

Make initial guess of $t$ parameters
Estimate alignments $P(A|E,F)$ under our model
Learn parameters $t$, based on expected alignments (from step 2)
Repeat from step 2

HMMS FOR ALIGNMENT

CATALOG

1. STATISTICAL MT
1. 1.1. IBM MODEL 1
2. 1.2. HMMS FOR ALIGNMENT



缺失模块。
1、请确保node版本大于6.2
2、在博客根目录（注意不是archer根目录）执行以下命令：
npm i hexo-generator-json-content --save
3、在根目录_config.yml里添加配置：

jsonContent:
  meta: false
  pages: false
  posts:
    title: true
    date: true
    path: true
    text: false
    raw: false
    content: false
    slug: false
    updated: false
    comments: false
    link: false
    permalink: false
    excerpt: false
    categories: false
    tags: true