Note - Deep contextualized word representations

NLP

 2018/07/28   Share

Note: Deep contextualized word representations

Peters et al. - 2018 - Deep contextualized word representations

Introduction

They introduce a new type of deep contextualised word representation complex characteristics of word use. Their word vectors are learned functions of the internal states of a deep bidirectional language model (biLM).

Bidirectional language models

Given $(t_1, t_2, \ldots, t_N)$, a forward language model computes the probability:

$p(t_1, t_2, \ldots, t_N) = \Pi^{N}_{k=1}p(t_k|t_1, t_2, \ldots, t_{k-1})$

A backward LM is similar to a forward LM:

$p(t_1, t_2, \ldots, t_N) = \Pi^{N}_{k=1}p(t_k|t_{k+1}, t_{k+2}, \ldots, t_{N})$

The formulation jointly maximises the log likelihood:

$\Pi^{N}_{k=1}(\log{p(t_k|t_1, t_2, \ldots, t_{k-1};\Theta_{x}, \vec{\Theta}_{LSTM}, \Theta_{s})} + \log{p(t_k|t_{k+1}, t_{k+2}, \ldots, t_{N};\Theta_{x}, \overleftarrow{\Theta}_{LSTM}, \Theta_{s})})$

The parameters both the token representation $\Theta_{x}$ and softmax layer $\Theta_{s}$ are shared in the forward and backward direction.

CATALOG

1. Note: Deep contextualized word representations
1. 1.1. Introduction
2. 1.2. Bidirectional language models



缺失模块。
1、请确保node版本大于6.2
2、在博客根目录（注意不是archer根目录）执行以下命令：
npm i hexo-generator-json-content --save
3、在根目录_config.yml里添加配置：

jsonContent:
  meta: false
  pages: false
  posts:
    title: true
    date: true
    path: true
    text: false
    raw: false
    content: false
    slug: false
    updated: false
    comments: false
    link: false
    permalink: false
    excerpt: false
    categories: false
    tags: true