Paper

Sequence to Sequence Learning with Neural Networks

Sequence to Sequence Learning with Neural Networks

RNN

Given a sequence of inputs $(x_1,...x_T)$, a standard RNN computes a sequence of outputs $(y_1,...,y_T)$ by iterating the following equation:

$$ h_t=\text{sigm}(W^{hx}x_t+W^{hh}h_{t-1}) \\ y_t=W^{yh}h_t $$

LSTM

LSTM is known to learn problems with long range temporal dependencies, so an LSTM may succeed in this setting.

The goal of the LSTM is to estimate the conditional probability $p(y_1,...,y_{T'}|x_1,...,x_T)$ where $(x_1,...,x_T)$ is an input sequence and $(y_1,...y_{T'})$ is its corresponding output sequence whose length $T'$ may differ from $T$.

생략

The model

The researchers’ models differ from above description in three important ways.

Seq2Seq with RNNs

Untitled