RNN研究现状以及发展趋势,LSTM的变种 (variants of LSTM)

训练算法,如Back Propagation Through Time(BPTT)、Real-time Recurrent Learning(RTRL)、Extended Kalman Filter(EKF)等学习算法,以及梯度消失问题(vanishing gradient problem)

  1. 详细介绍Long Short-Term Memory(LSTM,长短时记忆网络);
  2. 详细介绍Clockwork RNNs(CW-RNNs,时钟频率驱动循环神经网络);
  3. 基于Python和Theano对RNNs进行实现,包括一些常见的RNNs模型。

GRU - gated recurrent unit

It combines the forget and input gates into a single “update gate.” It
also merges the cell state and hidden state, and makes some other changes. The resulting model is simpler than
standard LSTM models, and has been growing increasingly popular.


上海交通大学的Zeping Yu 和Gongshen Liu,在论文“Sliced Recurrent Neural Networks”中,提出了全新架构“切片循环神经网络”(SRNN)。SRNN可以通过将序列分割成多个子序列来实现并行化。SRNN能通过多个层获得高级信息,而不需要额外的参数。

SRU - simple recurrent unit



Architecture Search

LSTM initialized with a large positive forget gate bias
outperformed both the basic LSTM and the GRU!

Gradients will vanish if f is close to 0. Using a large positive bias
ensures that f has values close to 1, especially when training begins.

An empirical exploration of recurrent network architectures, ICML 2015