Controling Gradients: Initialization, Batch Normalization, Resnets and Gated RNNs
Slides
Problems
References on Controlling Gradients
Colah's blog on RNNs
Stanford CS231 lecture slides on RNNs