LSTM Recurrent Neural Networks

Recurrent neural networks (RNNs) are essential in NLP in 2016. I’m always trying to improve my knowledge in this area both for my work at Agolo and my own edification. This post collects some resources that have helped me better understand RNNs and specifically, Long Short-Term Memory (LSTM) networks.

Understanding LSTM Networks by Christopher Olah is an incredibly good introductory resource. In fact, all of his blog posts are great resources. He has a clear writing style and uses illuminating illustrations.

For a more concrete example with code snippets, I’d recommend reading the TensorFlow official tutorial. You don’t actually have to follow it on your machine. It helps if you’re familiar with Python, but it’s a very easy language to read.

“The Unreasonable Effectiveness of Recurrent Neural Networks” by Andrej Karpathy is a detailed but easy to read introduction to RNNs. It comes with a working implementation of an RNN, albeit in Lua. It’s also a gold standard in technical writing. You can learn a lot by analyzing how he explains technical concepts with some detail while keeping it interesting and maintaining forward momentum.

Finally, “Speech Recognition with Deep Recurrent Neural Networks” by Graves, Mohamed, and Hinton that give a technical, detailed explanation of RNNs. It also gives a practical application for RNNs - speech recognition - with state of the art results.

I’m a relative newcomer to deep learning. As I get deeper into it, I’ll update with more resources I find helpful.