Chapter 7. Recurrent Neural Networks and Language Models
The neural network architectures we discussed in the previous chapters take in fixed sized input and provide fixed sized output. This chapter will lift this constraint by introducing Recurrent Neural Networks (RNNs). RNNs help us deal with sequences of variable length by defining a recurrence relation over these sequences (hence the name).
The ability to process arbitrary sequences of input makes RNNs applicable for natural language processing (NLP) and speech recognition tasks. In fact, RNNs can be applied to any problem since it has been proven that they are Turing complete – theoretically, they can simulate any program that a regular computer would not be able to compute. For example, Google's DeepMind has proposed a model called Differentiable Neural Computer, which can learn how to execute simple algorithms, such as sorting.
In this chapter, we will cover the following topics:
- Recurrent neural networks
- Language modeling
- Sequence to...