Stacking multiple LSTM layers
Just as we can increase the depth of neural networks or CNNs, we can increase the depth of RNN networks. In this recipe we apply a three-layer-deep LSTM to improve our Shakespearean language generation.
Getting ready
We can increase the depth of recurrent neural networks by stacking them on top of each other. Essentially, we will be taking the target outputs and feeding them into another network.
To get an idea of how this might work for just two layers, see the following diagram:

Figure 5: In the preceding diagram, we have extended one-layer RNNs so that they have two layers. For the original one-layer versions, see the diagrams in the introduction to the previous chapter. The left architecture illustrates a way of using a multi-layered RNN to predict one output from a sequence of outputs. The right architecture shows a way to use a multi-layered RNN to predict a sequence of outputs, which uses the outputs as inputs
TensorFlow allows easy implementation of multiple...