LSTM
LSTMs are a fundamental step in RNNs, because they introduce long-term dependencies into the cells. The unrolled cells contain two different parameter lines: one long-term status, and the other representing short-term memory.
Between steps, the long-term forgets less important information, and adds filtered information from short-term events, incorporating them into the future.
LSTMs are really versatile in their possible applications, and they are the most commonly employed recurrent models, along with GRUs, which we will explain later. Let's try to break down an LSTM into its components to get a better understanding of how they work.
The gate and multiplier operation
LSTMs have two fundamental values: remembering important things from the present, and slowly forgetting unimportant things from the past. What kind of mechanism can we use to apply this kind of filtering? It's called the gate operation.
The gate operation basically has a multivariate vector input and a filter vector, which...