Intuition and justification of generative models
So far, we've used neural networks as discriminative models. This simply means that given input data, a discriminative model will map it to a certain label (in other words, a classification). A typical example is the classification of MNIST images in 1 of 10 digit classes, where the neural network maps the input data features (pixel intensities) to the digit label. We can also say this in another way, a discriminative model gives us the probability of

(class), given

(input)

. In the MNIST case, this is the probability of the digit, given the pixel intensities of the image.
On the other hand, a generative model learns the distribution of the classes. You can think of it as the opposite of what the discriminative model does. Instead of predicting the class probability,

, given certain input features, it tries to predict the probability of the input features, given a class,

-

. For example, a generative model will be able to create an image...