Different layers of CNNs
A typical CNN architecture consists of multiple layers that do different tasks, as shown in the preceding diagram. In this section, we are going to go through them in detail and will see the benefits of having all of them connected in a special way to make such a breakthrough in computer vision.
Input layer
This is the first layer in any CNN architecture. All the subsequent convolution and pooling layers expect the input to be in a specific format. The input variables will tensors, that has the following shape:
[batch_size, image_width, image_height, channels]
Here:
batch_size
is a random sample from the original training set that's used during applying stochastic gradient descent.image_width
is the width of the input images to the network.image_height
is the height of the input images to the network.channels
are the number of color channels of the input images. This number could be 3 for RGB images or 1 for binary images.
For example, consider our famous MNIST dataset...