Convolutional neural networks
CNN, also known as ConvNet, is a special type of neural network and it is extensively used in Computer Vision. The application of a CNN ranges from enabling vision in self-driving cars to the automatic tagging of friends in your Facebook pictures. CNNs make use of spatial information to recognize the image. But how do they really work? How can the neural networks recognize these images? Let's go through this step by step.
A CNN typically consists of three major layers:
- Convolutional layer
- Pooling layer
- Fully connected layer
Convolutional layer
When we feed an image as input, it will actually be converted to a matrix of pixel values. These pixel values range from 0 to 255 and the dimensions of this matrix will be [image height * image width * number of channels]. If the input image is 64 x 64 in size, then the pixel matrix dimension would be 64 x 64 x 3, where the 3 refers to the channel number. A grayscale image has 1 channel and color images have 3 channels (RGB...