Training a CNN
In the previous section, we have seen how to construct a CNN and apply different operations on its different layers. Now when it comes to training a CNN, it is much trickier as it needs a lot of considerations to control those operations such as applying appropriate activation function, weight and bias initialization, and of course, using optimizers intelligently.
There are also some advanced considerations such as hyperparameter tuning for optimized too. However, that will be discussed in the next section. We first start our discussion with weight and bias initialization.
Weight and bias initialization
One of the most common initialization techniques in training a DNN is random initialization. The idea of using random initialization is just sampling each weight from a normal distribution of the input dataset with low deviation. Well, a low deviation allows you to bias the network towards the simple 0 solutions.
But what does it mean? The thing is that, the initialization can...