Chapter 4, Object Recognition Using Neural Networks and Supervised Learning
- This is an exercise for the student. You should see different curves develop as the activation function is changed. Some will not produce an answer at all (which look like random results – the curve stays at the same level as no learning is taking place). Some will learn faster or slower.
- Refer to Figure 3 in the chapter. The artificial neuron has a number of inputs, a set of weights, one for each input, a bias, an activation, and a set of outputs.
- Both have multiple inputs and multiple outputs, and accept inputs, perform some processing, and then make an output. Both use some sort of activation to determine when to “fire” or produce an output.
- The natural neuron is an analog device that can handle many levels or degrees of inputs, with no simple on/off binary representations like the computer neuron. Neurons use chemical paths that make pathways and connections easier the more they are used, which is the learning function of a neuron. This is simulated by the weights in an artificial neuron. The natural neuron has an axon, or connecting body that extends out to the outputs that can be a quite distance from the nerve inputs. Neurons are randomly connected to other neurons, while artificial neurons are connected in regular patterns.
- The first layer contains the number of inputs to the network.
- The last layer of an ANN is the output layer and has to have the same number of neurons as outputs.
- Loss functions in ANNs are the error function. They compare the expected output of the neuron with the actual output.
- Mean square loss: The most commonly used loss function. Sum of the squares of the distances between the output and the expected output. MSL amplifies the error the farther away from the desired solution.
- Cross-entropy: Also called log loss. Used mostly for classification CNNs. As the predicted value approaches 1 (no error), XE (cross-entropy) slowly decreases. As the values diverge, the XE increases rapidly. Two types of cross-entropy are binary (on/off, used for yes/no questions) and sigmoid cross-entropy, which can handle multiple classes.
- You are probably “overfitting” and have too small a sample size, or your network is not wide or deep enough.