Linear classification
Let's consider a generic linear classification problem with two classes. In the following graph, there's an example:

Bidimensional scenario for a linear classification problem
Our goal is to find an optimal hyperplane, that separates the two classes. In multi-class problems, the one-vs-all strategy is normally adopted, so the discussion can focus only on binary classifications. Suppose we have the following dataset made up of nm-dimensional samples:

This dataset is associated with the following target set:

Generally, there are two equivalent options; binary and bipolar outputs and different algorithms are based on the former or the latter without any substantial difference. Normally, the choice is made to simplify the computation and has no impact on the results.
We can now define a weight vector made of m continuous components:

We can also define the quantity, z:

If x is a variable, z is the value determined by the hyperplane equation. Therefore, in a bipolar scenario, if...