SVMs with scikit-learn
To allow the model to have a more flexible separating hyperplane, all scikit-learn implementations are based on a simple variant that includes so-called slack variablesζi in the function to minimize (sometimes called C-Support Vector Machines):

In this case, the constraints become the following:

The introduction of slack variables allows us to create a flexible margin so that some vectors belonging to a class can also be found in the opposite part of the hyperspace and can be included in the model training. The strength of this flexibility can be set using the C parameter. Small values (close to zero) bring about very hard margins, while values greater than or equal to 1 allow more and more flexibility (while alsoincreasing the misclassification rate). Equivalently, when C → 0, the number of support vectors is minimized, while larger C values increase it.
The right choice ofCis not immediate, but the best value can be found automatically by using a grid search, as seen...