Depending on your goal, different metrics can be used to measure the performance of a model. When dealing with regression, a commonly used metric is the Mean Squared Error (MSE), which quantifies the average distance between the true and predicted values. The lower the MSE, the better the model.
However, in a classification problem, it doesn't make sense to use this metric, especially for multi-class problems.
The first indicator we may want to check when running a classifier is the accuracy, A, which is defined by the number of observations classified correctly, divided by the total number of observations.
We can compute the accuracy with scikit-learn using the following:
from sklearn.metrics import accuracy_score
accuracy_score(y_test, y_pred)
Here, y_pred was computed for the test sample using our fitted classifier:
y_pred = clf.predict(X_test)
This function will tell us we have an overall accuracy of 66%, which is not a terrific score for...