Examining logistic regression errors with a confusion matrix
Getting ready
Import and view the confusion matrix for the logistic regression we constructed:
from sklearn.metrics import confusion_matrix confusion_matrix(y_test, y_pred,labels = [1,0]) array([[27, 27], [12, 88]])
I passed three arguments to the confusion matrix:
y_test
: The test target sety_pred
: Our logistic regression predictionslabels
: References to a positive class
The labels = [1,0]
means that the positive class is 1
and the negative class is 0
. In the medical context, we found while exploring the Pima Indians diabetes dataset that class 1
tested positive for diabetes.
Here is the confusion matrix, again in pandas dataframe form:

How to do it...
Reading the confusion matrix
The small array of numbers has the following meaning:

The confusion matrix tells us a bit more about what occurred during classification, not only the accuracy score. The diagonal elements from upper-left to lower-right are correct classifications. There...