Using SGD for classification
The stochastic gradient descent (SGD) is a fundamental technique used to fit a model for regression. There are natural connections between SGD for classification or regression.
Getting ready
In regression, we minimized a cost function that penalized for bad choices on a continuous scale, but for classification, we'll minimize a cost function that penalizes for two (or more) cases.
How to do it...
- First, let's create some very basic data:
from sklearn import datasets X, y = datasets.make_classification(n_samples = 500)
- Split the data into training and testing sets:
from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X,y,stratify=y)
- Instantiate and train the classifier:
from sklearn import linear_model sgd_clf = linear_model.SGDClassifier() #As usual, we'll fit the model: sgd_clf.fit(X_train, y_train)
- Measure the performance on the test set:
from sklearn.metrics import accuracy_score accuracy_score(y_test,sgd_clf.predict...