Regression metrics
Cross-validation with a regression metric is straightforward with scikit-learn. Either import a score function from sklearn.metrics
and place it within a make_scorer
function, or you could create a custom scorer for a particular data science problem.
Getting ready
Load a dataset that utilizes a regression metric. We will load the Boston housing dataset and split it into training and test sets:
from sklearn.datasets import load_boston boston = load_boston() X = boston.data y = boston.target from sklearn.model_selection import train_test_split, cross_val_score X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=7)
We do not know much about the dataset. We can try a quick grid search using a high variance algorithm:
from sklearn.neighbors import KNeighborsRegressor from sklearn.model_selection import RandomizedSearchCV knn_reg = KNeighborsRegressor() param_dist = {'n_neighbors': list(range(3,20,1))} rs = RandomizedSearchCV(knn_reg,param_dist...