Using dummy estimators to compare results
This recipe is about creating fake estimators; this isn't the pretty or exciting stuff, but it is worthwhile having a reference point for the model you'll eventually build.
Getting ready
In this recipe, we'll perform the following tasks:
- Create some random data.
- Fit the various dummy estimators.
We'll perform these two steps for regression data and classification data.
How to do it...
- First, we'll create the random data:
from sklearn.datasets import make_regression, make_classification X, y = make_regression() from sklearn import dummy dumdum = dummy.DummyRegressor() dumdum.fit(X, y) DummyRegressor(constant=None, quantile=None, strategy='mean')
- By default, the estimator will predict by just taking the mean of the values and outputting it multiple times::
dumdum.predict(X)[:5] >array([-25.0450033, -25.0450033, -25.0450033, -25.0450033, -25.0450033])
There are other two other strategies we can try. We can predict a supplied constant (refer to constant=None...