Using KNN for regression
Regression is covered elsewhere in the book, but we might also want to run a regression on pockets of the feature space. We can think that our dataset is subject to several data processes. If this is true, only training on similar data points is a good idea.
Getting ready
Our old friend, regression, can be used in the context of clustering. Regression is obviously a supervised technique, so we'll use K-Nearest Neighbors (KNN) clustering rather than k-means. For KNN regression, we'll use the K closest points in the feature space to build the regression rather than using the entire space as in regular regression.
How to do it…
For this recipe, we'll use the iris
dataset. If we want to predict something such as the petal width for each flower, clustering by iris species can potentially give us better results. The KNN regression won't cluster by the species, but we'll work under the assumption that the Xs will be close for the same species, in this case, the petal length...