Computing with mixed distance functions
When dealing with data observations that have multiple features, we should be aware that features can be scaled differently, on different scales. In this recipe, we will account for that to improve our housing value predictions.
Getting ready
It is important to extend the nearest-neighbor algorithm, to take into account variables that are scaled differently. In this example, we will illustrate how to scale the distance function for different variables. Specifically, we will scale the distance function as a function of the feature variance.
The key to weighting the distance function is to use a weight matrix. The distance function, written with matrix operations, becomes the following formula:

Here, A is a diagonal weight matrix that we will use to scale the distance metric for each feature.
In this recipe, we will try to improve our MSE on the Boston housing value dataset. This dataset is a great example of features that are on different scales, and the...