Finding the closest object in the feature space
Sometimes, the easiest thing to do is to find the distance between two objects. We just need to find some distance metric, compute the pairwise distances, and compare the outcomes with what is expected.
Getting ready
A lower level utility in scikit-learn is sklearn.metrics.pairwise
. It contains server functions used to compute distances between vectors in a matrix X or between vectors in X and Y easily. This can be useful for information retrieval. For example, given a set of customers with attributes of X, we might want to take a reference customer and find the closest customers to this customer.
In fact, we might want to rank customers by the notion of similarity measured by a distance function. The quality of similarity depends upon the feature space selection as well as any transformation we might do on the space. We'll walk through several different scenarios of measuring distance.
How to do it...
We will use the pairwise_distances
function...