Similarity measures
From the rating matrix in the previous section, we see that every user can be represented as a j-dimensional vector where the kth dimension denotes the rating given by that user to the kth item. For instance, let 1 denote a like, -1 denote a dislike, and 0 denote no rating. Therefore, user B can be represented as (0, 1, -1, -1). Similarly, every item can also be represented as an i-dimensional vector where the kth dimension denotes the rating given to that item by the kth user. The video games item is therefore represented as (1, -1, 0, 0, -1).
We have already computed a similarity score for like-dimensional vectors when we built our content-based recommendation engine. In this section, we will take a look at the other similarity measures and also revisit the cosine similarity score in the context of the other scores.
Euclidean distance
The Euclidean distance can be defined as the length of the line segment joining the two data points plotted on an n-dimensional Cartesian...