The first example – the k-means clustering algorithm
The k-means clustering algorithm is a clustering algorithm to group a set of items not previously classified into a predefined number of k clusters. It's very popular within the data mining and machine learning world to organize and classify data in an unsupervised way.
Each item is normally defined by a vector of characteristics or attributes. All the items have the same number of attributes. Each cluster is also defined by a vector with the same number of attributes that represents all the items classified into that cluster. This vector is named the centroid. For example, if the items are defined by numeric vectors, the clusters are defined by the mean of the items classified into that cluster.
Basically, the algorithm has four steps:
Initialization: In the first step, you have to create the initial vectors that represent the K clusters. Normally, you will initialize those vectors randomly.
Assignment: Then, you classify each item into...