Clustering data with the k-means method
k-means clustering is a flat clustering technique, which produces only one partition with k clusters. Unlike hierarchical clustering, which does not require a user to determine the number of clusters at the beginning, the k-means method requires this to be determined first. However, k-means clustering is much faster than hierarchical clustering as the construction of a hierarchical tree is very time consuming. In this recipe, we will demonstrate how to perform k-means clustering on the customer dataset.
Getting ready
In this recipe, we will continue to use the customer dataset as the input data source to perform k-means clustering.
How to do it...
Perform the following steps to cluster the customer
dataset with the k-means method:
- First, you can use
kmeans
to cluster the customer data:
> set.seed(22) > fit = kmeans(customer, 4) > fit Output K-means clustering with 4 clusters of sizes 8, 11, 16, 25 Cluster means: ...