Density-based Clustering with DBSCAN Algorithm
My previous post was about grid-based clustering. DBSCAN (Density-Based Spatial Clustering of Applications with Noise) takes another approach called density-based clustering. It grows regions with high density (above threshold provided) into clusters and discovers clusters of arbitrary shape.
DBSCAN Implementation
First off the data is loaded and a distance matrix is calculated based on the data points.
The algorithm visits every data point and finds its neighbours. If the neighbours are dense enough than the cluster is expanded to include those points as well.
So data points that are close enough to each other are included in the same cluster.
Joining dense clusters is a similar approach taken in grid clustering. The difference is this way the clusters can have arbitrary shapes.