-->

Density-based Clustering with DBSCAN Algorithm

dev java, machine_learning

My previous post was about grid-based clustering. DBSCAN (Density-Based Spatial Clustering of Applications with Noise) takes another approach called density-based clustering. It grows regions with high density (above threshold provided) into clusters and discovers clusters of arbitrary shape.

DBSCAN Implementation

First off the data is loaded and a distance matrix is calculated based on the data points.

DBSCAN 1

The algorithm visits every data point and finds its neighbours. If the neighbours are dense enough than the cluster is expanded to include those points as well.

DBSCAN 3

So data points that are close enough to each other are included in the same cluster.

DBSCAN 2

Joining dense clusters is a similar approach taken in grid clustering. The difference is this way the clusters can have arbitrary shapes.

Resources