src.clustering.dbscan_clusterer
Classes
|
- class src.clustering.dbscan_clusterer.DBScanClusterer(**kwargs)
- Author:
Alberto M. Esmorís Pena
DBScan clustering on the structure space \(\pmb{X} \in \mathbb{R}^{m \times n_x}\). It supports filtering by discrete categorical values (e.g., classifications), i.e., one DBScan on the subspace of the Euclidean space that contains only points belonging to a given cluster (classes, and categorical predictions are clusters in this context).
More formally, let \(\pmb{x_{i*}} \in \mathbb{R}^{n_x}\) be a point in the structure space, with \(y_i \in \mathbb{Z}_{\geq 0}\) the integer that represents the cluster to which point \(i\) belongs.
This DBScan clustering component can be applied once to all points \(\pmb{X} \in \mathbb{R}^{m \times 3}\). Alternatively, it can be applied \(K \in \mathbb{Z}_{>1}\) times. In this last case, consider \(\pmb{X_1} \in \mathbb{R}^{m_1 \times n_x}, \ldots, \pmb{X_K} \in \mathbb{R}^{m_K \times n_x}\) as the \(K\) structure spaces, and compute a DBScan on each of them. The \(m_k\) points in \(\pmb{X_k}_{m_k \times n_x}\) must represent the set of points \(\biggl\{{\pmb{x_j*} : y_j = k}\biggr\}\).
- Variables:
precluster_name (str or None) – The name of the attribute to be considered as the precluster. If None, then all points will be considered at once instead of partitioned by previous clusters.
precluster_domain (list or tuple of str) – The domain of the precluster, i.e., the precluster labels to be considered. If not given, then any unique precluster label will be considered.
min_points (int) – The minimum number of points in the neighborhood so the center point can be considered a kernel point.
radius – The radius of the neighborhood (typically a spherical neighborhood) for spatial queries.
- static extract_clustering_args(spec)
Extract the arguments to initialize/instantiate a DBScanClusterer from a key-word specification.
- Parameters:
spec – The key-word specification containing the arguments.
- Returns:
The arguments to initialize/instantiate a DBScanClusterer.
- __init__(**kwargs)
Initialize an instance of DBScanClusterer.
- Parameters:
kwargs – The attributes of the DBScanClusterer that will also be passed to the parent.
- fit(pcloud)
The
DBScanClustererdoes not require any fit at all. SeeClustererandClusterer.fit().
- cluster(pcloud)
Apply DBScan clustering to the given point cloud.
See
ClustererandClusterer.cluster().
- do_dbscan(X, c, cluster_idx)
Compute a density-based spatial clustering of applications with noise (DBSCAN).
- Parameters:
X (
np.ndarray) – The input structure space.c (
np.ndarray) – The vector of point-wise cluster labels for the points in X.cluster_idx – The cluster index for the first cluster.
- Returns:
The least cluster index greater than the highest cluster index assigned to any point.
- Return type:
int