src.clustering.postproc.cluster_selector

Classes

ClusterSelector(**kwargs)

class src.clustering.postproc.cluster_selector.ClusterSelector(**kwargs)
Author:

Alberto M. Esmoris Pena

Clustering post-processor that filters the clusters, i.e., discards those that do not satisfy the specified requirements or preserve only those that satisfy some given requisites.

See ClusteringPostProcessor.

Variables:

filters (list of dict) –

The list with the specification of the preserve and discard actions with their requirements. The structure of each dictionary in the list is as follows:

attribute

The attribute for which the relational condition/requirement is specified. Supported attributes are:

"number_of_points"

The number of points in the cluster.

"surface_area"

The surface area in the \((x, y)\) plane of each cluster understood as the area of the convex hull that contains the cluster.

"volume"

The volume of each cluster understood as the volume of the 3D convex hull that contains the cluster.

"surface_density"

The number of points divided by the surface area.

"volume_density"

The number of points divided by the volume.

"x_length"

The difference between the max and min points of the cluster along the \(x\)-axis.

"y_length"

The difference between the max and min points of the cluster along the \(y\)-axis.

"z_length"

The difference between the max and min points of the cluster along the \(z\)-axis.

relational

The relational governing whether the condition/requirement is satisfied or not. Supported relationals are "not_equals" \(x \neq y\), "equals" \(x = y\), "less_than" \(x < y\), "less_than_or_equal_to" \(x \leq y\), "greater_than" :math`x > y`, "greater_than_or_equal_to" \(x \geq y\), "in" \(x \in S\), "not_in" \(x \notin S\), and "inside" \(x \in [a, b] \subset \mathbb{R}\).

target

The target value for the right hand side of the relational. It can be either an integer, a float or a list. Lists are used for "in", "not_in" and "inside" relationals and concretely for "inside" the list must have exactly two elements.

action

Either "preserve" to keep those clusters that satisfy the relational condition or "discard" to discard clusters that satisfy the relational condition.

__init__(**kwargs)

Initialize a ClusterSelector post-processor.

See ClusteringPostProcessor.__init__().

Parameters:

kwargs – The key-word arguments for the initialization of the ClusterSelector.

__call__(clusterer, pcloud, out_prefix=None)

Post-process the given point cloud with clusters to discard those clusters that do not pass the requested filters.

Parameters:
  • clusterer (Clusterer) – The clusterer that generated the clusters.

  • pcloud (PointCloud) – The point cloud to be post-processed.

  • out_prefix (str or None) – The output prefix in case path expansion must be applied.

Returns:

The post-processed point cloud.

Return type:

PointCloud

determine_attributes()

Determine the attributes that must be computed for each clustering from the filters specification.

Returns:

A dictionary-like look-up table whose keys are the names of the attributes that must be computed and whose values are the indices of those values in the cluster-wise feature space matrix.

Return type:

dict

compute_attributes(alut, X, c, c_dom)

Compute the attributes for each cluster.

Parameters:
  • alut (dict) – The attribute’s look-up table as generated by ClusterSelector.determine_attributes().

  • X (np.ndarray) – The structure space matrix representing the point cloud \(\pmb{X} \in \mathbb{R}^{m \times 3}\).

  • c (np.ndarray) – The vector of point-wise cluster labels \(\pmb{c} \in \mathbb{R}^{m}\).

  • c_dom (np.ndarray) – The cluster-wise vector of cluster labels \(\pmb{c}_{\text{dom}} \in \mathbb{R}^{n_c}\).

Returns:

The cluster-wise feature space matrix \(\pmb{F} \in \mathbb{R}^{n_c \times n_f}\) for \(n_c \in \mathbb{Z}_{>0}\) clusters and \(n_f > \mathbb{Z}_{>0}\) attributes.

Return type:

np.ndarray

compute_selection_mask(c, c_dom, alut, F)

Compute the selection mask where True means the cluster must be preserved and False means the cluster must be discarded.

Parameters:
  • c (np.ndarray) – The point-wise cluster labels.

  • c_dom (np.ndarray) – The cluster labels.

  • alut (dict) – The look-up table for the cluster-wise attributes/features as computed by the ClusterSelector.determine_attributes().

  • F – The feature space matrix of the clusters.

Type:

np.ndarray

Returns:

The cluster-wise selection mask (True means the cluster must be kept, False means it must be discarded).

Return type:

np.ndarray

apply_selection_mask(clusterer, pcloud, c, c_dom, mask)

Apply the selection mask to discard those clusters that does not meet the given requirements. The preserved clusters are updated to have sequential indices as cluster labels (starting at zero, with \(-1\) representing non-clustered points).

Parameters:
  • pcloud (.PointCloud) – The point cloud that must be updated.

  • c (np.ndarray) – The point-wise cluster labels.

  • c_dom (np.ndarray) – The cluster labels.

  • mask (np.ndarray of bool) – The cluster-wise boolean mask where True means the cluster must be preserved and False means it must be discarded.

Returns:

The updated point cloud and the new domain of the clusters.

compute_number_of_points(X, c, c_dom)

Compute the number of points in each cluster.

See ClusterSelector.compute_attributes().

compute_surface_area(X, c, c_dom)

Compute the area of the convex hull in the \((x, y)\) plane that contains each cluster.

See ClusterSelector.compute_attributes().

compute_volume(X, c, c_dom)

Compute the volume of the 3D convex hull that contains each cluster.

See ClusterSelector.compute_attributes().

compute_surface_density(X, c, c_dom)

Compute the number of points in the cluster divided by the area of the convex hull in the \((x, y)\) plane that contains each cluster.

See ClusterSelector.compute_attributes().

compute_volume_density(X, c, c_dom)

Compute the volume of the 3D convex hull that contains each cluster.

See ClusterSelector.compute_attributes().

compute_x_length(X, c, c_dom)
Compute the difference between the max and min values along the

\(x\)-axis.

See ClusterSelector.compute_attributes().

compute_y_length(X, c, c_dom)
Compute the difference between the max and min values along the

\(y\)-axis.

See ClusterSelector.compute_attributes().

compute_z_length(X, c, c_dom)
Compute the difference between the max and min values along the

\(z\)-axis.

See ClusterSelector.compute_attributes().