src.clustering.postproc.cluster_selector
Classes
|
- class src.clustering.postproc.cluster_selector.ClusterSelector(**kwargs)
- Author:
Alberto M. Esmoris Pena
Clustering post-processor that filters the clusters, i.e., discards those that do not satisfy the specified requirements or preserve only those that satisfy some given requisites.
- Variables:
filters (list of dict) –
The list with the specification of the preserve and discard actions with their requirements. The structure of each dictionary in the list is as follows:
- –
attribute The attribute for which the relational condition/requirement is specified. Supported attributes are:
- –
"number_of_points" The number of points in the cluster.
- –
"surface_area" The surface area in the \((x, y)\) plane of each cluster understood as the area of the convex hull that contains the cluster.
- –
"volume" The volume of each cluster understood as the volume of the 3D convex hull that contains the cluster.
- –
"surface_density" The number of points divided by the surface area.
- –
"volume_density" The number of points divided by the volume.
- –
"x_length" The difference between the max and min points of the cluster along the \(x\)-axis.
- –
"y_length" The difference between the max and min points of the cluster along the \(y\)-axis.
- –
"z_length" The difference between the max and min points of the cluster along the \(z\)-axis.
- –
- –
relational The relational governing whether the condition/requirement is satisfied or not. Supported relationals are
"not_equals"\(x \neq y\),"equals"\(x = y\),"less_than"\(x < y\),"less_than_or_equal_to"\(x \leq y\),"greater_than":math`x > y`,"greater_than_or_equal_to"\(x \geq y\),"in"\(x \in S\),"not_in"\(x \notin S\), and"inside"\(x \in [a, b] \subset \mathbb{R}\).- –
target The target value for the right hand side of the relational. It can be either an integer, a float or a list. Lists are used for
"in","not_in"and"inside"relationals and concretely for"inside"the list must have exactly two elements.- –
action Either
"preserve"to keep those clusters that satisfy the relational condition or"discard"to discard clusters that satisfy the relational condition.
- –
- __init__(**kwargs)
Initialize a ClusterSelector post-processor.
See
ClusteringPostProcessor.__init__().- Parameters:
kwargs – The key-word arguments for the initialization of the ClusterSelector.
- __call__(clusterer, pcloud, out_prefix=None)
Post-process the given point cloud with clusters to discard those clusters that do not pass the requested filters.
- Parameters:
clusterer (
Clusterer) – The clusterer that generated the clusters.pcloud (
PointCloud) – The point cloud to be post-processed.out_prefix (str or None) – The output prefix in case path expansion must be applied.
- Returns:
The post-processed point cloud.
- Return type:
- determine_attributes()
Determine the attributes that must be computed for each clustering from the filters specification.
- Returns:
A dictionary-like look-up table whose keys are the names of the attributes that must be computed and whose values are the indices of those values in the cluster-wise feature space matrix.
- Return type:
dict
- compute_attributes(alut, X, c, c_dom)
Compute the attributes for each cluster.
- Parameters:
alut (dict) – The attribute’s look-up table as generated by
ClusterSelector.determine_attributes().X (
np.ndarray) – The structure space matrix representing the point cloud \(\pmb{X} \in \mathbb{R}^{m \times 3}\).c (
np.ndarray) – The vector of point-wise cluster labels \(\pmb{c} \in \mathbb{R}^{m}\).c_dom (
np.ndarray) – The cluster-wise vector of cluster labels \(\pmb{c}_{\text{dom}} \in \mathbb{R}^{n_c}\).
- Returns:
The cluster-wise feature space matrix \(\pmb{F} \in \mathbb{R}^{n_c \times n_f}\) for \(n_c \in \mathbb{Z}_{>0}\) clusters and \(n_f > \mathbb{Z}_{>0}\) attributes.
- Return type:
np.ndarray
- compute_selection_mask(c, c_dom, alut, F)
Compute the selection mask where
Truemeans the cluster must be preserved andFalsemeans the cluster must be discarded.- Parameters:
c (
np.ndarray) – The point-wise cluster labels.c_dom (
np.ndarray) – The cluster labels.alut (dict) – The look-up table for the cluster-wise attributes/features as computed by the
ClusterSelector.determine_attributes().F – The feature space matrix of the clusters.
- Type:
np.ndarray- Returns:
The cluster-wise selection mask (True means the cluster must be kept, False means it must be discarded).
- Return type:
np.ndarray
- apply_selection_mask(clusterer, pcloud, c, c_dom, mask)
Apply the selection mask to discard those clusters that does not meet the given requirements. The preserved clusters are updated to have sequential indices as cluster labels (starting at zero, with \(-1\) representing non-clustered points).
- Parameters:
pcloud (.PointCloud) – The point cloud that must be updated.
c (
np.ndarray) – The point-wise cluster labels.c_dom (
np.ndarray) – The cluster labels.mask (
np.ndarrayof bool) – The cluster-wise boolean mask whereTruemeans the cluster must be preserved andFalsemeans it must be discarded.
- Returns:
The updated point cloud and the new domain of the clusters.
- compute_number_of_points(X, c, c_dom)
Compute the number of points in each cluster.
- compute_surface_area(X, c, c_dom)
Compute the area of the convex hull in the \((x, y)\) plane that contains each cluster.
- compute_volume(X, c, c_dom)
Compute the volume of the 3D convex hull that contains each cluster.
- compute_surface_density(X, c, c_dom)
Compute the number of points in the cluster divided by the area of the convex hull in the \((x, y)\) plane that contains each cluster.
- compute_volume_density(X, c, c_dom)
Compute the volume of the 3D convex hull that contains each cluster.
- compute_x_length(X, c, c_dom)
- Compute the difference between the max and min values along the
\(x\)-axis.
- compute_y_length(X, c, c_dom)
- Compute the difference between the max and min values along the
\(y\)-axis.
- compute_z_length(X, c, c_dom)
- Compute the difference between the max and min values along the
\(z\)-axis.