src.mining.recount_miner

Classes

RecountMiner(**kwargs)

class src.mining.recount_miner.RecountMiner(**kwargs)
Author:

Alberto M. Esmoris Pena

Recount miner. See Miner.

The recount miner considers each point in the point cloud \(\pmb{x_{i*}}\) and finds either the knn or its spherical neighborhood \(\mathcal{N}\). Now, let \(j\) index the points in the neighborhood. For then, a given feature \(f\) (or reference value \(y\), e.g., classification label) can be used to filter the points (e.g., selecting \(\pmb{x}_{i*} \in \mathcal{N}\) such that the j-th feature for the i-th points satisfies \({f_i > \tau}\), for a given threshold \(\tau\). Finally, all the points in the filtered neighborhood can be counted in terms of absolute and relative frequency, and also with respect to the surface or the volume of the neighborhood (given by the radius of the spherical neighborhood or the distance wrt the closest nearest neighbor for knn neighborhoods).

static extract_miner_args(spec)

Extract the arguments to initialize/instantiate a RecountMiner from a key-word specification.

Parameters:

spec – The key-word specification containing the arguments.

Returns:

The arguments to initialize/instantiate a RecountMiner.

__init__(**kwargs)

Initialize an instance of RecountMiner.

The neighborhood definition and feature names (fnames) are always assigned during initialization. The default neighborhood is a knn neighborhood with \(k=16\).

Parameters:

kwargs (dict) – The attributes for the RecountMiner that will also be passed to the parent.

mine(pcloud)

Mine recounts on filtered neighborhoods from the given point cloud.

Parameters:

pcloud – The point cloud to be mined.

Returns:

The point cloud extended with recounts.

Return type:

PointCloud

get_recount_names_from_filter(f)

Obtain the new feature names generated by the filter.

Parameters:

f (dict) – The filter specification.

Returns:

The names of the new features generated by the filter.

Return type:

list

fname_to_feature_index(fname)

Obtain the feature index corresponding to the given fname.

Parameters:

fname (str) – The name of the feature which index must be found.

Returns:

The index of the feature with the given name.

Return type:

int

compute_recount(X, F, kdt, neighborhood_f, neighborhood_radius, X_chunk, F_chunk, chunk_idx)

Compute the recounts for a given chunk.

Parameters:
  • X – The structure space matrix (i.e., the matrix of coordinates).

  • F – The feature space matrix (i.e., the matrix of features).

  • kdt – The KDTree representing the entire point cloud.

  • neighborhood_f – The function to extract neighborhoods for the points in the chunk.

  • neighborhood_radius – The radius of the spherical neighborhood or None to be computed from the points (e.g., for knn neighborhoods).

  • X_chunk – The structure space matrix of the chunk.

  • F_chunk – The feature space matrix of the chunk.

  • chunk_idx – The index of the chunk.

Returns:

The recount features computed for the chunk.

Return type:

np.ndarray

compute_filter(f, X, F, X_sub, I, r)

Compute the given filter on the neighborhoods of a given chunk.

Parameters:
  • f (dict) – The specification of the filter to be computed.

  • X (np.ndarray) – The matrix of coordinates representing the input point cloud.

  • F (np.ndarray) – The matrix of features representing the intput point cloud.

  • X_sub (np.ndarray) – The matrix of coordinates representing the subchunk which recount features must be computed.

  • I (list of list of int) – The list of lists of indices such that the i-th list contains the indices of the points in X that belong to the neighborhood of the i-th point in X_sub.

  • r (float or None) – The radius of the spherical neighborhood, None if it must be computed from the points in the neighborhood.

Returns:

The recount features for the points in X_sub.

Return type:

np.ndarray

recount_absolute_frequency(F)

Count the number of points.

recount_relative_frequency(F, total_pts)

The number of points after filtering divided by the total number of points before filtering.

recount_surface_density(F, X2D, x, r)

The number of points after filtering divided by the area of the neighborhood.

If the neighborhood is a spherical one with radius \(r\) then the area will be given by \(\pi r^2\). If the neighborhood is a knn one then the area will be given by \(\pi \left(\dfrac{d^*}{2}\right)^2\), where \(d^*\) is the distance between the \((x, y)\) coordinates of the center point and the furthest one.

recount_volume_density(F, X, x, r)

The number of points after filtering divided by the volume of the neighborhood.

If the neighborhood is a spherical one with radius \(r\) then the volume will be given by \(\dfrac{4}{3}\pi r^2\). If the neighborhood is a knn one then the area will be given by \(\dfrac{4}{3}\pi \left(\dfrac{d^*}{2}\right)^2\), where \(d^*\) is the distance between the \((x, y, z)\) coordinates of the center point and the furthest one. When using a cylinder, the radius will be considered to compute the area and the volume will be computed considering the vertical boundaries of the cylinder such that \(\pi r^2 (z^*-z_*)\) where \(z_*\) is the min vertical coordinate and \(z^*\) is the max vertical coordinate.

Note that, for cylindrical neighborhoods, if there is no difference between the max and the min vertical coordinate, then the maximum integer will be returned, effectively avoiding a division by zero.

recount_vertical_segments(F, z, num_segments)

The number of vertical segments along a vertical cylinder that contain at least one point.

static apply_conditions(I, F, conditions)

Apply the conditions to filter out all the points that do not satisfy one or more of them.

Parameters:
  • I (list) – The indices for the current neighborhood.

  • F (np.ndarray) – The features

  • conditions (list) – The conditions to be applied.

Returns:

The indices of the current neighborhood that satisfy the conditions.

Return type:

list

static apply_condition(f, condition)

Check whether the condition is satisfied for each given point.

Parameters:
  • f – The feature vector where the condition must be checked.

  • condition – The specification of the condition to be checked.

Returns:

The mask with True for points that satisfy the condition, False otherwise.

Return type:

np.ndarray of bool

get_decorated_fnames()

Obtain the names of the recount features.

Returns:

List with the names of the recount features.

Return type:

list of str