src.utils.ctransf.distance_reclassifier

Classes

DistanceReclassifier(**kwargs)

class src.utils.ctransf.distance_reclassifier.DistanceReclassifier(**kwargs)
Author:

Alberto M. Esmoris Pena

Class to transform the classifications (or predictions) into another set of classes depending on distance and conditional filters.

See ClassTransformer.

Variables:
  • reclassifications (list of dict) –

    List with the specifications of the many reclassification operations given in the order that they must be applied. Each reclassification is a dict with the following format:

    source_classes

    List with the names of the input classes. There must be as many strings in the list as classes. The \(i\)-th string corresponds with the \(i\)-th input class.

    target_class

    List with the names of the output classes. It consists of an arbitrary number of output classes such that the \(i\)-th string corresponds to the \(i\)-th output class.

    conditions

    A list of dictionaries such that each dictionary defines a conditional filtering. It can be null (None).

    value_name

    The name of the feature involved in the relational condition.

    condition_type

    The relational governing the condition it can be either "not_equals" (\(\neq\)), "equals" (\(=\)), "less_than" (\(<\)), "less_than_or_equal_to" (\(\leq\)), "greater_than" (\(>\)), "greater_than_or_equal_to" (\(\geq\)), "in" (\(\in\)), or "not_in" (\(\notin\)).

    value_target

    The RHS of the relational, with the LHS being the value from the point cloud.

    action

    Whether to "preserve" or "discard" the points that satisfy the condition.

    distance_filters

    A list of dictionaries such that each dictionary defines a distance-based filter. It can be null (None).

    metric

    Either the "euclidean" \(\lVert\pmb{x}_{k*}-\pmb{x}_{i*}\rVert\) or the "manhattan" \(\sum_{j=1}^{n_x}{\lvert x_{kj} - x_{ij}\rvert}\) distance.

    components

    A list with the components involved in the distance metric. For example, ["x", "y", "z"] will lead to 3D distances while ["z"] would lead to distances considering the \(z\) coordinate only.

    knn

    The dictionary specifying how to compute the k-nearest neighbors neighborhood. For \(k=1\) the distance will be computed with respect to the closest neighbor. For \(k>1\) the distance will be computed with respect to the centroid of the neighborhood.

    coordinates

    The coordinates to consider to compute the distance between neighbors. Most typical cases are ["x", "y", "z"] for 3D knn and ["x", "y"] for 2D knn.

    max_distance

    The max supported distance. If given (i.e., not null/None) those neighbors further that this value will be excluded from the requested \(k\)-nearest neighbors neighborhood.

    k

    The number of neighbors in the neighborhood.

    source_classes

    The classes (in the input system of classes) of the points that must be considered in the neighborhood. If not given, then all points will be considered in the neighborhood for distance computations.

    filter_type

    The relational governing the distance filter it can be either "not_equals" (\(\neq\)), "equals" (\(=\)), "less_than" (\(<\)), "less_than_or_equal_to" (\(\leq\)), "greater_than" (\(>\)), "greater_than_or_equal_to" (\(\geq\)), "in" (\(\in\)), "not_in" (\(\notin\)), or "inside" (\(x \in [a, b] \subset \mathbb{R}\)). Note that the "inside" type is a special type that implies 1) \(x \geq a\) and 2) \(x \leq b\).

    filter_target

    The RHS of the distance relational, with the LHS being the computed distance.

    action

    Whether to "preserve" or "discard" the points that satisfy the distance condition.

  • nthreads (int) – The number of threads to be used for parallel computations. Note that -1 means as many threads as available cores.

static extract_ctransf_args(spec)

Extract the arguments to initialize/instantiate a DistanceReclassifier.

Parameters:

spec – The key-word specification containing the arguments.

Returns:

The arguments to initialize/instantiate a DistanceReclassifier.

__init__(**kwargs)

Initialize/instantiate a DistanceReclassifier.

Parameters:

kwargs – The attributes for the DistanceReclassifier

transform(y, X=None, F=None, fnames=None, out_prefix=None)

The fundamental transformation logic defining the distance reclassifier.

Parameters:
  • y (np.ndarray) – The vector of classes (either reference classifications or predictions).

  • X (np.ndarray) – The structure space matrix representing the input point cloud whose classes must be transformed.

  • F (np.ndarray) – The feature space matrix representing the input point cloud whose classes must be transformed.

  • fnames (list of str) – The list with the name for each considered feature.

  • out_prefix – See class_transformer.ClassTransformer.transform()

Returns:

The transformed vector of classes.

Return type:

np.ndarray

transform_pcloud(pcloud, out_prefix=None)

See class_transformer.ClassTransformer.transform()

apply_conditions(reclassification, mask, X, F, flut=None, yinlut=None, youtlut=None)

Update the point-wise mask of selected points applying the conditional filters in the reclassification specification.

Parameters:
  • reclassification (dict) – The reclassification specification.

  • mask (np.ndarray of bool) – The point-wise mask of selected points (True if selected, False otherwise).

  • X (np.ndarray) – The structure space matrix.

  • F (np.ndarray) – The feature space matrix.

  • flut (dict) – The feature look-up table. Keys are feature names, values are corresponding integer row-indices in F.

  • yinlut (dict) – The source (input) classes look-up table. Keys are class names, values are the corresponding indices.

  • youtlut (dict) – The target (output) classes look-up table. Keys are class names, values are the corresponding indices.

Returns:

The point-wise boolean mask (also updated in place).

Return type:

np.ndarray of bool

apply_distance_filters(reclassification, mask, X, F, y, flut=None, yinlut=None, youtlut=None)

Update the point-wise maks of selected points applying the distance-based filters in the reclassification specification.

The DistanceReclassifier.compute_centroids() assists this method with the parallel computation of the centroids.

Parameters:
  • reclassification (dict) – The reclassification specification.

  • mask (np.ndarray of bool) – The point-wise mask of selected points (True if selected, False otherwise).

  • X (np.ndarray) – The structure space matrix.

  • F (np.ndarray) – The feature space matrix.

  • y (np.ndarray) – The vector of point-wise source classes.

  • flut (dict) – The feature look-up table. Keys are feature names, values are corresponding integer row-indices in F.

  • yinlut (dict) – The source (input) classes look-up table. Keys are class names, values are the corresponding indices.

  • youtlut (dict) – The target (output) classes look-up table. Keys are class names, values are the corresponding indices.

Returns:

The point-wise boolean mask (also updated in place).

Return type:

np.ndarray of bool

static compute_centroids(knnX_chunk)

Method with the logic for the parallel computation of centroids during the application of distance filters.

See DistanceReclassifier.apply_distance_filters().

Parameters:

knnX_chunk – The chunk of neighborhoods (i.e., the structure space matrix for each neighborhood) whose centroid must be computed.

Returns:

The centroids for the neighborhoods in the given chunk.

Return type:

np.ndarray

determine_fnames()

Determine the names of the features involved in the distance reclassification.

Returns:

The names of the features necessary for the distance reclassification.

Return type:

list of str