src.eval.classification_uncertainty_evaluator

Classes

ClassificationUncertaintyEvaluator(**kwargs)

class src.eval.classification_uncertainty_evaluator.ClassificationUncertaintyEvaluator(**kwargs)

Author:: Alberto M. Esmoris Pena

Class to evaluate classification-like predictions to analyzer their uncertainty.

Variables:

class_names (list) – The name for each class.
include_probabilities (bool) – Whether to include the probabilities in the resulting evaluation (True) or not (False).
probability_eps – The value representing the zero, to avoid NaNs when computing the logarithms of the likelihoods/probabilities. If it is exactly zero, then the zeroes will not be replaced by this value.
include_weighted_entropy (bool) – Whether to include the weighted entropy in the resulting evaluation (True) or not (False).
include_clusters (bool) – Whether to include the cluster-wise entropies in the resulting evaluation (True) or not (False).
weight_by_predictions (bool) – Whether to compute the weighted entropy considering the predictions instead of the reference labels (True) or not (False, by default).
num_clusters (int) – Governs how many clusters must be built when the cluster-wise entropies must be computed.
clustering_max_iters (int) – How many iterations are allowed (at most) for the cluster algorithm to converge.
clustering_batch_size (int) – How many points consider per batch at each iteration of the clustering algoritm. More points imply a more accurate clustering. However, they also imply a greater computational cost, and thus longer execution time.
clustering_entropy_weights (bool) – Whether to use point-wise entropy as the sample weights for the clustering (True) or not (False).
clustering_reduce_function (str) – What function use to reduce the entropy values in a given cluster to a single one. Either ‘mean’, ‘median’, ‘Q1’ (first quartile), ‘Q3’ (third quartile), ‘min, or ‘max’.
gaussian_kernel_points (int) – How many points consider to compute the gaussian kernel density estimations. Note that this argument has a great impact on the time required to generate the plots.
report_path (str) – The generated point cloud-like report will be exported to the file pointed by the report path.
plot_path (str) – The generated plots will be stored at the directory pointed by the plot path.
ignore_classes (list of str) – The list of classes that must be ignored when computing the evaluations. In other words, those points that are labeled (not predicted) as one of the ignored classes will not be considered when calculating the evaluation metrics.

static extract_eval_args(spec)

Extract the arguments to initialize/instantiate a ClassificationUncertaintyEvaluator from a key-word specification.

Parameters:: spec – The key-word specification containing the arguments.
Returns:: The arguments to initialize/instantiate a ClassificationUncertaintyEvaluator.

__init__(**kwargs)

Initialize/instantiate a ClassificationUncertaintyEvaluator.

Parameters:: kwargs – The attributes for the ClassificationUncertaintyEvaluator.

eval(Zhat, X=None, y=None, yhat=None, F=None)

Evaluate the uncertainty of the given predictions.

Parameters:

Zhat (np.ndarray) – Predicted class probabilities.

Variables:

X (np.ndarray) – The matrix with the coordinates of the points.
y (np.ndarray) – The point-wise classes (reference).
yhat (np.ndarray) – The point-wise classes (predictions).
F (np.ndarray) – The features matrix (it is necessary to compute cluster-wise entropies).

Returns:

The evaluation of the classification’s uncertainty.

Return type:

ClassificationUncertaintyEvaluation

__call__(pcloud, **kwargs)

Evaluate with extra logic that is convenient for pipeline-based execution.

See evaluator.Evaluator.eval().

Parameters:

pcloud (PointCloud) – The point cloud which predicted probabilities must be computed to determine the uncertainty measurements.
model (Model) – The model that computed the predictions.

compute_pwise_entropy(Zhat)

Compute the point-wise Shannon’s entropy for the given predicted probabilities.

Let \(\pmb{Z} \in \mathbb{R}^{m \times n_c}\) be a matrix representing the predicted probabilities for \(m\) points assuming \(n_c\) classes. The point-wise Shannon entropy for point i \(e_{i}\) can be defined as:

\[e_i = - \sum_{j=1}^{n_c}{z_{ij} \log_{2}(z_{ij})}\]

Parameters:: Zhat (np.ndarray) – The matrix of point-wise predicted probabilities.
Returns:: A vector of point-wise Shannon’s entropies such that the component i is the entropy corresponding to the point i.
Return type:: np.ndarray

compute_weighted_entropy(Zhat, y=None, yhat=None)

Compute the weighted point-wise Shannon’s entropy for the given predicted probabilities.

The weighted Shannon’s entropy is the point-wise Shannon’s entropy but weighting each probability by the frequency of the class with respect to some reference labels \(\pmb{y} \in \mathbb{Z}^{m}\) for \(n_c\) different classes. When the expected reference labels of a point cloud (i.e., classification, i.e., self.y) are available, they will be considered. Otherwise, when they are not available or the weight_by_predictions flag is true, the predicted labels will be considered for the weights.

The weights can be represented through a vector \(\pmb{w} \in \mathbb{R}^{n_c}\). Let \(m\) be the number of points and \(m_j\) be the number of points belonging to class j. For then, the components of the weights vector can be defined as:

\[w_j = 1 - \dfrac{m_j}{m}\]

When using these weights, the less frequent classes will be more significant than the more frequent classes. The weighted point-wise entropy will be computed as follows:

\[e_i = - \sum_{j=1}^{n_c}{w_j z_{ij} \log_{2}(z_{ij})}\]

See classification_uncertainty_evaluator.ClassificationUncertaintyEvaluator.compute_pwise_entropy().

Parameters:: Zhat (np.ndarray) – The matrix of point-wise predicted probabilities.
Returns:: A vector of weighted point-wise Shannon’s entropies such that the component i is the entropy corresponding to the point i.
Return type:: np.ndarray

compute_cluster_wise_entropy(E, F=None)

Compute the cluster-wise Shannon’s entropy for the given predicted point-wise entropies and features.

A KMeans is computed on batches with self.clustering_batch_size points up to a maximum of self.clustering_max_iters iterations to extract self.num_clusters clusters on the feature space. If self.clustering_entropy_weights is True, then the KMeans will scale the contribution of each point considering its associated point-wise entropy. Finally, all the points belonging to the same cluster will have the same cluster-wise entropy which is obtained by reducing the entropies in the cluster through the self.crf function.

Parameters:

E – The point-wise Shannon’s entropies \(\pmb{E} \in \mathbb{R}^{m \times 1}\).
F – The feature matrix \(\pmb{F} \in \mathbb{R}^{m \times n_f}\).

Returns:

A vector of point-wise cluster labels and a vector of cluster-wise Shannon’s entropies (one cluster-wise per point).

Return type:

tuple

compute_class_ambiguity(Zhat)

Compute a naive point-wise class ambiguity measurement.

Let \(\pmb{Z} \in \mathbb{R}^{m \times n_c}\) be a matrix representing the predicted probabilities for \(m\) points assuming \(n_c\) classes. The point-wise class ambiguity for point i \(a_{i}\) can be defined as:

\[a_i = 1 - z^{*}_{i} + z^{**}_{i}\]

Where \(z^{*}_{i}\) is the highest prediction for point i and \(z^*{**}_{i}\) is the second highest prediction for point i.

Parameters:: Zhat (np.ndarray) – The matrix of point-wise predicted probabilities.
Returns:: A vector of point-wise class ambiguities such that the component i is the class ambiguity corresponding to the point i.

eval_args_from_state(state)

Obtain the arguments to call the ClassificationUncertaintyEvaluator from the current pipeline’s state.

Parameters:: state (SimplePipelineState) – The pipeline’s state.
Returns:: The dictionary of arguments for calling ClassificationUncertaintyEvaluator.
Return type:: dict