eval package
Subpackages
Submodules
eval.advanced_classification_evaluation module
- class eval.advanced_classification_evaluation.AdvancedClassificationEvaluation(**kwargs)
Bases:
Evaluation- Author:
Alberto M. Esmoris Pena
Class representing the result of evaluating a classification in and advanced way. See
AdvancedClassificationEvaluator.- Variables:
evals (list of
ClassificationEvaluation) – The evaluation of the classification for each filter.domain_name – See
AdvancedClassificationEvaluator.num_points (int) – The number of points without filtering (i.e., the original number of points).
num_fpoints (
np.ndarrayof int) – The number of points for each evaluation (i.e., the points preserved after applying each filter).
- __init__(**kwargs)
Initialize/instantiate an AdvancedClassificationEvaluation.
- Parameters:
kwargs – The attributes for the AdvancedClassificationEvaluation.
- report(**kwargs)
Transform the AdvancedClassificationEvaluation into an
AdvancedClassificationReport.See
AdvancedClassificationReport.- Returns:
The
AdvancedClassificationReportrepresenting theAdvancedClassificationEvaluation.- Return type:
- can_report()
See
EvaluationandEvaluation.can_report().
- plot(**kwargs)
Transform the
AdvancedClassificationEvaluatorinto aAdvancedClassificationPlot.Se
AdvancedClassificationPlot.- Parameters:
kwargs – The key-word arguments for the plot.
- Returns:
The
AdvancedClassificationPlotrepresenting theClassificationEvaluation.- Return type:
- can_plot()
See
EvaluationandEvaluation.can_plot().
eval.advanced_classification_evaluator module
- class eval.advanced_classification_evaluator.AdvancedClassificationEvaluator(**kwargs)
Bases:
Evaluator- Author:
Alberto M. Esmoris Pena
Class to evaluate classification-like predictions against expected/reference classes in an advanced way, i.e., by computing many evaluations considering different filters.
The arguments of the
AdvancedClassificationEvaluatorinclude those of theClassificationEvaluator. Only those that are introduced by theAdvancedClassificationEvaluatorare documented here. SeeClassificationEvaluatorfor details on the common attributes.See also
AdvancedClassificationEvaluation.- Variables:
evaluator (
ClassificationEvaluator.) – TheClassificationEvaluatorused to compute the filter-wise classification evaluations.filters (list of dict) –
List of filters such that one evaluation will be carried out for each filter. An example of filter is given below:
{ "name": "pwe0_1", "x": 0.1, "conditions": [ { "value_name": "classification", "condition_type": "not_equals", "value_target": 2, "action": "discard" }, { "value_name": "PointWiseEntropy", "condition_type": "less_than_or_equal_to", "value_target": 0.1, "action": "preserve" } ] }
The filter above will filter out all the points that are classified in the class 2 (third class, note they start at zero) and then it will consider only those that have a point-wise entropy \(\leq 1/10\). The value of the “x” attribute will be used in the figures to represent nodes along the \(x\)-axis and in the output CSV (report) to identify to which evaluation corresponds each row.
domain_name (str) – The name of the variable that constitutes the domain of the advanced evaluation (\(x\)-axis name).
- static extract_eval_args(spec)
Extract the arguments to initialize/instantiate an
AdvancedClassificationEvaluatorfrom a key-word specification.See
ClassificationEvaluator.extract_eval_args().- Param:
spec The key-word specification containing the arguments.
- Returns:
The arguments to initialize/instantiate an
AdvancedClassificationEvaluator.
- __init__(**kwargs)
Initialize/instantiate an AdvancedClassificationEvaluator.
- Parameters:
kwargs – The attributes for the AdvancedClassificationEvaluator.
- eval(yhat, y=None, fnames=None, F=None)
Evaluate predicted classes (\(\hat{y}\)) against expected/reference classes (\(y\)) one time for each given filter.
- __call__(pcloud, **kwargs)
Evaluate with extra logic that is convenient for pipeline-based execution.
See
Evaluator.eval().
- apply_filter(f, yhat, y=None, fnames=None, F=None)
Apply the given filter on the predictions and reference classes to extract the subset of predicted classes that must be evaluated.
- Parameters:
f (dict) – The filter to be applied.
yhat (
np.ndarray) – The predicted classes to be filtered.y (
np.ndarray) – The reference classes to be filtered.fnames (list of str) – The names of the features in \(\pmb{F}\).
F (
np.ndarray) – The feature space matrix representing the point cloud to be evaluated.
- Returns:
Return the domain node (x), filtered predictions, and filtered reference classes as a 3-tuple.
- Return type:
tuple
- apply_condition(mask, cond, yhat, y=None, fnames=None, F=None, name=None)
Apply the given condition to update the input mask (in place).
- Parameters:
mask (
np.ndarrayof bool) – Boolean mask to be updated. It must specify True for points that must be considered for the evaluation, False otherwise.cond (dict) – The condition specification.
yhat (
np.ndarray) – The predictions.y (
np.ndarray) – The expected/reference classes.fnames (list of str) – The names of the features in the feature space matrix \(\pmb{F}\).
F (
np.ndarray) – The feature space matrix.name (str) – The name of the filter to which the condition belongs.
- get_fnames_from_filters()
Obtain the names of the features that must be considered by the filters.
- Returns:
List with the names of the features that must be considered by the filters.
- Return type:
list of str
- eval_args_from_state(state)
Obtain the arguments to call the AdvancedClassificationEvaluator from the current pipeline’s state.
- Parameters:
state (
SimplePipelineState) – The pipeline’s state.- Returns:
The dictionary of arguments for calling AdvancedClassificationEvaluator.
- Return type:
dict
eval.classification_evaluation module
- class eval.classification_evaluation.ClassificationEvaluation(**kwargs)
Bases:
Evaluation- Author:
Alberto M. Esmoris Pena
Class representing the result of evaluating a classification. See
ClassificationEvaluator.- Variables:
class_names – See
ClassificationEvaluator.ignore_classes – See
ClassificationEvaluator.metric_names – See
ClassificationEvaluator.class_metric_names – See
ClassificationEvaluator.yhat_count (
np.ndarray) – The count of cases per predicted label.y_count (
np.ndarray) – The count of cases per expected label (real class distribution).conf_mat (
np.ndarray) – The confusion matrix where rows are the expected or true labels and columns are the predicted labels.conf_mat_norm_type (str or None) – The type of normalization strategy to be applied to the confusion matrix when plotting it. Either None or a string from [“row”, “col”, “full”].
metric_scores (
np.ndarray) – The score for each metric, i.e., metric_scores[i] is the computed score corresponding to metric_names[i].class_metric_scores (
np.ndarray) – The class-wise scores for each metric. class_metric_scores[i][j] is the metric i calculated for the class j.
- __init__(**kwargs)
Initialize/instantiate a ClassificationEvaluation.
- Parameters:
kwargs – The attributes for the ClassificationEvaluation.
- report(**kwargs)
Transform the ClassificationEvaluation into a
ClassificationReport.See
ClassificationReport.- Returns:
The ClassificationReport representing the ClassificationEvaluation.
- Return type:
- can_report()
See
Evaluationandevaluation.Evaluation.can_report().
- plot(**kwargs)
Transform the ClassificationEvaluation into a ClassificationPlot.
See
ClassificationPlot.- Parameters:
kwargs – The key-word arguments for the plot.
- Returns:
The ClassificationPlot representing the ClassificationEvaluation.
- Return type:
- can_plot()
See
Evaluationandevaluation.Evaluation.can_plot().
eval.classification_evaluator module
- class eval.classification_evaluator.ClassificationEvaluator(**kwargs)
Bases:
Evaluator- Author:
Alberto M. Esmoris Pena
Class to evaluate classification-like predictions against expected/reference classes.
- Variables:
metrics (list of str) – The name of the metrics for overall evaluation.
class_metrics (list) – The name of the metrics for the class-wise evaluation.
class_names (list) – The name for each class.
metricf (list) – The functions to compute each metric. The metric \(j\) will be computed as \(f_j(y, \hat{y})\).
class_metricf (list) – The function to compute class-wise metrics. The metric \(j\) for class \(i\) will be computed as \(f_j(y, \hat{y}, i)\).
report_path (str) – The path to write the global evaluation report.
class_report_path (str) – The path to write the class-wise evaluation report.
confusion_matrix_report_path (str) – The path to write the confusion matrix report.
confusion_matrix_plot_path (str) – The path to write the plot representing the confusion matrix.
class_distribution_report_path (str) – The path to write the class distribution report.
class_distribution_plot_path (str) – The path to write the plot representing the class distribution.
ignore_classes (list of str) – The list of classes that must be ignored when computing the evaluations. In other words, those points that are labeled (not predicted) as one of the ignored classes will not be considered when calculating the evaluation metrics.
ignore_predictions (bool) – Whether to ignore the classes also in the predictions (False) or not (True, default). Note that, in the general case, it is not recommended to ignore the classes also in the predictions.
nthreads (int) – How many threads to compute the metrics in parallel (i.e., one thread per metric at most). Note that -1 means using as many threads as available cores.
- static extract_eval_args(spec)
Extract the arguments to initialize/instantiate a
ClassificationEvaluatorfrom a key-word specification.- Parameters:
spec – The key-word specification containing the arguments.
- Returns:
The arguments to initialize/instantiate a
ClassificationEvaluator.
- __init__(**kwargs)
Initialize/instantiate a ClassificationEvaluator.
- Parameters:
kwargs – The attributes for the ClassificationEvaluator.
- eval(yhat, y=None, **kwargs)
Evaluate predicted classes (\(\hat{y}\)) against expected/reference classes (\(y\)).
- Parameters:
yhat – The predictions to be evaluated.
y – The expected/reference values to evaluate the predictions against.
- Returns:
The evaluation of the classification.
- Return type:
- __call__(x, **kwargs)
Evaluate with extra logic that is convenient for pipeline-based execution.
See
evaluator.Evaluator.eval().
- static metrics_from_names(names)
Obtain a list of metrics that can be evaluated for vectors of classes y (expected), and yhat (predicted).
- Parameters:
names – The names of the metrics. Currently supported metrics are Overall Accuracy “OA”, Precision “P”, Recall “R”, F1 score “F1”, Intersection over Union “IoU”, weighted Precision “wP”, weighted Recall “wR”, weighted F1 score “wF1”, weighted Intersection over Union “wIoU”, Matthews Correlation Coefficient “MCC”, and Kohen’s Kappa score “Kappa”.
- Returns:
List of metrics such that metric_i(y, yhat) can be invoked.
- Return type:
list
- static class_metrics_from_names(names)
Obtain a list of class-wise metrics that can be evaluated for vectors of classes \(y\) (expected), and \(yhat\) (predicted).
- Parameters:
names – The names of the class-wise metrics. Currently supported Precision “P”, Recall “R”, F1 score “F1”, and Intersection over Union “IoU”.
- Returns:
List of metrics such that metric_i(y, yhat) can be invoked.
- Return type:
list
- static get_indices_from_names(all_names, target_names)
Find the indices of the target names with respect to all names.
- Parameters:
all_names (list) – The list of all names.
target_names (list) – The list of target names.
- Returns:
The indices of target_names in the all_names list
- Return type:
np.ndarrayof int
- static find_ignore_mask(y, indices)
Find the boolean ignore mask (True to ignore, False otherwise).
For any point whose class (given by y) matches any target index (given by indices) a true must be stored in the corresponding element of the boolean mask.
- Parameters:
y (
np.ndarray) – A vector of point-wise classes.indices (
np.ndarray) – The class indices to search for.
- Returns:
The boolean mask specifying what points must be ignored.
- Return type:
np.ndarrayof bool
- static remove_indices(y, mask)
Preserve only those points for which the boolean mask is False (True means it must be ignored).
- Parameters:
y – The points to be filtered by the mask.
mask – The mask defining the removal filter.
- Returns:
The input points without the ignored ones.
eval.classification_uncertainty_evaluation module
- class eval.classification_uncertainty_evaluation.ClassificationUncertaintyEvaluation(**kwargs)
Bases:
Evaluation- Author:
Alberto M. Esmoris Pena
Class representing the result of evaluating the uncertainty for a given classification. See
ClassificationUncertaintyEvaluator.- Variables:
class_names (list) – The name for each class.
X (
np.ndarray) – The matrix with the coordinates of the points.y (
np.ndarray) – The point-wise classes (reference).yhat (
np.ndarray) – The point-wise classes (predictions).Zhat (
np.ndarray) – Predicted class probabilities.pwise_entropy (
np.ndarray) – The point-wise Shannon’s entropy.weighted_entropy (
np.ndarray) – The weighted Shannon’s entropy.cluster_labels (list or
np.ndarray) – The point-wise labels identifying to which cluster (in the context of the cluster-wise entropy) each point belongs to.cluster_wise_entropy (
np.ndarray) – The cluster-wise Shannon’s entropy.class_ambiguity (
np.ndarray) – The point-wise class ambiguity.gaussian_kernel_points (int) – How many points consider for the gaussian kernel density estimation.
- __init__(**kwargs)
Initialize/instantiate a ClassificationUncertaintyEvaluation.
- Parameters:
kwargs – The attributes for the ClassificationUncertaintyEvaluation.
- report(**kwargs)
Transform the ClassificationUncertaintyEvaluation into a ClassificationUncertaintyReport.
See
ClassificationUncertaintyReport- Returns:
The ClassificationUncertaintyReport representing the ClassificationUncertaintyEvaluation.
- Return type:
:class.`.ClassificationUncertaintyReport`
- can_report()
See
Evaluationandevaluation.Evaluation.can_report().
- plot(**kwargs)
Transform the ClassificationUncertaintyEvaluation into a ClassificationUncertaintyPlot.
See
ClassificationUncertaintyPlot.- Parameters:
kwargs – The key-word arguments for the plot.
- Returns:
The ClassificationUncertaintyPlot representing the ClassificationUncertaintyEvaluation.
- Return type:
- can_plot()
See
Evaluationandevaluation.Evaluation.can_plot()
eval.classification_uncertainty_evaluator module
- class eval.classification_uncertainty_evaluator.ClassificationUncertaintyEvaluator(**kwargs)
Bases:
Evaluator- Author:
Alberto M. Esmoris Pena
Class to evaluate classification-like predictions to analyzer their uncertainty.
- Variables:
class_names (list) – The name for each class.
include_probabilities (bool) – Whether to include the probabilities in the resulting evaluation (True) or not (False).
probability_eps – The value representing the zero, to avoid NaNs when computing the logarithms of the likelihoods/probabilities. If it is exactly zero, then the zeroes will not be replaced by this value.
include_weighted_entropy (bool) – Whether to include the weighted entropy in the resulting evaluation (True) or not (False).
include_clusters (bool) – Whether to include the cluster-wise entropies in the resulting evaluation (True) or not (False).
weight_by_predictions (bool) – Whether to compute the weighted entropy considering the predictions instead of the reference labels (True) or not (False, by default).
num_clusters (int) – Governs how many clusters must be built when the cluster-wise entropies must be computed.
clustering_max_iters (int) – How many iterations are allowed (at most) for the cluster algorithm to converge.
clustering_batch_size (int) – How many points consider per batch at each iteration of the clustering algoritm. More points imply a more accurate clustering. However, they also imply a greater computational cost, and thus longer execution time.
clustering_entropy_weights (bool) – Whether to use point-wise entropy as the sample weights for the clustering (True) or not (False).
clustering_reduce_function (str) – What function use to reduce the entropy values in a given cluster to a single one. Either ‘mean’, ‘median’, ‘Q1’ (first quartile), ‘Q3’ (third quartile), ‘min, or ‘max’.
gaussian_kernel_points (int) – How many points consider to compute the gaussian kernel density estimations. Note that this argument has a great impact on the time required to generate the plots.
report_path (str) – The generated point cloud-like report will be exported to the file pointed by the report path.
plot_path (str) – The generated plots will be stored at the directory pointed by the plot path.
ignore_classes (list of str) – The list of classes that must be ignored when computing the evaluations. In other words, those points that are labeled (not predicted) as one of the ignored classes will not be considered when calculating the evaluation metrics.
- static extract_eval_args(spec)
Extract the arguments to initialize/instantiate a ClassificationUncertaintyEvaluator from a key-word specification.
- Parameters:
spec – The key-word specification containing the arguments.
- Returns:
The arguments to initialize/instantiate a ClassificationUncertaintyEvaluator.
- __init__(**kwargs)
Initialize/instantiate a ClassificationUncertaintyEvaluator.
- Parameters:
kwargs – The attributes for the ClassificationUncertaintyEvaluator.
- eval(Zhat, X=None, y=None, yhat=None, F=None)
Evaluate the uncertainty of the given predictions.
- Parameters:
Zhat (
np.ndarray) – Predicted class probabilities.- Variables:
X (
np.ndarray) – The matrix with the coordinates of the points.y (
np.ndarray) – The point-wise classes (reference).yhat (
np.ndarray) – The point-wise classes (predictions).F (
np.ndarray) – The features matrix (it is necessary to compute cluster-wise entropies).
- Returns:
The evaluation of the classification’s uncertainty.
- Return type:
- __call__(pcloud, **kwargs)
Evaluate with extra logic that is convenient for pipeline-based execution.
See
evaluator.Evaluator.eval().- Parameters:
pcloud (
PointCloud) – The point cloud which predicted probabilities must be computed to determine the uncertainty measurements.model (
Model) – The model that computed the predictions.
- compute_pwise_entropy(Zhat)
Compute the point-wise Shannon’s entropy for the given predicted probabilities.
Let \(\pmb{Z} \in \mathbb{R}^{m \times n_c}\) be a matrix representing the predicted probabilities for \(m\) points assuming \(n_c\) classes. The point-wise Shannon entropy for point i \(e_{i}\) can be defined as:
\[e_i = - \sum_{j=1}^{n_c}{z_{ij} \log_{2}(z_{ij})}\]- Parameters:
Zhat (
np.ndarray) – The matrix of point-wise predicted probabilities.- Returns:
A vector of point-wise Shannon’s entropies such that the component i is the entropy corresponding to the point i.
- Return type:
np.ndarray
- compute_weighted_entropy(Zhat, y=None, yhat=None)
Compute the weighted point-wise Shannon’s entropy for the given predicted probabilities.
The weighted Shannon’s entropy is the point-wise Shannon’s entropy but weighting each probability by the frequency of the class with respect to some reference labels \(\pmb{y} \in \mathbb{Z}^{m}\) for \(n_c\) different classes. When the expected reference labels of a point cloud (i.e., classification, i.e., self.y) are available, they will be considered. Otherwise, when they are not available or the
weight_by_predictionsflag is true, the predicted labels will be considered for the weights.The weights can be represented through a vector \(\pmb{w} \in \mathbb{R}^{n_c}\). Let \(m\) be the number of points and \(m_j\) be the number of points belonging to class j. For then, the components of the weights vector can be defined as:
\[w_j = 1 - \dfrac{m_j}{m}\]When using these weights, the less frequent classes will be more significant than the more frequent classes. The weighted point-wise entropy will be computed as follows:
\[e_i = - \sum_{j=1}^{n_c}{w_j z_{ij} \log_{2}(z_{ij})}\]See
classification_uncertainty_evaluator.ClassificationUncertaintyEvaluator.compute_pwise_entropy().- Parameters:
Zhat (
np.ndarray) – The matrix of point-wise predicted probabilities.- Returns:
A vector of weighted point-wise Shannon’s entropies such that the component i is the entropy corresponding to the point i.
- Return type:
np.ndarray
- compute_cluster_wise_entropy(E, F=None)
Compute the cluster-wise Shannon’s entropy for the given predicted point-wise entropies and features.
A KMeans is computed on batches with
self.clustering_batch_sizepoints up to a maximum ofself.clustering_max_itersiterations to extractself.num_clustersclusters on the feature space. Ifself.clustering_entropy_weightsis True, then the KMeans will scale the contribution of each point considering its associated point-wise entropy. Finally, all the points belonging to the same cluster will have the same cluster-wise entropy which is obtained by reducing the entropies in the cluster through theself.crffunction.- Parameters:
E – The point-wise Shannon’s entropies \(\pmb{E} \in \mathbb{R}^{m \times 1}\).
F – The feature matrix \(\pmb{F} \in \mathbb{R}^{m \times n_f}\).
- Returns:
A vector of point-wise cluster labels and a vector of cluster-wise Shannon’s entropies (one cluster-wise per point).
- Return type:
tuple
- compute_class_ambiguity(Zhat)
Compute a naive point-wise class ambiguity measurement.
Let \(\pmb{Z} \in \mathbb{R}^{m \times n_c}\) be a matrix representing the predicted probabilities for \(m\) points assuming \(n_c\) classes. The point-wise class ambiguity for point i \(a_{i}\) can be defined as:
\[a_i = 1 - z^{*}_{i} + z^{**}_{i}\]Where \(z^{*}_{i}\) is the highest prediction for point i and \(z^*{**}_{i}\) is the second highest prediction for point i.
- Parameters:
Zhat (
np.ndarray) – The matrix of point-wise predicted probabilities.- Returns:
A vector of point-wise class ambiguities such that the component i is the class ambiguity corresponding to the point i.
- eval_args_from_state(state)
Obtain the arguments to call the ClassificationUncertaintyEvaluator from the current pipeline’s state.
- Parameters:
state (
SimplePipelineState) – The pipeline’s state.- Returns:
The dictionary of arguments for calling ClassificationUncertaintyEvaluator.
- Return type:
dict
eval.evaluation module
- exception eval.evaluation.EvaluationException(message='')
Bases:
Exception- Author:
Alberto M. Esmoris Pena
Class for exceptions related to evaluations. See
VL3DExceptionandEvaluationException- __init__(message='')
- class eval.evaluation.Evaluation(**kwargs)
Bases:
object- Author:
Alberto M. Esmoris Pena
Class for evaluation results. See
Evaluator.- Variables:
problem_name (str) – The name of the evaluated problem.
- __init__(**kwargs)
Initialize/instantiate an Evaluation.
- Parameters:
kwargs – The attributes for the Evaluation.
- report(**kwargs)
Transform the evaluation into a report.
By default, this method is not supported. Depending on the evaluation subclass, this method might be overriden to provide automatic report generation.
- Returns:
The report representing the evaluation.
- Return type:
- can_report()
Check whether the evaluation object can generate a report or not.
- Returns:
True if a report can be generated, False otherwise.
- plot(**kwargs)
Transform the evaluation into a plot (or many plots).
By default, this method is not supported. Depending on the evaluation subclass, this method might be overriden to provide automatic plot generation.
- Returns:
The plot representing the evaluation.
- Return type:
- can_plot()
Check whether the evaluation object can generate a plot or not.
- Returns:
True if a plot can be generated, False otherwise.
eval.evaluator module
- exception eval.evaluator.EvaluatorException(message='')
Bases:
Exception- Author:
Alberto M. Esmoris Pena
Class for exceptions related to evaluators. See
VL3DExceptionandEvaluationException.- __init__(message='')
- class eval.evaluator.Evaluator(**kwargs)
Bases:
object- Author:
Alberto M. Esmoris Pena
Class for evaluation operations. See
Evaluation.- Variables:
problem_name (str) – The name of the problem that is being evaluated.
- __init__(**kwargs)
Initialize/instantiate an Evaluator.
- Parameters:
kwargs – The attributes for the Evaluator.
- abstractmethod eval(x, **kwargs)
Evaluate something and yield an evaluation.
- Parameters:
x – The input to be evaluated.
- Returns:
Evaluation.
- Return type:
- __call__(x, **kwargs)
Evaluate with extra logic that is convenient for pipeline-based execution.
See
evaluator.Evaluator.eval().
eval.kfold_evaluation module
- class eval.kfold_evaluation.KFoldEvaluation(**kwargs)
Bases:
Evaluation- Author:
Alberto M. Esmoris Pena
Class representing the result of evaluating a kfold procedure. See
KFoldEvaluator.- Variables:
problem_name – See
Evaluatormetric_names (list or tuple) – The name for each metric, i.e., the name of each component in mu or sigma vectors.
mu (
np.ndarrayvector-like) – The vector of means such that each component represents the mean value of an evaluation metric used to assess the k-folding procedure.sigma (
np.ndarrayvector-like) – The vector of standard deviations such that each component represents the standard deviation of an evaluation metric used to assess the k-folding procedure.Q (
np.ndarraymatrix-like) – The matrix of quantiles such that the component of each column vector represents the quantiles of an evaluation metrics used to assess the k-folding procedure.
- __init__(**kwargs)
Initialize/instantiate a KFoldEvaluation.
- Parameters:
kwargs – The attributes for the KFoldEvaluation.
- report(**kwargs)
Transform the KFoldEvaluation into a KFoldReport.
See
KFoldReport.- Returns:
The KFoldReport representing the KFoldEvaluation.
- Return type:
- can_report()
See
Evaluationandevaluation.Evaluation.can_report().
- plot(**kwargs)
Transform the KFoldEvaluation into a KFoldPlot.
See
KFoldPlot.- Keyword Arguments:
path (
str) – The path to store the plot.show (
bool) – Boolean flag to handle whether to show the plot (True) or not (False).
- Returns:
The KFoldPlot representing the KFoldEvaluation.
- Return type:
- can_plot()
See
Evaluationandevaluation.Evaluation.can_plot().
eval.kfold_evaluator module
- class eval.kfold_evaluator.KFoldEvaluator(**kwargs)
Bases:
Evaluator- Author:
Alberto M. Esmoris Pena
Class to evaluate kfold procedures. See
KFoldEvaluation.- Variables:
quantile_cuts (list or tuple or np.ndarray) – The cut points defining the quantiles. By default, they represent the quartiles, i.e., 1/4, 2/4, 3/4.
- __init__(**kwargs)
Initialize/instantiate a KFoldEvaluator.
- Parameters:
kwargs – The attributes for the KFoldEvaluator.
- eval(X, **kwargs)
Evaluate the results of a k-folding procedure.
- Parameters:
X – The matrix of quantitative evaluations. Each row must represent a fold and each column an evaluation metric. Thus, X[i][j] is the j-th evaluation metric on the i-th fold.
- Returns:
The evaluation of the k-folding.
- Return type:
eval.rand_forest_evaluation module
- class eval.rand_forest_evaluation.RandForestEvaluation(**kwargs)
Bases:
Evaluation- Author:
Alberto M. Esmoris Pena
Class representing the result of evaluating a trained random forest. See
RandForestEvaluator.- Variables:
problem_name – See
Evaluatorfnames (list or tuple) – The name for each feature.
importance (
np.ndarray) – The normalized importance of each feature in [0, 1].permutation_importance_mean (
np.ndarray) – The normalized mean permutation importance of each feature in [0, 1].permutation_importance_stdev (
np.ndarray) – The standard deviation of the normalized permutation importance of each feature.trees (list) – The list of trees representing the estimators of the random forest.
- __init__(**kwargs)
Initialize/instantiate a RandForestEvaluation.
- Parameters:
kwargs – The attributes for the RandForestEvaluation.
- report(**kwargs)
Transform the RandForestEvaluation into a RandForestReport.
See
RandForestReport.- Returns:
The RandForestReport representing the RandForestEvaluation.
- Return type:
- can_report()
See
Evaluationandevaluation.Evaluation.can_report().
- plot(**kwargs)
Transform the RandForestEvaluation into a RandForestPlot.
See
RandForestPlot.- Keyword Arguments:
path (
str) – The path to store the plot.show (
bool) – Boolean flag to handle whether to show the plot (True) or not (False).
- Returns:
The RandForestPlot representing the RandForestEvaluation.
- Return type:
- can_plot()
See
Evaluationandevaluation.Evaluation.can_plot().
eval.rand_forest_evaluator module
- class eval.rand_forest_evaluator.RandForestEvaluator(**kwargs)
Bases:
Evaluator- Author:
Alberto M. Esmoris Pena
Class to evaluate trained random forest models. See
RandomForestClassificationModel.- Variables:
num_decision_trees (int) – How many estimators consider when plotting the decision trees. Zero means none at all, n means consider n decision trees, and -1 means consider all the decision trees.
compute_permutation_importance – Whether to also compute the permutation importance (True) or not (False).
- __init__(**kwargs)
Initialize/instantiate a RandForestEvaluator.
- Parameters:
kwargs – The attributes for the RandForestEvaluator.
- eval(model, X=None, y=None, **kwargs)
Evaluate a trained random forest model.
- Parameters:
model (
RandomForestClassificationModel) – The random forest model.X – The matrix of point-wise features. Rows are points, columns are features.
y – The vector of classes. The component i represents the class for the point i.
- Returns:
The evaluation of the trained random forest.
- Return type:
eval.raster_grid_evaluation module
- class eval.raster_grid_evaluation.RasterGridEvaluation(**kwargs)
Bases:
Evaluation- Author:
Alberto M. Esmoris Pena
Class representing the result of evaluating a point cloud by transforming it to a convenient raster representation.
- Variables:
X (
np.ndarray) – The matrix of coordinates representing the evaluated point cloud.Fgrids (list) – A list of grids representing the grids of features from the evaluated point cloud.
onames (list) – The output name for each grid of features.
crs (str) – The coordinate reference system (CRS).
xres (float) – The cell size along the x-axis.
yres (float) – The cell size along the y-axis.
- __init__(**kwargs)
Initialize/instantiate a RasterGridEvaluation.
- Parameters:
kwargs – The attributes for the RasterGridEvaluation.
- can_plot(**kwargs)
See
Evaluationandevaluation.Evaluation.can_plot().
- plot(**kwargs)
Transform the evaluation into a plot (or many plots), each representing a raster-like 2D grid, typically in GeoTiff format.
See
RasterGridPlot.- Parameters:
kwargs – The key-word arguments for the plot.
- Returns:
The RasterGridPlot representing the RasterGridEvaluation.
- Return type:
eval.raster_grid_evaluator module
- class eval.raster_grid_evaluator.RasterGridEvaluator(**kwargs)
Bases:
Evaluator- Author:
Alberto M. Esmoris Pena
Class to generate a raster-like 2D grid evaluating a given point cloud.
- Variables:
plot_path (str) – The path to write the raster.
fnames (list of str) – The name of the features to be considered.
grids (list of dict) – The many grid specifications.
crs (str) – The coordinate reference system (CRS).
xres (float) – The cell size along the x-axis.
yres (float) – The cell size along the y-axis.
grid_iter_step (int) – How many rows at most must be considered per iteration when generating the raster-like grid.
reverse_rows (bool) – Whether to reverse the rows of the grid (True) or not (False).
radius_expr (str) – An expression defining the radius of the neighborhood centerd on each cell. In this expression, “l” represents the greatest cell size, i.e., \(\max \; \{\mathrm{xres}, \mathrm{yres}\}\).
nthreads (int) – How many threads must be used for the parallel computation of the grids of features. By default, a single thread is used (1). Also, note that -1 means as many threads as available cores.
- static extract_eval_args(spec)
Extract the arguments to initialize/instantiate a
- Parameters:
spec – The key-word specification containing the arguments.
- Returns:
The arguments to initialize/instantiate a RasterGridEvaluator.
- Return type:
dict
- __init__(**kwargs)
Initialize/instantiate a RasterGridEvaluator.
- Parameters:
kwargs – The attributes for the RasterGridEvaluator.
- eval(pcloud)
Evaluate the point cloud as a raster-like 2D grid.
- Parameters:
pcloud (
PointCloud) – The point cloud to be evaluated.- Returns:
The raster grid obtained after computing the evaluation.
- Return type:
- __call__(pcloud, **kwargs)
Evaluate with extra logic that is convenient for pipeline-based execution.
See
evaluator.Evaluator.eval().- Parameters:
pcloud – The point cloud that must be evaluated through raster-like grid analysis.
- digest_grids(X, F, fnames)
Generate the grids of features for the requested grid specifications.
- Parameters:
X (
np.ndarray) – The structure space matrix, i.e., matrix of point-wise coordinates.F (
np.ndarray) – The feature space matrix, i.e., matrix of point-wise features.fnames (list of str) – The feature names, i.e., the name for each column of the feature space matrix F.
- Returns:
The generated grids of features.
- Return type:
list of
np.ndarray
- static digest_grid(X, Xi, kdt, radius, fnames, F, grid, width, height)
Generate the grid of features for a given grid specification.
Internally:
- –
IThe list of neighborhoods, where each neighborhood is represented as a list of indices.
—
n_rowsHow many rows are being considered in the chunk to be digested.- Parameters:
X – The structure space representing the original point cloud or the globally filtered point cloud.
Xi – The points representing a chunk of the grid.
kdt – The KDTree representing the original structure space (X).
radius – The radius for the spherical neighborhoods centered at each cell of the grid.
fnames – The name for each column (feature) in the matrix F.
F – The matrix of features.
grid – The grid specification to be digested.
width – The width of the grid in number of cells.
height – The height of the grid in number of cells.
- Returns:
The generated grid of features.
- Return type:
np.ndarray
- –
- static digest_standard_grid(X, Xi, kdt, radius, fnames, F, grid, height)
Generate a grid of features for a given grid specification in the standard way. This method provides
RasterGridEvaluator.digest_grid()with the standard logic to handle the general case.
- eval_args_from_state(state)
Obtain the arguments to call the RasterGridEvaluator form the current pipeline’s state.
- Parameters:
state (
SimplePipelineState) – The pipeline’s state.- Returns:
The dictionary of arguments for calling RasterGridEvaluator
- Return type:
dict
- static reduce_mean(F, I, k, rel, target, th, cache)
Reduce the given features (F) on the given neighborhoods (I) for the k-th cell of the raster that corresponds with the k-th neighborhood. The reduction takes the mean value in the neighborhood.
- Parameters:
F (
np.ndarray) – The feature space matrix for \(m\) points with \(n_f\) features. Note that here \(n_f\) does not represent the dimensionality of the original point cloud but the one where only the requested features are considered.I (list of list of int) – List of neighborhoods. Each \(i\)-th neighborhood \(I_i\) is a list (i.e., mutable tuple) representing the indices of the points in the neighborhood. Note that each list inside the list can have a different number of neighbors.
k (int) – The index representing the cell that corresponds with the \(k\)-th neighborhood.
rel – Not considered at all.
target – The value that must be searched when using a binary mask, recount, or relative recount strategy.
th – The threshold of target cases that must be inside the cell to consider a \(1\) for the binary mask.
cache – The cache with pre-computed information needed for the reduction. Note that it might be None if it is not needed by the reduce strategy. Its type also depends on the strategy.
- Returns:
The vector after the reduction (reductions to scalars are understood as vectors with 1 component).
- Return type:
np.ndarrayor list
- static reduce_median(F, I, k, rel, target, th, cache)
Reduce the given features (F) on the given neighborhoods (I) for the k-th cell of the raster that corresponds with the k-th neighborhood. The reduction takes the median value in the neighborhood.
- Parameters:
F (
np.ndarray) – The feature space matrix for \(m\) points with \(n_f\) features. Note that here \(n_f\) does not represent the dimensionality of the original point cloud but the one where only the requested features are considered.I (list of list of int) – List of neighborhoods. Each \(i\)-th neighborhood \(I_i\) is a list (i.e., mutable tuple) representing the indices of the points in the neighborhood. Note that each list inside the list can have a different number of neighbors.
k (int) – The index representing the cell that corresponds with the \(k\)-th neighborhood.
rel – Not considered at all.
target – Not considered at all.
th – Not considered at all.
cache – Not considered at all.
- Returns:
The vector after the reduction (reductions to scalars are understood as vectors with 1 component).
- Return type:
np.ndarrayor list
- static reduce_min(F, I, k, rel, target, th, cache)
Reduce the given features (F) on the given neighborhoods (I) for the k-th cell of the raster that corresponds with the k-th neighborhood. The reduction takes the min value in the neighborhood.
- Parameters:
F (
np.ndarray) – The feature space matrix for \(m\) points with \(n_f\) features. Note that here \(n_f\) does not represent the dimensionality of the original point cloud but the one where only the requested features are considered.I (list of list of int) – List of neighborhoods. Each \(i\)-th neighborhood \(I_i\) is a list (i.e., mutable tuple) representing the indices of the points in the neighborhood. Note that each list inside the list can have a different number of neighbors.
k (int) – The index representing the cell that corresponds with the \(k\)-th neighborhood.
rel – Not considered at all.
target – Not considered at all.
th – Not considered at all.
cache – Not considered at all.
- Returns:
The vector after the reduction (reductions to scalars are understood as vectors with 1 component).
- Return type:
np.ndarrayor list
- static reduce_max(F, I, k, rel, target, th, cache)
Reduce the given features (F) on the given neighborhoods (I) for the k-th cell of the raster that corresponds with the k-th neighborhood. The reduction takes the max value in the neighborhood.
- Parameters:
F (
np.ndarray) – The feature space matrix for \(m\) points with \(n_f\) features. Note that here \(n_f\) does not represent the dimensionality of the original point cloud but the one where only the requested features are considered.I (list of list of int) – List of neighborhoods. Each \(i\)-th neighborhood \(I_i\) is a list (i.e., mutable tuple) representing the indices of the points in the neighborhood. Note that each list inside the list can have a different number of neighbors.
k (int) – The index representing the cell that corresponds with the \(k\)-th neighborhood.
rel – Not considered at all.
target – Not considered at all.
th – Not considered at all.
cache – Not considered at all.
- Returns:
The vector after the reduction (reductions to scalars are understood as vectors with 1 component).
- Return type:
np.ndarrayor list
- static reduce_binary_mask(F, I, k, rel, target, th, cache)
Reduce the given features (F) on the given neighborhoods (I) for the k-th cell of the raster that corresponds with the k-th neighborhood. The reduction yields a zero (0) if there are not enough points in the neighborhood matching the target, one (1) otherwise.
- Parameters:
F (
np.ndarray) – The feature space matrix for \(m\) points with \(n_f\) features. Note that here \(n_f\) does not represent the dimensionality of the original point cloud but the one where only the requested features are considered.I (list of list of int) – List of neighborhoods. Each \(i\)-th neighborhood \(I_i\) is a list (i.e., mutable tuple) representing the indices of the points in the neighborhood. Note that each list inside the list can have a different number of neighbors.
k (int) – The index representing the cell that corresponds with the \(k\)-th neighborhood.
rel (callable) – The relational (as a binary function) that must be applied to compare the values against the target.
target – The value that must be matched.
th – The threshold of target cases that must be inside the cell to consider a \(1\) for the binary mask.
cache – Not considered at all.
- Returns:
The vector after the reduction (reductions to scalars are understood as vectors with 1 component).
- Return type:
np.ndarrayor list
- static reduce_binary_mask_nofeats(F, I, k, rel, target, th, cache)
Reduce the given points on the given neighborhoods (I) for the k-th cell of the raster that corresponds with the k-th neighborhood. The reduction yields a zero (0) if there are not enough points in the neighborhood, one (1) otherwise.
- Parameters:
F – Not considered at all.
I (list of list of int) – List of neighborhoods. Each \(i\)-th neighborhood \(I_i\) is a list (i.e., mutable tuple) representing the indices of the points in the neighborhood. Note that each list inside the list can have a different number of neighbors.
k (int) – The index representing the cell that corresponds with the \(k\)-th neighborhood.
rel – Not considered at all.
target – Not considered at all.
th – The threshold for the number of points that must be inside the cell to consider a \(1\) for the binary mask.
cache – Not considered at all.
- Returns:
The vector after the reduction (reductions to scalars are understood as vectors with 1 component).
- Return type:
np.ndarrayor list
- static reduce_recount(F, I, k, rel, target, th, cache)
Reduce the given features (F) on the given neighborhoods (I) for the k-th cell of the raster that corresponds with the k-th neighborhood. The reduction takes the number of points inside the cell matching the given target.
- Parameters:
F (
np.ndarray) – The feature space matrix for \(m\) points with \(n_f\) features. Note that here \(n_f\) does not represent the dimensionality of the original point cloud but the one where only the requested features are considered.I (list of list of int) – List of neighborhoods. Each \(i\)-th neighborhood \(I_i\) is a list (i.e., mutable tuple) representing the indices of the points in the neighborhood. Note that each list inside the list can have a different number of neighbors.
k (int) – The index representing the cell that corresponds with the \(k\)-th neighborhood.
rel (callable) – The relational (as a binary function) that must be applied to compare the values against the target.
target – The value that must be matched.
th – Not considered at all.
cache – Not considered at all.
- Returns:
The vector after the reduction (reductions to scalars are understood as vectors with 1 component).
- Return type:
np.ndarrayor list
- static reduce_recount_nofeats(F, I, k, rel, target, th, cache)
Reduce the given features (F) on the given neighborhoods (I) for the k-th cell of the raster that corresponds with the k-th neighborhood. The reduction takes the number of points inside the cell.
- Parameters:
F – Not considered at all.
I (list of list of int) – List of neighborhoods. Each \(i\)-th neighborhood \(I_i\) is a list (i.e., mutable tuple) representing the indices of the points in the neighborhood. Note that each list inside the list can have a different number of neighbors.
k (int) – The index representing the cell that corresponds with the \(k\)-th neighborhood.
rel – Not considered at all.
target – Not considered at all.
th – Not considered at all.
cache – Not considered at all.
- Returns:
The vector after the reduction (reductions to scalars are understood as vectors with 1 component).
- Return type:
np.ndarrayor list
- static reduce_relative_recount(F, I, k, rel, target, th, cache)
Reduce the given features (F) on the given neighborhoods (I) for the k-th cell of the raster that corresponds with the k-th neighborhood. The reduction takes the number of points inside the cell matching the given target, normalized by the number of points inside the cell.
- Parameters:
F (
np.ndarray) – The feature space matrix for \(m\) points with \(n_f\) features. Note that here \(n_f\) does not represent the dimensionality of the original point cloud but the one where only the requested features are considered.I (list of list of int) – List of neighborhoods. Each \(i\)-th neighborhood \(I_i\) is a list (i.e., mutable tuple) representing the indices of the points in the neighborhood. Note that each list inside the list can have a different number of neighbors.
k (int) – The index representing the cell that corresponds with the \(k\)-th neighborhood.
rel (callable) – The relational (as a binary function) that must be applied to compare the values against the target.
target – The value that must be matched.
th – Not considered at all.
cache – Not considered at all.
- Returns:
The vector after the reduction (reductions to scalars are understood as vectors with 1 component).
- Return type:
np.ndarrayor list
- static reduce_relative_recount_nofeats(F, I, k, rel, target, th, cache)
Reduce the given features (F) on the given neighborhoods (I) for the k-th cell of the raster that corresponds with the k-th neighborhood. The reduction takes the number of points inside the cell, normalized by the max number of points inside any neighborhood.
- Parameters:
F – Not considered at all.
I (list of list of int) – List of neighborhoods. Each \(i\)-th neighborhood \(I_i\) is a list (i.e., mutable tuple) representing the indices of the points in the neighborhood. Note that each list inside the list can have a different number of neighbors.
k (int) – The index representing the cell that corresponds with the \(k\)-th neighborhood.
rel – Not considered at all.
target – Not considered at all.
th – Not considered at all.
cache (int) – The max number of points inside the most populated neighborhood.
- Returns:
The vector after the reduction (reductions to scalars are understood as vectors with 1 component).
- Return type:
np.ndarrayor list
- static precompute_max_neighbors(F, I, rel, target, th)
Pre-compute the number of neighborhoods of the most populated neighborhood.
- Parameters:
F – Not considered at all.
I (list of list of int) – List of neighborhoods. Each \(i\)-th neighborhood \(I_i\) is a list (i.e., mutable tuple) representing the indices of the points in the neighborhood. Note that each list inside the list can have a different number of neighbors.
rel – Not considered at all.
target – Not considered at all.
th – Not considered at all.
- Returns:
Max number of points inside the most populated neighborhood.
- Return type:
int
- apply_global_filter(X, F, pcloud)
Filter the given structure and feature spaces at a global level, i.e., for all rasters and not only for a particular one.
- Parameters:
X (
np.ndarray) – The structure space matrix, i.e., matrix of point-wise coordinates.F (
np.ndarrayor None) – The feature space matrix, i.e., matrix of point-wise features.pcloud (
PointCloud) – The point cloud being filtered. Note that the point cloud object will not be filtered. Nonetheless, its data is necessary to apply the filter (e.g., non evaluated features might be considered by the filter).
- Returns:
The filtered structure and feature spaces.
- Return type:
tuple of
np.ndarray
- static filter(val, cond, mask)
Apply the given conditional filter to the given mask.
- Parameters:
val (
np.ndarray) – The vector of values considered by the filter.cond (dict) – The condition specification governing the filter. See the
conditionsargument ofPointCloudSampler.filter_support_points()for further details.mask (
np.ndarray) – The boolean mask representing the composition of conditions that defines the full filter, at its current state.
- Returns:
The updated boolean filter mask.
- Return type:
np.ndarray
eval.regression_evaluation module
- class eval.regression_evaluation.RegressionEvaluation(**kwargs)
Bases:
Evaluation- Author:
Alberto M. Esmoris Pena
Class representing the result of evaluating a regression. See
RegressionEvaluator.- Variables:
X – See
RegressionEvaluator.header – See
RegressionEvaluator.cases – See
RegressionEvaluator.quantities – See
RegressionEvaluator.errors – See
RegressionEvaluator.metrics – See
RegressionEvaluator.outers – See
RegressionEvaluator.distribution – See
RegressionEvaluator.
- __init__(**kwargs)
Initialize/instantiate an Evaluation.
- Parameters:
kwargs – The attributes for the Evaluation.
- report(**kwargs)
Transform the RegressionEvaluation into a
RegressionReport.See
RegressionReport.- Returns:
The RegressionReport representing the RegressionEvaluation.
- Return type:
- can_report(**kwargs)
See
Evaluationandevaluation.Evaluation.can_report().
- plot(**kwargs)
Transform the
RegressionEvaluationinto aRegressionPlot.See
RegressionPlot.- Parameters:
kwargs – The key-word arguments for the plot.
- Returns:
The RegressionPlot representing the RegressionEvaluation.
- Return type:
- can_plot(**kwargs)
See
Evaluationandevaluation.Evaluation.can_plot().
eval.regression_evaluator module
- class eval.regression_evaluator.RegressionEvaluator(**kwargs)
Bases:
Evaluator- Author:
Alberto M. Esmoris Pena
Class to evaluate regression-like predictions against expected/reference classes.
Internally, the regression evaluator works with key-value maps governing the information. They are described below:
- –
quantities The keys are feature names (from the original point cloud), the values are vectors with the point-wise features in the components.
- –
errors The keys are the case names (renamed when
cases_renameshas been specified), the values are dictionaries such that:- –
fname_ref The name of the reference feature in the point cloud.
- –
fname_pred The name of the predicted feature in the point cloud.
- –
error A dictionary whose keys are the error names (
"e","se","ae") and whose values are vectors of point-wise errors.
- –
- –
metrics The keys are the case names (renamed when
cases_renameshas been specified), the values are dictionaries whose keys are metric names and whose values are metric values.- –
outers The keys are the case names (renamed when
cases_renameshas been specified), the values are dictionaries whose keys are the names of the outer features and whose values are subdictionaries. These subdictionaries have as keys the error names ("e","se","ae") and whose values are correlation dictionaries. Each correlation dictionary has keys corresponding to correlation measurements ("pearson","spearman") and whose values are tuples such that (name,correlation value,p-value).- –
distribution The keys are the case names (renamed when
cases_renameshas been specified), the values are subdictionaries. Each subdictionary has the estimator name as a key (including a special case"ref"that is not associated to the estimation/predicted value but to the reference value), the values are the vectors with the quantiles (typically percentiles).
- Variables:
metrics (list of str) – The name of the metrics for overall evaluation.
cases (list of list of str) – A list whose elements are lists of at least two elements. The first element of each inner list is a string that gives the name of the reference attribute and the successive one give the name of the prediction attributes.
cases_renames (list of list of str) – A list whose elements are lists of strings for the elements of the inner lists in cases (ignoring the first one). They will be used to rename the metrics on a case-wise basis.
outer_correlations (dict of dict of dict) –
A dictionary whose keys are the names of the cases (considering case_renames if given) and whose values are other dictionaries. The children dictionaries have as keys the name of a feature in the point cloud that is called the outer feature. The value of each children dictionary is another dictionary of three entries:
{ "metrics": ["E", "SE", "AE"], "correlations": ["pearson", "spearman"], "frenames": [ "case_feat_E_r", "case_feat_E_rho", "case_feat_SE_r", "case_feat_SE_rho", "case_feat_AE_r", "case_feat_AE_rho" ] }
outlier_filter (str or None) – The outlier filter to be applied, if any. It can be either
"stdev"(to filter values that are below or above many times the standard deviation),"iqr"(to filter values that are outside [Q1-kIQR, Q3+kIQR], where k is a given value and IQR means interquartile range),"topk","botk","extremek"(to filter the k best, worst, and best and worst simultaneously cases),"topp","botp","extremep"(to filter the p percentage of best, worst, and best and worst simultaneously cases).outlier_param (float or int) – The parameter, for those outlier filters that are based on a parameter (e.g.,
"stdev","iqr”,"topk","botp", etc.).regression_report_path (str) – The path to write the report with the regression evaluation metrics.
outer_report_path (str) – The path to write the report with the correlation of the regression metrics and given features.
distribution_report_path (str) – The path to write the report describing the distribution of each attribute (includes references and predictions).
regression_pcloud_path (str) – The path to write the point cloud with the point-wise regression metrics.
regression_plot_path (str) – The path where the plot with the regression evaluation will be written. It represents the error on the y-axis and the predictions on the x-axis.
regression_hist2d_path – The path where the 2D histogram with the regression evaluation will be written. It represents the error on the y-axis and the predictions on the x-axis.
residual_plot_path (str) – Like the
regression_plot_pathbut the x-axis represents the references instead of the predictions.residual_hist2d_path (str) – Like the
regression_hist2d_pathbut the x-axis represents the references instead of the predictions.scatter_plot_path (str) – The path where the plot with the references on the x-axis and the predictions on the y-axis will be written.
scatter_hist2d_path (str) – The path where the 2D histogram with the references on the x-axis and the predictions on the y-axis will be written.
qq_plot_path (str) – The path where the plot with the quartiles of the references on the x-axis and the quartiles of the predictions on the y-axis will be written.
summary_plot_path (str) – The path where the plot summarizing the mean and standard deviation for each error measurement for each case.
nthreads (int) – How many threads to compute the metrics in parallel (i.e., one thread per metric at most). Note that -1 means using as many threads as available cores.
metric_funs (list of callable) – The functions (callables) that compute the metrics. They are derived from the requested metrics (names).
- static extract_eval_args(spec)
Extract the arguments to initialize/instantiate a
RegressionEvaluatorfrom a key-word specification.- Parameters:
spec – The key-word specifiaction containing te arguments.
- Returns:
The arguments to initialize/instantiate a
RegressionEvaluator.
- __init__(**kwargs)
Initialize/instantiate a RegressionEvaluator.
- Parameters:
kwargs – The attributes for the RegressionEvaluator.
- eval(pcloud)
Evaluate the predicted values (\(\hat{y}\)) against expected/reference values (\(y\)).
- Parameters:
pcloud (
PointCloud) – The point cloud to be evaluated.- Returns:
The evaluation of the regression/s.
- Return type:
- __call__(pcloud, **kwargs)
Evaluate with extra logic that is convenient for pipeline-based execution.
See
Evaluator.eval().- Parameters:
pcloud – The point cloud that must be evaluated with respect to the specified regressions and references.
- generate_metric_functions()
Generate a list of callables that receive as input the reference values and a predictions to yield an aggregated error or correlation measurement. Alternatively, the functions receive a third argument known has the calculus space (cs) where intermediate values can be cached to avoid redundant computations.
- Returns:
List of callables to compute error and correlation measurements from an input vector of reference values and another input vector with the predicted values. Besides, the third argument provides a cache known as calculus space (cs) to avoid redundant computations.
- Return type:
list of callable
- determine_error_measurements()
Determine what error measurements must be computed depending on the requested metrics and outer correlations.
- Returns:
The value for the member attribute
self.error_measurements, which is a list with the names of the necessary error measurements.- Return type:
list
- extract_quantities(pcloud)
Extract the quantities from the point cloud
- Parameters:
pcloud (
PointCloud) – The point cloud from where the quantities must be extracted.- Returns:
A dictionary with the quantities. The keys are the names of the quantitative features, the values are the vectors where each component yields the value of the feature for a point.
- Return type:
dict
- quantify_error(quantities)
Quantify the error, squared error or absolute error as necessary.
- Returns:
The dictionary with the errors. The keys correspond to the names of the regression values with a subdictionary for each. The keys for each subdictionary are
"e"for the error (raw difference between expected and predicted),"se"for the squared error, and"ae"for the absolute error. Any error can be given as anp.ndarraywith the error values or None if the error measurement is not necessary.- Return type:
dict
- compute_metrics(quantities, errors)
Compute the aggregated error and correlation metrics.
- Returns:
The dictionary with metrics. The keys correspond to the name of each case and the values are subdictionaries. Each subdictionary has as key the name of the metric and as value the metric itself.
- Return type:
dict
- compute_outer_correlations(quantities, errors)
Compute correlation metrics with respect to a third variable that is not necessarily involved in the error computation. For example, check how the norm of the estimated gradient norm correlates with error in the estimated Gaussian curvature.
- Returns:
The dictionary with metrics. The keys correspond to the name of each case and the values are subdictionaries. Each subdictionary has as keys the names of the outer feature whose correlation must be computed. The values of the subdictionary are dictionaries whose keys give the name for the error type and whose values are pairs (correlation, p-value) with the correlation value and the associated p-value.
- Return type:
dict
- compute_qq_distribution(quantities)
Compute the QQ distribution considering the percentiles for each requested case.
- Returns:
The dictionary with the QQ distributions. The keys correspond to the name of the case’s reference. The values are a subdictionary with a special key
"ref"that contains the percentiles of the reference values. The other keys in the subdictionary are the names of the features whose percentiles were computed.- Return type:
dict
- static me_fun(y, yhat, cs)
Compute the Mean Error
- static mse_fun(y, yhat, cs)
Compute the Mean Squared Error
- static rmse_fun(y, yhat, cs)
Compute the Root Mean Squared Error
- static mae_fun(y, yhat, cs)
Compute the Mean Absolute Error
- static maxe_fun(y, yhat, cs)
Compute the Max Error
- static maxse_fun(y, yhat, cs)
Compute the Max Squared Error
- static rmaxse_fun(y, yhat, cs)
Compute the Root Max Squared Error
- static maxae_fun(y, yhat, cs)
Compute the Max Absolute Error
- static mine_fun(y, yhat, cs)
Compute the Min Error
- static minse_fun(y, yhat, cs)
Compute the Min Squared Error
- static rminse_fun(y, yhat, cs)
Compute the Root Min Squared Error
- static minae_fun(y, yhat, cs)
Compute the Min Absolute Error
- static mee_fun(y, yhat, cs)
Compute the Median Error
- static mese_fun(y, yhat, cs)
Compute the Median Squared Error
- static rmese_fun(y, yhat, cs)
Compute the Root Median Squared Error
- static meae_fun(y, yhat, cs)
Compute the Median Absolute Error
- static deve_fun(y, yhat, cs)
Compute the Standard Deviation of the Error
- static devse_fun(y, yhat, cs)
Compute the Standard Deviation of the Squared Error
- static rdevse_fun(y, yhat, cs)
Compute the Root of the Standard Deviation of the Squared Error
- static devae_fun(y, yhat, cs)
Compute the Standard Deviation of the Absolute Error
- static rangee_fun(y, yhat, cs)
Compute the Range of the Error
- static rangese_fun(y, yhat, cs)
Compute the Range of the Squared Error
- static rrangese_fun(y, yhat, cs)
Compute the Root of the Range of the Squared Error
- static rangeae_fun(y, yhat, cs)
Compute the Range of the Absolute Error
- static q1e_fun(y, yhat, cs)
Compute the First Quartile of Error
- static q1se_fun(y, yhat, cs)
Compute the First Quartile of Squared Error
- static rq1se_fun(y, yhat, cs)
Compute the Root of the First Quartile of Squared Error
- static q1ae_fun(y, yhat, cs)
Compute the First Quartile of Absolute Error
- static q3e_fun(y, yhat, cs)
Compute the Third Quartile of Error
- static q3se_fun(y, yhat, cs)
Compute the Third Quartile of Squared Error
- static rq3se_fun(y, yhat, cs)
Compute the Root of the Third Quartile of Squared Error
- static q3ae_fun(y, yhat, cs)
Compute the Third Quartile of Absolute Error
- static ske_fun(y, yhat, cs)
Compute the Skewness of Error
- static skse_fun(y, yhat, cs)
Compute the Skewness of Squared Error
- static skae_fun(y, yhat, cs)
Compute the Skewness of Absolute Error
- static kue_fun(y, yhat, cs)
Compute the Kurtosis of Error
- static kuse_fun(y, yhat, cs)
Compute the Kurtosis of Squared Error
- static kuae_fun(y, yhat, cs)
Compute the Kurtosis of Absolute Error
- static pearson_fun(y, yhat, cs)
Compute the Pearson correlation coefficient.
- Parameters:
y – The references.
yhat – The predictions.
cs – The computing space
- Returns:
The Pearson correlation coefficient between the references and the predictions.
- static spearman_fun(y, yhat, cs)
Compute the Spearman correlation coefficient.
- Parameters:
y – The references.
yhat – The predictions.
cs – The computing space.
- Returns:
The Spearman correlation coefficient between the references and the predictions.
- filter_outliers(e, se, ae)
Filter the outlier values from the given errors.
- Parameters:
e (
np.ndarrayor None) – The error (raw differences).se (
np.ndarrayor None) – The squared error.ae (
np.ndarrayor None) – The absolute error.
- Returns:
A tuple with the three error vectors without the outliers.
- Return type:
tuple
- stdev_outlier_filter(e, se, ae)
Apply a filter based on removing those values lying outside the interval centered at the mean and governed by k times the standard deviation.
- iqr_outlier_filter(e, se, ae)
Apply a filter based on removing those values lying outside the interval governed by the first and third quartiles expanded k times the interquartile range.
- topk_outlier_filter(e, se, ae)
Apply a filter based on removing the k highest errors.
- botk_outlier_filter(e, se, ae)
Apply a filter based on removing the k lowest errors.
- extremek_outlier_filter(e, se, ae)
Apply a filter based on removing the k lowest and highest errors.
- topp_outlier_filter(e, se, ae)
Apply a filter based on removing the p percentage of highest errors.
- botp_outlier_filter(e, se, ae)
Apply a filter based on removing the p percentage of lowest errors.
- extremep_outlier_filter(e, se, ae)
Apply a filter based on removing the p percentage of highest and lowest errors.
- eval_args_from_state(state)
Obtain the arguments to call the RegressionEvaluator from the current pipeline’s state.
- Parameters:
state (
SimplePipelineState) – The pipeline’s state.- Returns:
The dictionary of arguments for calling RegressionEvaluator.
- Return type:
dict
- has_report_paths()
Check whether at least one report path has been specified.
- Returns:
True if at least one report path has been specified, False otherwise.
- Return type:
bool
- has_plot_paths()
Check whether at least one plot path has been specified.
- Returns:
True if at least one plot path has been specified, False otherwise.
- Return type:
bool
Module contents
- author:
Alberto M. Esmoris Pena
The eval package contains the logic to evaluate the data and the models.