src.model.random_forestpp_classification_model
Classes
|
- class src.model.random_forestpp_classification_model.RandomForestPPClassificationModel(**kwargs)
- Author:
Alberto M. Esmoris Pena
Random Forest model for classification using the C++ VL3D++ backend. Uses the optimized C++ RandomForest implementation with pre-sorted indices, inline Gini, incremental entropy, global presort, and OpenMP-parallel tree training.
This model is a drop-in replacement for
RandomForestClassificationModel. Pipeline specifications using"train": "RandomForestClassifier"can switch to the C++ backend by changing to"train": "RandomForestPPClassifier". Themodel_argskeys are compatible with sklearn naming.See
ClassificationModelandModel.- Variables:
model_args (dict) – The arguments for the C++ Random Forest.
model (capsule) – The C++ RandomForest object (held via py::capsule).
importance_report_path (str) – Path to the file to store the report.
decision_plot_path (str) – Path to the file to store decision tree plots.
- CRITERION_MAP = {'entropy': 0, 'gini': 1, 'hellinger': 3, 'log_loss': 2}
- static extract_model_args(spec)
Extract the arguments to initialize/instantiate a RandomForestPPClassificationModel from a key-word specification.
- Parameters:
spec – The key-word specification containing the arguments.
- Returns:
The arguments to initialize/instantiate a RandomForestPPClassificationModel
- __init__(**kwargs)
Initialize an instance of RandomForestPPClassificationModel.
- Parameters:
kwargs – The attributes for the RandomForestPPClassificationModel that will also be passed to the parent.
- prepare_model()
Prepare the C++ Random Forest from model_args.
- Returns:
The prepared model capsule. Note it is also assigned as the model attribute of the object/instance.
- training(X, y, info=True)
The fundamental training logic to train the C++ random forest classifier.
See
ClassificationModelandModel. Also seemodel.Model.training().
- on_training_finished(X, y, yhat=None)
See
model.Model.on_training_finished().
- get_feature_importances()
Compute MDI (Mean Decrease in Impurity) feature importances.
- Returns:
Array of shape (n_features,) summing to 1.0.
- get_permutation_importances(X, y, n_repeats=5)
Compute permutation feature importance.
- Parameters:
X – Feature matrix (M x F).
y – True labels (M).
n_repeats – Number of shuffle repeats per feature.
- Returns:
Array of shape (n_features, 2): col 0 = mean, col 1 = std.
- save_model(path)
Save the trained model to a binary file.
- Parameters:
path – Output file path.
- load_model(path)
Load a trained model from a binary file.
- Parameters:
path – Input file path.