Configuration

The VirtuaLearn3D (VL3D) framework uses YAML configuration files to define default values and internal parameters for its different subsystems. These files are located in the config/ directory and are loaded at startup into the global dictionary VL3DCFG (see src.main.main_config). Framework components look up their defaults from this dictionary, so editing a configuration file changes the behavior of every pipeline that relies on those defaults.

The configuration files are not pipeline specifications. Pipelines are defined in JSON files (see Pipelines). The YAML configuration files provide the default values that pipeline components fall back to when a parameter is not explicitly set in the JSON specification.

How configuration is loaded

At startup, main_config_init reads each YAML file under config/ and stores its parsed contents in a sub-key of the VL3DCFG dictionary:

File	Key	Purpose
`config/io.yml`	`IO`	Input/output defaults (LAS version, memory proxy thresholds).
`config/eval.yml`	`EVAL`	Evaluator defaults (raster grid, classification, uncertainty).
`config/report.yml`	`REPORT`	Report defaults (receptive field report toggles).
`config/mining.yml`	`MINING`	Data mining defaults (geometric, height, smooth features).
`config/model.yml`	`MODEL`	Model defaults (RF, deep learning, architectures, receptive fields).
`config/test.yml`	`TEST`	Test suite selection (which tests to run).

Components access their defaults through a two-level lookup. For example, VL3DCFG['MODEL']['TransfOctoRFClassificationModel'] returns the default dictionary for the TransfOctoRF model, while VL3DCFG['IO']['PointCloudArraysFactory'] returns the LAS output defaults. When a JSON pipeline specification provides an explicit value for a parameter, it takes precedence over the YAML default.

In addition, config/logging.yml is loaded separately by src.main.main_logger to configure Python’s logging system and is not part of VL3DCFG.

io.yml – Input/Output

Controls how point clouds are read and written, and when in-memory data is offloaded to disk.

MemToFileProxy

Configures the automatic memory-to-file proxy that dumps point cloud data to temporary files when system memory usage exceeds a threshold.

– mem_check_threshold: Fraction of total system memory (0 to 1) that triggers the proxy. Default: 0.34.

PointCloudArraysFactory

Controls the LAS output format when writing point clouds from arrays.

– las_version: LAS file version string. Default: "1.4".
– las_point_format: LAS point record format ID. Default: 6.

eval.yml – Evaluators

Provides default parameters for the evaluation components.

RasterGridEvaluator

Defaults for the raster grid evaluator that produces GeoTIFF outputs from point clouds.

– grid_iter_step: Number of grid cells processed per iteration. Default: 1024.
– reverse_rows: Whether to flip rows so the output matches geographic conventions. Default: true.
– radius_expr: Expression for the interpolation radius as a function of cell size \(l\). Default: "sqrt(2)*l/2".
– nthreads: Number of parallel threads. Default: 1.

RandForestEvaluator

Defaults for Random Forest evaluation utilities.

– num_decision_trees: Number of trees to plot. 0 disables plotting. Default: 0.
– compute_permutation_importance: Whether to compute permutation-based feature importance. Default: true.
– max_tree_depth: Maximum depth for decision tree visualization. Default: 5.

KFoldEvaluator

Defaults for k-fold cross-validation.

– quantile_cuts: List of quantiles to report. Default: [0.25, 0.5, 0.75].

ClassificationEvaluator

Defaults for classification evaluation, including confusion matrix formatting (colormap, font sizes, borders).

ClassificationUncertaintyEvaluator

Defaults for uncertainty analysis, including probability export, entropy, and optional clustering-based uncertainty decomposition.

report.yml – Reports

Controls which columns are included when exporting receptive field point clouds.

ReceptiveFieldsReport

– include_entropies: Include per-point entropy in exported receptive fields. Default: true.
– include_likelihoods: Include softmax probabilities. Default: true.
– include_predictions: Include predicted class labels. Default: true.
– include_references: Include ground-truth labels. Default: true.
– include_success: Include the binary success mask (prediction matches reference). Default: true.
– include_features: Include the NN input features. Default: true.

mining.yml – Data Mining

Default parameters for feature extraction from point clouds.

Top-level keys

– structure_space_bits: Precision for coordinate storage. Default: 32.
– feature_space_bits: Precision for feature storage. Default: 32.

GeomFeatsMiner

Defaults for the Python geometric features miner.

– radius: Neighborhood radius. Default: 0.3.
– fnames: Default feature names to compute. Default: ["linearity", "planarity", "sphericity"].
– nthreads: Number of threads (-1 for all cores). Default: -1.

GeomFeatsMinerPP

Defaults for the C++ geometric features miner. Includes neighborhood specification (type, radius, k), IDW parameters, Tikhonov regularization, and second-order estimation strategy.

HeightFeatsMiner

Defaults for the height features miner. Includes chunk sizes, cylindrical neighborhood parameters, and default features (["floor_distance"]).

NaiveChangeMiner

Defaults for the naive change detection miner.

– min_distance: Minimum distance threshold for change detection. Default: 0.01.

model.yml – Models

The largest configuration file, covering machine learning models, deep learning architectures, and receptive fields. Parameters are organized by component class name.

All models

Top-level keys define precision for the numeric spaces:

– structure_space_bits: Coordinate precision (32 or 64). Default: 64.
– feature_space_bits: Feature precision (32 or 64). Default: 32.
– classification_space_bits: Label precision (8, 16, or 32). Default: 8.

Machine learning

RandomForestClassificationModel

– importance_report_permutation: Use permutation importance instead of impurity-based. Default: true.
– decision_plot_trees: Number of trees to plot. 0 disables. Default: 0.
– decision_plot_max_depth: Maximum depth for plotted trees. Default: 5.

TransfOctoRFClassificationModel

– predict_batch_size: Number of centroids per outer prediction batch (CPU gather). Default: 8192.
– predict_inner_batch_size: Number of centroids per inner GPU batch (predict_on_batch calls). Default: 128.
– predict_chunked_knn_threshold: When the centroid count exceeds this threshold, KNN is computed per-batch via a persistent C++ octree instead of materializing the full neighbors array. Prevents OOM on large point clouds. Default: 500000.

Deep learning

Top-level deep learning keys:

– dl_write_keras

Save the full Keras model (.keras file) after training. Default: true.

– dl_write_weights

Save model weights alongside the Keras file. Default: true.

– dl_memsafe_training

Enable memory-safe training mode (reduces peak memory at the cost of speed). Default: false.

SimpleDLModelHandler

Default handler parameters shared across deep learning models:

– run_eagerly: Run Keras in eager mode (debugging only). Default: false.
– training_epochs: Default number of training epochs. Default: 100.
– batch_size: Default training batch size. Default: 16.
– checkpoint_monitor: Metric monitored by the checkpoint callback. Default: "loss".
– fit_verbose / predict_verbose: Verbosity for model.fit and model.predict. Default: "auto".
– skip_fit_on_zero_epochs: Skip training entirely when training_epochs is 0. Default: true.
– tensorboard_log_dir: Directory for TensorBoard logs. null disables.
– tensorboard_histogram_frequency: How often to log weight histograms (epochs). Default: 0.

Architecture

Default arguments for keras.utils.plot_model when exporting architecture diagrams (shapes, dtypes, layer names, DPI, etc.).

The file also includes defaults for specific architectures such as PointNet, PointNetPwiseClassif, RBFNet, RBFNetPwiseClassif, ConvAutoencPwiseClassif, and SpConv3DPwiseClassif. Each section specifies kernel initializers, feature dimensions, skip-link strategies, batch normalization momentum, and other architecture-specific parameters.

ReceptiveField

– structure_space_bits: Coordinate precision for receptive fields. Default: 32.
– max_classes_per_reduction: Maximum classes for label reduction in receptive fields. Default: 16.

ReceptiveFieldHierarchicalSG

– check_discretization_rounding_error: Validate discretization accuracy. Default: true.

test.yml – Tests

Controls which test suites are executed when running venv/bin/python vl3d.py --test. Tests are organized by category (data mining, clustering, input/output, deep learning, pipeline, C++). Each entry under a category names a test class. Commented entries (prefixed with #) are disabled; uncommented entries are active.

For example, to activate the TransfOctoRFTest and C++ binding tests while disabling all others:

ErrorTestSuites:
  Deep learning:
    - TransfOctoRFTest
  C++:
    - VL3DPPBindingTest
    - VL3DPPBackendTest

Test suites can also be placed under WarningTestSuites to run them without failing the overall test run on errors.

logging.yml – Logging

Configures Python’s logging system using the standard logging.config dictionary schema. This file is loaded by src.main.main_logger, not through VL3DCFG.

The default setup defines:

Formatter formatter_datetime: timestamps with [YYYY-MM-DD HH:MM:SS] (LEVEL): message format.
Handler handler_console: prints to stdout at DEBUG level.
Handler handler_file: rotating file handler writing to virtualearn3d.log (8 MB per file, 32 backups).
Logger logger_vl3d: the framework logger, attached to both handlers at DEBUG level.

To change the log level for console output (e.g., show only warnings), edit the level key under handler_console:

handlers:
  handler_console:
    level: WARNING

Summary

File	VL3DCFG key	Description
`config/io.yml`	`IO`	LAS output format, memory proxy threshold.
`config/eval.yml`	`EVAL`	Evaluator defaults (raster, RF, k-fold, classification, uncertainty).
`config/report.yml`	`REPORT`	Receptive field report column toggles.
`config/mining.yml`	`MINING`	Feature miner defaults (geometric, height, change detection).
`config/model.yml`	`MODEL`	Model, architecture, and receptive field defaults.
`config/test.yml`	`TEST`	Active/inactive test suites.
`config/logging.yml`	(separate)	Python logging configuration (console + rotating file).