src.tests.torf_training_tools_test

Classes

TORFTrainingToolsTest()

class src.tests.torf_training_tools_test.TORFTrainingToolsTest

Author:: Alberto M. Esmoris Pena

Test that the TORF NN supports the three standard VL3D training tools:

Transfer learning via the new TORFTransferHandler, using the same JSON specification as the standard DLTransferHandler. Two configurations are covered: full transfer (no translator) and partial transfer with default_to_null semantics that resets a specific layer.
Freezing layers during training driven by DLFreezeTrainingExecutor, with weight invariance checks on the frozen layer and a counter-check that an unfrozen layer’s weights did change.
Continue-training a previously serialized TORF model: a pickle round-trip preserves the NN weights, and a subsequent train_nn call reuses the loaded handler instead of re-initializing the network.

Also includes a schedule-preparation parity check that runs the same fixtures as FreezeTrainingTest against the new executor and confirms identical output, guaranteeing the refactor preserved the legacy behavior.

__init__()

Basic configuration for any VL3D test.

Parameters:: name (str) – Test name

run(): Run all subtests in sequence. Restores logging levels even on failure (matches the pattern used by TransfOctoRFTest).

check_executor_matches_legacy_schedule()

Replay one of the schedules from FreezeTrainingTest through DLFreezeTrainingExecutor.compute_schedule() and confirm the returned (op_epoch, op, lrs) matches an explicit, hand-derived expected value. Also confirms that the wrapper DLTrainingHandler.prepare_freeze_training() is wired so it delegates to the executor (same result).

Returns:: True if the schedule preparation is correct.
Return type:: bool

check_executor_trailing_tail()

Regression test for the schedule bug where a training_interval whose end-point is strictly less than training_epochs would cause the executor to silently skip the trailing unfrozen segment. Drives DLFreezeTrainingExecutor on a mock NN and confirms that the recorded fit_fn calls cover ALL training epochs.

Returns:: True if the trailing tail is trained.
Return type:: bool

check_transfer_full()

Train a source TORF model (lazy, shared with check_transfer_default_to_null()), save its NN to a .keras file, and train a target TORF model that transfers every layer from the source via nn_transfer_weights with default_to_null=false and an empty translator (so the layer-name match is identity).

Confirms a representative dense layer’s weights match the source after a zero-epoch training pass (the transfer happens inside the target’s _fit flow before any optimizer step).

Returns:: True if the transferred weights match the source.
Return type:: bool

check_transfer_default_to_null()

Same as check_transfer_full() but the translator maps the out layer to None so that target layer is not transferred. Confirms the out layer’s weights differ from the source while an arbitrary other transferred layer matches.

Unlike check_transfer_full(), this subtest bypasses the TORF preprocessing pipeline (RF training, KNN build, HDF5 IO) and invokes the transfer mechanism directly on a freshly built target architecture. The complementary subtest check_transfer_full() still exercises the full _fit wiring end-to-end, so the layer-translation behavior here is tested at unit level without losing wiring coverage.

Returns:: True if the partial transfer behaves as expected.
Return type:: bool

check_freeze_during_training()

Drive DLFreezeTrainingExecutor with a mock NN to confirm that:

The requested layer’s trainable flag is set to False at the start of the segment whose op marks it frozen.
The flag remains False for the duration of that segment (i.e., the executor does not flip it back mid-segment).
At the end of the schedule the executor unfreezes every layer (trainable = True).

The contract that trainable = False actually freezes the gradient update is a Keras invariant that we trust and do not need to re-test here. Driving with a mock NN avoids ~3-4 s of Keras training per run and keeps this subtest sub-millisecond.

Returns:: True if the executor mutates trainability as expected.
Return type:: bool

check_continue_training()

Pickle round-trip a trained TORF model and resume training: the loaded NN must be reused (not re-initialized) so the weights carry over across the boundary.

Skips the real 1-epoch fit (which alone costs ~500 ms-1 s through the full preprocessing + sequencer pipeline) and replaces it with a deterministic set_weights() mutation that makes the snapshot distinguishable from the build-time random init. The contracts being tested — weights survive pickle and prepare_nn_handler reuse path preserves them — do not depend on the snapshot being the product of training.

Returns:: True if the continue-training path reuses the loaded NN.
Return type:: bool

check_no_double_transfer_after_reload()

Regression test (minimal): a TORF model configured with nn_transfer_weights must, after a pickle/unpickle round-trip, carry a transfer handler marked as already executed, so a subsequent _fit does NOT re-apply the transfer over the restored trained weights. Verified two ways:

transfer_handler.transfer_count == 1 after reload (the direct contract that guards the parent transfer() against a second firing).
Calling transfer_handler.transfer() on the reloaded handler does not change the model’s weights.

Skips RF and NN training entirely — the bug lives in TransfOctoRFClassificationModel.__setstate__(), not in the training loop, so wiring the architecture by hand is enough to exercise the fix in well under a second.

Returns:: True if the contract holds after reload.
Return type:: bool

check_overwrite_pretrained_model_same_spec()

Regression test: re-specifying the same nn_transfer_weights on a continue-training pass must NOT silently reset the transfer_count and trigger another transfer over the already-trained weights. The orchestrator deep-compares the new spec against the saved one and only rebuilds the handler when the spec actually changed.

The contract being tested is purely about TransfOctoRFClassificationModel.overwrite_pretrained_model() propagating updates to self.nn_handler.transfer_handler. No Keras model is needed — substituting a strict stand-in for nn_handler keeps the subtest at a few milliseconds (vs ~800 ms when the actual TORF architecture is built).

The stand-in uses a sealed attribute set so that any future change to overwrite_pretrained_model that starts touching a new nn_handler attribute fails loudly here instead of silently passing — see _StrictHandlerStandIn.

Returns:: True if same-spec preserves and different-spec rebuilds and key omission is a no-op.
Return type:: bool

check_executor_rejects_none_spec()

Regression test for the I-3 hardening: the public static src.model.deeplearn.handle.dl_freeze_training_executor.DLFreezeTrainingExecutor.compute_schedule() must reject freeze_training=None with the package’s typed src.model.deeplearn.deep_learning_exception..DeepLearningException, not with the opaque TypeError that for x in None would surface.

The test uses a callable stub fit_fn (instances of _StubFitFn track every call), and verifies two properties:

Negative: executor.run(...) with freeze_training= None raises DeepLearningException and does NOT call the stub (the executor fails fast, no partial state mutations).
Positive: the same stub class drives the executor successfully on a valid spec, recording the expected per-segment epoch counts. This confirms the stub pattern itself is sound and that the None-rejection path did not leave any executor-side residue.

Returns:: True if both negative and positive paths behave as expected.
Return type:: bool

check_fit_guards_against_missing_optimizer()

Regression test for the I-1 hardening: TransfOctoRFHandler._fit must self-protect against the post-__setstate__ state where the handler has compiled = True but arch.nn.optimizer is None. Without the guard, freeze-training with an explicit initial_learning_rate would crash on KerasUtils.set_learning_rate() (which dereferences the optimizer).

Patches arch.nn.fit to a no-op stub so the test exercises the guard + freeze-executor path (including the set_learning_rate access on the re-attached optimizer) without paying for a real gradient step. The actual training is incidental to the contract under test.

Returns:: True if the optimizer is re-attached before the freeze executor runs and the per-segment fit hook is called the expected number of times.
Return type:: bool

check_executor_can_rerun_with_same_instance()

Regression test for the I-3 latent invariant: a DLFreezeTrainingExecutor instance carries no per-run mutable state, so calling run() twice with the same spec on different mock NNs must produce the same observable trace AND must not mutate the stored self.freeze_training specification (deep-equality preserved across both calls).

The deep-copy snapshot catches a future regression that normalizes the spec in-place on the first run, which would still produce identical observable traces (and thus pass a weaker test) but corrupt the executor’s state for any future caller inspecting executor.freeze_training.

Returns:: True if both runs produce identical traces, the final trainable flags are restored, and the executor’s spec is unchanged.
Return type:: bool

check_optimizer_state_preserved_with_flag()

Regression test for M-1: the opt-in preserve_optimizer_state=True flag must keep the NN optimizer’s state intact across the __getstate__ / __setstate__ boundary. Two pieces of state are verified end-to-end:

optimizer.iterations — the LR-scheduler counter. Without preservation, cosine/exponential decays would restart each reload.
One of Adam’s first-moment slot variables (m) — the actual continue-training payload. Without preservation, Adam restarts with zero moments and convergence stalls.

A hypothetical future regression that preserved iterations but corrupted the moment slots would be caught by the second check.

Returns:: True if both the iteration counter AND the sentinel slot value survive the save/reload cycle.
Return type:: bool

check_executor_empty_spec()

Regression test: an empty freeze_training: [] list must degrade gracefully to “no freezing” (a single full-length fit call) instead of raising IndexError on the schedule bookkeeping arrays.

Returns:: True if the executor falls back to a single full fit.
Return type:: bool

build_small_model(seed=42, nn_transfer_weights=None, nn_freeze_training=None, epochs=2, tiny=False)

Build a small TORF model for the smoke tests. The configuration mirrors a minimal variant of TransfOctoRFTest.

Parameters:

seed – Random seed.
nn_transfer_weights – Optional transfer-learning spec.
nn_freeze_training – Optional freeze-training spec.
epochs – Training epochs.
tiny – When True, picks the smallest architecture that still builds (n_h=4, single 4-unit SMLP layer, K=2). Use for subtests that do not actually train (e.g., pickle-roundtrip checks) so the Keras graph build cost is minimal (~200 ms instead of ~900 ms).

Returns:

The configured but unfit TORF model.

Return type:

TransfOctoRFClassificationModel

static make_data(seed)

Build small linearly separable RF + NN datasets.

Returns:: Tuple (X_rf, y_rf, X_nn, y_nn) with 4 features per sample and 2 classes.
Return type:: tuple of np.ndarray