src.model.deeplearn.sequencer.dl_sparse_concat_sequencer
Classes
|
- class src.model.deeplearn.sequencer.dl_sparse_concat_sequencer.DLSparseConcatSequencer(X, y, batch_size, **kwargs)
- Author:
Alberto M. Esmoris Pena
A deep learning sequencer that feeds receptive fields into a sparse 3D convolutional neural network by concatenating per-batch-element receptive fields into a single global tensor.
Each batch produces:
A single feature tensor \(\pmb{F} \in \mathbb{R}^{(1 + \Sigma_k R_{k0}) \times n_f}\) where row 0 is the shared ground row and rows \([1 + \mathrm{off}_{k0}, R_{k0} + \mathrm{off}_{k0}]\) hold the level-0 features of receptive field \(k\) (with \(\mathrm{off}_{k0} = \sum_{j < k} R_{j0}\) being the cumulative number of active cells from preceding receptive fields). Inactive cells are never present.
A submanifold neighbor table \(\pmb{S}_t \in \mathbb{Z}^{(1 + \Sigma_k R_{kt}) \times (2 w_t + 1)^3}\) per depth \(t \in [0, t^*)\). Row 0 is the global ground row (entries all zero — looking it up returns the ground row of \(\pmb{F}\)). Rows \(v \in [1, R_{kt} + \mathrm{off}_{kt}]\) hold the one-based sequential indices of the cells in the submanifold convolutional window centered on the \(v\)-th active cell, offset so that the indices reference the correct row in the concatenated feature tensor.
A downsampling neighbor table \(\pmb{D}_t \in \mathbb{Z}^{(1 + \Sigma_k R_{k(t+1)}) \times w_t^{D^3}}\) per depth \(t \in [0, t^* - 1)\). Indices point into depth \(t\)’s row space; row indices are at depth \(t + 1\).
An upsampling neighbor table \(\pmb{U}_t \in \mathbb{Z}^{(1 + \Sigma_k R_{kt}) \times w_t^{U^3}}\) per depth \(t \in [0, t^* - 1)\). Indices point into depth \(t + 1\)’s row space; row indices are at depth \(t\).
Every batch is statically padded along its row axis so every batch this sequencer emits presents the same shape to the model. This is what eliminates
tf.functionretracing on the forward pass. Concretely, the row axis of each emitted tensor is1 + pad_R_per_t[t]instead of the variable1 + Σ_k R_{kt}of the per-batch real cells.Fis padded with zero rows; the neighbor tablesS/D/Uare padded with all-zero rows so the downstreamtf.gatherfetches the ground row and the gather + matmul outputs zero features for padded cells. Labels are padded with class0and the loss / metric receives a matchingsample_weightvector that is1.0over the real cells and0.0over the padded tail (and also over real cells whose label is inignore_labels).Every batch also emits per-depth real-cell masks
M_tat the end of the input list. Each mask is a 1-D boolean tensor of lengthpad_R_per_t[t]withTrueover the real cells andFalseover the padded tail. The SpConv architecture forwards these masks to everyMaskedBatchNormalizationso the batch statistics ignore the padded zero rows and the running mean / variance converge to the real-cell distribution (not biased by the padding ratio).Predictions come back stacked at
n_batches × pad_R_per_t[0]rows;post_process_output()strips the padded tail of every batch and splits the real-cell rows back into per-receptive-field arrays using the cumulative-offset bookkeeping computed atprepare_data()time.The sequencer expects the input list to be the
HierarchicalSGPreProcessorPPpyoutextended with the dense neighbor tables emitted by the C++ pre-processor. Specifically,X[0]isFout,X[6]isS,X[7]isD,X[8]isU.- Variables:
total_elems (int) – Total number of receptive fields in the dataset.
max_depth (int) – Hierarchy depth \(t^*\).
R_per_rf_t (
np.ndarray) – Number of active cells per receptive field per depth. Shape(total_elems, max_depth).cum_offset_per_rf_t (
np.ndarray) – Cumulative active-cell counts per depth, used to derive per-batch offsets. Shape(total_elems + 1, max_depth);cum_offset_per_rf_t[k, t]is the row position in the global depth-\(t\) tensor at which receptive field \(k\)’s first active cell sits.pad_R_per_t (
np.ndarray) – Per-depth static pad budget = sum of the topbatch_sizevalues ofR_per_rf_t[:, t]. Bounds the worst-case batch sum regardless of which RFs the random shuffle groups together. Shape(max_depth,).ignore_labels (
np.ndarrayorNone) – Optional 1-D array of label values whose cells are masked out of the training loss / metric (sample_weight = 0).Nonedisables masking.
- __init__(X, y, batch_size, **kwargs)
Initialize the sequencer. See
DLAbstractSequencer.__init__()for the base contract.ignore_labels(optional kwarg, list of int): labels that should be excluded from the training loss / metric. Real cells whose label is in this list getsample_weight = 0.0so they do not contribute to gradient updates. This is the cleanest way to keep noisy “unclassified” or domain-irrelevant cells out of training without dropping them entirely from the receptive field.
- property batch_size
- set_input_data(X, y)
Bind the input data and (re)compute per-RF active-cell counts and cumulative offsets. See
DLAbstractSequencer.set_input_data().- Parameters:
X – The input data. Expected to be the
HierarchicalSGPreProcessorPPpyoutextended with the dense neighbor tables — a list of length 9 whereX[0]isFoutandX[6:9]areS,D,U.y – Reference labels as a list of per-RF 1-D arrays of shape
(R_k0,).
- prepare_data()
Compute per-RF active-cell counts, cumulative offsets, and the static per-depth pad sizes from the dense neighbor tables. Called every time the input data is rebound (e.g., when
DLOfflineSequencerloads a new point cloud).pad_R_per_t[t]is the smallest depth-\(t\) row count large enough to accommodate any batch this sequencer can possibly emit. It is computed as the sum of the topbatch_sizevalues ofR_per_rf_t[:, t]— that bounds the worst-case batch sum regardless of which RFs random-shuffle groups together. Every batch is then padded to this fixed shape, eliminating per-steptf.functionretracing.
- getitem_training(idx)
See
DLAbstractSequencer.getitem_training().Returns a 3-tuple
(batch_X, batch_y, sample_weight). Static-shape padding adds masked rows to every batch so all batches present an identical input shape to the model (avoidingtf.functionretracing).sample_weightis1.0for the real cells of the batch and0.0for the padded tail; the loss and the metrics therefore ignore the padded rows entirely.
- getitem_predict(idx)
- on_epoch_end_training()
- init_random_indices()
- apply_random_indices()
See
DLAbstractSequencer.apply_random_indices(). Reorders every per-RF list inself.Xandself.yaccording toself.Irandom, then recomputes the cumulative offsets.
- extract_input_batch(start_idx, end_idx)
Build the concatenated input tensor list for the receptive fields in
[start_idx, end_idx). The output ordering matchesSpConv3DPwiseClassif.build_input():[F, S_0, S_1, ..., S_{t*-1}, D_0, ..., D_{t*-2}, U_0, ..., U_{t*-2}]Every tensor is padded along its row axis so that all batches produced by this sequencer share an identical shape — that is what unblocks the
tf.functioncache.Fis padded with0.0rows; the neighbor tables (S,D,U) are padded with all-zero rows so that a downstreamtf.gatherfetches the ground row ofFand the downstream gather + matmul outputs zeros for the padded cells.post_process_outputand the sample-weight returned byextract_reference_batch()strip the padded rows out of the predictions and the loss respectively.Every output tensor is allocated once at its final padded shape and per-RF blocks are written into it with an in-place
+ offset * (entry != 0)expression — avoiding the double-allocation pattern of “concat then pad” and the per-RFnp.whereround-trip.- Returns:
A list of int32 / float32 numpy arrays ready to be fed into Keras’s
predict_on_batch()/train_on_batch().- Return type:
list
- extract_reference_batch(start_idx, end_idx)
Flat concatenation of per-RF labels over the batch interval, padded along the row axis to
pad_R_per_t[0]so that the loss / metric receives a tensor of the same shape on every step. Returns(y_padded, sample_weight)wheresample_weightis1.0for the real-cell rows and0.0for the padded tail.
- post_process_output(z_rf)
Split the flat model output back into a per-receptive-field list. Called by the model handler after stacking every batch’s output into a single
(n_batches * pad_R_0, num_classes)tensor — i.e., each batch contributespad_R_per_t[0]rows but only the firstsum(R_k0)rows of each batch slice carry real predictions. This method strips the padded tail off every batch and then splits the resulting flat tensor by the per-RF cumulative offsets stored atprepare_data()time.Note
This method assumes batches were processed in sequential
idx = 0, 1, ..., n_batches - 1order and that the sequencer state (self.X,cum_offset_per_rf_t) has not been mutated between the prediction loop and this call. VL3D’s prediction handler satisfies both invariants by building a fresh sequencer for the prediction pass; callers that reuse a sequencer across train and predict must avoid triggering a shuffle in between.- Parameters:
z_rf (
np.ndarray) – The stacked per-batch model output.- Returns:
A list of per-RF arrays, each of shape
(R_k0, num_classes).- Return type:
list of
np.ndarray