src.model.deeplearn.layer.downsampling_spconv3d_layer
Classes
|
- class src.model.deeplearn.layer.downsampling_spconv3d_layer.DownsamplingSpConv3DLayer(*args, **kwargs)
- Author:
Alberto M. Esmoris Pena
Downsampling sparse 3D convolution from hierarchy depth \(t\) to depth \(t + 1\), driven by a dense downsampling neighbor table emitted by the C++ pre-processor and concatenated across receptive fields by
DLSparseConcatSequencer.The layer’s call expects two inputs:
\(\pmb{F} \in \mathbb{R}^{(1 + R_t) \times n_f}\) — the concatenated per-cell features at depth \(t\), with row 0 reserved as the shared ground row.
\(\pmb{D} \in \mathbb{Z}^{(1 + R_{t+1}) \times w_D^3}\) — the downsampling neighbor table. Row 0 is the ground marker. Row \(v \in [1, R_{t+1}]\) contains the one-based indices in depth-\(t\) row space of the cells in the \(w_D^3\)-cell downsampling window whose min vertex is the \(v\)-th active cell at depth \(t + 1\). Entries equal to 0 mark inactive neighbors (ground row).
The convolution is then:
\[\pmb{G}_{v*} = \sum_{p = 0}^{w_D^3 - 1} \pmb{F}_{\pmb{D}_{v p} *} \pmb{W}_{p}\]Implemented as
tf.einsum('ijk,jkm->im', tf.gather(F, D[1:]), W), binding the kernel-position axisjbetween gathered features and weights so each kernel position has its own weight slice. The output has shape \((R_{t+1}, n_g)\) (no ground row) when consumed by the fused encoding layer, or \((1 + R_{t+1}, n_g)\) when consumed by the standalone layer-by-layer architecture.- Variables:
wD (int) – Full downsampling window edge length. The window covers \(w_D^3\) cells.
f (int) – Number of convolutional filters / kernel positions. Equals \(w_D^3\) by construction.
nf (int) – Input feature dimension at depth \(t\).
ng (int) – Output feature dimension at depth \(t + 1\).
W (
tf.Variable) – Kernel of shape \((f, n_f, n_g)\).
- __init__(wD, f, nf, ng, built_W=False, W_initializer=None, W_regularizer=None, W_constraint=None, **kwargs)
See
Layerandlayer.Layer.__init__().
- build(dim_in)
Build the convolutional kernel weights.
- call(inputs, training=False, mask=False)
Apply the downsampling convolution.
- Parameters:
inputs –
inputs[0]is \(\pmb{F}\);inputs[1]is the downsampling neighbor table \(\pmb{D}\).- Returns:
(1 + R_{t+1}, n_g)tensor with row 0 reserved as the ground row and rows \(v \in [1, R_{t+1}]\) holding the convolved features at depth \(t + 1\).- Return type:
tf.Tensor
- static down_spconv3d_on_elem(F, D_active, W)
Pad-based variant of the downsampling convolution.
- Parameters:
F – Padded depth-\(t\) features
(1 + R_t, n_f)— ground row at index 0.D_active – Downsampling neighbor index table
(R_{t+1}, w_D^3)with values in[0, R_t]; value 0 fetches the ground row.W – Kernel
(w_D^3, n_f, n_g).
- Returns:
(R_{t+1}, n_g)— no ground row.
- static down_spconv3d_on_elem_active(F_active, D_active, W)
Active-form variant. Operates directly on the
R_t-row active-cell view ofFand avoids thetf.padallocation that the pad-baseddown_spconv3d_on_elem()would require.- Parameters:
F_active – Depth-\(t\) features
(R_t, n_f)— no ground row.D_active – Downsampling neighbor index table
(R_{t+1}, w_D^3)with values in[0, R_t]; 0 is the “missing neighbor” sentinel.W – Kernel
(w_D^3, n_f, n_g).
- Returns:
(R_{t+1}, n_g).
See
SubmanifoldSpConv3DLayer.spconv3d_on_elem_active()for the equivalence proof and the memory rationale. The underlying matmul is the reshape-then-matmul form so cuBLAS receives a clean GEMM rather than theReshape → Transpose → BatchMatMul → Reshapechain thattf.einsumwould lower to.
- static down_spconv3d_on_idx_real(F_active, idx, real, W)
Variant of
down_spconv3d_on_elem_active()taking pre-resolvedidxandreal. SeeSubmanifoldSpConv3DLayer.spconv3d_on_idx_real()for the rationale.The
(idx_D, real_D)cache is per-table:Dhas a different shape (R_{t+1}rows,wD^nxcolumns) and different contents from the submanifoldS_active, so it cannot be shared with the submanifold path. Feeding submanifold’s(idx_S, real_S)into this kernel would index the wrong space and silently corrupt the downsampling output.
- get_config()
Return necessary data to serialize the layer.