src.model.deeplearn.layer.downsampling_spconv3d_layer

Classes

DownsamplingSpConv3DLayer(*args, **kwargs)

class src.model.deeplearn.layer.downsampling_spconv3d_layer.DownsamplingSpConv3DLayer(*args, **kwargs)

Author:: Alberto M. Esmoris Pena

Downsampling sparse 3D convolution from hierarchy depth \(t\) to depth \(t + 1\), driven by a dense downsampling neighbor table emitted by the C++ pre-processor and concatenated across receptive fields by DLSparseConcatSequencer.

The layer’s call expects two inputs:

\(\pmb{F} \in \mathbb{R}^{(1 + R_t) \times n_f}\) — the concatenated per-cell features at depth \(t\), with row 0 reserved as the shared ground row.
\(\pmb{D} \in \mathbb{Z}^{(1 + R_{t+1}) \times w_D^3}\) — the downsampling neighbor table. Row 0 is the ground marker. Row \(v \in [1, R_{t+1}]\) contains the one-based indices in depth-\(t\) row space of the cells in the \(w_D^3\)-cell downsampling window whose min vertex is the \(v\)-th active cell at depth \(t + 1\). Entries equal to 0 mark inactive neighbors (ground row).

The convolution is then:

\[\pmb{G}_{v*} = \sum_{p = 0}^{w_D^3 - 1} \pmb{F}_{\pmb{D}_{v p} *} \pmb{W}_{p}\]

Implemented as tf.einsum('ijk,jkm->im', tf.gather(F, D[1:]), W), binding the kernel-position axis j between gathered features and weights so each kernel position has its own weight slice. The output has shape \((R_{t+1}, n_g)\) (no ground row) when consumed by the fused encoding layer, or \((1 + R_{t+1}, n_g)\) when consumed by the standalone layer-by-layer architecture.

Variables:

wD (int) – Full downsampling window edge length. The window covers \(w_D^3\) cells.
f (int) – Number of convolutional filters / kernel positions. Equals \(w_D^3\) by construction.
nf (int) – Input feature dimension at depth \(t\).
ng (int) – Output feature dimension at depth \(t + 1\).
W (tf.Variable) – Kernel of shape \((f, n_f, n_g)\).

__init__(wD, f, nf, ng, built_W=False, W_initializer=None, W_regularizer=None, W_constraint=None, **kwargs): See Layer and layer.Layer.__init__().

build(dim_in): Build the convolutional kernel weights.

call(inputs, training=False, mask=False)

Apply the downsampling convolution.

Parameters:: inputs – inputs[0] is \(\pmb{F}\); inputs[1] is the downsampling neighbor table \(\pmb{D}\).
Returns:: (1 + R_{t+1}, n_g) tensor with row 0 reserved as the ground row and rows \(v \in [1, R_{t+1}]\) holding the convolved features at depth \(t + 1\).
Return type:: tf.Tensor

static down_spconv3d_on_elem(F, D_active, W)

Pad-based variant of the downsampling convolution.

Parameters:

F – Padded depth-\(t\) features (1 + R_t, n_f) — ground row at index 0.
D_active – Downsampling neighbor index table (R_{t+1}, w_D^3) with values in [0, R_t]; value 0 fetches the ground row.
W – Kernel (w_D^3, n_f, n_g).

Returns:

(R_{t+1}, n_g) — no ground row.

static down_spconv3d_on_elem_active(F_active, D_active, W)

Active-form variant. Operates directly on the R_t-row active-cell view of F and avoids the tf.pad allocation that the pad-based down_spconv3d_on_elem() would require.

Parameters:

F_active – Depth-\(t\) features (R_t, n_f) — no ground row.
D_active – Downsampling neighbor index table (R_{t+1}, w_D^3) with values in [0, R_t]; 0 is the “missing neighbor” sentinel.
W – Kernel (w_D^3, n_f, n_g).

Returns:

(R_{t+1}, n_g).

See SubmanifoldSpConv3DLayer.spconv3d_on_elem_active() for the equivalence proof and the memory rationale. The underlying matmul is the reshape-then-matmul form so cuBLAS receives a clean GEMM rather than the Reshape → Transpose → BatchMatMul → Reshape chain that tf.einsum would lower to.

static down_spconv3d_on_idx_real(F_active, idx, real, W)

Variant of down_spconv3d_on_elem_active() taking pre-resolved idx and real. See SubmanifoldSpConv3DLayer.spconv3d_on_idx_real() for the rationale.

The (idx_D, real_D) cache is per-table: D has a different shape (R_{t+1} rows, wD^nx columns) and different contents from the submanifold S_active, so it cannot be shared with the submanifold path. Feeding submanifold’s (idx_S, real_S) into this kernel would index the wrong space and silently corrupt the downsampling output.

get_config(): Return necessary data to serialize the layer.

classmethod from_config(config)

Deserialize a layer from given specification.

Parameters:: config – The dictionary specifying how to deserialize the layer.
Returns:: The deserialized layer.
Return type:: Layer or derived