src.model.deeplearn.layer.grouping_point_net_layer

Classes

GroupingPointNetLayer(*args, **kwargs)

class src.model.deeplearn.layer.grouping_point_net_layer.GroupingPointNetLayer(*args, **kwargs)
Author:

Alberto M. Esmoris Pena

A grouping point net layer receives batches of \(R\) points with \(n\) features each (typically, \(n=n_x+n_f\), i.e., the sum of the input structure and feature spaces), and \(\kappa\) known neighbors in the same space. These inputs are used to compute an output feature space of \(R\) points with \(D_{\text{out}}\) features each. In doing so, the indexing tensor \(\mathcal{N} \in \mathbb{Z}^{K \times R \times \kappa}\) is used to link \(\kappa\) neighbors for each of the \(R\) input points in each of the \(K\) input batches.

The layer is defined by a matrix \(\pmb{H} \in \mathbb{R}^{D_{\text{out}} \times n}\) that governs the weights for the SharedMLP (also, unitary 1DConv, i.e., win size one and stride one), and a pair (matrix, vector) governing the weights and the bias of the classical MLP (typical Dense layer in Keras):

\[( \pmb{\Gamma} \in \mathbb{R}^{D_{\text{out}} \times D_{\text{out}}}, \pmb{\gamma} \in \mathbb{R}^{D_{\text{out}}} )\]

The grouping PointNet layer applies a PointNet-like operator:

\[\left( \gamma \circ \operatorname*{MAX}_{1 \leq i \leq \kappa} \right)\left( \left\{h(\pmb{p}_{i*})\right\} \right)\]

To each of the neighborhoods, for each point in the batch, for each batch in the input. Where \(\pmb{p}_{i*} \in \mathbb{R}^{n}\) is the vector representation of the \(i\)-th point inside a given group (the neighborhood of a given input point). Note that \(\operatorname{MAX}\) is the component-wise max, as explained in the PointNet paper (https://arxiv.org/abs/1612.00593).

__init__(dim_out, H_activation=None, H_initializer=None, H_regularizer=None, H_constraint=None, gamma_activation=None, gamma_kernel_initializer=None, gamma_kernel_regularizer=None, gamma_kernel_constraint=None, gamma_bias_enabled=True, gamma_bias_initializer=None, gamma_bias_regularizer=None, gamma_bias_constraint=None, **kwargs)

See Layer and layer.Layer.__init__().

build(dim_in)

Build the \(\pmb{H} \in \mathbb{R}^{D_{\text{out}} \times n}\) matrix representing the kernel or weights of the 1DConv or SharedMLP, also the \(\pmb{\Gamma} \in \mathbb{R}^{D_{\text{out}} \times D_{\text{out}}}\) representing the weights of the MLP, and \(\pmb{\gamma} \in \mathbb{R}^{D_{\text{out}}}\) representing its bias (if requested).

See Layer and layer.Layer.build().

call(inputs, training=False, mask=False)

The computation of the \(\mathcal{Y} \in \mathbb{R}^{K \times R \times D_{\text{out}}}\) output feature space.

First, the structure and feature spaces are concatenated to compose the full point cloud matrix \(\pmb{P} = [\pmb{X} | \pmb{F}]\) for each neighborhood in each receptive field in the batch. Then, these \(\pmb{P} \in \mathbb{R}^{\kappa \times n}\) matrices are convolved through a SharedMLP (Unitary-1DConv) such that \((\pmb{P}\pmb{H}^\intercal) \in \mathbb{R}^{\kappa \times D_{\text{out}}}\).

Then, the component-wise max is computed to achieve a vector representation of each point \(\pmb{p}_{i*}^* \in \mathbb{R}^{D_{\text{out}}},\, i=1,\ldots,R\). These representations can be aranged as row-wise vectors in a matrix \(\pmb{P}^{*} \in \mathbb{R}^{R \times D_{\text{out}}}\).

Finally, an MLP must be computed on the \(\pmb{P}^*\) matrix. It can be done without bias:

\[\pmb{Y} = \pmb{P}^{*} \pmb{\Gamma}\]

Or it can be computed with bias:

\[\pmb{Y} = (\pmb{P}^{*} \pmb{\Gamma}) \oplus \pmb{\gamma}\]

Where \(\oplus\) is the broadcast sum typical in machine learning contexts. More concretely, it is a sum of the \(\pmb{\gamma} \in \mathbb{R}^{D_{\text{out}}}\) vector along the rows of the \((\pmb{P}^* \pmb{\Gamma}) \in \mathbb{R}^{R \times D_{\text{out}}}\) matrix.

The output tensor is simply a concatenation along the external axis of the many (\(K\)) \(\pmb{Y}\) matrices.

Parameters:

inputs

The input such that:

– inputs[0]

is the structure space tensor representing the geometry of the many receptive fields in the batch.

\[\mathcal{X} \in \mathbb{R}^{K \times R \times n_x}\]
– inputs[1]

is the feature space tensor representing the features of the many receptive fields in the batch.

\[\mathcal{F} \in \mathbb{R}^{K \times R \times n_f}\]
– inputs[2]

is the indexing tensor representing the neighborhoods of \(\kappa\) neighbors for each input point, in the same space.

\[\mathcal{N} \in \mathbb{Z}^{K \times R \times \kappa}\]

Returns:

The output feature space \(\mathcal{Y} \in \mathbb{R}^{K \times R \times D_{\text{out}}}\).

calc_full_dimensionality()

Compute the full dimensionality on which the PointNet operator works.

Returns:

The dimensionality of the feature space considered by the PointNet operator. Note it is not necessarily the same that self.feature_dimensionality because the structure space can be concatenated to the feature space before applying the PointNet operator.

Return type:

int

get_config()

Return necessary data to serialize the layer

classmethod from_config(config)

Use given config data to deserialize the layer