src.model.deeplearn.layer.grouped_point_transformer_layer
Classes
|
- class src.model.deeplearn.layer.grouped_point_transformer_layer.GroupedPointTransformerLayer(*args, **kwargs)
- Author:
Alberto M. Esmoris Pena
A grouped point transformer layer receives batches of \(R\) points with \(C\) (read as channels) features each, and \(\kappa\) known neighbors in the same spce. These inputs are used to compute an output feature space of \(R\) points with \(C\) features each. In doing so, the indexing tensor \(\mathcal{N} \in \mathbb{Z}^{B \times R \times \kappa}\) is used to link \(\kappa\) neighbors for each of the \(R\) input points in each of the \(B\) input batches.
The layer applies a feature space transformation on each element of the batch
\[\tilde{f}_{igp} = \sum_{k=1}^{\kappa}{ \tilde{v}_{ikgp} w_{ikg} }\]that yields a feature tensor \(\pmb{\widetilde{F}} \in \mathbb{R}^{R \times G \times H}\).
The final output is \(\pmb{\widehat{F}} \in \mathbb{R}^{R \times C}\), a reshaped version of the \(\pmb{\widetilde{F}}\) tensor
\[\hat{f}_{i*} = \bigl( \tilde{f}_{i11}, \ldots, \tilde{f}_{i1H}, \tilde{f}_{i21}, \ldots, \tilde{f}_{iGH} \bigr) ,\]where \(H=C/G\) for a given number of groups \(G\) such that \(G \mid C\).
The previous equations assume the previous computation of many steps. First, let \(\pmb{Q}, \pmb{K}, \pmb{V} \in \mathbb{R}^{C \times C}\) be weight matrices and \(\pmb{q}, \pmb{k}, \pmb{v} \in \mathbb{R}^{C}\) be weight vectors. Now compute
\[\pmb{\widehat{Q}} = \sigma_{Q}\bigl(Z_{Q}\bigl( \pmb{F} \pmb{Q} \oplus \pmb{q} \bigr)\bigr) ,\]\[\pmb{\widehat{K}} = \sigma_{K}\bigl(Z_{K}\bigl( \pmb{F} \pmb{K} \oplus \pmb{k} \bigr)\bigr) ,\]and
\[\pmb{\widehat{V}} = \sigma_{V}\bigl(Z_{V}\bigl( \pmb{F} \pmb{V} \oplus \pmb{v} \bigr)\bigr) .\]Where \(\sigma_{Q}, \sigma_{K}, \sigma_{V}\) are activation functions (typically ReLU), \(Z_{Q}, Z_{K}, Z_{V}\) are batch normalizations, and \(\oplus\) is the broadcast vector summation, i.e., summing a vector to the fibers of a tensor (in this case, summing the vector to all the rows of the corresponding matrix product on the left side). Concerning their dimensionalities note that \(\pmb{\widehat{Q}}, \pmb{\widehat{K}}, \pmb{\widehat{V}} \in \mathbb{R}^{R \times C}\). The input feature space corresponding to a single element in the batch is given by \(\pmb{F} \in \mathbb{R}^{R \times C}\).
Let us consider a relative position encoding tensor for each neighborhood \(\pmb{\widehat{X}} \in \mathbb{R}^{R \times \kappa \times n_x}\) for an \(n_x\)-dimensional structure space such that
\[\pmb{\hat{x}}_{ij*} = \pmb{x}_{j_i*} - \pmb{x}_{i*} ,\]with \(\pmb{x}_{j_i*} \in \mathcal{N}(\pmb{x}_{i*})\) (i.e., \(\pmb{x}_{j_i*} \in \mathbb{R}^{n_x}\) belongs to the neighborhood of the \(i\)-th point).
Now we can compute two positional encoding tensors
\[\pmb{\Delta_{A}} = \sigma_{A}\bigl(Z_{A}\bigl( \pmb{\widehat{X}} \pmb{\Theta_{A} \oplus \pmb{\theta_{A}}} \bigr)\bigr) \pmb{\widetilde{\Theta}_A} \oplus \pmb{\tilde{\theta}_A}\]and
\[\pmb{\Delta_{B}} = \sigma_{B}\bigl(Z_{B}\bigl( \pmb{\widehat{X}} \pmb{\Theta_{B} \oplus \pmb{\theta_{B}}} \bigr)\bigr) \pmb{\widetilde{\Theta}_B} \oplus \pmb{\tilde{\theta}_B} .\]Note that \(\sigma_A, \sigma_B\) are activation functions (typically ReLU) and \(Z_A, Z_B\) correspond to batch normalizations. The weights are represented through the matrices \(\pmb{\Theta_A}, \pmb{\theta_A} \in \mathbb{R}^{n_x \times C}\), \(\pmb{\widetilde{\Theta}_A}, \pmb{\widetilde{\Theta}_B} \in \mathbb{R}^{C \times C}\), and the vectors \(\pmb{\theta_A}, \pmb{\theta_B}, \pmb{\tilde{\theta}_A}, \pmb{\tilde{\theta}_B} \in \mathbb{R}^{C}\) . Concerning the dimensionalities note that \(\pmb{\Delta_A}, \pmb{\Delta_B} \in \mathbb{R}^{R \times \kappa \times C}\) .
Now, we can also compute the differences between keys and queries
\[\pmb{\gamma}_{ij*} = \pmb{\hat{k}}_{j_i*} - \pmb{\hat{q}}_{i*}\]with (again) \(\pmb{x}_{j_i*} \in \mathcal{N}(\pmb{x}_{i*})\) (i.e., \(\pmb{x}_{j_i*} \in \mathbb{R}^{n_x}\) belongs to the neighborhood of the \(i\)-th point), which leads to the tensor \(\pmb{\Gamma} \in \mathbb{R}^{R \times \kappa \times C}\).
At this point, we can compute
\[\pmb{\widetilde{\Gamma}} = \pmb{\Gamma} \odot \pmb{\Delta_{A}} + \pmb{\Delta_{B}} .\]Now we can compute the weight matrix \(\pmb{W} \in \mathbb{R}^{R \times \kappa \times G}\), for a number of groups \(G\) such that \(G | C\), that yields the \(w_{ikg}\) terms in the first equation
\[\pmb{W} = \sigma\biggl( \sigma_{\omega}\bigl(Z_{\omega}\bigl( \pmb{\widetilde{\Gamma}} \pmb{\Omega} \odot \pmb{\omega} \bigr)\bigr) \pmb{\widetilde{\Omega}} \oplus \pmb{\tilde{\omega}} \biggr) .\]In the above equation \(\sigma\) is a softmax activation function that normalizes considering the summation of all the values along the axis of \(G\) groups, \(\sigma_{\omega}\) is an activation function (typically a ReLU), and \(Z_{\omega}\) is a batch normalization.
Finally, we only need to compute the tensor \(\pmb{\widehat{V}} \in \mathbb{R}^{R \times \kappa \times C}\) such that
\[\pmb{\hat{v}}_{ij*} = \pmb{v}_{ij_i*} + \pmb{\delta^{(B)}}_{ij_i*} ,\]where \(\pmb{\delta^{(B)}}_{ij_i*} \in \mathcal{N}(\pmb{x}_{i*})\) and also \(\pmb{\delta^{(B)}}_{ij_i*}\) represents a row for the \(j\)-th closest neighbor of the \(i\)-th point in the \(\pmb{\Delta_{B}}\) tensor. If we rearrange this tensor such that
\[\tilde{v}_{ijgp} = \hat{v}_{ijc} \,,\quad c=(g-1)H+p\]we achieve the \(\pmb{\widetilde{V}} \in \mathbb{R}^{R \times \kappa \times G \times H}\) tensor necessary to compute the original equations to yield the output feature space.
This layer is inspired in the PointTransformer v2 paper (https://doi.org/10.48550/arXiv.2210.05666).
- Variables:
groups (int) – The number of groups \(G\) into which the channels will be divided. It must satisfy \(G \mid C\).
channels (int) – The number of input and output features \(C \in \mathbb{Z}_{>0}\).
dropout – The dropout layer that can be applied during training to the weight encoding tensor.
built_Q (bool) – Whether the weights matrix \(\pmb{Q}\) is built. Initially it is false, but it will be updated once the layer is built.
built_q (bool) – Whether the weights vector \(\pmb{q}\) is built. Initially it is false, but it will be updated once the layer is built.
built_K (bool) – Whether the weights matrix \(\pmb{K}\) is built. Initially it is false, but it will be updated once the layer is built.
built_k (bool) – Whether the weights vector \(\pmb{k}\) is built. Initially it is false, but it will be updated once the layer is built.
built_V (bool) – Whether the weights matrix \(\pmb{V}\) is built. Initially it is false, but it will be updated once the layer is built.
built_v (bool) – Whether the weights vector \(\pmb{v}\) is built. Initially it is false, but it will be updated once the layer is built.
built_ThetaA (bool) – Whether the weights matrix \(\pmb{\Theta_A}\) is built. Initially it is false, but it will be updated once the layer is built.
built_thetaA (bool) – Whether the weights vector \(\pmb{\theta_A}\) is built. Initially it is false, but it will be updated once the layer is built.
built_ThetaTildeA (bool) – Whether the weights matrix \(\pmb{\widetilde{\Theta}_A}\) is built. Initially it is false, but it will be updated once the layer is built.
built_thetaTildeA (bool) – Whether the weights vector \(\pmb{\tilde{\theta}_A}\) is built. Initially it is false, but it will be updated once the layer is built.
built_ThetaB (bool) – Whether the weights matrix \(\pmb{\Theta_B}\) is built. Initially it is false, but it will be updated once the layer is built.
built_thetaB (bool) – Whether the weights vector \(\pmb{\theta_B}\) is built. Initially it is false, but it will be updated once the layer is built.
built_ThetaTildeB (bool) – Whether the weights matrix \(\pmb{\widetilde{\Theta}_B}\) is built. Initially it is false, but it will be updated once the layer is built.
built_thetaTildeB (bool) – Whether the weights vector \(\pmb{\tilde{\theta}_B}\) is built. Initially it is false, but it will be updated once the layer is built.
built_Omega (bool) – Whether the weights matrix \(\pmb{\Omega}\) is built. Initially it is false, but it will be updated once the layer is built.
built_omega (bool) – Whether the weights vector \(\pmb{\omega}\) is built. Initially it is false, but it will be updated once the layer is built.
built_OmegaTilde (bool) – Whether the weights matrix \(\pmb{\widetilde{\Omega}}\) is built. Initially it is false, but it will be updated once the layer is built.
built_omegaTilde (bool) – Whether the weights vector \(\pmb{\tilde{\omega}}\) is built. Initially it is false, but it will be updated once the layer is built.
Q_initializer – The initializer for the matrix of weights \(\pmb{Q}\).
q_initializer – The initializer for the vector of weights \(\pmb{q}\).
Q_regularizer – The regularizer for the matrix of weights \(\pmb{Q}\).
q_regularizer – The regularizer for the vector of weights \(\pmb{q}\).
Q_constraint – The constraint for the matrix of weights \(\pmb{Q}\).
q_constraint – The constraint for the vector of weights \(\pmb{q}\).
K_initializer – The initializer for the matrix of weights \(\pmb{K}\).
k_initializer – The initializer for the vector of weights \(\pmb{k}\).
K_regularizer – The regularizer for the matrix of weights \(\pmb{K}\).
k_regularizer – The regularizer for the vector of weights \(\pmb{k}\).
K_constraint – The constraint for the matrix of weights \(\pmb{K}\).
k_constraint – The constraint for the vector of weights \(\pmb{k}\).
V_initializer – The initializer for the matrix of weights \(\pmb{V}\).
v_initializer – The initializer for the vector of weights \(\pmb{v}\).
V_regularizer – The regularizer for the matrix of weights \(\pmb{V}\).
v_regularizer – The regularizer for the vector of weights \(\pmb{v}\).
V_constraint – The constraint for the matrix of weights \(\pmb{V}\).
v_constraint – The constraint for the vector of weights \(\pmb{v}\).
ThetaA_initializer – The initializer for the matrix of weights \(\pmb{\Theta_A}\).
thetaA_initializer – The initializer for the vector of weights \(\pmb{\theta_A}\).
ThetaA_regularizer – The regularizer for the matrix of weights \(\pmb{\Theta_A}\).
thetaA_regularizer – The regularizer for the vector of weights \(\pmb{\theta_A}\).
ThetaA_constraint – The constraint for the matrix of weights \(\pmb{\Theta_A}\).
thetaA_constraint – The constraint for the vector of weights \(\pmb{\theta_A}\).
ThetaB_initializer – The initializer for the matrix of weights \(\pmb{\Theta_B}\).
thetaB_initializer – The initializer for the vector of weights \(\pmb{\theta_B}\).
ThetaB_regularizer – The regularizer for the matrix of weights \(\pmb{\Theta_B}\).
thetaB_regularizer – The regularizer for the vector of weights \(\pmb{\theta_B}\).
ThetaB_constraint – The constraint for the matrix of weights \(\pmb{\Theta_B}\).
thetaB_constraint – The constraint for the vector of weights \(\pmb{\theta_B}\).
Omega_initializer – The initializer for the matrix of weights \(\pmb{\Omega}\).
omega_initializer – The initializer for the vector of weights \(\pmb{\omega}\).
Omega_regularizer – The regularizer for the matrix of weights \(\pmb{\Omega}\).
omega_regularizer – The regularizer for the vector of weights \(\pmb{\omega}\).
Omega_constraint – The constraint for the matrix of weights \(\pmb{\Omega}\).
omega_constraint – The constraint for the vector of weights \(\pmb{\omega}\).
OmegaTilde_initializer – The initializer for the matrix of weights \(\pmb{\widetilde{\Omega}}\).
omegaTilde_initializer – The initializer for the vector of weights \(\pmb{\tilde{\omega}}\).
OmegaTilde_regularizer – The regularizer for the matrix of weights \(\pmb{\widetilde{\Omega}}\).
omegaTilde_regularizer – The regularizer for the vector of weights \(\pmb{\tilde{\omega}}\).
OmegaTilde_constraint – The constraint for the matrix of weights \(\pmb{\widetilde{\Omega}}\).
omegaTilde_constraint – The constraint for the vector of weights \(\pmb{\tilde{\omega}}\).
Q_act – The activation function for the queries.
Q_bn – The batch normalization layer for the queries.
K_act – The activation function for the queries.
K_bn – The batch normalization layer for the queries.
deltaA_act – The activation function for the positional encoding multiplier.
deltaA_bn – The batch normalization layer for the positional encoding multiplier.
deltaB_act – The activation function for the positional encoding bias.
deltaB_bn – The batch normalization layer for the positional encoding bias.
omega_act – The activation function for the weight encoding.
omega_bn – The batch normalization layer for the weight encoding.
sigma_act – A softmax activation function to apply neighbor-wise normalization.
- __init__(groups, channels=None, dropout=None, dropout_rate=0.0, built_Q=False, built_q=False, built_K=False, built_k=False, built_V=False, built_v=False, built_ThetaA=False, built_thetaA=False, built_ThetaTildeA=False, built_thetaTildeA=False, built_ThetaB=False, built_thetaB=False, built_ThetaTildeB=False, built_thetaTildeB=False, built_Omega=False, built_omega=False, built_OmegaTilde=False, built_omegaTilde=False, Q_initializer=None, Q_regularizer=None, Q_constraint=None, K_initializer=None, K_regularizer=None, K_constraint=None, V_initializer=None, V_regularizer=None, V_constraint=None, ThetaA_initializer=None, ThetaA_regularizer=None, ThetaA_constraint=None, ThetaTildeA_initializer=None, ThetaTildeA_regularizer=None, ThetaTildeA_constraint=None, ThetaB_initializer=None, ThetaB_regularizer=None, ThetaB_constraint=None, ThetaTildeB_initializer=None, ThetaTildeB_regularizer=None, ThetaTildeB_constraint=None, Omega_initializer=None, Omega_regularizer=None, Omega_constraint=None, OmegaTilde_initializer=None, OmegaTilde_regularizer=None, OmegaTilde_constraint=None, q_initializer=None, q_regularizer=None, q_constraint=None, k_initializer=None, k_regularizer=None, k_constraint=None, v_initializer=None, v_regularizer=None, v_constraint=None, thetaA_initializer=None, thetaA_regularizer=None, thetaA_constraint=None, thetaTildeA_initializer=None, thetaTildeA_regularizer=None, thetaTildeA_constraint=None, thetaB_initializer=None, thetaB_regularizer=None, thetaB_constraint=None, thetaTildeB_initializer=None, thetaTildeB_regularizer=None, thetaTildeB_constraint=None, omega_initializer=None, omega_regularizer=None, omega_constraint=None, omegaTilde_initializer=None, omegaTilde_regularizer=None, omegaTilde_constraint=None, Q_act=<function relu>, Q_bn=None, Q_bn_momentum=0.98, K_act=<function relu>, K_bn=None, K_bn_momentum=0.98, deltaA_act=<function relu>, deltaA_bn=None, deltaA_bn_momentum=0.98, deltaB_act=<function relu>, deltaB_bn=None, deltaB_bn_momentum=0.98, omega_act=<function relu>, omega_bn=None, omega_bn_momentum=0.98, sigma_act=<function softmax>, **kwargs)
See
LayerandLayer.__init__().- Parameters:
dropout_rate (float) – The ratio of units that will be disabled when applying dropout. It must be given inside \([0, 1)\), where \(0\) means no dropout at all and \(1\) means all the units will be deactivated.
- build(dim_in)
Build the weight vectors and matrices.
See
Layerandlayer.Layer.build().
- call(inputs, training=False, mask=False)
Compute the GroupedPointTransformerLayer on an input batch.
- Parameters:
inputs –
The input such that:
- – inputs[0]
is the structure space tensor representing the geometry of the many receptive fields in the batch.
\[\mathcal{X} \in \mathbb{R}^{K \times R \times n_x}\]- – inputs[1]
is the feature space tensor representing the features of the many receptive fields in the batch.
\[\mathcal{F} \in \mathbb{R}^{K \times R \times n_f}\]- – inputs[2]
is the indexing tensor representing the neighborhoods of \(\kappa\) neighbors for each input point, in the same space.
\[\mathcal{N} \in \mathbb{Z}^{K \times R \times \kappa}\]
- Returns:
The output feature space \(\mathcal{\widehat{F}} \in \mathbb{R}^{K \times R \times D_{\mathrm{out}}}\).
- get_config()
Return necessary data to serialize the layer.
- classmethod from_config(config)
Use given config data to deserialize the layer.