src.model.deeplearn.layer.contextual_point_layer
Classes
|
- class src.model.deeplearn.layer.contextual_point_layer.ContextualPointLayer(*args, **kwargs)
- Author:
Alberto M. Esmoris Pena
A contextual point layer receives batches of \(R\) points with \(D_{\text{in}}\) input features, and \(\kappa\) known neighbors in the same space. This layer combines features from local neighborhoods with features from local neighborhoods weighted by spatial distances, and global features for the entire input batch and positional information.
Formally speaking, for a batch size \(B \in \mathbb{Z}_{>0}\) the input consists of a \(n_x\)-dimensional structure space tensor \(\mathcal{X} \in \mathbb{R}^{B \times R \times n_x}\), a \(D_{\text{in}}\)-dimensional feature space tensor \(\mathcal{F} \in \mathbb{R}^{B \times R \times D_{\text{in}}}\), and a tensor indexing the \(\kappa \in \mathbb{Z}_{>0}\) neighbors for each point \(\mathcal{N} \in \mathbb{R}^{B \times R \times \kappa}\).
First, the global information is computed for each element in the batch as a matrix \(\pmb{G} \in \mathbb{R}^{R \times D_H}\) such that
\[\pmb{G} = \sigma_{\gamma} \left(\mathcal{Z}_{\gamma}\left( \pmb{F} \pmb{\Gamma} \oplus \pmb{\gamma} \right)\right) ,\]where \(D_H \in \mathbb{Z}_{>0}\) is the dimensionality of the hidden feature space, \(\oplus\) represents the vector sum broadcast over a matrix (typical in machine learning contexts), \(\sigma_{\gamma}\) is an activation function (typically a ReLU), \(\mathcal{Z}_{\gamma}\) represents batch normalization, \(\pmb{\Gamma} \in \mathbb{R}^{D_{\text{in}} \times D_H}\) is the matrix of weights, and \(\pmb{\gamma} \in \mathbb{R}^{D_H}\) the vector of weights.
Then, the features from local neighborhoods start by composing the tensor \(\mathcal{\widetilde{F}} \in \mathbb{R}^{R \times \kappa \times D_{\text{in}}}\) that represents the features for the \(\kappa\) neighbors for each of the \(R\) points. Now, the features of the neighbors can be transformed such that
\[\mathcal{H} = \sigma_{\phi}\left( \mathcal{Z}_{\phi}\left( \mathcal{\widetilde{F}} \pmb{\Phi} \oplus \pmb{\phi} \right)\right) ,\]where \(\sigma_{\phi}\) is an activation function (typically a ReLU), \(\mathcal{Z}_{\phi}\) represents batch normalization, \(\pmb{\Phi} \in \mathbb{R}^{D_{\text{in}} \times D_H}\) is the matrix of weights, and \(\pmb{\phi} \in \mathbb{R}^{D_H}\) is the vector of weights.
Now, we need to compute a matrix of point-wise distances for each local neighborhood \(\pmb{D} \in \mathbb{R}^{R \times \kappa}\). One alternative is to do this with the Euclidean distances or standard vector norms, i.e., \(d_{ij} = \lVert \pmb{x}_{j*} - \pmb{x}_{i*} \rVert\). The other alternative is to consider the squared distances, i.e., \(d_{ij} = \lVert \pmb{x}_{j*} - \pmb{x}_{i*} \rVert^2\). It can be interesting to force an ascending order such that \(d_{ij} \leq d_{i(j+1)}\). With this matrix we can compute a hidden distance matrix \(\pmb{\widetilde{D}} \in \mathbb{R}^{R \times D_H}\)
\[\pmb{\widetilde{D}} = \sigma_{\psi}\left( \mathcal{Z}_{\psi}\left( \pmb{D} \pmb{\Psi} \oplus \pmb{\psi} \right)\right) ,\]where \(\sigma_{\psi}\) is an activation function (typically a ReLU), \(\mathcal{Z}_{\psi}\) represents batch normalization, \(\pmb{\Psi} \in \mathbb{R}^{\kappa \times D_H}\) is the matrix of weights, and \(\pmb{\psi} \in \mathbb{R}^{D_H}\) is the vector of weights.
The \(\pmb{\widetilde{D}}\) matrix is then reduced again to a \(\kappa\)-dimensional space such that
\[\pmb{\widehat{D}} = \sigma_{\hat{\psi}}\left( \mathcal{Z}_{\hat{\psi}}\left( \pmb{\widetilde{D}} \pmb{\widehat{\Psi}} \oplus \pmb{\hat{\psi}} \right)\right) ,\]where \(\sigma_{\hat{\psi}}\) is an activation function (typically a ReLU), \(\mathcal{Z}_{\hat{\psi}}\) represents batch normalization, \(\pmb{\widehat{\Psi}} \in \mathbb{R}^{D_H \times \kappa}\) is the matrix of weights, and \(\pmb{\hat{\psi}} \in \mathbb{R}^{\kappa}\) is the vector of weights.
Let us define a tensor \(\mathcal{\widetilde{H}} \in \mathbb{R}^{R \times \kappa \times D_H}\) such that \(\tilde{h}_{ij} = \hat{d}_{ij} \pmb{h}_{ij*}\). At this point, the features from local neighborhoods weighted by spatial distances can be computed as a feature tensor
\[\mathcal{\widehat{H}} = \sigma_{\hat{\phi}}\left( \mathcal{Z}_{\hat{\phi}}\left( \mathcal{\widetilde{H}} \pmb{\widehat{\Phi}} \oplus \pmb{\hat{\phi}} \right)\right) ,\]where \(\sigma_{\hat{\phi}}\) is an activation function (typically a ReLU), \(\mathcal{Z}_{\hat{\psi}}\) represents batch normalization, \(\pmb{\widehat{\Phi}} \in \mathbb{R}^{D_H \times D_H}\) is the matrix of weights, and \(\pmb{\hat{\phi}} \in \mathbb{R}^{D_H}\) is the vector of weights.
Finally, the output feature space \(\pmb{F} \in \mathbb{R}^{R \times D_{\text{out}}}\) can be calculated as
\[\pmb{\hat{F}} = \sigma_{\theta}\left( \mathcal{Z}_{\theta}\left( \left( \pmb{G} + \mathcal{A}(\mathcal{H}) + \mathcal{A}(\mathcal{\widehat{H}}) \right) \pmb{\Theta} \oplus \pmb{\theta} \right)\right) ,\]where \(\sigma_{\theta}\) is an activation function (typically a ReLU), \(\mathcal{Z}_{\theta}\) represents batch normalization, \(\pmb{\Theta} \in \mathbb{R}^{D_H \times D_{\text{out}}}\) is the matrix of weights, and \(\pmb{\theta} \in \mathbb{R}^{D_{\text{out}}}\) is the vector of weights. Note that \(\mathcal{A}\) is a symmetric aggregation function (typically max pooling or mean reduction).
- Variables:
hidden_channels (int) – The dimensionality of the hidden feature space.
output_channels (int) – The dimensionality of the output feature space.
bn (bool) – Whether to include batch normalization or not.
bn_momentum (float) – The momentum for the batch normalization layers.
distance (str) – The distance to be used. It can be either
"squared"or"euclidean".ascending_order (bool) – Whether to force distance-based ascending order of neighborhoods (True) or not (False).
aggregation (str) – The aggregation strategy to be used. It can be either
"max"or"mean".initializer – The initializer for the matrices and vectors of weights.
regularizer – The regularizer for the matrices and vectors of weights.
constraint – The constraint for the matrices and vectors of weights.
bn_along_neighbors (bool) – Whether to normalize neighborhood tensors along the neighborhood axis (True) or along the feature axis (False).
activation – The activation function for the MLPs.
BNgamma – The batch normalization layer \(\mathcal{Z}_{\gamma}\) for the gamma MLP.
BNphi – The batch normalization layer \(\mathcal{Z}_{\phi}\) for the phi MLP.
BNphiHat – The batch normalization layer \(\mathcal{Z}_{\hat{\phi}}\) for the phiHat MLP.
BNpsi – The batch normalization layer \(\mathcal{Z}_{\psi}\) for the phi MLP.
BNpsiHat – The batch normalization layer \(\mathcal{Z}_{\hat{\psi}}\) for the phiHat MLP.
BNtheta – The batch normalization layer \(\mathcal{Z}_{\theta}\) for the theta MLP.
- __init__(hidden_channels=64, output_channels=64, bn=True, bn_momentum=0.95, bn_along_neighbors=True, activation=<class 'keras.src.layers.activations.relu.ReLU'>, distance='euclidean', ascending_order=True, aggregation='max', initializer='glorot_normal', regularizer=None, constraint=None, BNgamma=None, BNphi=None, BNphiHat=None, BNpsi=None, BNpsiHat=None, BNtheta=None, built_Gamma=False, built_gamma=False, built_Phi=False, built_phi=False, built_PhiHat=False, built_phiHat=False, built_Psi=False, built_psi=False, built_PsiHat=False, built_psiHat=False, built_Theta=False, built_theta=False, **kwargs)
See
LayerandLayer.__init__().
- build(dim_in)
Build the weight vectors and matrices.
See
Layerandlayer.Layer.build().
- call(inputs, training=False, mask=False)
Compute the ContextualPointLayer on an input batch.
- Parameters:
inputs –
The input such that:
- – inputs[0]
is the structure space tensor representing the geometry of the many receptive fields in the batch.
\[\mathcal{X} \in \mathbb{R}^{B \times R \times n_x}\]- – inputs[1]
is the feature space tensor representing the features of the many receptive fields in the batch.
\[\mathcal{F} \in \mathbb{R}^{B \times R \times n_f}\]- – inputs[2]
is the indexing tensor representing the neighborhoods of \(\kappa\) neighbors for each input point, in the same space.
\[\mathcal{N} \in \mathbb{Z}^{B \times R \times \kappa}\]
- Returns:
The output feature space \(\mathcal{\widehat{F}} \in \mathbb{R}^{B \times R \times D_{\mathrm{out}}}\).
- normalize(X, bn)
Normalize the tensor X if batch normalization is requested and available.
- Parameters:
X – The input tensor to be normalized.
bn – The batch normalization layer to be applied, if any.
- aggregate(X)
Assist
ContextualPointLayer.call in aggregating H and Hhat matrices. Aggregating means that the :math:`B times R times kappa times D_H()tensor will be reduced to a \(B \times R \times D_H\) tensor.- Parameters:
X – The tensor to be aggregated, i.e., one of its axis will be reduced to a single value.
- Returns:
The aggregated/reduced tensor.
- get_config()
Return necessary data to deserialize the layer.
- classmethod from_config(config)
Use given config data to deserialize the layer