src.model.deeplearn.layer.rbf_feat_processing_layer
Classes
|
- class src.model.deeplearn.layer.rbf_feat_processing_layer.RBFFeatProcessingLayer(*args, **kwargs)
- Author:
Alberto M. Esmoris Pena
A RBF feature processing layer is governed by a matrix \(\pmb{M} \in \mathbb{R}^{K \times n_f}\) representing the \(K\) kernels for each of the \(n_f\) features, and a matrix \(\pmb{\Omega} \in \mathbb{R}^{K \times n_f}\).
Each column \(\pmb{\mu}_{*k}\) of the matrix \(\pmb{M}\) defines \(K\) kernels for a given input feature, together with each column \(\pmb{\omega}_{*k}\) of the matrix \(\pmb{\Omega}\).
The output of a RBFFeatProcessingLayer consists of a matrix \(\pmb{Y}\in\mathbb{R}^{m \times K n_f}\) with \(K \times n_f\) output features for each of the \(m\) input points.
Let \(\mathcal{Y} \in \mathbb{R}^{K \times m \times n_f}\) be a tensor that can be sliced into \(K\) matrices of \(m\) rows and \(n_f\) columns representing the point-wise output features derived from a given kernel. For then, any cell of this tensor can be defined as follows (assuming a Gaussian RBF):
\[\mathcal{y}_{kij} = \exp\left[-\dfrac{ (f_{ij} - \mu_{kj})^2 }{ \omega_{kj}^2 }\right]\]Now, the output matrix \(\pmb{Y}\) can be simply defined by reorganizing the tensor \(\mathcal{Y}\) as a matrix such that:
\[\begin{split}\pmb{Y} = \left[\begin{array}{ccccc} | & & | & & | \\ \pmb{y}_{1*1} & \cdots & \pmb{y}_{K*1} & \cdots & \pmb{y}_{K*n_f} \\ | & & | & & | \end{array}\right]\end{split}\]Technically, it is also convenient to express the output matrix like a component-wise exponential \(\pmb{Y} = \exp\left[- \pmb{D} \odot \pmb{D}\right]\), where \(\odot\) is the Hadamard product and \(\pmb{D} \in \mathbb{R}^{m \times K n_f}\) is a matrix of scaled differences as defined below:
\[\begin{split}\pmb{D} = \left[\begin{array}{ccccc} | & & | & & | \\ \dfrac{\pmb{f}_{*1}-\mu_{11}}{\omega_{11}} & \cdots & \dfrac{\pmb{f}_{*1}-\mu_{K1}}{\omega_{K1}} & \cdots & \dfrac{\pmb{f}_{*n_f}-\mu_{Kn_f}}{\omega_{Kn_f}} \\ | & & | & & | \end{array}\right]\end{split}\]For initialization, the mean for each \(j\)-th feature \(\mu_j\) is assumed, together with its standard deviation \(\sigma_j\). With this information, it is possible to initialize the columns of the matrix \(\pmb{M}\) by taking \(K\) linearly-spaced samples from the interval \([\mu_j - 3\sigma_j, \mu_j + 3\sigma_j]\). Besides, the rows of the matrix \(\pmb{\Omega}\) can be initialized considering samples from a uniform distribution \(x \sim U(-b, b)\) such that \(\forall 1 < k \leq K,\, \omega_{kj} = a + \sigma_j(b+x)\). Typically, \(a=10^{-2}\) and \(b=1\).
- Variables:
means (list or tuple or
np.ndarray) – The mean value for each feature to be processed.stdevs (list or tuple or
np.ndarray) – The standard deviation for each feature to be processed.num_feats (int) – The number of features.
num_kernels (int) – The number of kernels per feature.
a (float) – The offset or intercept for the kernel’s sizes.
b (float) – The parameter governing the uniform distribution.
trainable_M (bool) – Whether the matrix of centers is trainable (True) or not (False).
trainable_Omega (bool) – Whether the matrix of kernel’s sizes is trainable (True) or not (False).
kernel_function_type (str) – The type of kernel function. Supported functions are “Gaussian” and “Markov”.
M (
tf.Tensor) – The matrix of kernel’s centers.built_M (bool) – Whether the matrix of kernel’s centers has been built (True) or not (False).
Omega (
tf.Tensor) – The matrix of kernel’s sizes.built_Omega – Whether the matrix of kernel’s sizes has been built (True) or not (False).
- Vartpye built_Omega:
bool
- __init__(num_kernels, means, stdevs, a=0.01, b=1, kernel_function_type='Gaussian', trainable_M=True, trainable_Omega=True, built_M=False, built_Omega=False, **kwargs)
See
Layerlayer.Layer.__init__().
- build(dim_in)
Build the \(\pmb{M} \in \mathbb{R}^{K \times n_f}\) and \(\pmb{\Omega} \in \mathbb{R}^{K \times n_f}\) matrices representing the feature processing kernel’s centers and sizes (curvatures), respectively.
See
Layerandlayer.Layer.build().
- call(inputs, training=False, mask=False)
The computation of the \(\pmb{Y} \in \mathbb{R}^{m \times Kn_f}\) output matrix.
- Returns:
The processed output features.
- Return type:
tf.Tensor
- compute_gaussian_kernel(F)
Compute a Gaussian kernel function.
\[y_{ip} = \exp\left[ - \dfrac{(f_{ij} - \mu_{kj})^2}{\omega_{kj}^2} \right]\]- Parameters:
F – The feature space matrix.
- Returns:
The computed Gaussian kernel function.
- compute_markov_kernel(F)
Compute a Markov kernel function.
\[y_{ip} = \exp\left[ - \dfrac{\lvert{f_{ij} - \mu_{kj}}\rvert}{\omega_{kj}^2} \right]\]- Parameters:
F – The feature space matrix.
- Returns:
The computed Markov kernel function.
- get_config()
Return necessary data to serialize the layer
- classmethod from_config(config)
Use given config data to deserialize the layer
- export_representation(dir_path, out_prefix=None)
Export a set of files representing the state of the kernel.
- Parameters:
dir_path (str) – The directory where the representation files will be exported.
out_prefix (str) – The output prefix to name the output files.
- Returns:
Nothing at all, but the representation is exported as a set of files inside the given directory.