src.model.deeplearn.layer.rbf_feat_processing_layer

Classes

RBFFeatProcessingLayer(*args, **kwargs)

class src.model.deeplearn.layer.rbf_feat_processing_layer.RBFFeatProcessingLayer(*args, **kwargs)

Author:: Alberto M. Esmoris Pena

A RBF feature processing layer is governed by a matrix \(\pmb{M} \in \mathbb{R}^{K \times n_f}\) representing the \(K\) kernels for each of the \(n_f\) features, and a matrix \(\pmb{\Omega} \in \mathbb{R}^{K \times n_f}\).

Each column \(\pmb{\mu}_{*k}\) of the matrix \(\pmb{M}\) defines \(K\) kernels for a given input feature, together with each column \(\pmb{\omega}_{*k}\) of the matrix \(\pmb{\Omega}\).

The output of a RBFFeatProcessingLayer consists of a matrix \(\pmb{Y}\in\mathbb{R}^{m \times K n_f}\) with \(K \times n_f\) output features for each of the \(m\) input points.

Let \(\mathcal{Y} \in \mathbb{R}^{K \times m \times n_f}\) be a tensor that can be sliced into \(K\) matrices of \(m\) rows and \(n_f\) columns representing the point-wise output features derived from a given kernel. For then, any cell of this tensor can be defined as follows (assuming a Gaussian RBF):

\[\mathcal{y}_{kij} = \exp\left[-\dfrac{ (f_{ij} - \mu_{kj})^2 }{ \omega_{kj}^2 }\right]\]

Now, the output matrix \(\pmb{Y}\) can be simply defined by reorganizing the tensor \(\mathcal{Y}\) as a matrix such that:

\[\begin{split}\pmb{Y} = \left[\begin{array}{ccccc} | & & | & & | \\ \pmb{y}_{1*1} & \cdots & \pmb{y}_{K*1} & \cdots & \pmb{y}_{K*n_f} \\ | & & | & & | \end{array}\right]\end{split}\]

Technically, it is also convenient to express the output matrix like a component-wise exponential \(\pmb{Y} = \exp\left[- \pmb{D} \odot \pmb{D}\right]\), where \(\odot\) is the Hadamard product and \(\pmb{D} \in \mathbb{R}^{m \times K n_f}\) is a matrix of scaled differences as defined below:

\[\begin{split}\pmb{D} = \left[\begin{array}{ccccc} | & & | & & | \\ \dfrac{\pmb{f}_{*1}-\mu_{11}}{\omega_{11}} & \cdots & \dfrac{\pmb{f}_{*1}-\mu_{K1}}{\omega_{K1}} & \cdots & \dfrac{\pmb{f}_{*n_f}-\mu_{Kn_f}}{\omega_{Kn_f}} \\ | & & | & & | \end{array}\right]\end{split}\]

For initialization, the mean for each \(j\)-th feature \(\mu_j\) is assumed, together with its standard deviation \(\sigma_j\). With this information, it is possible to initialize the columns of the matrix \(\pmb{M}\) by taking \(K\) linearly-spaced samples from the interval \([\mu_j - 3\sigma_j, \mu_j + 3\sigma_j]\). Besides, the rows of the matrix \(\pmb{\Omega}\) can be initialized considering samples from a uniform distribution \(x \sim U(-b, b)\) such that \(\forall 1 < k \leq K,\, \omega_{kj} = a + \sigma_j(b+x)\). Typically, \(a=10^{-2}\) and \(b=1\).

Variables:

means (list or tuple or np.ndarray) – The mean value for each feature to be processed.
stdevs (list or tuple or np.ndarray) – The standard deviation for each feature to be processed.
num_feats (int) – The number of features.
num_kernels (int) – The number of kernels per feature.
a (float) – The offset or intercept for the kernel’s sizes.
b (float) – The parameter governing the uniform distribution.
trainable_M (bool) – Whether the matrix of centers is trainable (True) or not (False).
trainable_Omega (bool) – Whether the matrix of kernel’s sizes is trainable (True) or not (False).
kernel_function_type (str) – The type of kernel function. Supported functions are “Gaussian” and “Markov”.
M (tf.Tensor) – The matrix of kernel’s centers.
built_M (bool) – Whether the matrix of kernel’s centers has been built (True) or not (False).
Omega (tf.Tensor) – The matrix of kernel’s sizes.
built_Omega – Whether the matrix of kernel’s sizes has been built (True) or not (False).

Vartpye built_Omega:

bool

__init__(num_kernels, means, stdevs, a=0.01, b=1, kernel_function_type='Gaussian', trainable_M=True, trainable_Omega=True, built_M=False, built_Omega=False, **kwargs): See Layer layer.Layer.__init__().

build(dim_in)

Build the \(\pmb{M} \in \mathbb{R}^{K \times n_f}\) and \(\pmb{\Omega} \in \mathbb{R}^{K \times n_f}\) matrices representing the feature processing kernel’s centers and sizes (curvatures), respectively.

See Layer and layer.Layer.build().

call(inputs, training=False, mask=False)

The computation of the \(\pmb{Y} \in \mathbb{R}^{m \times Kn_f}\) output matrix.

Returns:: The processed output features.
Return type:: tf.Tensor

compute_gaussian_kernel(F)

Compute a Gaussian kernel function.

\[y_{ip} = \exp\left[ - \dfrac{(f_{ij} - \mu_{kj})^2}{\omega_{kj}^2} \right]\]

Parameters:: F – The feature space matrix.
Returns:: The computed Gaussian kernel function.

compute_markov_kernel(F)

Compute a Markov kernel function.

\[y_{ip} = \exp\left[ - \dfrac{\lvert{f_{ij} - \mu_{kj}}\rvert}{\omega_{kj}^2} \right]\]

Parameters:: F – The feature space matrix.
Returns:: The computed Markov kernel function.

get_config(): Return necessary data to serialize the layer

classmethod from_config(config): Use given config data to deserialize the layer

export_representation(dir_path, out_prefix=None)

Export a set of files representing the state of the kernel.

Parameters:

dir_path (str) – The directory where the representation files will be exported.
out_prefix (str) – The output prefix to name the output files.

Returns:

Nothing at all, but the representation is exported as a set of files inside the given directory.