src.model.deeplearn.optimizer.centralized_lamb

Classes

CentralizedLamb([learning_rate, beta_1, ...])

Stochastic gradient descent with layer-wise adaptive moments to tune the parameter-wise learning rate.

class src.model.deeplearn.optimizer.centralized_lamb.CentralizedLamb(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, weight_decay=None, clipnorm=None, clipvalue=None, global_clipnorm=None, use_ema=False, ema_momentum=0.99, ema_overwrite_frequency=None, loss_scale_factor=None, gradient_accumulation_steps=None, name='lamb', **kwargs)

Stochastic gradient descent with layer-wise adaptive moments to tune the parameter-wise learning rate. The optimizer behaves exactly like keras.optimizers.Lamb but its gradients are centered.

update_step(gradient, variable, learning_rate)

Modify the gradients of the backbone optimizer by centering them before applying them to fit the model’s parameters.

See CentralizedAdam.center_gradients().

Returns:: Nothing at all, but the parameters are updated with the centered gradients instead of the original ones.