src.model.deeplearn.optimizer.centralized_adamw
Classes
|
ADAM optimizer with extra weight decay and centralized gradients. |
- class src.model.deeplearn.optimizer.centralized_adamw.CentralizedAdamW(learning_rate=0.001, weight_decay=0.004, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False, clipnorm=None, clipvalue=None, global_clipnorm=None, use_ema=False, ema_momentum=0.99, ema_overwrite_frequency=None, loss_scale_factor=None, gradient_accumulation_steps=None, name='adamw', **kwargs)
ADAM optimizer with extra weight decay and centralized gradients. The optimizer behaves exactly like
keras.optimizers.AdamWbut its gradients are centered.- update_step(gradient, variable, learning_rate)
Modify the gradients of the backbone optimizer by centering them before applying them to fit the model’s parameters.
See
CentralizedAdam.center_gradients().- Returns:
Nothing at all, but the parameters are updated with the centered gradients instead of the original ones.