Hi,
All existing optimizers in keras (adam, sgd, nadam, …) distinguish between tf.IndexedSlices gradients and dense gradients in order to update variables (in update_step() function):
def update_step(self, gradient, variable):
if isinstance(gradient, tf.IndexedSlices):
update manner 1
else:# Dense gradients.
update manner 2 (other update manner)
I want to write a custom optimizer and I want why should I distinguish between tf.IndexedSlices gradients and dense gradients?
thank you