In many cases, it’s useful to train different layer groups with different learning rates. How do we achieve this with the stand keras.optimizers
?
1 Like
I don’t think that we have an off the shelf solution.
Probably you could use a custom function with gradient_transformers
2 Likes