VeLO: Training Versatile Learned Optimizers by Scaling Up

Does anyone know of a TF implementation of VeLO Learned Optimizer?

Supposedly, it is about twice as fast as standard optimizers and doesn’t really require optimizer hyperparameter tuning. Both things I could really use.

I think it was first published in June or July 2022 so I was thinking that someone might have gotten something up and running? I don’t know anything at all about JAX so I won’t even try to do anything with it :frowning:

Hi @Mog, If you want to implement your own optimizer, you have to subclass the _BaseOptimizer and have to do the necssary cahnges

  • build: Create your optimizer-related variables, such as momentums in SGD optimizer.
  • update_step: Implement your optimizer’s updating logic.
  • get_config: serialization of the optimizer, includes all hyper parameters.

For more details please refer to this document. Thank You.

Ah, looking at the documentation for learnable optimizers I definitely do not know enough to implement them in Keras / TF.
@Kiran_Sai_Ramineni the document you linked to says specifically NOT to subclass _BaseOptimizer and use Optimizer instead.

https://learned-optimization.readthedocs.io/en/latest/index.html