VeLO: Training Versatile Learned Optimizers by Scaling Up #111

redknightlois · 2023-02-15T03:45:12Z

While deep learning models have replaced hand-designed features across many domains,
these models are still trained with hand-designed optimizers. In this work, we leverage the same
scaling approach behind the success of deep learning to learn versatile optimizers. We train an
optimizer for deep learning which is itself a small neural network that ingests gradients and
outputs parameter updates. Meta-trained with approximately four thousand TPU-months of
compute on a wide variety of optimization tasks, our optimizer not only exhibits compelling
performance, but optimizes in interesting and unexpected ways. It requires no hyperparameter
tuning, instead automatically adapting to the specifics of the problem being optimized. We open
source our learned optimizer, meta-training code, the associated train and test data, and an
extensive optimizer benchmark suite with baselines at velo-code.github.io.

https://github.com/google/learned_optimization/tree/main/learned_optimization/research/general_lopt

kozistr · 2023-02-15T05:10:08Z

thanks for the request!

actually, I read the paper and implementation before. However, I am not confident in implementing the VeLO optimizer effectively & Pytorch-friendly. But, I'll try!

kozistr self-assigned this Feb 15, 2023

kozistr added the feature request Request features label Feb 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VeLO: Training Versatile Learned Optimizers by Scaling Up #111

VeLO: Training Versatile Learned Optimizers by Scaling Up #111

redknightlois commented Feb 15, 2023

kozistr commented Feb 15, 2023

VeLO: Training Versatile Learned Optimizers by Scaling Up #111

VeLO: Training Versatile Learned Optimizers by Scaling Up #111

Comments

redknightlois commented Feb 15, 2023

kozistr commented Feb 15, 2023