CaffeRanger

Implementation of Solver(Optimizer) Ranger (Radam + look ahead)

Radam : On the Variance of the Adaptive Learning Rate and Beyond
Look ahead : Lookahead Optimizer: k steps forward, 1 step back
Ptorch Version : https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer

caffe.proto

optional float ranger_alpha = 43 [default = 0.5];
optional int32 ranger_k_thres = 44 [default = 6];
optional float ranger_n_sma_threshold = 45 [default = 5.0];
optional bool ranger_use_radam = 45 [default = true];
optional bool ranger_use_lookahead = 45 [default = true];

Here use ranger_use_lookahead (ranger_use_radam has not decided where to use) to switch between radam and ranger because when using l1 training, the training error will increase.
It should because the lookahead is not a soft gradient when using l1. The loss between fast_move and slow_move may get a high loss and the model will confuse where to go.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
include		include
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CaffeRanger

About

Releases

Packages

Languages

ricky40403/CaffeRanger

Folders and files

Latest commit

History

Repository files navigation

CaffeRanger

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages