Releases · kozistr/pytorch_optimizer

28 Oct 23:30

kozistr

v3.2.0

a59f2e1

pytorch-optimizer v3.2.0 Latest

Latest

Change Log

Feature

Implement SOAP optimizer. (#275)
- SOAP: Improving and Stabilizing Shampoo using Adam
Support AdEMAMix variants. (#276)
- bnb_ademamix8bit, bnb_ademamix32bit, bnb_paged_ademamix8bit, bnb_paged_ademamix32bit
Support 8/4bit, fp8 optimizers. (#208, #281)
- torchao_adamw8bit, torchao_adamw4bit, torchao_adamwfp8.
Support a module-name-level (e.g. LayerNorm) weight decay exclusion for get_optimizer_parameters. (#282, #283)
Implement CPUOffloadOptimizer, which offloads optimizer to CPU for single-GPU training. (#284)
Support a regex-based filter for searching names of optimizers, lr schedulers, and loss functions.

Bug

Fix should_grokfast condition when initialization. (#279, #280)

Contributions

thanks to @Vectorrent

Contributors

Vectorrent

Assets 2

10 Sep 10:58

kozistr

v3.1.2

9d5e181

pytorch-optimizer v3.1.2

Change Log

Feature

Implement AdEMAMix optimizer. (#272)
- THE ADEMAMIX OPTIMIZER: BETTER, FASTER, OLDER

Bug

Add **kwargs to the parameters for dummy placeholder. (#270, #271)

Assets 2

14 Aug 09:47

kozistr

v3.1.1

a8eb19c

pytorch-optimizer v3.1.1

Change Log

Feature

Implement TRAC optimizer. (#263)
- Fast TRAC: A Parameter-Free Optimizer for Lifelong Reinforcement Learning
Support AdamW optimizer via create_optimizer(). (#263)
Implement AdamG optimizer. (#264, #265)
- Towards Stability of Parameter-free Optimization

Bug

Handle the optimizers that only take the model instead of the parameters in create_optimizer(). (#263)
Move the variable to the same device with the parameter. (#266, #267)

Assets 2

21 Jul 11:54

kozistr

v3.1.0

d00136f

pytorch-optimizer v3.1.0

Change Log

Feature

Implement AdaLomo optimizer. (#258)
- Low-memory Optimization with Adaptive Learning Rate
Support Q-GaLore optimizer. (#258)
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
- you can use by optimizer = load_optimizer('q_galore_adamw8bit')
Support more bnb optimizers. (#258)
- bnb_paged_adam8bit, bnb_paged_adamw8bit, bnb_*_*32bit.
Improve power_iteration() speed up to 40%. (#259)
Improve reg_noise() (E-MCMC) speed up to 120%. (#260)
Support disable_lr_scheduler parameter for Ranger21 optimizer to disable built-in learning rate scheduler. (#261)

Refactor

Refactor AdamMini optimizer. (#258)
Deprecate optional dependency, bitsandbytes. (#258)
Move get_rms, approximate_sq_grad functions to BaseOptimizer for reusability. (#258)
Refactor shampoo_utils.py. (#259)
Add debias, debias_adam methods in BaseOptimizer. (#261)
Refactor to use BaseOptimizer only, not inherit multiple classes. (#261)

Bug

Fix several bugs in AdamMini optimizer. (#257)

Contributions

thanks to @sdbds

Contributors

sdbds

Assets 2

06 Jul 11:04

kozistr

v3.0.2

232f72e

pytorch-optimizer v3.0.2

Change Log

Feature

Implement WSD LR Scheduler. (#247, #248)
- Warmup-Stable-Decay LR Scheduler
Add more Pytorch built-in lr schedulers. (#248)
Implement Kate optimizer. (#249, #251)
- Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad
Implement StableAdamW optimizer. (#250, #252)
- Stable and low-precision training for large-scale vision-language models
Implement AdamMini optimizer. (#246, #253)
- Use Fewer Learning Rates To Gain More

Refactor

Refactor Chebyschev lr scheduler modules. (#248)
- Rename get_chebyshev_lr to get_chebyshev_lr_lambda.
- Rename get_chebyshev_schedule to get_chebyshev_perm_steps.
- Call get_chebyshev_schedule function to get LamdbaLR scheduler object.
Refactor with ScheduleType. (#248)

Assets 2

22 Jun 06:26

kozistr

v3.0.1

7c40a79

pytorch-optimizer v3.0.1

Change Log

Feature

Implement FAdam optimizer. (#241, #242)
- Adam is a natural gradient optimizer using diagonal empirical Fisher information
Tweak AdaFactor optimizer. (#236, #243)
- support not-using-first-momentum when beta1 is not given
- default dtype for first momentum to bfloat16
- clip second momentum to 0.999
Implement GrokFast optimizer. (#244, #245)
- Accelerated Grokking by Amplifying Slow Gradients

Bug

Wrong typing of reg_noise. (#239, #240)
Lookahead`s param_groups attribute is not loaded from checkpoint. (#237, #238)

Contributions

thanks to @michaldyczko

Contributors

michaldyczko

Assets 2

21 May 09:02

kozistr

v3.0.0

eda736f

pytorch-optimizer v3.0.0

Change Log

The major version is updated! (v2.12.0 -> v3.0.0) (#164)

Many optimizers, learning rate schedulers, and objective functions are in pytorch-optimizer.
Currently, pytorch-optimizer supports 67 optimizers (+ bitsandbytes), 11 lr schedulers, and 13 loss functions, and reached about 4 ~ 50K downloads / month (peak is 75K downloads / month)!

The reason for updating the major version from v2 to v3 is that I think it's a good time to ship the recent implementations (the last update was about 7 months ago) and plan to pivot to new concepts like training utilities while maintaining the original features (e.g. optimizers).
Also, rich test cases, benchmarks, and examples are on the list!

Finally, thanks for using the pytorch-optimizer, and feel free to make any requests :)

Feature

Implement REX lr scheduler. (#217, #222)
- Revisiting Budgeted Training with an Improved Schedule
Implement Aida optimizer. (#220, #221)
- A DNN Optimizer that Improves over AdaBelief by Suppression of the Adaptive Stepsize Range
Implement WSAM optimizer. (#213, #216)
- Sharpness-Aware Minimization Revisited: Weighted Sharpness as a Regularization Term
Implement GaLore optimizer. (#224, #228)
- Memory-Efficient LLM Training by Gradient Low-Rank Projection
Implement Adalite optimizer. (#225, #229)
Implement bSAM optimizer. (#212, #233)
- SAM as an Optimal Relaxation of Bayes
Implement Schedule-Free optimizer. (#230, #233)
- Schedule-Free optimizers
Implement EMCMC. (#231, #233)
- Entropy-MCMC: Sampling from flat basins with ease

Fix

Fix SRMM to allow operation beyond memory_length. (#227)

Dependency

Drop Python 3.7 support officially. (#221)
- Please check the README.
Update bitsandbytes to 0.43.0. (#228)

Docs

Add missing parameters in Ranger21 optimizer document. (#214, #215)
Fix WSAM optimizer paper link. (#219)

Contributions

thanks to @sdbds, @i404788

Diff

from the previous major version : 2.0.0...3.0.0
from the previous version: 2.12.0...3.0.0

Contributors

sdbds and i404788

Assets 2

07 Oct 06:06

kozistr

v2.12.0

14b6b58

pytorch-optimizer v2.12.0

Change Log

Feature

Support bitsandbytes optimizer. (#211)
- now, you can install with pip3 install pytorch-optimizer[bitsandbytes]
- supports 8 bnb optimizers.
  - bnb_adagrad8bit, bnb_adam8bit, bnb_adamw8bit, bnb_lion8bit, bnb_lamb8bit, bnb_lars8bit, bnb_rmsprop8bit, bnb_sgd8bit.

Docs

Introduce mkdocs with material theme. (#204, #206)
- documentation : https://pytorch-optimizers.readthedocs.io/en/latest/

Diff

2.11.2...2.12.0

Assets 2

02 Sep 05:57

kozistr

v2.11.2

25618d7

pytorch-optimizer v2.11.2

Change Log

Feature

Implement DAdaptLion optimizer (#203)
- Lion with D-Adaptation

Fix

Fix Lookahead optimizer (#200, #201, #202)
- When using PyTorch Lightning which expects your optimiser to be a subclass of Optimizer.
Fix default rectify to False in AdaBelief optimizer (#203)

Test

Add DynamicLossScaler test case

Docs

Highlight the code blocks
Fix pepy badges

Contributions

thanks to @georg-wolflein

Diff

2.11.1...2.11.2

Contributors

georg-wolflein

Assets 2

19 Jul 12:00

kozistr

v2.11.1

aaaf303

pytorch-optimizer v2.11.1

Change Log

Feature

Implement Tiger optimizer (#192)
- A Tight-fisted Optimizer
Implement CAME optimizer (#196)
- Confidence-guided Adaptive Memory Efficient Optimization
Implement loss functions (#198)
- Tversky Loss : Tversky loss function for image segmentation using 3D fully convolutional deep networks
- Focal Tversky Loss
- Lovasz Hinge Loss : The Lovász-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks

Diff

2.11.0...2.11.1

Assets 2

Releases: kozistr/pytorch_optimizer

pytorch-optimizer v3.2.0

Change Log

Feature

Bug

Contributions

Contributors

pytorch-optimizer v3.1.2

Change Log

Feature

Bug

pytorch-optimizer v3.1.1

Change Log

Feature

Bug

pytorch-optimizer v3.1.0

Change Log

Feature

Refactor

Bug

Contributions

Contributors

pytorch-optimizer v3.0.2

Change Log

Feature

Refactor

pytorch-optimizer v3.0.1

Change Log

Feature

Bug

Contributions

Contributors

pytorch-optimizer v3.0.0

Change Log

Feature

Fix

Dependency

Docs

Contributions

Diff

Contributors

pytorch-optimizer v2.12.0

Change Log

Feature

Docs

Diff

pytorch-optimizer v2.11.2

Change Log

Feature

Fix

Test

Docs

Contributions

Diff

Contributors

pytorch-optimizer v2.11.1

Change Log

Feature

Diff