Pre alpha v0.3
Pre-release
Pre-release
What's Changed
- Download simulacra by @reciprocated in #62
- Update documentation (first review) by @simoninithomas in #64
- Add ckpt/ to gitignore by @ayulockin in #70
- change version in package to match lib by @cat-state in #73
- Docs by @shahbuland in #71
- [fix] Remove stale options from
ppo_gptj.yml
by @jon-tow in #77 - Add
entity
name config forwandb
logging by @jon-tow in #78 - EXAMPLE : Interpreter grounded Neural Program Synthesis [WIP] by @reshinthadithyan in #81
- Update
TrainConfig
optimizer hyperparameters by @jon-tow in #82 - Add examples tip to contribution guide by @jon-tow in #84
- Fix pipeline's context overflow by @reciprocated in #87
- Refactor PPO objective function by @jon-tow in #88
- Fix slow ilql eval by @reciprocated in #91
- rerun #89 by @cat-state in #92
- Hyperparameter Optimization with Ray Tune and Weights and Biases by @ayulockin in #76
- Update readme instructions by @reciprocated in #93
- Update README to align nomenclature correctness by @ayulockin in #97
- Add optional reward scaling by @reciprocated in #95
- Force class registry via imports by @jon-tow in #100
- Add optional normalization (cont.) by @reciprocated in #98
- Restructure sweeps for reuse by @reciprocated in #102
New Contributors
- @simoninithomas made their first contribution in #64
- @ayulockin made their first contribution in #70
- @reshinthadithyan made their first contribution in #81
Full Changelog: v0.2...v0.3