Release Pre alpha v0.3 · CarperAI/trlx

What's Changed

Download simulacra by @reciprocated in #62
Update documentation (first review) by @simoninithomas in #64
Add ckpt/ to gitignore by @ayulockin in #70
change version in package to match lib by @cat-state in #73
Docs by @shahbuland in #71
[fix] Remove stale options from ppo_gptj.yml by @jon-tow in #77
Add entity name config for wandb logging by @jon-tow in #78
EXAMPLE : Interpreter grounded Neural Program Synthesis [WIP] by @reshinthadithyan in #81
Update TrainConfig optimizer hyperparameters by @jon-tow in #82
Add examples tip to contribution guide by @jon-tow in #84
Fix pipeline's context overflow by @reciprocated in #87
Refactor PPO objective function by @jon-tow in #88
Fix slow ilql eval by @reciprocated in #91
rerun #89 by @cat-state in #92
Hyperparameter Optimization with Ray Tune and Weights and Biases by @ayulockin in #76
Update readme instructions by @reciprocated in #93
Update README to align nomenclature correctness by @ayulockin in #97
Add optional reward scaling by @reciprocated in #95
Force class registry via imports by @jon-tow in #100
Add optional normalization (cont.) by @reciprocated in #98
Restructure sweeps for reuse by @reciprocated in #102

Full Changelog: v0.2...v0.3