You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, from this issue, it says VecNormalize's gamma should match the gamma of RL algorithm (e.g., gamma=0.99 should be consistent in both PPO2 and VecNormalize) to ensure consistent sliding window size. However, it seems the normalization arguments used in create_env are always the default one read from .yml file (i.e., gamma=0.99 as default):
The same applies for rl-baselines3-zoo. Is this a bug? Should create_env consider gamma change in initiating VecNormalize per trial? Please give me some hint if I missed anything, thank you!
The text was updated successfully, but these errors were encountered:
Overall, it should not make a big difference as the main point is to normalize the reward magnitude.
But for consistency, I agree that gamma should be updated.
Hi, from this issue, it says
VecNormalize
'sgamma
should match thegamma
of RL algorithm (e.g.,gamma
=0.99 should be consistent in bothPPO2
andVecNormalize
) to ensure consistent sliding window size. However, it seems the normalization arguments used increate_env
are always the default one read from.yml
file (i.e.,gamma
=0.99 as default):rl-baselines-zoo/train.py
Line 269 in fd9d388
although
gamma
has different candidates inhyperparams_opt.py
:rl-baselines-zoo/utils/hyperparams_opt.py
Line 188 in fd9d388
The same applies for rl-baselines3-zoo. Is this a bug? Should
create_env
considergamma
change in initiatingVecNormalize
per trial? Please give me some hint if I missed anything, thank you!The text was updated successfully, but these errors were encountered: