Release Bug Fixes and Improvements (VecEnv) · hill-a/stable-baselines

Warning: breaking change when using custom policies

doc update (fix example of result plotter + improve doc)
fixed logger issues when stdout lacks read function
fixed a bug in common.dataset.Dataset where shuffling was not disabled properly (it affects only PPO1 with recurrent policies)
fixed output layer name for DDPG q function, used in pop-art normalization and l2 regularization of the critic
added support for multi env recording to generate_expert_traj (@XMaster96)
added support for LSTM model recording to generate_expert_traj (@XMaster96)
GAIL: remove mandatory matplotlib dependency and refactor as subclass of TRPO (@kantneel and @AdamGleave)
added get_attr(), env_method() and set_attr() methods for all VecEnv.
Those methods now all accept indices keyword to select a subset of envs.
set_attr now returns None rather than a list of None. (@kantneel)
GAIL: gail.dataset.ExpertDataset supports loading from memory rather than file, and
gail.dataset.record_expert supports returning in-memory rather than saving to file.
added support in VecEnvWrapper for accessing attributes of arbitrarily deeply nested
instances of VecEnvWrapper and VecEnv. This is allowed as long as the attribute belongs
to exactly one of the nested instances i.e. it must be unambiguous. (@kantneel)
fixed bug where result plotter would crash on very short runs (@Pastafarianist)
added option to not trim output of result plotter by number of timesteps (@Pastafarianist)
clarified the public interface of BasePolicy and ActorCriticPolicy. Breaking change when using custom policies: masks_ph is now called dones_ph.
support for custom stateful policies.
fixed episode length recording in trpo_mpi.utils.traj_segment_generator (@GerardMaggiolino)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug Fixes and Improvements (VecEnv)