-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Last N
actions as mlp_keys
encoder input for dreamer_v3
#239
Comments
Hi @geranim0, The key is Let me know if it works Note: Discrete actions are converted into one-hot actions (as the agent works with one-hot actions in the discrete case). We can discuss which is the best option. cc @belerico |
Hi @michele, Thanks for the branch! Taking a look and doing some tests with it. |
So, did some testing, here are the results Where the gray line represents the agent trained with the last It also suggests that the wrapper works 👍 Only modification I made to your branch was add an input buffer to the wrapper. |
Great, I'm glad it works. |
Sure, actually it is in my first message, in the The purpose of this is to simulate human reaction time. That's why I wanted to test adding the input buffer to the observation, to see if it would improve performance (looks like it does). |
Understood, thanks |
Hi @geranim0, if this is done we can add this feature in a new PR and put it in the next release |
Hi @belerico, sure! Side note though, in tests using real_actions = (
torch.cat([real_act.argmax(dim=-1) for real_act in real_actions], dim=-1).cpu().numpy()
)
step_data["actions"] = actions.reshape((1, cfg.env.num_envs, -1)) For now got around it by reshaping my action space to |
Hi @geranim0, |
I should have fixed the problem, could you check with the multidiscrete action space? |
Hi,
Working on an Atari environment wrapper with action input buffer with
len=N
that I want to feed as input tomlp_keys
.Algo config:
However, unable to get it working, getting error
TypeError: object of type 'NoneType' has no len()
atBecause
gym.spaces.Tuple
has no membershape
.Wondering what should change in this wrapper so it correctly interfaces with what
sheeprl
expects? Would there be a way to augmentTuple
to have a shape, or should it change to aBox
? If needed to beBox
, what should be its config?Edit:
I have a working setup using hard-coded, implementation details-aware wrapper using stuff like this. Still wondering how to achieve generic solution though.
The text was updated successfully, but these errors were encountered: