Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReplayBuffer storing actions size mismatch during env reset #278

Open
defrag-bambino opened this issue May 3, 2024 · 7 comments
Open

ReplayBuffer storing actions size mismatch during env reset #278

defrag-bambino opened this issue May 3, 2024 · 7 comments
Labels
wontfix This will not be worked on

Comments

@defrag-bambino
Copy link

Hi,

I am trying to write a simple gym wrapper for an existing env.
During testing, I am not facing the following issue:

  File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/algos/dreamer_v3/dreamer_v3.py", line 647, in main
    rb.add(reset_data, dones_idxes, validate_args=cfg.buffer.validate_args)
  File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/data/buffers.py", line 656, in add
    self._buf[env_idx].add(env_data, validate_args=validate_args)
  File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/data/buffers.py", line 220, in add
    self.buffer[k][idxes] = data_to_store[k]
  File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/utils/memmap.py", line 264, in __setitem__
    self.array[idx] = value
ValueError: shape mismatch: value array of shape (1,1,5) could not be broadcast to indexing result of shape (1,1,4)

Which, I think, originates from this line: reset_data["actions"] = np.zeros((1, reset_envs, np.sum(actions_dim))) (line 643 in dreamer_v3.py). My env has action_space.shape of (1,4) - but in this line it is summing up to 1+4=5.

Is this the desired behavior?

Thanks

@michele-milesi
Copy link
Member

Hi @defrag-bambino,
thank you for reporting this problem.

Which action space are you using? Are they continuous actions?
In this case, we assume that continuous actions have a shape with a dimension, something like this: (n,). This allows us to handle continuous, discrete, and multidiscrete in the same way.
I would suggest you try changing the action space to dimension (4,).

@belerico might it make sense to have a wrapper that flattens the continuous actions?

@defrag-bambino
Copy link
Author

Yes, it is a continuous "Box" Space.
The problem is that this particular action_space is (N_AGENTS, 4). So there is different versions of the gym env with different action_space shapes).

@defrag-bambino
Copy link
Author

I've tried to work around it using np.squeeze() and np.expand_dims() in relevant places of my env wrapper. This seems to work for now.
However, after a few seconds it crashes with this error

Stacktrace

Traceback (most recent call last):
File "/home/drt/miniconda3/envs/sheeprl/bin/sheeprl", line 8, in
sys.exit(run())
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/main.py", line 90, in decorated_main
_run_hydra(
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra
_run_app(
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/_internal/utils.py", line 457, in _run_app
run_and_report(
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/_internal/utils.py", line 222, in run_and_report
raise ex
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/_internal/utils.py", line 219, in run_and_report
return func()
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/_internal/utils.py", line 458, in
lambda: hydra.run(
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/_internal/hydra.py", line 132, in run
_ = ret.return_value
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/core/utils.py", line 260, in return_value
raise self._return_value
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/core/utils.py", line 186, in run_job
ret.return_value = task_function(task_cfg)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/cli.py", line 352, in run
run_algorithm(cfg)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/cli.py", line 190, in run_algorithm
fabric.launch(reproducible(command), cfg, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 839, in launch
return self._wrap_and_launch(function, self, *args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 924, in _wrap_and_launch
return launcher.launch(to_run, *args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/lightning/fabric/strategies/launchers/subprocess_script.py", line 104, in launch
return function(*args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 930, in _wrap_with_setup
return to_run(*args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/cli.py", line 186, in wrapper
return func(fabric, cfg, *args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/algos/dreamer_v3/dreamer_v3.py", line 677, in main
train(
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/algos/dreamer_v3/dreamer_v3.py", line 113, in train
embedded_obs = world_model.encoder(batch_obs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/lightning/fabric/wrappers.py", line 119, in forward
output = self._forward_module(*args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1523, in forward
else self._run_ddp_forward(*inputs, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1359, in _run_ddp_forward
return self.module(*inputs, **kwargs) # type: ignore[index]
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/models/models.py", line 469, in forward
mlp_out = self.mlp_encoder(obs, *args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/algos/dreamer_v3/agent.py", line 151, in forward
return self.model(x)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/models/models.py", line 119, in forward
return self.model(obs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/container.py", line 217, in forward
input = module(input)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 116, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1024x72 and 1x512)

Seems like the same holds for the observation shape (1, 72).

@belerico
Copy link
Member

belerico commented May 3, 2024

Hi @defrag-bambino, thank you for reporting this problem.

Which action space are you using? Are they continuous actions? In this case, we assume that continuous actions have a shape with a dimension, something like this: (n,). This allows us to handle continuous, discrete, and multidiscrete in the same way. I would suggest you try changing the action space to dimension (4,).

@belerico might it make sense to have a wrapper that flattens the continuous actions?

Yep, we can add it and leave it to the user to use it

@belerico
Copy link
Member

belerico commented May 3, 2024

I've tried to work around it using np.squeeze() and np.expand_dims() in relevant places of my env wrapper. This seems to work for now. However, after a few seconds it crashes with this error

Stacktrace
Traceback (most recent call last): File "/home/drt/miniconda3/envs/sheeprl/bin/sheeprl", line 8, in sys.exit(run()) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/main.py", line 90, in decorated_main _run_hydra( File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra _run_app( File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/_internal/utils.py", line 457, in _run_app run_and_report( File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/_internal/utils.py", line 222, in run_and_report raise ex File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/_internal/utils.py", line 219, in run_and_report return func() File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/_internal/utils.py", line 458, in lambda: hydra.run( File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/_internal/hydra.py", line 132, in run _ = ret.return_value File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/core/utils.py", line 260, in return_value raise self._return_value File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/hydra/core/utils.py", line 186, in run_job ret.return_value = task_function(task_cfg) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/cli.py", line 352, in run run_algorithm(cfg) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/cli.py", line 190, in run_algorithm fabric.launch(reproducible(command), cfg, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 839, in launch return self._wrap_and_launch(function, self, *args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 924, in _wrap_and_launch return launcher.launch(to_run, *args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/lightning/fabric/strategies/launchers/subprocess_script.py", line 104, in launch return function(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/lightning/fabric/fabric.py", line 930, in _wrap_with_setup return to_run(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/cli.py", line 186, in wrapper return func(fabric, cfg, *args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/algos/dreamer_v3/dreamer_v3.py", line 677, in main train( File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/algos/dreamer_v3/dreamer_v3.py", line 113, in train embedded_obs = world_model.encoder(batch_obs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/lightning/fabric/wrappers.py", line 119, in forward output = self._forward_module(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1523, in forward else self._run_ddp_forward(*inputs, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1359, in _run_ddp_forward return self.module(*inputs, **kwargs) # type: ignore[index] File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/models/models.py", line 469, in forward mlp_out = self.mlp_encoder(obs, *args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/algos/dreamer_v3/agent.py", line 151, in forward return self.model(x) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/sheeprl/models/models.py", line 119, in forward return self.model(obs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/container.py", line 217, in forward input = module(input) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/home/drt/miniconda3/envs/sheeprl/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 116, in forward return F.linear(input, self.weight, self.bias) RuntimeError: mat1 and mat2 shapes cannot be multiplied (1024x72 and 1x512)

Seems like the same holds for the observation shape (1, 72).

If your observation space is a 1D vector, then you should also remove the leadning 1 in the dimension i suppose. Can you try it?

@belerico
Copy link
Member

belerico commented May 6, 2024

Hi @defrag-bambino, we're sorry but right now Multi-Agent RL (MARL) is not supported, so your actions and observations space must be unrelated from the number of agents, which are considered as independentfrom one another. This means that:

  • Observations must be 1D vectors or 2D/3D images: everything that is not a 1D vector will be processed by a CNN by the agent. A 2D image or a 3D image of shape [H,W,1] or [1,H,W] will be considered as a grayscale image, a multi-channel image otherwise.
  • An action of type gymnasium.spaces.Box must be of shape (n,), where n is the number of (possibly continuous) actions the environment supports.
  • Every agent runs in its own environment

@belerico belerico closed this as completed May 6, 2024
@belerico belerico added the wontfix This will not be worked on label May 6, 2024
@belerico belerico reopened this May 6, 2024
@belerico
Copy link
Member

belerico commented May 6, 2024

Maybe there could be a solution as explained in #241

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants