[Feature Request] Include group-level done fields in the VMAS environment #2536

thomasbbrunner · 2024-11-05T17:18:00Z

Motivation

In the current implementation of the VMAS environment, the done fields are only available on the root of the tensordict. However, for training, it is useful to have them in the group-level.

This is also an issue for the TorchRL tutorial scripts multiagent_ppo.py and multiagent_competitive_ddpg.py, which have to manually add the done and terminated at group-level with the right shape

https://github.com/pytorch/rl/blob/main/tutorials/sphinx-tutorials/multiagent_ppo.py#L655
https://github.com/pytorch/rl/blob/main/tutorials/sphinx-tutorials/multiagent_competitive_ddpg.py#L732

The PettingZoo env implementation does include the group-level done fields.

So, can we also include these fields in the VMAS code? If so, I'd be willing to work on this feature.

Checklist

I have checked that there is no similar issue in the repo (required)

The text was updated successfully, but these errors were encountered:

thomasbbrunner · 2024-11-05T17:20:30Z

Tagging @matteobettini, as you also worked on the VmasEnv.

matteobettini · 2024-11-05T17:37:33Z

Hey thanks for opening this.

So the idea is that the location of the done keys of a multiagent env will tell you how the env operates. For example:

in PettingZoo, the done and the rewards are per-agent, so this keys are in the group tds (we also have the global done as this allows to decide when to reset)
in VMAS, the done is global (thus in the root of the td) and the reward is per-agent (in the subgroups). This is the reason why, before training, we expand the done to match the reward shape and make torch happy
in SMAC both rewards and done are global (this in the root), so potentially we could avoid expanding them for computations

In general the rationale is that, if some of the keys need to be expanded later, this can be done at the time you need it in a lazy way.

Pre-adding this extra keys and creating new entries in the env tds regardless of whether you will need it could lead to extra computation, due to the fact that torchrl will track a larget done_spec and so on.

Another motivation is: Now, by looking at the td coming from a vmas env, you immediatly know that there is not a per-agent done.
If instead we add this, a user would not immediately know that it is just a copy of the global that is useful to have for some training operations.

Let me know what you think

thomasbbrunner · 2024-11-06T11:10:40Z

So the idea is that the location of the done keys of a multi-agent env will tell you how the env operates.

That makes sense! And I agree that it is more intuitive this way instead of having the fields everywhere.

However, IMO it is suboptimal to have to manually patch the data so that it is compatible with the trainer. I'd expect the TorchRL environments and the TorchRL trainers to be plug and play compatible.

Or at least to have a default solution for this. I see some options for such a solution:

Adding an env transform to patch the data (I don't think that such a transform exists?).
Adding a postproc for the collector to patch the data (IMO the preferred approach).

What do you think of this?

matteobettini · 2024-11-06T11:27:34Z

Yeah 2 is the best.

we actually already have that transform in our implementations

rl/sota-implementations/multiagent/utils/utils.py

Line 21 in 997d90e

class DoneTransform(Transform):

We could consider extending it and adding it to the lib!

@vmoens

thomasbbrunner · 2024-11-06T11:35:56Z

Oh, this is great! Should we move this transform to the main lib and also use this in the tutorials? I'm open to working on this.

vmoens · 2024-11-06T11:40:01Z

I'm open to it, @matteobettini do you remember why it's not part of the core lib?
The doc strings are a bit dull and the name of the transform is a bit too generic to my taste but it'd fit nicely in the lib!

matteobettini · 2024-11-06T12:42:49Z

I don’t remember.

Yes, agreed, we need to extend it and test it a bit bit it would be v useful to have

matteobettini · 2024-11-06T12:43:43Z

We could also consider having a “multiagent” file in the transforms to avoid clutter

vmoens · 2024-11-06T14:59:37Z

Ofc if the usage is only in MARL settings that makes sense

thomasbbrunner added the enhancement New feature or request label Nov 5, 2024

thomasbbrunner assigned vmoens Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Include group-level done fields in the VMAS environment #2536

[Feature Request] Include group-level done fields in the VMAS environment #2536

thomasbbrunner commented Nov 5, 2024 •

edited

Loading

thomasbbrunner commented Nov 5, 2024

matteobettini commented Nov 5, 2024

thomasbbrunner commented Nov 6, 2024

matteobettini commented Nov 6, 2024

thomasbbrunner commented Nov 6, 2024

vmoens commented Nov 6, 2024

matteobettini commented Nov 6, 2024

matteobettini commented Nov 6, 2024

vmoens commented Nov 6, 2024

[Feature Request] Include group-level done fields in the VMAS environment #2536

[Feature Request] Include group-level done fields in the VMAS environment #2536

Comments

thomasbbrunner commented Nov 5, 2024 • edited Loading

Motivation

Checklist

thomasbbrunner commented Nov 5, 2024

matteobettini commented Nov 5, 2024

thomasbbrunner commented Nov 6, 2024

matteobettini commented Nov 6, 2024

thomasbbrunner commented Nov 6, 2024

vmoens commented Nov 6, 2024

matteobettini commented Nov 6, 2024

matteobettini commented Nov 6, 2024

vmoens commented Nov 6, 2024

thomasbbrunner commented Nov 5, 2024 •

edited

Loading