We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
user of pyCMO
pyCMO
to be able to specify different reward models for my scenarios
I can train RL agents
we currently only export the player's side's total score as the reward
we implement a way for users to specify a reward model
we get closer to being able to train RL agents
One idea is to create a custom RewardHandler class that gets passed into CMOEnv that can calculate the reward based on the current observation
RewardHandler
CMOEnv
The text was updated successfully, but these errors were encountered:
gymnasium provides reward wrappers
Sorry, something went wrong.
No branches or pull requests
Why
As a
user of
pyCMO
I want
to be able to specify different reward models for my scenarios
So that
I can train RL agents
Acceptance Criteria
Given
we currently only export the player's side's total score as the reward
When
we implement a way for users to specify a reward model
Then
we get closer to being able to train RL agents
Notes
One idea is to create a custom
RewardHandler
class that gets passed intoCMOEnv
that can calculate the reward based on the current observationThe text was updated successfully, but these errors were encountered: