You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want to have a 2-player game where they take turns. In the beginning there are 114 possible actions and they decrease by 1 every time a player makes a move. The game is played for 10 turns (that's the terminal state). I have my own function for the reward.
Here is a sample game tree:
START- available actions to both players -> [1,2,3,4,5,6....112,113,114]
Player 1 - takes action 5 -> [5,0,0,0,0,0,0,0,0,0] -remove action 5 from available actions
Player 2 - takes action 32->[5,0,0,0,0,32,0,0,0,0] - remove action 32 from the available actions
Player 1- takes action 97 ->[5,97,0,0,0,32,0,0,0,0] - remove action 97 from the available actions
Player 2 takes action 56 -> [5,97,0,0,0,32,56,0,0,0] - remove action 56 from the available actions
....
Final (example) game state after each player makes 5 actions -> [5,97,3,5,1,32,56,87,101,8]
First 5 entries present the actions taken by Player1, second 5 entries present the actions taken by Player 2
Finally, I apply a reward function to this vector [5,97,3,5,1,32,56,87,101,8]
My python skill is really bad. I hope you can help with this.
The text was updated successfully, but these errors were encountered:
Hey,
I want to have a 2-player game where they take turns. In the beginning there are 114 possible actions and they decrease by 1 every time a player makes a move. The game is played for 10 turns (that's the terminal state). I have my own function for the reward.
Here is a sample game tree:
START- available actions to both players -> [1,2,3,4,5,6....112,113,114]
Player 1 - takes action 5 -> [5,0,0,0,0,0,0,0,0,0] -remove action 5 from available actions
Player 2 - takes action 32->[5,0,0,0,0,32,0,0,0,0] - remove action 32 from the available actions
Player 1- takes action 97 ->[5,97,0,0,0,32,0,0,0,0] - remove action 97 from the available actions
Player 2 takes action 56 -> [5,97,0,0,0,32,56,0,0,0] - remove action 56 from the available actions
....
Final (example) game state after each player makes 5 actions -> [5,97,3,5,1,32,56,87,101,8]
First 5 entries present the actions taken by Player1, second 5 entries present the actions taken by Player 2
Finally, I apply a reward function to this vector [5,97,3,5,1,32,56,87,101,8]
My python skill is really bad. I hope you can help with this.
The text was updated successfully, but these errors were encountered: