-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distributed Framework #327
base: master
Are you sure you want to change the base?
Distributed Framework #327
Conversation
Codecov Report
@@ Coverage Diff @@
## master #327 +/- ##
==========================================
- Coverage 90.87% 86.83% -4.05%
==========================================
Files 88 97 +9
Lines 3705 4017 +312
==========================================
+ Hits 3367 3488 +121
- Misses 338 529 +191
|
This pull request introduces 3 alerts when merging 4cba727 into 0fe4180 - view on LGTM.com new alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tried it on any agent yet?
This pull request introduces 10 alerts when merging 3d233c4 into 0fe4180 - view on LGTM.com new alerts:
|
This pull request introduces 15 alerts when merging 1c504cc into bb85ea1 - view on LGTM.com new alerts:
|
This pull request introduces 19 alerts when merging 2c3298a into 608fc03 - view on LGTM.com new alerts:
|
This pull request introduces 14 alerts when merging 73586d5 into 52b0b4c - view on LGTM.com new alerts:
|
This pull request introduces 2 alerts when merging 072d545 into 52b0b4c - view on LGTM.com new alerts:
|
class WeightHolder: | ||
def __init__(self, init_weights): | ||
self._weights = init_weights |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
You could add a decorator @dataclass
to avoid __init__
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for other classes where there's only assigning business happening in the constructor
genrl/distributed/actor.py
Outdated
learner_rref = get_rref(learner_name) | ||
print(f"{name}: Begining experience collection") | ||
while not learner_rref.rpc_sync().is_done(): | ||
agent.load_weights(parameter_server_rref.rpc_sync().get_weights()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to assign the agent in the constructor itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assign agent weights? They will need to be updated in the loop right?
examples/distributed.py
Outdated
def collect_experience(agent, experience_server_rref): | ||
obs = agent.env.reset() | ||
done = False | ||
for i in range(MAX_ENV_STEPS): | ||
action = agent.select_action(obs) | ||
next_obs, reward, done, info = agent.env.step(action) | ||
experience_server_rref.rpc_sync().push((obs, action, reward, next_obs, done)) | ||
obs = next_obs | ||
if done: | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This experience collection is working only on a single agent/single thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is being run in multiple different processes. Its being passed to the ActorNode which is running it in its own infinite loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the Actor definition going to work? Can I define any architecture for the actor?(this would be ideal behavior)
(I dont see any neural network definitions as of now).
Another thought that I had was: do you think we could somehow use decorators here? There's a bunch of core details we can get rid of then.
This pull request introduces 2 alerts when merging e2eef66 into 25eb018 - view on LGTM.com new alerts:
|
This pull request introduces 2 alerts when merging 837eb18 into 25eb018 - view on LGTM.com new alerts:
|
Yeah. The |
This is happening internally in the |
This pull request introduces 5 alerts when merging 8030b2a into 25eb018 - view on LGTM.com new alerts:
|
I haven't used decorators too extensively before, I'll look into it though. Did you have any specific ideas in mind? |
This is a very rough draft of a trainer for distributed off policy agents.
Currently working on getting DDPG to be trained in distributed manner using this.