-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Semantics of collection chain set wrt nextStepCollective #410
Comments
This sounds at least superficially reasonable. I'll think about it more tomorrow. |
Any more comments? I’m thinking about implementing this soon |
I realize there's a tension/inconsistency between different points of your proposal. As in your point 3, I believe that readiness to handle messages in an epoch needs to be on a per-object. I might possibly extend that to a per-handler basis. That conflicts with the notion in points 1-2 that this is a per-epoch notion, seemingly across all objects. More broadly, this in some ways resembles notions in more dogmatic Actor programming models of objects having a 'mailbox' from which they select inbound messages. We're definitely distinguishing our approach in that control of mailbox-processing-order is living at least partially outside the recipient, but we should keep that in mind. |
The “dependent” bit can never be unset for a given epoch. It just puts the epoch in the category so it can be treated accordingly by the runtime. |
The lack of readiness is handled by each component in VT. For collections, each element can track its readiness for a given epoch individually. |
Had a thought earlier today, which may or may not be a productive direction: The chain manager can arrange things so that none of the messages in a chain actually get delivered until it's got the whole chain defined (by not calling |
Per a discussion with @PhilMiller, we've concluded that a mechanism that actually holds back messages for a given collective step (from one chain to another) could provide a clearer semantic.
First, note that a message marked with a collective epoch could arrive before that epoch is even created on another node. This is because
theTerm()->makeCollectiveEpoch(..)
is not synchronized by design. This definitely complicates any mechanism that we build.My current proposition:
Add a bit to epoch that indicates if the epoch is ready to execute immediately or must be made ready. Currently, all are ready immediately.
If that bit is set as unready, the virtual context collection manager will not deliver the message until the user indicates it ready.
Add a call
proxy(idx).ready(epoch)
that allows deliveryAdd the proxy to collection chain set so it can call this, for each index at the right time, and set the epoch bit appropriately.
The text was updated successfully, but these errors were encountered: