Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronous room callbacks + documentation #75

Open
msackman opened this issue Jul 1, 2022 · 5 comments
Open

Synchronous room callbacks + documentation #75

msackman opened this issue Jul 1, 2022 · 5 comments

Comments

@msackman
Copy link

msackman commented Jul 1, 2022

I have some questions about the correct way to use RoomCallbacks, and I can't find any documentation. If someone could answer the following questions then I would happily write the documentation and send in a PR.

  1. For each Room, is each callback invoked in its own go-routine, or do they all get invoked from the same go-routine.
  2. If the same go-routine, what happens if the callback blocks for a while? Do other callback invocations queue up behind it?
  3. If the same go-routine, is this just the same for each room (so each room has its own go-routine for callbacks), or do all rooms share the same go-routine?
  4. If every callback is invoked in its own go-routine, how can one manage ordering? What is to stop you seeing a participant leave before they join, just because of the order in which the go-runtime schedules threads?

Because of 4, I would really love to see a chan of events and some guarantee that the events on that chan reflects the total order of events as they happened in the room.

(I'm also curious how this impacts the webhooks. I wonder if they suffer from the same challenges - particularly with ordering).

@davidzhao
Copy link
Member

Hey @msackman, these are some good points & improvements for the SDK.

  1. For each Room, is each callback invoked in its own go-routine, or do they all get invoked from the same go-routine.

It's not entirely consistent today. For events originated from a participant, they are triggered synchronously. However, some of the room events are being triggered in their own goroutines.

If the same go-routine, what happens if the callback blocks for a while? Do other callback invocations queue up behind it?

blocking calls would be an issue here, so it's ideal to avoid performing heavy processing within the callbacks.

If every callback is invoked in its own go-routine, how can one manage ordering? What is to stop you seeing a participant leave before they join, just because of the order in which the go-runtime schedules threads?

For this reason, I think we should move to a model that fires events synchronously. clients should handle blocking queries / spin up goroutines as appropriate.

it's unlikely, but possible for webhooks to see events out of order too. We do include a timestamp of the event, but that is in seconds, and likely not granular enough. we could include a nanosecond timestamp in addition to that. wdyt?

@msackman
Copy link
Author

msackman commented Jul 15, 2022

Hi @davidzhao
Thanks for your reply.

My preference is the same as yours: a single go routine invokes all callbacks per room, in a blocking fashion. And the docs make clear that it's your responsibility to not block for very long. The user could then send them all into a channel as a reasonable way to not block the server, but maintain correctly ordered events.

For the webhooks, personally I always prefer seeing a simple counter (if it's good enough for TCP, it's good enough pretty much everywhere else too!). Yes, a nanosecond timestamp would very likely do, but then it always risks implying more - eg exactly when within the event was this timestamp taken? So if it was me doing it, I would have a simple eventCounter uint64 in some appropriate room struct, then use the atomic eventId := atomic.AddUint64(&room.eventCounter, 1), and include that in the webhooks (and for consistency and convenience, the sdk room callbacks too).

@davidzhao davidzhao changed the title Documentation for RoomCallbacks Synchronous room callbacks + documentation Jul 15, 2022
@msackman
Copy link
Author

Oh I just thought of something else; apologies if I'm stating the obvious:
Using a plain counter allows you to spot missing messages (again, what TCP does...). Using a timestamp does not.
So if you know you've received event 8 and you next receive event 10, you know you need to wait for 9 before you can act on 10. This is not possible if you just use timestamps.

@davidzhao
Copy link
Member

a plain counter would definitely work, but would make it trickier for distributed rooms (in the future). Generally speaking, we are moving to a timed verion system in all of our cross-node communication, which we might end up using for webhooks too. It's also known as a hybrid clock

@msackman
Copy link
Author

msackman commented Aug 2, 2022

Curious. Ok, I've not come across hybrid clocks before. And basically because of that I'd have probably picked vector clocks, simply because I know them very well. Either way, it means that you can no longer have a total order of events, right - if you're going for truly distributed rooms with no "leader" node, then events can be truly concurrent. I wouldn't be surprised if that raises a lot of challenges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants