Synchronous room callbacks + documentation #75

msackman · 2022-07-01T08:39:46Z

I have some questions about the correct way to use RoomCallbacks, and I can't find any documentation. If someone could answer the following questions then I would happily write the documentation and send in a PR.

For each Room, is each callback invoked in its own go-routine, or do they all get invoked from the same go-routine.
If the same go-routine, what happens if the callback blocks for a while? Do other callback invocations queue up behind it?
If the same go-routine, is this just the same for each room (so each room has its own go-routine for callbacks), or do all rooms share the same go-routine?
If every callback is invoked in its own go-routine, how can one manage ordering? What is to stop you seeing a participant leave before they join, just because of the order in which the go-runtime schedules threads?

Because of 4, I would really love to see a chan of events and some guarantee that the events on that chan reflects the total order of events as they happened in the room.

(I'm also curious how this impacts the webhooks. I wonder if they suffer from the same challenges - particularly with ordering).

davidzhao · 2022-07-15T06:45:16Z

Hey @msackman, these are some good points & improvements for the SDK.

For each Room, is each callback invoked in its own go-routine, or do they all get invoked from the same go-routine.

It's not entirely consistent today. For events originated from a participant, they are triggered synchronously. However, some of the room events are being triggered in their own goroutines.

If the same go-routine, what happens if the callback blocks for a while? Do other callback invocations queue up behind it?

blocking calls would be an issue here, so it's ideal to avoid performing heavy processing within the callbacks.

If every callback is invoked in its own go-routine, how can one manage ordering? What is to stop you seeing a participant leave before they join, just because of the order in which the go-runtime schedules threads?

For this reason, I think we should move to a model that fires events synchronously. clients should handle blocking queries / spin up goroutines as appropriate.

it's unlikely, but possible for webhooks to see events out of order too. We do include a timestamp of the event, but that is in seconds, and likely not granular enough. we could include a nanosecond timestamp in addition to that. wdyt?

msackman · 2022-07-15T13:01:17Z

Hi @davidzhao
Thanks for your reply.

My preference is the same as yours: a single go routine invokes all callbacks per room, in a blocking fashion. And the docs make clear that it's your responsibility to not block for very long. The user could then send them all into a channel as a reasonable way to not block the server, but maintain correctly ordered events.

For the webhooks, personally I always prefer seeing a simple counter (if it's good enough for TCP, it's good enough pretty much everywhere else too!). Yes, a nanosecond timestamp would very likely do, but then it always risks implying more - eg exactly when within the event was this timestamp taken? So if it was me doing it, I would have a simple eventCounter uint64 in some appropriate room struct, then use the atomic eventId := atomic.AddUint64(&room.eventCounter, 1), and include that in the webhooks (and for consistency and convenience, the sdk room callbacks too).

msackman · 2022-07-30T17:47:26Z

Oh I just thought of something else; apologies if I'm stating the obvious:
Using a plain counter allows you to spot missing messages (again, what TCP does...). Using a timestamp does not.
So if you know you've received event 8 and you next receive event 10, you know you need to wait for 9 before you can act on 10. This is not possible if you just use timestamps.

davidzhao · 2022-08-02T00:34:57Z

a plain counter would definitely work, but would make it trickier for distributed rooms (in the future). Generally speaking, we are moving to a timed verion system in all of our cross-node communication, which we might end up using for webhooks too. It's also known as a hybrid clock

msackman · 2022-08-02T07:24:04Z

Curious. Ok, I've not come across hybrid clocks before. And basically because of that I'd have probably picked vector clocks, simply because I know them very well. Either way, it means that you can no longer have a total order of events, right - if you're going for truly distributed rooms with no "leader" node, then events can be truly concurrent. I wouldn't be surprised if that raises a lot of challenges.

davidzhao changed the title ~~Documentation for RoomCallbacks~~ Synchronous room callbacks + documentation Jul 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Synchronous room callbacks + documentation #75

Synchronous room callbacks + documentation #75

msackman commented Jul 1, 2022

davidzhao commented Jul 15, 2022

msackman commented Jul 15, 2022 •

edited

Loading

msackman commented Jul 30, 2022

davidzhao commented Aug 2, 2022

msackman commented Aug 2, 2022

Synchronous room callbacks + documentation #75

Synchronous room callbacks + documentation #75

Comments

msackman commented Jul 1, 2022

davidzhao commented Jul 15, 2022

msackman commented Jul 15, 2022 • edited Loading

msackman commented Jul 30, 2022

davidzhao commented Aug 2, 2022

msackman commented Aug 2, 2022

msackman commented Jul 15, 2022 •

edited

Loading