Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node: Make stream sync more resilient #408

Merged
merged 16 commits into from
Jul 24, 2024
Merged

node: Make stream sync more resilient #408

merged 16 commits into from
Jul 24, 2024

Conversation

bas-vk
Copy link
Contributor

@bas-vk bas-vk commented Jul 15, 2024

Currently is one or more streams from a client sync are managed by a remote node that goes down/or is unreachable the entire stream sync subscription is cancelled. This is undesirable.

With this change a new sync op message type SyncOp_SYNC_DOWN is introduced. When the node that managed the subscription for the client detects a remote node is down it will send SyncOp_SYNC_DOWN message to the client to report that it will not get updates for this stream and drops all streams managed by the remote node from the existing subscription. It is up to the client after that to add these streams again.

Architecture

Introduces the rpc/sync package. This defines a handler interface that the service uses to forward incoming client requests. The handlerImpl implements this interface and keeps a mapping of active StreamSyncOperation. It will either create a subscription (SyncStreams) or find and forwards the request to the sync operation. The sync operation manages a set of syncers. A syncer is either local (streamCache) or remote (grpc api). Syncers are defined in the rpc/sync/client package. Syncers write messages to an internal channel from which the sync operation sends them to the client.

@bas-vk bas-vk force-pushed the basvk/syncstreams branch 5 times, most recently from 671a903 to a590464 Compare July 16, 2024 13:01
@bas-vk bas-vk marked this pull request as ready for review July 16, 2024 14:07
@bas-vk bas-vk requested review from sergekh2 and texuf as code owners July 16, 2024 14:07
@bas-vk bas-vk force-pushed the basvk/syncstreams branch 4 times, most recently from f8ccbdb to 35cd87f Compare July 16, 2024 15:59
@@ -694,6 +694,7 @@ message SyncStreamsResponse {
SyncOp sync_op = 2;
StreamAndCookie stream = 3;
string pong_nonce = 4;
bytes stream_id = 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like it's a good idea to add some comments to this struct.

GetStreamId() string
}

func ctxAndLogForRequest[T any](ctx context.Context, req *connect.Request[T]) (context.Context, *slog.Logger) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's suboptimal to have a copy in rpc and here. Lets make it public (needs a new package?)

nodeRegistry: nodeRegistry,
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To highlight interface comformance:

Suggested change
var _ Handler = (*handlerImpl)(nil)
var _ DebugHandler = (*handlerImpl)(nil)

// StreamCookieSetGroupedByNodeAddress is a mapping from a node address to a SyncCookieSet
StreamCookieSetGroupedByNodeAddress map[common.Address]SyncCookieSet
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add var _ to indicate which structs should implement which interfaces

}

// use the syncID as used between client and subscription node
msg.SyncId = syncOp.SyncID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But is there really a need to set it if it's not a SYNC_NEW op? (Or rather should it be empty for all other ops?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It must be set for SYNC_NEW and SYNC_CLOSE. It is debatable if it needs to be set for other updates. Currently it is set and several unittests test (go + sdk) do check for it.

delete(ss.syncers, syncerAddr)
ss.muSyncers.Unlock()
ss.syncerTasks.Done()
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this capture work correctly? should syncerAddr, syncer be args to the goroutine instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From go 1.22 this isn't an issue anymore.

ss.streamID2Syncer[streamID] = syncer

ss.syncerTasks.Add(1)
go func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this can be a functions on ss, then here and above in can be go-called instead of same code.

type remoteSyncer struct {
syncStreamCtx context.Context
syncStreamCancel context.CancelFunc
syncID atomic.Value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems it doesn't have to be atomic: it is written once during object creation?

}
case <-s.syncStreamCtx.Done():
return
case <-s.syncStreamCtx.Done():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same case twice


latestMsgReceived.Store(time.Now())

go func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please put this func in the method and go-call it here: it's relatively long and makes Run() less readable as such

@bas-vk bas-vk force-pushed the basvk/syncstreams branch from f5a2588 to d7c8309 Compare July 22, 2024 13:16
@bas-vk bas-vk force-pushed the basvk/syncstreams branch from d7c8309 to 53f0374 Compare July 24, 2024 07:06
@bas-vk bas-vk merged commit cafc5a5 into main Jul 24, 2024
6 of 8 checks passed
@bas-vk bas-vk deleted the basvk/syncstreams branch July 24, 2024 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants