Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpoint store in the target cluster #8

Open
alexeyzimarev opened this issue Mar 15, 2021 · 9 comments
Open

Checkpoint store in the target cluster #8

alexeyzimarev opened this issue Mar 15, 2021 · 9 comments
Assignees

Comments

@alexeyzimarev
Copy link
Member

No description provided.

@thinkingerrol
Copy link

I've encountered jumps back in time when reading a stream from a replica Event Store, where in the original Event Store the timestamps embedded in my data always increased. I wonder if it's a wrong configuration on my part, or will this always happen whenever I restart the replicator? My replica Event Store was built from an older backup of the /var/lib/eventstore directory where all the chunks are, and then feeding it only by the Replicator.

@ylorph
Copy link
Contributor

ylorph commented Dec 3, 2021

what do you mean "jumped back in time" ?
If you don't do any transformation no data is changed
( except for the Event timestamp, which is system metadata, and should not be use for any application purpose)
Replicatoer uses a chekpoint from the source store in roder to know where it is .
And as you know, timestamps , and time is not reliable, servers do go back & forward in time when they synchronize their clock wiht a NTP server

@thinkingerrol
Copy link

Thanks for quick reply. By the way the replicator is quite solid, especially I like how easy it is to setup based on your docker-compose example. I probably did something wrong. My transformation function is: function transform(original) {return original;}. I use timestamp from our metaData inside the event (copied by the replicator), not the system metadata Event timestamp which is freshly stamped on every replicated event. By "jumping back in time" I mean jumping back by 8 days at a certain point when reading our custom ALL stream from the Replica Event Store which is filled by the following projection, which is actively running in both Event Stores:

fromAll()
  .when({
    $any: function(stream, event, metaData) {
      linkTo('ALL', event, metaData);
    }
});

Maybe I should exclude such streams from replication, or disable such projections in the Replica Event Store?

@ylorph
Copy link
Contributor

ylorph commented Dec 3, 2021

that seams strange, what are you using that replica for?
you know you have the $all stream you can read / subscribe from and most clients we maintain support that , if not now, then in the very short future ?

@thinkingerrol
Copy link

thinkingerrol commented Dec 3, 2021

I think the reason and the difference is that $all additionally contains tons of the system events like $statsCollected, $statistics-report, $result, $state etc, while we just wanted a stream to read all our events from, regardless to which stream they were originally published.

Our replica Event Store is used for running heavier projections for monitoring and debugging purposes, which could otherwise degrade the main Event Store performance.

@ylorph
Copy link
Contributor

ylorph commented Dec 3, 2021

I guess , you're on V5.x , if you upgrade to v21.10.x then those are not in the database itself anymore ( stats related streams, $result & $state are still in )
( and you get server side filtered read as a bonus with the gRPC based clients: https://developers.eventstore.com/clients/grpc/subscriptions.html#filter-options)

@alexeyzimarev
Copy link
Member Author

In addition, 20+ version allows applying server-side filters on subscriptions to $all, which is a much better way to get the same outcome, as it won't require running a replica server.

It's also possible to use a server-side projection to link all the events to a certain stream, so it will become a pseudo-$all. But then again, the only use case I am aware of for this is persistent subscriptions. And, we will soon release the client that support persistent subscriptions to $all, as the server supports it since 21.6 (I believe).

@alexeyzimarev
Copy link
Member Author

Using the target server to store checkpoints essentially doubles the write load to the target, so I always had doubts about usefulness of this feature...

@thinkingerrol
Copy link

Using the target server to store checkpoints essentially doubles the write load to the target, so I always had doubts about usefulness of this feature...

I think you are right and storing the checkpoint in a file configurable via replicator.yml is optimal.

The use case that I'm missing is to be able to bootstrap the target by copying the /var/lib/eventstore directory from the source Event Store, to give a head start to the whole replication process. Currently I think only empty target Event Store is supported, until the checkpoint file has been written by the replicator for the first time.

I'm using Event Store 5.0.9.0 and Replicator 0.4.1

By unsupported I mean exactly these jumps backwards in time by multiple days. Example: I wrote a Python script which prints the first event of a stream and then scans the rest of it for temporal inconsistencies:

Fetching events from http://localhost:2113/streams/Assignment-Stream/0/forward/4000
2021-11-16T14:20:56.934Z        498613ec-2496-4d4f-9090-2aa1733a0ca1    EntityAssigned        Assignment-Stream
.................................3339: timestamp jumped back in time by -P18DT6H4M41.791S within Assignment-Stream:
2021-12-04T20:25:38.725Z        7b82011c-87b3-4144-bf32-68c8de2d7bed    EntityUnassigned    Assignment-Stream
2021-11-16T14:20:56.934Z        498613ec-2496-4d4f-9090-2aa1733a0ca1    EntityAssigned        Assignment-Stream
.........

2021-12-04 is the date of the last backup of the /var/lib/eventstore directory from the source ES (copied over to target ES before starting the Replicator, to minimize the initial gap)
2021-11-16 is the date of the first event in the source ES, which after starting the Replicator now appears twice in my target ES, even with the same eventId which I thought was supposed to be unique.

As a workaround I will simply start with an empty target Event Store. Clearly my use case could only work in the absence of any kind of transformation or filtering during replication - which I don't need at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants