-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MESDK-88 Minimal v2 persistence #2801
Conversation
🦋 Changeset detectedLatest commit: 075d664 The changes in this PR will be included in the next version bump. This PR includes changesets to release 30 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Tests are now green, however dumping all messages into a single JSON file on every update is not a tenable design even for this first PR. When there a multiple writers, the large-file save lock-contention eventually produces errors (see below). options
cc: @janfjohannes @martin-lysk @samuelstroschein
|
@martin-lysk @jldec my last status was that you use many small files with a locking mechanism for persistnece. what changed since then? |
go for option 1: split by bundle (we have to do this anyways). for prototyping you likely don't need to re-implement @martin-lysk's namespacing algorithm to avoid more than 1000 files per directory.
i think @jldec wants to incrementally build the feature. dropping in one file seemed good enough but turned out to be insufficient even for prototyping |
@samuelstroschein could you clarify the reason why you think we have to do this? I will look at both options to see which will unblock us to ship this PR soonest. File-per-message has more complexity which may not be justified given our immediate goal of shipping the v2 persistence api for apps asap.
@janfjohannes, the POC for new sdk persistence with a file-per-message was removed from PR 2108 in order to limit the scope of that PR and address data-loss issues first. The plan to build bidirectional data-sync between plugins and sdk files was simplified to ship the new persistence separately, behind a feature flag, which is what we're doing in this PR. The locking mechanism currently in place is a global (project-level) lock. We never implemented per-file locking. |
One more comment about file-per-message persistence, and watching those files for udpates: We currently use fs.watch to detect changes on disk (e.g. when the cli or git modifies message files directly), and we know that the current fs watcher is not reliable (see MESDK-91) I think we can ship this PR without implementing file watching - but thinking ahead: I would like to avoid registering a separate watcher per file per message for the same scalability reasons as avoiding reactivity per message in the core. There are other ways to do this e.g. by watching the root directory of the persisted tree or by using a different store-level write-coordination mechanism. |
i think a good batching algorithm is needed in any case and would always be my first step before going into the other optimizations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just had a look into the PR to clarify questions.
With a look at the query api interface and with the discussion @jldec and me had via around I feel like we are not 100% alinged about the steps forward and the subscribe functionality exposed by the SDK.
* E.g. `await project.messageBundles.get({ id: "..." })` | ||
**/ | ||
export interface Query<T> { | ||
get: (args: { id: string }) => Promise<T | undefined> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for clearification - will the returned object T expose a .subscribe later on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why on T and not a separate subscribe method?
query.get("id")
query.subscribe("id")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Samuel's answer is closer to what I was planning.
You won't be able to subscribe directly to stored entities at this level of the api, but a layer above this should be able to provide this capability so that apps don't need to implement their own change tracking. As mentioned below, I will rename this api to Store (instead of Query) to make this clearer.
**/ | ||
export interface Query<T> { | ||
get: (args: { id: string }) => Promise<T | undefined> | ||
set: (args: { data: T }) => Promise<void> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be split into insert, update and maybe upsert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The api of rxdb looks pretty clean and carefully drafted see I use this as an inspiration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. I would just copy the mango query builder and be done with discussions how our query Api should look like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the argument is that a mango query builder would take 1-2 weeks longer to implement, i would invest those 1-2 weeks:
- having a basic query builder now will ship the SDK faster but slow down apps (aka no speed advantage)
- breaking change in the future because apps will ask for more sophisticated querys
the implementation can be non-performance optimized. as long as the API is set, apps have an easy time querying now and we avoid a breaking change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that's helpful guidance.
The query builder api needs to live in a layer above this single entity, low-level store api, so that it can include other entities like lint reports. I will rename this api to Store<T>
(instead of Query<T>
) to make this clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay Store instead of Query makes sense.
the implementation can be non-performance optimized. as long as the API is set, apps have an easy time querying now and we avoid a breaking change
Addding to this remark: The query API doesn't even need to be complete. Functionality like filters ($gte
) can be added incrementally.
inlang/source-code/sdk/src/persistence/filelock/acquireFileLock.ts
Outdated
Show resolved
Hide resolved
@martin-lysk i understand that @jldec (correct me if i am wrong) want's to unblock the message component (cc @NilsJacobsen) by shipping persistency in this PR without backwards compatibility faster, then think about backwards compatibility afterwards (if it is even needed)
|
This is step 1 toward full persistence and apis for v2 MessageBundles and resolves opral/inlang-sdk#50.
The (bare-bones) store dumps messageBundles into a single JSON file.
The goal of this PR is to be able to run the load-test and multi-project-test with v2 MessageBundles persisted to disk using the experimental persistence feature flag.
inlang machine translate
, adapted to a new CRUD api.Only the minimum read/write CRUD api necessary for the above will be included in this PR. This api should be considered internal / temporary.
project.store.messageBundles
CRUD api.project.query.*
apiinlang machine translate
to call the store api behind experimental persistence flag.to test
$
DEBUG=sdk:store,sdk:fileLock,sdk:batchedIO pnpm --filter @inlang/sdk-load-test test
not included in this PR (but planned in future PRs)
Note
Because the existing messagesQueryApi is stubbed out, existing apps will no longer work in projects with the experimental persistence feature flag turned on.