schema complexity? irmin Vs RDBMS? distributed store integrity? #1004

MdeLv · 2020-05-19T13:53:19Z

MdeLv
May 19, 2020

Schema
irmin allows several "store" types (fs, mem, git, git_mem; and custom store types).
irmin allows several "content" types (string, json object, json_value; and custom content types).

So, any schema, even extremely complex one, can be technically implemented with irmin, irrespective of performance (discussed hereafter).
Is it correct ?

Performance (speed) / Applications
What kind of application can be implemented advantageously with irmin instead of a (traditional) full-featured RDBMS?
What kind of application must still be done with RDBMS and for which reasons?

Especially, a RDBMS has optimization mechanisms that discharge the client from heavy or repetitive computations. How should this be handled when using irmin? Should we reinvent these mechanisms in a dedicated program running on the same server? Or does irmin have builtin optimization mechanisms?

(Transactionnal) consistency of a distributed store
I read: "irmin is a distributed database". I can first understand that as:
"I configure an irmin store (that complies my very complex schema) and locate it in some location accessible via TCP/IP. This store is able to answer simultaneous requests in a client-server mode, with a secure transaction mechanism insuring the storage's integrity at the level of ONE irmin server."
Of course, a backup copy of this store is done automatically, because that's IT life.
In addition, I can decide to distribute this store as a certain number of replicas in locations closer to their clients/users. This is intended to reduce request/response delay (closer and especially more computation capacity). And that also increases availability because at least one replica should always be accessible (if things are going well), even at the cost of an increased delay (if more requests focus on the last available replica).

The key question is: while there are ongoing transactions, how would the several replicas be synchronized in a way that the store integrity is preserved?
For a slow process with possible time limit, there is no problem. For example, after an order is created, it can be prepared and shipped. As a time limit (4 pm) can be imposed, orders will be prepared iff time > 4 pm (with possible predictive preparation, depending of other constraints).
But for high speed activities, how is it handled by irmin? Or, how should this be handled by the architect of the distributed store?
I imagine that each new transaction should put a lock on the manipulated objects on ALL replicas (which can be a struggle), then commits or rollbacks, then unlock the related objects on ALL replicas. This is simple and reliable, and it has a cost (communications).

How does irmin address that key requirement?

Thanks.

samoht · 2020-12-02T09:45:27Z

samoht
Dec 2, 2020
Maintainer

Hi, thanks for your interest in Irmin!

Sorry for the late reply, but here are some answers to your questions:

yes, you could encode and store any kind of data-structure into Irmin. As you pointed out there might a cost, depending on the schema. For instance storing cyclic values is possible but costly.
for the kind of application that fits well with Irmin, please see @craigfe post on discuss: https://discuss.ocaml.org/t/document-management/6803/11
for the distributed aspect of it: irmin encourages the "local first" model for data ownership : every replica has a copy of the store, where they do writes. Conflicts are handled locally. Then, once in a while, they can sync/refresh their data with remote replicas. See for instance Banyan which uses Irmin API on top of a Cassandra database.
Let me know if you have further questions.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

schema complexity? irmin Vs RDBMS? distributed store integrity? #1004

{{title}}

Replies: 1 comment

{{title}}

Select a reply

schema complexity? irmin Vs RDBMS? distributed store integrity? #1004

MdeLv May 19, 2020

Replies: 1 comment

samoht Dec 2, 2020 Maintainer

MdeLv
May 19, 2020

samoht
Dec 2, 2020
Maintainer