Range-Based Set Reconciliation vs Merkle trees #793

Nuhvi · 2023-02-24T08:16:26Z

Nuhvi
Feb 24, 2023

Disclaimer: I am mostly learning by asking questions, not speaking from authority, so feel free to ignore this.

I asked about what does the Range-Based Set Reconciliation offer over a Merkle binary tree, and I got this response:
#707 (reply in thread) (thanks @dignifiedquire)

Usually this doesn't quite work out in practice for large trees/sets, because you don't know what side has which pieces when you are doing reconciliation, and so the amount of data you have to transfer (especially if you use hashes) becomes quite large.

I am not quite sure I understand the response so I will try to assume why would large trees need much larger data transferred, than the binary search approach.

What we know about the Merkle tree that Iroh would create:
1- Leaves are going to be all at the same depth (256) by default since keys are 32 bytes hashes.
2- Leaves are going to be very sparse since they are uniformly distributed on a very large key space.

So a typical tree or a subtree could look like this:

graph TD
    R(["Root"]) --> i0(("0"))
    R  --> i1(("1"))
    i0 --> i00(("00"))
    i0 --> i01(("01"))
    i1 --> i10(("10"))
    i1 --> i11(("11"))
    i00 --> e000(("-"))
    i01 --> e010(("-"))
    i10 --> e100(("-"))
    i11 --> e110(("-"))
    i00 --> e00(("-"))
    i01 --> L01((("011")))
    i10 --> L010((("101")))
    i11 --> L11((("111")))

Indeed too many empty and internal nodes.

We can get rid of empty branches by encoding if a given internal node has left/right or both children. So we get this tree:

graph TD
    R(["Root"]) --> i0(("0"))
    R  --> i1(("1"))
    i0 --> i01(("01"))
    i1 --> i10(("10"))
    i1 --> i11(("11"))
    i01 --> L01((("011")))
    i10 --> L010((("101")))
    i11 --> L11((("111")))

And with collapsing nodes we can remove more internal nodes to get this tree instead:

graph TD
    R(["Root"])
    R  --> L01((("011")))
    R  --> i1(("1"))
    i1 --> L010((("101")))
    i1 --> L11((("111")))

If that is possible and feasible, wouldn't that reduce the data needed to almost the same as in the case of Range-Based Set Reconciliation?

Much more elaborate documentation at Quadrable where I learned about both left/right branches encoding, and collapsing nodes.

AljoschaMeyer · 2023-02-24T09:51:32Z

AljoschaMeyer
Feb 24, 2023

@nazeh

why do all this live recalculation when a Binary Merkle tree or a Radix trie can do the same thing but only compute the tree once on write?

TLDR: Merkle stuff is vulnerable to (possibly maliciously crafted) data sets whose necessarily unique tree representation is degenerate; monoid-tree-based range-based set reconciliation protects against this case.

With the more straightforward Merkle tree approach, all peers need to agree on a single tree shape, because the tree representation of a data set affects its hash: ((a b) c) and (a (b c)) have different hashes. As a consequence, the reconciliation process with Merkle-tree-based approaches has to be guided by the exact shape of the tree: given some (sub-)tree you wish to reconcile, the only way you can partition it into smaller trees is by partitioning it into some of its subtrees.

Unfortunately, agreeing on unique tree shapes for every set such that each such tree has logarithmic height can only be done with balancing schemes for which insertion and deletion take sqrt(n) time [Sny77]. I haven't seen any project explore using such a tree, neither have I seen anyone explore solutions that stay logarithmic as long as the tree is small compared to the full universe [ST94]. The most popular alternative are probabilistic schemes, where the height of the tree is logarithmic with high probability. Sparse Merkle trees, treaps, hash-tries all fall into this category.

In the worst case, these probabilistic solutions degrade to linear height. And unfortunately, maliciously crafting worst-case instances is very much feasible. Since a straightforward implementation of Merkle reconciliation approaches partitions into subtrees, the number of roundtrips becomes linear in that case as well.

The whole idea behind the monoid-tree-based range-based set reconciliation is that partitioning choices are fully decoupled from any concrete tree representation of the (sub-)set to reconcile. Peers can always split their set in k subsets of roughly equal size, which guarantees a logarithmic number of roundtrips (and the choice of k determines the base of the logarithm, so if bandwidth is cheap but roundtrips are not, they can increase k to speed things up). This means that peers may receive fingerprints for ranges that do not fit nicely into their local tree - but their balanced local tree allows them to compute the fingerprint for such a range in logarithmic time regardless.

0 replies

AljoschaMeyer · 2023-02-24T10:00:57Z

AljoschaMeyer
Feb 24, 2023

Section 6 of the range-based paper demonstrates how these two worlds can be combined, using treaps as an example. You keep a probabilistic tree with Merkle labels, but the fingerprints you used for reconciliation are not directly those tree labels. By accepting the risk of degenerate trees, you gain the ability to use standard, non-monoidal hash functions. You still need to compute fingerprints for the ranges you receive in time proportional to the height of your tree (so linear in the worst case, but logarithmic in the expected case), but the number of roundtrips is guaranteed to be logarithmic, even if the tree is degenerate.

I haven't looked into a similar construction for sparse Merkle trees, but I imagine it should be quite straightforward.

0 replies

Nuhvi · 2023-02-24T10:29:35Z

Nuhvi
Feb 24, 2023
Author

Well, I think I understand the attack surface if not the math; in a p2p network, someone can keep creating CIDs so that the Merkle tree used by Iroh takes O(sqrt n) to sync instead of O(log n).

I wonder if this means a Strfry is vulnerable to this, especially since it uses integer keys not hashes, so it is easier to create that malicious set.

Thanks, @AljoschaMeyer for answering my question, I have had more questions about append-only logs, and now I am encouraged to ask those at Bamboo.

0 replies

AljoschaMeyer · 2023-02-24T10:50:55Z

AljoschaMeyer
Feb 24, 2023

someone can keep creating CIDs so that the Merkle tree used by Iroh takes O(sqrt n) to sync instead of O(log n).

O(n) even when the tree degenerates to a linked list. For sparse merkle trees, a worst-case tree would consist of the minimal amount of items to have a path to a leaf whose height (256) cannot be compressed.

0 replies

Nuhvi · 2023-02-24T11:16:37Z

Nuhvi
Feb 24, 2023
Author

Ah, so something like this but all the way to [0; 32]? That makes sense.

graph TD
    R(["Root"])
    R  --> i0(("0"))
    i0 --> i00(("0"))
    i0 --> L01((("01")))
    i00 --> L001((("000")))
    i00 --> i000((("001")))

And it would be worse in systems using integer keys submitted by users not hashes of content.

0 replies

ribasushi · 2023-02-27T06:42:03Z

ribasushi
Feb 27, 2023

I don't have good input to the discussion topic, but this sentence caught my attention:

I haven't seen any project explore using such a tree, neither have I seen anyone explore solutions that stay logarithmic as long as the tree is small compared to the full universe

@AljoschaMeyer the trickle dag builder initially implemented in Kubo is exactly this scheme. It looks kinda/sorta like this ( the diagram is not entirely correct ) ipfs-inactive/js-ipfs-unixfs-engine#45 (comment)

I thought @dignifiedquire ported it to iroh, but can't find the code...

The "english description" I came up with when reimplementing it way back is:

Trickle produces a "side-balanced" DAG optimized for streaming. Data blocks
further away from the stream start are arranged in nodes at increasing depth
away from the root. The "placement group" for a particular node LeafIndex
away from the stream start can be derived numerically via:

int( log( LeafIndex / MaxDirectLeaves ) / log( 1 + MaxSiblingSubgroups ) )

0 replies

AljoschaMeyer · 2023-02-28T09:56:36Z

AljoschaMeyer
Feb 28, 2023

@ribasushi Thank you, I didn't know about these. But on a first glance, it looks like these are append-only for new items, whereas for sets we need to support item insertion at arbitrary positions (it has to be a search tree, after all). My instinctual reaction to the trickle dags is that they fall into a similar category as linking schemes or transparency logs-

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Range-Based Set Reconciliation vs Merkle trees #793

{{title}}

Replies: 7 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Range-Based Set Reconciliation vs Merkle trees #793

Nuhvi Feb 24, 2023

Replies: 7 comments

AljoschaMeyer Feb 24, 2023

AljoschaMeyer Feb 24, 2023

Nuhvi Feb 24, 2023 Author

AljoschaMeyer Feb 24, 2023

Nuhvi Feb 24, 2023 Author

ribasushi Feb 27, 2023

AljoschaMeyer Feb 28, 2023

Nuhvi
Feb 24, 2023

AljoschaMeyer
Feb 24, 2023

AljoschaMeyer
Feb 24, 2023

Nuhvi
Feb 24, 2023
Author

AljoschaMeyer
Feb 24, 2023

Nuhvi
Feb 24, 2023
Author

ribasushi
Feb 27, 2023

AljoschaMeyer
Feb 28, 2023