Rollback support & Transaction inclusion tracking #230

rpanic · 2024-11-21T14:14:11Z

This spec outlines a generic system for L2 and L1 interactions that takes into account L1 reorgs.

This mainly consists of two parts:

Monitoring of any sent L1 transactions
Considerations about the treatment and finality of L1 actions affecting the L2
Rollback support in case of reorgs that go beyond the justification period

The implementation should have very little dependencies on the concrete properties of the L2 itself, but rather serve as a library that makes the least amount of assumptions about the L2. However, assumptions about the L1 are necessary and expected.

Requirements:

Soundness in all configurations
Configurability of justification period
Rollback execution on reorgs to be O(n) on the reorg length

Prelimiaries:

Finality on the Mina L1 is currently around 250 blocks (~23 hours), thought most exchanges accept blocks after 20 confirmations

States

For any object, we define the following states:

Pending
Preconfirmed (L2 confirmed )
Settled ()
Justified (considered unlikely to be reorged)
Finalized (mathematically impossible to be reorged)

Current objects include: transactions, incoming messages, blocks, batches and settlements. It's worth noting however, that those objects transfer their state properties to the next higher-order object when they get included in that. E.g. Txs included in a block derive their state by the block's state, the block from the batch and so forth.

Behaviours

L2 -> L1 (settlement, outgoing messages)

Mempool: Pending
Included in L2 block: Preconfirmed
Settlement of object included in L1 block: Settled
L1 block of settlement passed justification period: Justified
L1 block final: Finalized

L1 -> L2 (incoming messages)

L1 transaction in L1 mempool: Pending (or more likely nothing)
L1 transaction included in unjustified L1 block: Pending
L1 block justified -> message will be included in a L2 block: Preconfirmed
Message execution is settled: Settled
L1 block of settlement passed justification period: Justified
L1 block final: Finalized

Configuration scenarios

Based sequencing

Since, in based sequencing, the L2's state is always directly dependent on the L1's state, it is safe to set the justification period to 0 (i.e. everything that is included in a L1 block is immediately justified). That is because a reorg on the L1 has the exact same effect on the L2 and users should expect that behavior in that mode of operation.
However, this also has the drawback that L2 rollbacks happen rather often - namely every time the L1 reorgs. In based sequencing, every L1 block where some transaction is submitted on L1 is a volatile event (see below).

Hybrid sequencing

The justification period has to cover following scenarios:

Likelihood of short-range forks.
Impact mainly on UX - frequent pre-confirmation rollbacks
Also - rollbacks are expensive to compute
MEV attack angles
Block producers can introduce short-range forks to exploit users for MEV

Rollback support

The architecture for rollbacks already exists with our layered / masked services architecture.

Tldr on Masking

One can think of masking as a aggregated changelog for incremental time periods.
Lets say we operate on a state $S_1$ and apply some operations on that to arrive at $S_2$. When we use our Masked services, under the hood we two services $C_1, C_2$ where $C_2$ only has a reference to $C_1$ (parent). Whenever writes to $C_2$ happen, they will get stored there, but reads are pass-through to $C_1$. Meaning if some value is not present in $C_2$, they will get fetched from $C_1$.
Basically, we lay a "mask" over $C_1$ and execute our operations on that mask. That has the effect that the original state will stay untouched and instead the mask obtaines the state diff $S_2 \cap S_1$. This will stay that way until we either, apply the state diff to $C_1$, at which point $C_1$ will hold the full $S_2$ and $C_2$ will be empty (everything passed through the "transparent mask"), or, we discard $C_2$ and it will be like nothing ever happened.
The same principle can be applied more than once, since it's consistent.

This architecture allows us to efficiently roll back changes made to the internal state without having to perform operations on the entire state and also without reversing individual operations.

Rollbacks on L1 reorgs

We can make the following observation: L1 state progresses and reverts strictly in-order. For example, for states $S_1 \rightarrow S_2 \rightarrow S_3$, the system cannot revert $S_2$ without also reverting $S_3$.
This allows us to apply the same to our masking flow.
So the strategy is to create a new mask on every point in time where state changes can be reorged. In this case - on every settlement or L1 message injection that we do.
We call the points where those $S_n$ change, "volatile events".

So, during operation of the rollup, we need to keep constructing a mapping between the volatile events and the masks that we used for this period. After that, when a reorg happens, we can look at this mapping and determine the masks that we need to remove to then start re-executing the new fork.

Rebuilding the chain

The strategy of the sequencer to rebuild the sequencer is the following:

For all previous volatile events still existent on the new fork in the right order (note: enforced by L1 nonces), keep as-is. Update state to correct one.
For all previous volatile events not existent, but resubmittable (i.e. settlements without changes in incoming messages), resubmit. Update state
For all other volatile events, roll them back
1. Roll back state as described in a).
2. re-fetch inputs to masking event (L1 messages)
3. rebuild L2 blocks with the minimal amount of changes possible
  1. keep L2 tx order if possible
  2. keep incoming messages order as well as possible
4. Create new settlement if applicable

a) Things to do when actually "rolling back":

Throw away masks
Delete settlements
Delete batches
Delete blocks -> re-add transactions onto mempool
Delete pending L1 messages up to the last valid volatile event

Rebuilding blocks (Hybrid mode)

Rough priorities when rebuilding L2 blocks after rollbacks:

Keep L2 txs in the same order
Keep L1 txs in order, where applicable. If we encounter a different ordering,

Finalization

After a masking event has been finalized on the L1, we can merge that given mask into the base state. This makes sure that we don't store unnecessary masks since they take more storage space as necessary and have a linear component to the lookup times as well.

Open questions

How do we construct "blocks" with based sequencing? Makes little sense for the sequencer to be able to decide that, a 1:1 mapping makes more sense

Work items

Based sequencing:

L1 chain monitoring
Transaction inclusion monitoring
L2 rollback support

Hybrid sequencing:

All of the above
Delayed L1 message inclusion (justification period)
L2 block rebuilding

Future work

Caching of proving results - cached proving work might come in handy for rollback re-computation
Implement replayability / determinism of Flows to make the best use of caching

The text was updated successfully, but these errors were encountered:

rpanic moved this to In Progress in Main Board Nov 24, 2024

rpanic added this to Main Board Nov 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rollback support & Transaction inclusion tracking #230

Rollback support & Transaction inclusion tracking #230

rpanic commented Nov 21, 2024 •

edited

Loading

Rollback support & Transaction inclusion tracking #230

Rollback support & Transaction inclusion tracking #230

Comments

rpanic commented Nov 21, 2024 • edited Loading

States

Behaviours

L2 -> L1 (settlement, outgoing messages)

L1 -> L2 (incoming messages)

Configuration scenarios

Based sequencing

Hybrid sequencing

Rollback support

Tldr on Masking

Rollbacks on L1 reorgs

Rebuilding the chain

Rebuilding blocks (Hybrid mode)

Finalization

Open questions

Work items

Future work

rpanic commented Nov 21, 2024 •

edited

Loading