Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Document consensus orchestration #951

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
247 changes: 247 additions & 0 deletions consensus/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,16 @@ This README serves as a guide to the implementation of the [1.0 Pocket's Consens
- [Block Validation](#block-validation)
- [Consensus Lifecycle](#consensus-lifecycle)
- [State Sync](#state-sync)
- [Synchronization between the consensus processes](#synchronization-between-the-consensus-processes)
- [StateSync Mode](#statesync-mode)
- [Download routine](#download-routine)
- [Apply routine](#apply-routine)
- [Consensus Mode](#consensus-mode)
- [PaceMaker block proposal delaying](#pacemaker-block-proposal-delaying)
- [Concurrent requests](#concurrent-requests)
- [Late requests](#late-requests)
- [Example height,round,step increment](#example-heightroundstep-increment)
- [Example invalid increments](#example-invalid-increments)
- [Implementation](#implementation)
- [Code Organization](#code-organization)
- [Testing](#testing)
Expand Down Expand Up @@ -140,6 +150,243 @@ graph TB
J --> note2
```

## Synchronization between the consensus processes
red-0ne marked this conversation as resolved.
Show resolved Hide resolved

The consensus module currently depends on the `PaceMaker`, `StateSync`, `LeaderElection` and `Networking`.
It has a bootstrapping state where it:
* Initializes connections to the network through a bootstrap node
* Keeps an updated current height (the greatest block height seen on the network)
* Compares network and local current heights, before switching to one of two mutually exclusive modes: `sync` or `consensus`
Copy link

@0xRampey 0xRampey Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's another doc which goes over StateSync in more detail over here https://github.com/pokt-network/pocket/blob/consensus-orchestration-doc/consensus/doc/PROTOCOL_STATE_SYNC.md#state-sync-lifecycle.

The names used here are different from the above doc such as the types of mode.
I'd expect a dev would want to go here for more in-depth info about state sync, so do we

  • Leave a link to the doc here? or maybe
  • Reconcile the terms used between both docs?

I think the idea is to use terminology that is as close as possible to the variable names used in the codebase.

cc: @Olshansk for your thoughts


```mermaid
sequenceDiagram
title: Consensus: Internal orchestration

participant Consensus
participant StateSync
Note over Node: This embeds Networking, FSM
participant Node
participant Network

Node->>Node:Start
Node->>Network:Join
loop
Node->>+Network:RequestCurrentHeight
Network-->>-Node:
Node->>Node:SwitchMode
end

alt SyncMode
par
Node->>Consensus:Pause
and
Node->>+StateSync:Start
Note over Node,StateSync: Details omitted for another diagram
StateSync-->>-Node:SyncDone
end
else ConsensusMode
par
Node->>StateSync:Pause
and
Node->>+Consensus:Start
Consensus-->>-Node:OutOfSync
end
end
```

### StateSync Mode

In this mode the node is behind the latest known-to-network height and will try to catchup by downloading then applying the downloaded blocks (sequentially) to its local state.
red-0ne marked this conversation as resolved.
Show resolved Hide resolved

The `download` and `apply` routines may run in parallel, but `apply` may be blocked by the former if the needed block is missing, it will wait until the needed blocks are downloaded.

#### Download routine

* The `download` routine is alive as long as `sync` mode is on
* It checks in its persistence for the latest downloaded block and tries to get and add to its persistence all the blocks; up to the network current height
* After downloading, and before inserting the block, basic (stateless) verification is made to the block
* A downloaded and inserted block is a structurally valid block but should by no mean considered valid w.r.t. its validators signatures or transactions within

```mermaid
sequenceDiagram
red-0ne marked this conversation as resolved.
Show resolved Hide resolved
title: Consensus: Download blocks
participant Persistence
Note over Node: (Download routine)
participant Node
participant Network

par
loop
Note left of Network: Constantly asking for the highest<br /> known block from network
Node->>+Network:UpdateNetworkCurrentHeight
Network-->>-Node:
end
and
Node->>+Persistence:GetLastDownloadedBlock
Persistence-->>-Node:

loop LastDownloadedBlock < NetworkCurrentHeight
Node->>+Network:GetBlock
Network-->>-Node:
Note left of Network: Currently the node asks all the network<br />for the block it wants to download

Node->>+Persistence:Append
Persistence-->>-Node:
Note left of Node: append if new block

Node->>Node:Wait(until new block)
Note right of Node: If latest network block is reached<br />Wait for a new block to continue
end
end
```
Olshansk marked this conversation as resolved.
Show resolved Hide resolved

### Apply routine

_Note: We do not detail how individual transactions are applied or how state is derived from them. Just assume that the state (specifically validator set) may be mutated after each block application._

* The `apply` routine remains alive as long as `sync` mode is on
* It needs a starting state (genesis, or loaded local state) to build blocks from
* Each block is validated and applied before moving to the next one
* The block application begins at the genesis block (1) and sequentially processes additional blocks until it arrives at the head of the chain (i.e the most recent block or highest block height)
* Since basic validation is done at the download step and assumes that the Pocket node trusts its persistence layer, it is safe to skip basic re-validation
* The `apply` mechanism needs to maintain a chain of trust while applying blocks by performing the following:
* Before applying block at height `h`, verify that it is signed by a quorum from the validator set at height `h-1`; note that the genesis validator set is used for block `1`
* By applying each block, the validator set is updated (validators joining or leaving), starting from genesis validator set for any new node
* With this chain of trust form a total of `3t+1` validators, where at least `2t+1` validators are honest and live. A synching node systematically detects invalid blocks
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* With this chain of trust form a total of `3t+1` validators, where at least `2t+1` validators are honest and live. A synching node systematically detects invalid blocks
* With this chain of trust form a total of `3t+1` validators, where at least `2t+1` validators are honest and live. A syncing node systematically detects invalid blocks

* No malicious or faulty node could inject an alternative block without making at least `2t+1` validators sign it
red-0ne marked this conversation as resolved.
Show resolved Hide resolved
* The persistence layer is used as a resume point for the block application, so a node won't restart block application from genesis each time it's rebooted
* When the routine applies `NetworkCurrentHeight`, it signals it so the node could switch to `consensus` mode. Meanwhile, it waits to apply a new downloaded block
Copy link

@0xRampey 0xRampey Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* When the routine applies `NetworkCurrentHeight`, it signals it so the node could switch to `consensus` mode. Meanwhile, it waits to apply a new downloaded block
* When the routine applies a block with height that matches `NetworkCurrentHeight`, it triggers a signal, prompting the node to switch to `consensus` mode. Meanwhile, it waits to apply a new downloaded block.


```mermaid
sequenceDiagram
title: Consensus: Apply blocks

participant Persistence
participant SyncRoutine
Note over ApplyBlockRoutine: This routine is an abstraction<br /> around block application logic
participant ApplyBlockRoutine
participant State

loop AppliedBlock < NetworkCurrentHeight
SyncRoutine->>+Persistence:GetDownloadedBlock
Persistence-->>-SyncRoutine:

Note right of Persistence: Or genesis if starting from scratch
SyncRoutine->>+State:GetValidatorSet
State-->>-SyncRoutine:

SyncRoutine->>SyncRoutine:Verify(block.qc, ValidatorSet)

SyncRoutine->>+ApplyBlockRoutine:ApplyBlock
ApplyBlockRoutine-->>-SyncRoutine:

Note left of ApplyBlockRoutine: Tells if it's valid block and what are the <br />changes to be made to the state
SyncRoutine->>State:Update

Note left of State: ValidatorSet should be updated here
SyncRoutine->>Persistence:MarkBlockAsApplied

SyncRoutine->>SyncRoutine:Wait(until block downloaded)
Note left of ApplyBlockRoutine: Wait for the next block to apply<br /> if it's not downloaded yet

SyncRoutine->>SyncRoutine:If all blocks applied:<br />SignalSyncEnd
Note right of SyncRoutine: StateSync has finished<br />Switch to consensus mode
end
```

## Consensus Mode

In this mode, the current node is up to date w.r.t. the latest block applied by the network and can start now participating to the consensus process.

This process is driven by:
* A pace maker, that alerts the consensus flow about key timings in the block production process
* `MinBlockTime`: The earliest time a leader node should reap the mempool and start proposing a new block
* `RoundTimeout`: The amount time a replicas wait before starting a new round
* A random but deterministic process to elect the leader of each round
* Given a unique random seed known to all validators and information about the current (height, round) being validated, any validator is able to know who is the leader without needing to communicate with others
* The leader election strategy aims to give validators a chance of leading the round proportional to their stake
* Fallback to a round robin strategy if probabilistic election doesn't work
* A consensus flow that aims to increment the height (thus block production) of the chain
* See [Block Generation](#block-generation)

### PaceMaker block proposal delaying

The pace maker ensures minimum block production time with the aim to have a constant production pace.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pacemaker is one word

* Adding a delay instead of directly proposing a block makes the the process concurrent.
* It ensures that the block proposal is done only once after each `NewRound` step
* Minimum block production time behavior is automatically disabled when the consensus is in debug mode with manual mode enabled (Manual next view triggering)
* When the leader gathers enough `NewRound` messages from replicas to propose a block, a first call to propose a block is made
* The proposal attempt may happen before `MinBlockTime` which the `PaceMaker` will delay.
* While delayed, more `NewRound` messages may come-in and the node will use the higher QC obtained by these late messages to propose the block (discards the previous QC).
* If the timer expires before having any block proposal attempt, any call (with enough signatures) will trigger the block proposal without delay
* If a late message is received after a block has already been proposed by another call, the late message is discarded

#### Concurrent requests
```mermaid
sequenceDiagram
title Pacemaker early concurrent requests

participant ReapMempool
participant BlockPrepare
participant Delay

ReapMempool->>BlockPrepare:RequestPrepare (id=1)
BlockPrepare->>+Delay:Start delay
Note right of Delay: Wait for minBlockTime
ReapMempool->>BlockPrepare:RequestPrepare (id=2)
BlockPrepare--xReapMempool:doNotPrepare (id=1)
Delay->>-BlockPrepare:End delay
BlockPrepare->>ReapMempool:doPrepare (id=2)
ReapMempool->>BlockPrepare: Prepare block
```

#### Late requests

```mermaid
sequenceDiagram
title: Pacemaker late request

participant ReapMempool
participant BlockPrepare
participant Delay

BlockPrepare->>+Delay:Start delay
Note right of Delay: Wait for minBlockTime
Delay->>-BlockPrepare:End delay
ReapMempool->>BlockPrepare:doPrepare (id=1)
BlockPrepare->>ReapMempool:RequestPrepare (id=1)
ReapMempool->>BlockPrepare: Prepare block
```

### Example height,round,step increment

| Height | Round | Step | Comment |
|--------|-------|------|------------------------------------------|
| 1 | 0 | 1 | Initial round, initial block, |
| 1 | 0 | 2 | Enter Prepare step |
| 1 | 0 | 3 | Enter Pre-Commit step |
| 1 | 1 | 1 | Round interrupted, reset step |
| 1 | 1 | 2 | Enter Prepare step again |
| 1 | 1 | 3 | Enter Pre-Commit step again |
| 1 | 1 | 4 | Enter Commit step for the first time |
| 2 | 0 | 1 | Incremented height, reset round and step |

#### Example invalid increments

| Height | Round | Step | Comment |
|--------|-------|------|----------------------------------------------------------------|
| 1 | 1 | 3 | Enter Pre-Commit step |
| 1 | 2 | 3 | **Invalid**: If `Round` increments, `Step` has to reset to `1` |

| Height | Round | Step | Comment |
|--------|-------|------|--------------------------------------------------------------------|
| 1 | 1 | 3 | Enter Pre-Commit step |
| 2 | 0 | 4 | **Invalid**: `Height` only increments when previous Step is at `4` |

| Height | Round | Step | Comment |
|--------|-------|------|-------------------------------------------------------------------------------|
| 1 | 2 | 4 | Enter Commit step |
| 2 | 2 | 4 | **Invalid**: When `Height` increments, reset `Round` to `0` and `Step` to `1` |

## Implementation

Expand Down