Skip to content

Commit

Permalink
Merge pull request #305 from a16z/book/moodlezoup-core-components
Browse files Browse the repository at this point in the history
Flesh out wiki
  • Loading branch information
moodlezoup authored Apr 17, 2024
2 parents 6f857f5 + f205ccb commit c859eb0
Show file tree
Hide file tree
Showing 23 changed files with 244 additions and 131 deletions.
9 changes: 6 additions & 3 deletions book/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,18 @@
- [Troubleshooting](./usage/troubleshooting.md)
- [Contributors](./contributors.md)
- [How it works](./how_it_works.md)
- [Lookups](./how/lookups.md)
- [Memory Checking](./how/memory-checking.md)
- [GKR](./how/gkr.md)
- [Jolt](./how/jolt.md)
- [Instruction lookups](./how/instruction_lookups.md)
- [Read-write memory](./how/read_write_memory.md)
- [Bytecode](./how/bytecode.md)
- [R1CS constraints](./how/r1cs_constraints.md)
- [Background](./background.md)
- [Sumcheck](./background/sumcheck.md)
- [Multilinear Extensions](./background/multilinear-extensions.md)
- [Eq Extension](./background/eq-polynomial.md)
- [Batched Polynomial Openings](./background/batched-openings.md)
- [Memory Checking](./background/memory-checking.md)
- [GKR](./background/gkr.md)
- [Dev](./dev/README.md)
- [Install](./dev/install.md)
- [Tools](./dev/tools.md)
Expand Down
2 changes: 2 additions & 0 deletions book/src/background.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Background
If you're interested in learning the nitty gritty details of SNARKs including questions like 'why', 'is it sound', you should read [Proofs and Args ZK](https://people.cs.georgetown.edu/jthaler/ProofsArgsAndZK.pdf) by Thaler. If you're simply interested in 'how', this section is for you.
- [Sumcheck](./background/sumcheck.md)
- [GKR](./background/gkr.md)
- [Multilinear Extensions](./background/multilinear-extensions.md)
- [EQ Polynomial](./background/eq-polynomial.md)
- [Memory Checking](./background/memory-checking.md)
6 changes: 3 additions & 3 deletions book/src/background/eq-polynomial.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# Eq Extension
The $\widetilde{eq}$ MLE is useful throughout the protocol
The $\widetilde{\text{eq}}$ MLE is useful throughout the protocol
$$
\widetilde{eq}(r,x) = \begin{cases}
\widetilde{\text{eq}}(r,x) = \begin{cases}
1 & \text{if } r = x \\
\mathbb{F} & \text{otherwise}
\end{cases}
$$


$$
\widetilde{eq}(r,x) = \prod_{i=1}^{\log(m)} (r_i \cdot x_i + (1 - r_i) \cdot (1 - x_i)) \quad \text{where } r, x \in \{0,1\}^{\log(m)}
\widetilde{\text{eq}}(r,x) = \prod_{i=1}^{\log(m)} (r_i \cdot x_i + (1 - r_i) \cdot (1 - x_i)) \quad \text{where } r, x \in \{0,1\}^{\log(m)}
$$
35 changes: 35 additions & 0 deletions book/src/background/gkr.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# GKR
GKR is a SNARK protocol for binary trees of multiplication / addition gates. The standard form allows combinations of both using a wiring predicate $\widetilde{V}_i$, and two additional MLEs $\widetilde{\text{add}}_i$ and $\widetilde{\text{mult}}_i$.

$\widetilde{V}_i(j)$ evaluates to the value of he circuit at the $i$-th layer in the $j$-th gate. For example $\widetilde{V}_1(0)$ corresponds to the output gate.

$\widetilde{\text{add}}_i(j)$ evaluates to 1 if the $j$-th gate of the $i$-th layer is an addition gate.

$\widetilde{\text{mult}}_i(j)$ evaluates to 1 if the $j$-th gate of the $i$-th layer is a multiplication gate.

The sumcheck protocol is applied to the following:
$$
\widetilde{V}_i(z) = \sum_{(p,\omega_1,\omega_2) \in \{0,1\}^{s_i+2s_{i+1}}} f_{i,z}(p,\omega_1,\omega_2),
$$

where

$$
f_i(z, p, \omega_1, \omega_2) = \beta_{s_i}(z, p) \cdot \widetilde{\text{add}}_i(p, \omega_1, \omega_2)(\widetilde{V}_{i+1}(\omega_1) + \widetilde{V}_{i+1}(\omega_2)) + \widetilde{\text{mult}}_i(p, \omega_1, \omega_2)\widetilde{V}_{i+1}(\omega_1) \cdot \widetilde{V}_{i+1}(\omega_2)
$$
$$
\beta_{s_i}(z, p) = \prod_{j=1}^{s_i} ((1-z_j)(1-p_j) + z_j p_j).
$$


Lasso and Jolt implement the [Thaler13](https://eprint.iacr.org/2013/351.pdf) version of GKR which is optimized for the far simpler case of a binary tree of multiplication gates. This simplifies each sumcheck to:
$$
\widetilde{V}_i(z) = \sum_{p \in \{0,1\}^{s_i}} g^{(i)}_z(p),
$$

where

$$
g^{(i)}_z(p) = \beta_{s_i}(z, p) \cdot \widetilde{V}_{i+1}(p,0) \cdot \widetilde{V}_{i+1}(p,1)
$$
GKR is utilized in [memory-checking](./memory-checking.html) for the multi-set permutation check.
File renamed without changes.
2 changes: 1 addition & 1 deletion book/src/background/multilinear-extensions.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,4 @@ for i in 0..half {
```

### Multi Variable Binding
Another common algorithm is to take the MLE $\tilde{f}(x_1, ... x_v)$ and compute its evaluation at a single $v$-variate point outside the boolean hypercube $x \in \mathbb{F}^v$. This algorithm can be performed in $O(n log(n))$ time by preforming the single variable binding algorithm $log(n)$ times.
Another common algorithm is to take the MLE $\tilde{f}(x_1, ... x_v)$ and compute its evaluation at a single $v$-variate point outside the boolean hypercube $x \in \mathbb{F}^v$. This algorithm can be performed in $O(n \log(n))$ time by preforming the single variable binding algorithm $\log(n)$ times.
2 changes: 1 addition & 1 deletion book/src/background/sumcheck.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Suppose we are given a $v$-variate polynomial $g$ defined over a finite field $\
$$ H := \sum_{b_1 \in \{0,1\}} \sum_{b_2 \in \{0,1\}} \cdots \sum_{b_v \in \{0,1\}} g(b_1, \ldots, b_v). $$

In order to execute the protocol, the verifier needs to be able to evaluate $g(r_1, \ldots, r_v)$ for a randomly chosen vector $(r_1, \ldots, r_v) \in \mathbb{F}^v$. Hence, from the verifier's perspective, the sum-check protocol _reduces_ the task
of summing $g$'s evaluations over $2^v$ inputs (namely, all inputs in $\{0, 1}^{v}$) to the task of evaluating $g$
of summing $g$'s evaluations over $2^v$ inputs (namely, all inputs in $\{0, 1\}^{v}$) to the task of evaluating $g$
at a _single_ input in $(r_1, \ldots, r_v) \in \mathbb{F}^v$.

The protocol proceeds in $v$ rounds as follows. In the first round, the prover sends a polynomial $g_1(X_1)$, and claims that
Expand Down
45 changes: 45 additions & 0 deletions book/src/how/bytecode.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Bytecode

Jolt proves the validity of registers and RAM using offline memory checking.

## Decoding an ELF file

The tracer iterates over the `.text` sections of the program ELF file and decode RISC-V instructions. Each instruction gets mapped to a `BytecodeRow` struct:

```rust
pub struct BytecodeRow {
/// Memory address as read from the ELF.
address: usize,
/// Packed instruction/circuit flags, used for r1cs
bitflags: u64,
/// Index of the destination register for this instruction (0 if register is unused).
rd: u64,
/// Index of the first source register for this instruction (0 if register is unused).
rs1: u64,
/// Index of the second source register for this instruction (0 if register is unused).
rs2: u64,
/// "Immediate" value for this instruction (0 if unused).
imm: u64,
}
```

The registers (`rd`, `rs1`, `rs2`) and `imm` are described in the [RISC-V spec](https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf).
The `bitflags` field is described in greater detail [below](#bitflags).

The preprocessed bytecode serves as the (read-only) "memory" over which we perform offline memory checking.

![bytecode_trace](../imgs/bytecode_trace.png)

We currently assume the program bytecode is sufficiently short and known by the verifier, so the prover doesn't need to provide a commitment to it –– the verifier can compute the MLE of the bytecode on its own.
This guarantees that the bytecode trace is consistent with the agreed-upon program.

Instead of $(a, v, t)$ tuples, each step in the bytecode trace is mapped to a tuple $(a, v_\texttt{bitflags}, v_\texttt{rd}, v_\texttt{rs1}, v_\texttt{rs2}, v_\texttt{imm}, t)$.
Otherwise, the offline memory checking proceeds as described in [Offline Memory Checking](../background/memory-checking.md).


## Bitflags

The `bitflags` of a given instruction is the concatenation of its [circuit flags](./r1cs_constraints.md#circuit-and-instruction-flags) and [instruction flags](./instruction_lookups.md).
This concatenation is enforced by R1CS constraints.

![bitflags](../imgs/bitflags.png)
35 changes: 0 additions & 35 deletions book/src/how/gkr.md

This file was deleted.

53 changes: 53 additions & 0 deletions book/src/how/instruction_lookups.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Instruction lookups

In Jolt, the "execute" part of the "fetch-decode-execute" loop is handled using a lookup argument (Lasso).
At a high level, the lookup queries are the instruction operands, and the lookup outputs are the instruction outputs.

## Lasso

Lasso is a lookup argument (equivalent to a SNARK for reads into a read-only memory). Lookup arguments allow the prover to convince the verifier that a (committed) set of values $Q$ is a subset of a lookup table $T$. Lasso is a special lookup argument with highly desirable asymptotic costs largely correlated to the number of queries $Q$ rather than the length of of the table $T$.

A conversational background on lookups can be found [here](https://a16zcrypto.com/posts/article/building-on-lasso-and-jolt/). In short: lookups are great for zkVMs as they allow constant cost / developer complexity for the prover algorithm per VM instruction.

A detailed engineering overview of Lasso (as a standalone lookup argument) can be found [here](https://www.youtube.com/watch?v=iDcXj9Vx3zY).

## Using Lasso to handle instruction execution

Lasso requires that each primitive instruction satisfies a decomposability property.
The property needed is that the input(s) to the instruction can be broken into "chunks" (say, with each chunk
consisting of 16 bits), such that one can obtain the answer to the original instruction by
evaluating a simple function on each chunk and then "collating" the results together.
For example, the bitwise-OR of two 32-bit inputs x and y can be computed by breaking each input up into 8-bit chunks, XORing
each 8-bit chunk of x with the associated chunk of y, and concatenating the results together.

In Lasso, we call the task of evaluating a simple function on each chunk a "subtable lookup" (the relevant lookup table
being the table storing all $2^{16}$ evaluations of the simple function). And the "collating" of
the results of the subtable lookups into the result of the original lookup (instruction execution on the un-chunked inputs)
is handled via an invocation of the sum-check protocol. We call this the "primary sumcheck" instance in Lasso.

The "primary sumcheck" collation looks as follows for a trace of length $m$ and a VM with $f$ unique instructions.

$\sum_{x \in \{0,1\}^{\log_2(m)}} \left[\widetilde{\text{eq}}(r,x) \cdot \sum_{f \in \{0,1\}^{\log_2(F)}} \widetilde{\text{flags}_f}(x) \cdot g_f(\text{terms}_f(x))\right]$

$\widetilde{\text{flags}_f}(x)$ is an indicator function for the $f$-th instruction: $\widetilde{\text{flags}_f}(x) = 1$ if instruction $f$ is used during the $x$-th step of the trace, when $x \in \{0,1\}^{\log_2(m)}$.

$g_f(...)$ is the collation function used by the $f$-th instruction.

$\text{terms}_f(x) = [E_1(x), ... E_\alpha(x)]$ where $\alpha$ is the number of independent memories used by an instruction. For simple instructions like the EQ instruction, $\alpha = C$, $\text{terms}_f(x) = [E_1(x), ... E_C(x)]$. More complicated instructions such LT might have $\text{terms}_f(x) = [E_{\texttt{EQ}}(x), E_{\texttt{LT1}}(x), E_{\texttt{LT2}}(x)]$. The exact layout is dependent on the number of subtables required by the decomposition of the instruction. The mappings can be found in the `JoltInstruction::subtable` method implementations.

### Mental Model
For a given $r = x \in \{0,1\}^{log(m)}$ (think integer index of the instruction within the trace), $\widetilde{\text{eq}} = 0$ for all but one term of the outer sum. Similarly all $\widetilde{\text{flags}_f}(x) = 0$ for all but one term of the inner sum. Leaving just the collation function of a single instruction, evaluating to the collated lookup output of the single instruction. In reality $r$ is a random point $r \in \mathbb{F}^{log(m)}$ selected by the verifier over the course of the protocol. The evaluation point provides a distance amplified encoding of the entire trace of instructions.


To illustrate more concretely imagine a two-instruction VM for LT and EQ instructions with $C=1$.

$$
\sum_{x \in \{0, 1\}^{\log_2(m)}}{\widetilde{\text{eq}}(r,x) \cdot \left[ \widetilde{\text{flags}}_{\texttt{LT}}(x) \cdot g_{\texttt{LT}}(E_{\texttt{LT}}(x)) + \widetilde{\text{flags}}_{\texttt{EQ}}(x) \cdot g_{\texttt{EQ}}(E_{\texttt{EQ}}(x)) \right]}
$$


## Subtable Flags
`TODO`
- We then use memory checking to determine that each of the memories $E_i$ is well formed
- At a given step of the CPU only a single instruction will be used, that means that only that instruction's subtables will be used. For the rest of the memories we insert a no_op with (a, v) = 0. In order to make the GKR trees cheaper to compute and sumcheck we'll add a single additional layer to the GKR tree. During this layer we'll "toggle" each of the GKR leaves to "1" in the case that it is an unused step of the CPU. This will make the binary tree of multiplication gates cheaper. We'll toggle based on a new flags polynomial called $subtable-flags_f$ which is the sum of all of the $instruction-flags_f$ used in the instruction collation above.
- The function to compute each of the leaves becomes $leaf[i] = \text{subtable-flags}[i] \cdot \text{fingerprint}[i] + (1 - \text{subtable-flags}[i])$
Loading

0 comments on commit c859eb0

Please sign in to comment.