RFC: Snarky(rs) #774

mimoo · 2022-12-22T22:59:35Z

mimoo
Dec 22, 2022

Summary

Today, Mina's cryptography stack is split into two large parts: an OCaml part and a Rust part. The proof system, kimchi, is in Rust. Snarky, the frontend to write circuits, and pickles, the recursion layer, are in OCaml. Our medium term goal is to transition this part of the cryptography stack written in OCaml to Rust. This RFC specifically documents the transition of snarky(ml) to snarky(rs).

Motivation

There are several reasons to replace snarky(ml) with snarky(rs):

Productivity. We've had a hard time making protocol changes, improvement of life changes, refactors, and other type of changes in the cryptography stack due to most changes impacting both the OCaml side and the Rust side. Most of our engineers are not OCaml experts, and so a lot of changes are either postponed (sometimes indefinitely) or unnecessarily slow to roll out.

Maintainability and security. The OCaml implementation is quite disorganized, and lacks documentation. It is quite a burden to onboard new engineers on this codebase. We are hoping that moving things to Rust will allow us to structure it well, and document it well. Improving the maintainability, as well as the security, of the library.

Performances. Rust is a much more performant language, and having everything in Rust will allow us to profile and improve performances in much more predictable ways.

FFI. SnarkyJS is currently making use of both the Rust side (by compiling it into WASM) and the OCaml side (by compiling it into javascript). This means that the current FFI code and infrastructure (how it gets updated) is quite a nightmare to maintain and update. If we can remove as much surface area as we can, it'll make exposing Rust code to both OCaml and javascript/typescript much more clean and manageable.

Detailed design

The current architecture of snarky(ml) within Mina looks something like this:

snarky(ml) relies on a lot of Rust code, which is exposed in OCaml through the kimchi backend.

Ideally, we would want the kimchi backend to export more than just kimchi, but snarky(rs) as well (which would allow SnarkyJS to only rely on kimchi):

This is not exactly true, as snarkyJS relies heavily on pickles and so SnarkyJS might not be able to more directly use snarky(rs) without converting parts of pickles(ml) to pickles(rs) (out of scope for this RFC).

Snarky(ml) architecture

As said previously, snarky(ml) is quite disorganized, undocumented, and lacks abstractions. For this reason, it is hard to succinctly explain how snarky(ml) is architected. The principles, though, are less complex and we have started to document them for snarky(rs).

In this section, we will avoid focus on how snarky(ml) works (so refer to the previous link if you're interested in that). We will instead focus on how snarky(ml) is structured, and the challenges that exist in converting snarky(ml) to snarky(rs). As you will see they are mostly unrelated to the concepts behind snarky, and more related to the organization of the code (that make changes difficult) and peculiarities of OCaml (that don't have an easy 1:1 translation in Rust).

If we could visualize the structure of snarky(ml) it might look like something like this:

But this diagram is a complete lie, as all components are heavily intertwined (which we will discuss in more details shortly). What matter are the following points:

there are distinct components that we can isolate to produce a much cleaner structure in snarky(rs)
snark0 in yellow at the top is the orchestrator that ends up constructing the interface exposed to the users of snarky
Rust code comes from the backend

in the next section let's talk about specific work that needs to be completed in order to fulfill the goals of this RFC.

Implementation details of snarky(ml) and transition to snarky(rs)

In this section we talk about implementation details in snarky(ml) and the path to transitioning to a snarky(rs) implementation.

Two interfaces for two monads. Snarky(ml) was originally based on an AST monad (which we documented here). The interface of snarky thus allowed one to build a circuit by building its AST, and then compile it by parsing the AST. The use of monad meant that an OCaml "let syntax" had to be used, and this made code quite confusing. Later, a new interface was introduced by implementing a state monad (which we documented here). This gave birth to an imperative interface which looks much cleaner. The problem: snarky was split in two with lots of redundant and unnecessary complex code. Worse, Mina made extended use of the two different (monadic, and imperative) APIs of snarky.

As the AST monad did not allow for meaningful optimizations, it not only complicated the design of snarky unnecessarily but also made a rewrite in Rust quite challenging. Early on, we realized that we could easily build a monadic interface on top of the imperative one, and thus heavily simplify the code and the implementation of snarky(rs) by removing the AST monad while maintaining backward compatibility with all the snarky code written in Mina. These changes were made in a series of PRs (with the final one still waiting to be merged).

Snarky(ml) as a thin layer. Snarky(ml) is not going to go away in the medium term, because we need to continue maintaining circuits and types written using snarky(ml) in the Mina codebase. Thus, we need to understand where the OCaml starts, and where it should stop. In other words, what is the surface area that we want to expose from Rust to OCaml. Following the insight of the previous point, we should aim at keeping the logic built on top of the state monad, and provide the same interface accessible from that monad to construct circuits (exists, as_prover, add_constraint, etc.) or types (typ, cvar). On the other hand, we should be able to fully reimplement high-level APIs like constraint_system which compiles a circuit and generate_witness(_conv) which creates a witness (or produces a proof).

Functor heavy code. Snarky has the particularity of heavily using functors. While functors are a powerful concept in OCaml, they can mask equality between types to the compiler, and thus require a lot of manual wiring. Today, functors are combined with top-level instantiations as well, and the code of snarky is heavily intertwined. This makes it difficult to understand the codebase, as well as perform small refactors and changes. Functors in snarky are used in two different ways: to specialize abstract types or type variables present in a module (often, to a specific field) or to assemble a number of modules together (for example, two modules might rely on each other, and as such have to use abstract types as well until they get combined with a functor).

A layered approach would work much better than the described mesh-network-of-functors approach, in which functors are used as a chain, each extending the result of the previous one (snark0 being the final functor). Since this would require a large amount of work, our current strategy is not to perform this refactor, but to reach something closer to this model with the numerous refactors that will be needed to integrate snarky(rs). (See the diagram at the end of this section.)

Incremental approach. Due to the last insight, we are taking an incremental approach to rewriting snarky(ml) in Rust: lower-level components should be isolated, replaced, and tested first. We then should work our way up, until we manage to replace the larger components. Knowing that, we expect to first replace components like Cvars and Constraints, and end up with components like Runstate and Checked which are more high-level. We also propose to provide the Rust FFI through the interface that currently expose to snarky: the Backend interface.

Testing. It is important that the correct behavior of snarky(ml) remains in the transition to a snarky(rs). Especially as Mina and SnarkyJS uses different parts of snarky(ml). For this reason, we have devised a list of tests and are implementing them. These tests will allow us to perform comparative testing, in which the behavior of snarky(rs) will be checked to match the exact same behavior of snarky(ml). We can achieve that by either serializing the output of snarky(ml) and checking that it matches the serialization of the output of snarky(rs), or by testing both implementations in conjunction, or by doing both. Note that writing tests will also provide a form of documentation of the current API of snarky(ml), and how it is supposed to be used.

GADTs. GADTs are an advanced feature of OCaml, and are used in a number of places in snarky. GADTs have no equivalent in Rust, which makes it tricky to translate. The bigger user of this feature is the Typ component, which allows users of snarky(ml) to construct snarky types--types that can be represented in OCaml as well as in snarky (as cvars).

Extensible variants. The Constraints module in snarky(ml) is constructed via extensible variants, which also does not have an equivalent in Rust. Our proposal is to rewrite this module as a type exposed by the constraint system.

Polymorphic code. A lot of code is polymorphic on the OCaml side, which makes it hard to translate to Rust as 1) using OCaml values with generic Rust structures is not trivial and 2) we currently do not have a way to have expose generic Rust functions to OCaml. It seems like most use of polymorphism in snarky(ml) are due to snarky supporting different fields, in which case we propose to simply expose declinations of all Rust types for the different fields we want to support (only two at the moment: The Pasta curve's Fp and Fq).

Communication from OCaml to Rust. OCaml sometimes pass functions, or types (via GADTs), as arguments to functions. For example, exists and as_prover can both path a closure as argument. exists also takes a Typ GADT to advertise the types that need to be used. We want to preserve the Typ feature, which is that users of snarky can create their own types. It is not clear exactly what is the simplest path, but we might be able to take inspiration on SnarkyJS and implement a similar thing. This will potentially mean that a lot of code on the Mina side will be have to be refactored.

In summary, the proposal is about:

implementing refactors in snarky(ml) to facilitate an integration of snarky(rs)
implementing snarky(rs) (you can see a proof of concept in the mimoo/snarky2 branch)
incrementally integrating and testing snarky(rs) within snarky(ml)

If we were to visualize that, perhaps the final design would look like this:

Bonus: SnarkyJS

For completion, here's a diagram of how snarkyJS accesses both Rust code and OCaml code:

Drawbacks

None that we can think of. The split of the cryptography stack between OCaml and Rust is one of the largest bottleneck in our organization today, and one of the biggest risk (as not enough engineers understand this part of the stack).

Rationale and alternatives

Why is this design the best in the space of possible designs?

What other designs have been considered and what is the rationale for not choosing them?

What is the impact of not doing this?

We also considered rewriting snarky from scratch, in Rust, and then swapping the two implementations. The problem is that snarky's logic is deeply connected to how it is being used, by both Mina and SnarkyJS. Thus we need to preserve a number of design decisions.

We could in theory clean snarky(ml), and only when we are done, proceed with the implementation of snarky(rs). But 1) this implies a lot of unnecessary work, and 2) we are naturally doing this as we go as a way to facilitate the integration of snarky(rs).

Finally, we could also do nothing, but as pointed out before this would increase the risk overtime and would continue to slow us down further and further.

Prior art

We're not aware of previous refactors of OCaml code into Rust, besides Facebook's Hack type checker. But we could not find any lessons learned there.

Unresolved questions

This proposal does not provide concrete solutions for some of the OCaml-to-Rust translation problems that we outlined. It also does not provide a step-by-step instruction guide on how to perform the incremental approach we outlined. This is because we predict that there will be some unknowns on the way and/or these are implementations details that don't need to be outlined here.

In addition, a number of snarky types are implemented outside of snarky. An MVP would consist of moving most of the logic out of snarky(ml), but an ideal world would also mean that building blocks implemented on top of snarky would be moved to Rust as well.

Note also that we will have to figure out how to rewrite pickles in Rust after the completion of this work. But this is out of scope for this RFC.

mimoo · 2022-12-22T23:31:37Z

mimoo
Dec 22, 2022
Author

cc @mitschabaude what I talked to you in DM:

Communication from OCaml to Rust. OCaml sometimes pass functions, or types (via GADTs), as arguments to functions. For example, exists and as_prover can both path a closure as argument. exists also takes a Typ GADT to advertise the types that need to be used. We want to preserve the Typ feature, which is that users of snarky can create their own types. It is not clear exactly what is the simplest path, but we might be able to take inspiration on SnarkyJS and implement a similar thing. This will potentially mean that a lot of code on the Mina side will be have to be refactored.

7 replies

bkase Jan 10, 2023
Maintainer

We could consider manually "lambda lift"ing closures to work around passing closures -- it's actually not as annoying as you'd think https://en.wikipedia.org/wiki/Lambda_lifting

mitschabaude Jan 11, 2023

Does snarky-rs need to call the callbacks later, or do they just exist so that they can be called immediately and snarky-rs can set some global state before and after? if the latter is the case, we could replace passing the closures with "start context" and "end context" calls, and in between OCaml / TS run the logic of their callbacks

mimoo Feb 16, 2023
Author

I think we could write most of the logic on the OCaml side, like so:

let exists typ ~checked ~compute = 
  let cvars = match has_witness state with
    true -> 
      (* we run our closure ourselves *)
      let as_prover_backup = get_as_prover state in
      set_as_prover state true;
      let val = compute state in
      set_as_prover state as_prover_backup;
      let (fields, aux) = typ.value_to_fields val in
      (* this will store the value in the witness + return the correct cvars *)
      compute_ocaml_witness state fields
    false ->
      (* this will allocated cvars and return them *)
      compute_ocaml state typ.size_in_field_elements
  in
  let aux = typ.constraint_system_auxiliary () in
  let var = typ.var_of_fields (cvars, aux) in
  if checked then typ.check state var;
  var

where we exported some Rust functions in OCaml:

val has_witness : state -> bool
val get_as_prover : state -> bool
val set_as_prover : state -> bool -> unit
val compute_ocaml_witness : state -> field array -> cvar array
val compute_ocaml : state -> int -> cvar array

the annoying thing is how much internal logic is leaking on the OCaml side, and if we make a change in snarky internally without considering that then this might create bugs. I'm wondering how much added logic like that there is on the js side now (@mitschabaude do you have a pointer :D?)

mitschabaude Feb 16, 2023

@mimoo yeah this is how I imagined it! I think it's fine, honestly. Both the OCaml side and the JS side will unavoidably have a bit of internal logic on top of the snarky-rs bindings themselves.

For reference, here is the current wrapper for exists in snarkyjs: https://github.com/o1-labs/snarkyjs/blob/main/src/lib/circuit_value.ts#L942-L973
Somewhere inside, it calls another function Circuit._witness which is the actual exists from OCaml. Currently the wrapper is there to process custom typs, and gracefully handle being called out-of-circuit. It wouldn't hurt if it had to include more low level calls to the lower stack, like set_as_prover.

Actually, now that I'm thinking of it, this would be preferable to the current way, because it would immensely cut down the stack traces when calling some user code, and make them stay in JS land.

cc @mrmr1993 this is something to think about -- moving from a "closure-passing" paradigm to one that directly mutates snarky state. If this was done also for Pickles (when it calls the main function), we could get past most of the bad debugging experience that isn't already addressed by the planned snarkyjs refactor.

mimoo Feb 16, 2023
Author

there actually seem to be a way to call ocaml closures in Rust but I think the way I outlined there (and in https://hackmd.io/4zUl4nMDQrWtJnKEC4bpDw) makes more sense

bkase · 2023-01-10T23:40:51Z

bkase
Jan 10, 2023
Maintainer

This proposal does not provide concrete solutions for some of the OCaml-to-Rust translation problems that we outlined. It also does not provide a step-by-step instruction guide on how to perform the incremental approach we outlined. This is because we predict that there will be some unknowns on the way and/or these are implementations details that don't need to be outlined here.

I think it would be nice to have RFC-style documents outlining the concrete solutions for each component before too much production implementation work is done so we can have discussions about them, but this seems good for now 👍

We may need another document describing requirements for this rewrite -- in addition to obvious ones to resolve pain-points around usage in Rust / untangling code, to best of our ability beforehand, we should agree upon the manner which SnarkyRS circuits can be ingested/inspected by SnarkyJS circuits (in my mind, ideally folks can author libraries that describe, in Rust, circuit snippets -- functions, var, value, etc -- and use them directly from SnarkyJS in JS: This supports the use-case of Rust zk/cryptography-engineers building infra libraries that product engineers consume). This may impact some of the actual API implementation work.

0 replies

mitschabaude · 2023-01-12T09:47:07Z

mitschabaude
Jan 12, 2023

Btw, I found something related to the open questions: You actually can pass closures from JS to Rust. https://rustwasm.github.io/docs/wasm-bindgen/reference/receiving-js-closures-in-rust.html
So, if we eventually go with some complicated solution like lambda-lifting for the OCaml - Rust bridge, we can completely skip that for the JS - Rust bridge and just pass normal closures

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Snarky(rs) #774

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 7 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

RFC: Snarky(rs) #774

mimoo Dec 22, 2022

Summary

Motivation

Detailed design

Snarky(ml) architecture

Implementation details of snarky(ml) and transition to snarky(rs)

Bonus: SnarkyJS

Drawbacks

Rationale and alternatives

Prior art

Unresolved questions

Replies: 3 comments · 7 replies

mimoo Dec 22, 2022 Author

bkase Jan 10, 2023 Maintainer

mitschabaude Jan 11, 2023

mimoo Feb 16, 2023 Author

mitschabaude Feb 16, 2023

mimoo Feb 16, 2023 Author

bkase Jan 10, 2023 Maintainer

mitschabaude Jan 12, 2023

mimoo
Dec 22, 2022

Replies: 3 comments 7 replies

mimoo
Dec 22, 2022
Author

bkase Jan 10, 2023
Maintainer

mimoo Feb 16, 2023
Author

mimoo Feb 16, 2023
Author

bkase
Jan 10, 2023
Maintainer

mitschabaude
Jan 12, 2023