Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: zero-bin is now able again to accesses evm_arithmetization for circuit versions #310

Merged
merged 12 commits into from
Aug 15, 2024

Conversation

BGluth
Copy link
Contributor

@BGluth BGluth commented Jun 21, 2024

Resolves #304.

Note however that the issue was actually something separate from what is mentioned in the issue. zero-bin was actually getting the version of evm_arithmetization from Cargo.lock and was not using tags. The issue was zero-bin was one directory deeper from Cargo.lock after it got migrated into zk_evm.

@BGluth BGluth requested review from muursh and Nashtare as code owners June 21, 2024 16:05
@github-actions github-actions bot added the crate: zero_bin Anything related to the zero-bin subcrates. label Jun 21, 2024
@Nashtare Nashtare added this to the Cleanups and Misc. milestone Jun 21, 2024
zero_bin/.cargo/config.toml Outdated Show resolved Hide resolved
BGluth added a commit that referenced this pull request Jun 24, 2024
@BGluth BGluth enabled auto-merge (squash) June 24, 2024 17:49
BGluth added a commit that referenced this pull request Jun 25, 2024
- Now writes `circuits/` to the root directory and not inside `tools/`.
BGluth added a commit that referenced this pull request Jun 25, 2024
- Now writes `circuits/` to `zero_bin` and not inside `tools/`.
@BGluth BGluth requested a review from 0xaatif June 25, 2024 17:47
BGluth added a commit that referenced this pull request Jun 27, 2024
- Now writes `circuits/` to `zero_bin` and not inside `tools/`.
.cargo/config.toml Outdated Show resolved Hide resolved
@0xaatif
Copy link
Contributor

0xaatif commented Jul 2, 2024

Any interactions with #357 ?

BGluth added a commit that referenced this pull request Jul 8, 2024
@BGluth BGluth force-pushed the fix/304_circuit_version_fix branch from ec16420 to acc8fd1 Compare July 8, 2024 21:19
@BGluth
Copy link
Contributor Author

BGluth commented Jul 8, 2024

@0xaatif Rebased and tested locally with #310, and it seems fine.

@BGluth BGluth requested a review from 0xaatif July 8, 2024 21:21
@BGluth BGluth force-pushed the fix/304_circuit_version_fix branch from e4230fd to 3df7730 Compare July 8, 2024 21:26
@0xaatif
Copy link
Contributor

0xaatif commented Jul 8, 2024

Overwriting CARGO_WORKSPACE_DIR still feels surprising to me - could you explain why this is the right solution (and not e.g changing the paths in the code so they point to wherever they're supposed to)?

I might be missing context :)

@BGluth
Copy link
Contributor Author

BGluth commented Jul 9, 2024

Yeah I don't blame you here, as it's not very obvious what [env] is doing in .cargo/config.toml.

So like we kind of both agree above, this is a potentially unstable way to get the project directory. Anything specified in [env] adds these environment variables during runtime (and compile time?). CARGO_WORKSPACE_DIR isn't an environment variable set by cargo. It's just one that I came up with. If the user ever happens to specify CARGO_WORKSPACE_DIR on their own, then [env] will never overwrite it (unless you set force = true).

Now why this works is not really plain either. When relative is set to true, cargo treats the value as a path and prepends the project's directory path to the value. If we set the value to just "", then we just get a path to the project directory.

@0xaatif
Copy link
Contributor

0xaatif commented Jul 9, 2024

CARGO_WORKSPACE_DIR isn't an environment variable set by cargo. It's just one that I came up with

Ah I didn't realise this!
It looks like an environment variable owned by cargo - I'd advocate for a different name like ZK_EVM_WORKSPACE_DIR
That would unblock this for me as a stopgap, but I'm interested in the actual problem this code is trying to solve.

Having a closer look at the code, it basically looks like have some local asset that we want to bundle in the binary, but fallback to a user-provided one - does that sound right? I can't find a tools/circuits directory on a fresh checkout - am I missing something?

If that is the case, can we include_str and parse on startup, or databake it in?

@BGluth
Copy link
Contributor Author

BGluth commented Jul 9, 2024

Yeah I agree with the rename to ZK_EVM_WORKSPACE_DIR.

So the idea is we want to always have the project use a constant circuit directory whenever we run the prover inside of the project directory. When we run the binary and want to figure out where the circuit directory is/should be created, we are doing this:

  • If we are in the project directory (ie. CARGO_WORKSPACE_DIR is set), the circuit directory is set to <CARGO_WORKSPACE_DIR>/tools/circuits.
  • If it's not set (currently set by the [env] section if running through cargo or the scripts in tools or by the user), then default to using circuits in the CWD.

If we're just distributing a binary around to users without the project directory (question: will we be?), then the best that we can do is to have it default to the current directory or have the user specify the env variable.

Also a bit of an aside and maybe only somewhat related, but if the user tries running the scripts in tools/ outside of the project directory, we're going to have issues because of this:

# Set the environment variable to let the binary know that we're running in the project workspace.
export CARGO_WORKSPACE_DIR="${TOOLS_DIR}/../"

BGluth added a commit that referenced this pull request Jul 9, 2024
- Renamed "CARGO_WORKSPACE_DIR" --> "ZK_EVM_WORKSPACE_DIR".
@0xaatif
Copy link
Contributor

0xaatif commented Jul 10, 2024

So the idea is we want to always have the project use a constant circuit directory whenever we run the prover inside of the project directory

Ah I see!
So really we want a cache of files keyed by evm_arithmetization_version!
I think my dream solution here would be $ZK_EVM_CACHE_DIR/<evm_arith_version> (probably XDG_CACHE_DIR/something), where the user may override ZK_EVM_CACHE_DIR, but has no visibility into evm_arith_version.
We can read the version from build.rs, but never need to set the env var, and probably never even need build.rs - just using env!("CARGO_PKG_VER") should suffice.
We use semver's ordering to decide which to take (and that way if we need pre-release, we can have it - I suspect there'll be bugs here )

See also

What do you think about the above?
I'm no longer blocking this review now that I understand more, but please either:

  • implement the above
  • create a tracking issue for it

@BGluth
Copy link
Contributor Author

BGluth commented Jul 10, 2024

Yeah I think it's a bit of an improvement using a standard directory for storing the circuits.

I guess we could switch to using our own crate version at this point. Technically the version of evm_arithmetization is more accurate (eg. if the version of evm_arithmetization stays the same when we bump zk_evm, then the circuits are compatible), but this made a lot more sense when evm_arithmetization was still a separate repo. At the time this was really nice during development since using a newer version of evm_arithmetization meant changing the version used of the crate. Now that the repos are merged, we can easily make breaking changes without changing the semver.

@Nashtare Do you have any thoughts on how we should version circuits? We have the work done at this point to use evm_arithmetization, so my gut would be to stick with that.

Also, should we add a clean command to zero_bin to remove old circuit state? Might be more worthwhile now that the location of the circuits directory is not as obvious for the user to find.

@Nashtare
Copy link
Collaborator

Now that we have zero-bin part of the workspace, relying on the version is indeed not really sustainable, as we would need to pre-bump upon breaking changes, but then any follow-up breaking change prior next release would require on its own some versioning distinction.

Alternatively, we could review the way that preprocessing is done now that zero-bin lives here. The idea is really to not have to redo it, when not needed, because of the time and resources it takes.

What we care about for compatibility are:

  • changes in the KERNEL: located in evm_arithmetization/src/cpu/kernel, we can easily identify any change with let kernel_hash = KERNEL.hash();
  • changes in the STARK constraints: this is a bit trickier. We could have version control though, through the recursive counterparts of each STARK module, and extracting their circuit.digest (a hash).
  • changes in the recursive aggregation circuits: this is where the overhead lies, because to have the upper circuits for proving aggregation / blocks, we need to generate all the recursive chains from the initial STARK proofs for each length in the ranges we want to support. This is why we relied on versioning. We could probably ignore it for development, as these circuits are never touched nowadays.

@0xaatif
Copy link
Contributor

0xaatif commented Jul 11, 2024

Alternatively, we could review the way that preprocessing is done now that zero-bin lives here.

I'd advocate for offloading this onto the user, and making our program less smart with a generate-circuits subcommand, which grabs the sizes etc.

So we could have separate subcommands for:

  • getting a witness from an RPC node
  • generating circuits
  • doing the proof

This would simplify the code we deliver, allow users to cache as needed, remove some params and concerns from our code.
We already half do the above, but also deliver "integrated" tools for other bits, which I think is a bit unnecessary, especially at the stage our product is at

@BGluth
Copy link
Contributor Author

BGluth commented Jul 25, 2024

Hey @Nashtare I have a dependent upstream PR that handles cleaning up old circuit state which deals with the "bloat" issue. Are we cool to merge this one?

@Nashtare
Copy link
Collaborator

Ah sorry it got buried in my notifications. I'll try to review it today or tomorrow.

Copy link
Collaborator

@Nashtare Nashtare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This got lost in my notifications, apologies for the delay.
Will need a rebase with latest changes though, but otherwise no real blocker for me.

zero_bin/common/src/prover_state/persistence.rs Outdated Show resolved Hide resolved
zero_bin/tools/prove_rpc.sh Outdated Show resolved Hide resolved
@BGluth
Copy link
Contributor Author

BGluth commented Aug 9, 2024

Just an update.

Once we switch to using the OS directories by default (#405), we don't really need to be setting the env var ZK_EVM_WORKSPACE_DIR anymore since it will default to an OS specific directory. Lets hold off until that merges and then I'll update this PR.

BGluth added a commit that referenced this pull request Aug 9, 2024
@Nashtare Nashtare linked an issue Aug 9, 2024 that may be closed by this pull request
@BGluth BGluth requested a review from Nashtare August 12, 2024 22:51
BGluth added 8 commits August 13, 2024 09:40
- Renamed "CARGO_WORKSPACE_DIR" --> "ZK_EVM_WORKSPACE_DIR".
- Was missing `zero_bin`.
- Also made the circuit directory error a bit more helpful (now also
  contains underlying IO error).
- Realized that the kernel hash is the best indicator if two serialized
  circuits were built using a compatable kernel.
- I don't think we really have a use for this anymore if we are relying
  solely on the kernel hash for circuit versioning.
- Also swapped out the `evm_arithmetization` versioning variable with
  the kernel hash.
@BGluth BGluth force-pushed the fix/304_circuit_version_fix branch from e9a6307 to 3f3a66c Compare August 14, 2024 18:06
@BGluth BGluth requested a review from atanmarko as a code owner August 14, 2024 19:03
@BGluth BGluth force-pushed the fix/304_circuit_version_fix branch from 7273a85 to 5ae99d3 Compare August 14, 2024 19:14
- If the directory path that we needed to create contained more than `1`
  directory, it would fail.
@BGluth BGluth force-pushed the fix/304_circuit_version_fix branch from 5ae99d3 to 0722475 Compare August 14, 2024 19:30
@BGluth
Copy link
Contributor Author

BGluth commented Aug 14, 2024

Should be good to merge now @Nashtare

Copy link
Collaborator

@Nashtare Nashtare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple nits then good to me for merging!

zero_bin/common/src/prover_state/persistence.rs Outdated Show resolved Hide resolved
@BGluth BGluth requested a review from Nashtare August 15, 2024 17:35
@BGluth BGluth merged commit e7e83de into develop Aug 15, 2024
15 checks passed
@BGluth BGluth deleted the fix/304_circuit_version_fix branch August 15, 2024 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crate: zero_bin Anything related to the zero-bin subcrates.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Make getting circuit version more stable zero-bin isn't distinguishing preprocessed circuit versions anymore
3 participants