-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add EIP: Separate Metadata Section for EOF
Merged by EIP-Bot.
- Loading branch information
Showing
1 changed file
with
92 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
--- | ||
eip: 7834 | ||
title: Separate Metadata Section for EOF | ||
description: Introduce a new separate metadata section to the EOF | ||
author: Kaan Uzdogan (@kuzdogan), Marco Castignoli (@marcocastignoli), Manuel Wedler (@manuelwedler) | ||
discussions-to: https://ethereum-magicians.org/t/eip-7834-separate-metadata-section-for-eof/22138 | ||
status: Draft | ||
type: Standards Track | ||
category: Core | ||
created: 2024-12-06 | ||
requires: 3540 | ||
--- | ||
|
||
## Abstract | ||
|
||
Introduce a new separate metadata section to the Ethereum Object Format (EOF) that is unreachable by the code, and any changes to which does not affect the code. | ||
|
||
## Motivation | ||
|
||
It is desirable to include metadata in contract's bytecode for various reasons. For instance, both the Solidity and Vyper compilers by default include the language and compiler version used to compile. Vyper (with 0.4.1) appends an integrity hash to the initcode in CBOR encoding. Solidity additionally includes the IPFS or the Swarm hash of the Solidity contract metadata.json file, and the experimental Solidity flag. The current (pre-EOF) practice is to append this CBOR encoded metadata section in the contract's runtime bytecode, followed by the 2 bytes length of the CBOR encoded bytes. | ||
|
||
``` | ||
Solidity ┌──────────────────────────────────────────0x0033 bytes──────────────────────────────────────────────┐ | ||
...7265206c656e677468a2646970667358221220dceca8706b29e917dacf25fceef95acac8d90d765ac926663ce4096195952b6164736f6c634300060b0033 | ||
``` | ||
|
||
This poses a problem for source code verification where the onchain bytecode is compared to the compiled bytecode of the given source code. During a contract verification, metadata sections, in particular the IPFS hash, need to be ignored and only the executional bytecode should be compared. Since pre-EOF bytecode is not structured, it is not possible to distinguish the metadata section from the executional bytecode easily. This gets even trickier in the case of factory contracts with multiple nested bytecodes, each having their own metadata sections. Verifiers need to implement their own heuristics and workarounds to find the metadata sections and ignore it. | ||
|
||
The EOF brings structure to the bytecode by separating the code from the data, and placing the code of each contract in their respective containers. In its current form, this makes it possible to find the data easier than the pre-EOF bytecode. However, the current spec also does not describe a metadata section. Compilers currently need to place the contract metadata inside the data section which poses several problems: | ||
|
||
1. It is not straightforward to distinguish the metadata part in the `data_section`, which poses the same problem as the pre-EOF bytecode. | ||
2. Any change to the metadata's size within the data section will change the executional bytecode, e.g. through shifting `DATALOADN` offsets. With that, two identical contracts with different metadata sizes will not match during source code, since the code will be different. | ||
3. The metadata can theoretically be reached by the code, e.g. via manipulating the `DATALOADN` instructions. | ||
|
||
## Specification | ||
|
||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 and RFC 8174. | ||
|
||
Extending the format introduced in [EIP-3540](./eip-3540.md), this EIP proposes to add a new OPTIONAL section in the body called `metadata_section` before the `data_section`, and to add two new OPTIONAL fields `kind_metadata` (value: `0x05`) and `metadata_size` to the header before the `kind_data` and `data_size` fields. | ||
|
||
``` | ||
container := header, body | ||
header := | ||
magic, version, | ||
kind_type, type_size, | ||
kind_code, num_code_sections, code_size+, | ||
[kind_container, num_container_sections, container_size+,] | ||
[kind_metadata, metadata_size,] | ||
kind_data, data_size, | ||
terminator | ||
body := types_section, code_section+, container_section*, [metadata_section], data_section | ||
types_section := (inputs, outputs, max_stack_height)+ | ||
``` | ||
|
||
### Header | ||
|
||
| name | length | value | description | | ||
| ------------- | ------- | ------------- | --------------------------------------------------------------------------------------- | | ||
| ... | ... | ... | ... | | ||
| kind_metadata | 1 byte | 0x05 | kind marker for metadata size section | | ||
| metadata_size | 2 bytes | 0x0001-0xFFFF | 16-bit unsigned big-endian integer denoting the length of the metadata section content | | ||
| kind_data | 1 byte | 0x04 | kind marker for data size section | | ||
| data_size | 2 bytes | 0x0000-0xFFFF | 16-bit unsigned big-endian integer denoting the length of the data section content (\*) | | ||
| terminator | 1 byte | 0x00 | marks the end of the header | | ||
|
||
### Body | ||
|
||
| name | length | value | description | | ||
| ---------------- | -------- | ----- | --------------------------- | | ||
| ... | ... | ... | ... | | ||
| metadata_section | variable | n/a | arbitrary sequence of bytes | | ||
| data_section | variable | n/a | arbitrary sequence of bytes | | ||
|
||
The strucure and the encoding of the `metadata_section` is not defined by this EIP. It is left to the compilers, tooling, or the contract developers to define the encoding and the content. The current practice by the Solidity and Vyper compilers is to use CBOR encoding. | ||
|
||
## Rationale | ||
|
||
The `metadata_section` in the `body`, as well as the `kind_metadata` and `metadata_size` fields in the `header`, are OPTIONAL. This way, the compilers can avoid additional bytes in the container if they don't want to write any metadata. The `data_section` can change in its size and content during deployment, therefore it needs to be REQUIRED, even if the data is empty. The `metadata_section` is not expected to change during the deployment. | ||
|
||
The reason for placing the `metadata_section` before the `data_section`, and assigning `kind_metadata` the value `0x05` (and not `0x04`) is to make it easier for the existing EOF tooling adapt the changes. Additionally, if the `metadata_section` was placed after the `data_section`, changes to the `data_section` in deploy time would cause the `metadata_section` to shift. By placing the `metadata_section` before, this could be mitigated. | ||
|
||
## Backwards Compatibility | ||
|
||
No backward compatibility issues are expected since [EIP-3540](./eip-3540.md) is not implemented yet. | ||
|
||
## Security Considerations | ||
|
||
No security considerations as this section is meant not to be executed. | ||
|
||
## Copyright | ||
|
||
Copyright and related rights waived via [CC0](../LICENSE.md). |