BEP: 206 Title: Hybrid Mode State Expiry Status: Stagnant Type: Standards Created: 2023-02-24 Discussions: https://forum.bnbchain.org/t/bep-idea-state-expiry-on-bnb-chain/646/10
- BEP: Hybrid Mode State Expiry
- 1.Summary
- 2.Motivation
- 3.Specification
- 3.1.Design Guide
- 3.2.The Components
- 3.3.General Workflow
- 3.4.The Meta
- 3.5.Epoch Information In Trie Node
- 3.6.About GetState/SetState
- 3.7.Gas Metering
- 3.8.Support Snapshot
- 3.8.The Rent Model
- 3.9.Precompile Contracts: BSCStateExpiry
- 3.10.New Account Structure
- 3.12.New Block Structure (TBD)
- 3.12.State Revive
- 3.13.Prune (Important)
- 3.14.Remote DB
- 3.15.New RPC
- 4.Rationale
- 5.Forward Compatibility
- 6.Backward Compatibility
- 7. License
This BEP proposes a practical solution to address the problem of increasing world state storage on the BNB Smart Chain, by removing expired storage state.
Storage presents a significant challenge for many blockchains, as new blocks are continually generated, and transactions within these blocks could invoke smart contracts that also add more states to the blockchain. A large storage size can cause several side effects on the chain, such as higher hardware requirements, increased network resources required for downloading and performing p2p sync, and performance degradation due to MPT write amplification.
Due to the high volume of traffic, the storage size on BSC grows very rapidly. As of the end of 2022, a pruned BSC full node snapshot file is approximately 1.6TB in size, compared to approximately 1TB just one year ago. The 1.6TB storage consists mainly of two parts:
- Block Data (~2/3), which includes the block header, block body, and receipt;
- World State (~1/3), which includes the account state and Key/Value (KV) storage state. The Ethereum community proposed EIP-4444 to address the first part, which is to prune old block data. However, EIP-4444 does not address the second part, which is more challenging. There have been many discussions on how to implement state expiry, with one proposed solution involving the removal of EOA accounts, extension of the address space, and the use of Verkle Trees to reduce the witness size. However, implementing such a solution would be a significant undertaking that could have a substantial impact on the ecosystem, and may not be feasible in the short term. As BSC continues to face high traffic volumes, it is essential to develop a short-term solution for state expiry that can be implemented quickly, while still retaining the ability to upgrade to a long-term solution once it becomes available.
Variable | Description |
---|---|
Epoch | A time span defined by block number, if set it to ~⅔ year, if BSC validators produce block every 3 second, it would be (365*2/3)*24*60*60/3 = 7,008,000 |
NumKeysLastEpoch | Number of keys that are accessed during last epoch |
NumKeysLive | Number of keys that are not expired, i.e. accessible |
NumKeysTotal | Total number of keys, both expired and not expired |
HotContractThreshold | The threshold to determine if the contract is hot or not, hot means a certain percentage of live states were accessed during the last epoch. If the majority of the state were accessed recently, then the state of the contract will not expire. |
WitnessPrice | The gas price of 1 byte witness, as witness will be part of the block and needs extra effort for validator to collect it, it could be more expensive than gas price of calldata. |
- Rule 1: Permanent delete is unacceptable, expired state must be able to be revived.
- Rule 2: Contract’s execution logic should stay unchanged, if it fails to get the correct state, the execution result will be discarded, i.e. transaction will be reverted.
-
Simple protocol
-
Less impact to UX
-
Affordable price: storage access fee, state revive fee…
There will be some new components introduced and some existing components will be updated.
- Meta Info: it is used to record the contract’s storage status.
- Rent Model: if a user wants to keep their state from being expired, they will need to pay for the storage rent fee. The rent model will define the detailed rules of how rent fees will be charged.
- Revive & Witness: it will define the procedure for user to revive their expired state. And witness will define the format of witness to revive state, to make it more compact, witness could be merged to remove duplicated bytes.
- Account & Block: the structure of Account and Block is quite important, they will be upgraded to support state expiry.
- Big Scan: it is a special phase to collect the storage information of each contract by scanning all the contract accounts. It could take several weeks, depending on the capability of the nodes.
- Epoch: it represents a certain time span, state expiry will be Epoch based, one Epoch could be around ⅔ year.
- SystemContract(StateExpiry): a new system contract would be created to manage the state expiry policies.
- TxPool: transactions with expired state accessed need to be treated specially, tx pool module may need to be upgraded to make it more efficient.
- Sync: will be upgraded to support sync of witness and metainfo.
- Prune: it will be upgraded to support prune expire state, beside MPT trie node, it also needs to prune the storage of snapshot.
- Snapshot: snapshot will be upgraded to record the status of each KV
- Governance: some of the arguments need to be adjusted dynamically, e.g. the ScanSpeed, RentPrice, ContractLivenessThreshold…
- Gas Metering: will be upgraded to support witness charge.
Users will not need to be aware of StateExpiry, generally before users send out the transaction, they would query a RPC node about the estimated gas needed. The RPC nodes will estimate the gas needed based on execution fee and the fee to revive all the expired states. If the estimated gas needed is acceptable, the user will send out the transaction as usual.
When the transaction is propagated to the tx pool of validator, the validator would know whether the transaction needs to access any expired state or not, if yes, the validator would be required to collect the witness and rewarded accordingly. It is ok if the validator does not get the witness, then the validator will not be allowed to include this transaction.
Nothing will be changed, nodes just operate as usual.
State will not expire when entering Epoch 1, so the whole state will be there in Epoch 1, no rent will be charged either. But the meta information will be updated on block finalization to record the latest status.
Since Epoch 2, state expiry will start to work. The state of the contract will be expired if none of the following requirements are met:
- The state has been accessed in current or last epoch.
- It is hot, i.e. the percentage of keys accessed during the last epoch is greater than or equal to $LivenessThreshold (unable to do it without big scan, )
- It has enough RentBalance left for KV pairs in old epochs, KV in current and previous epoch will be exempted. (not determine, if rent model will be used)
Version uint8 // in case meta definition upgrade in the future?
RLP {
EpochRecords []{epoch, numKeys} // for epoch based rent charge
NumKeysTotal uint64 // without keys in epoch0 if no big scan
RentBalance uint64 // deduct on access
}
// if no rent, no meta content needed, can be empty
There will be a meta hash in account structure, which will be the key to access encoded meta information.
It will be updated on block finalization, not by transaction level. When a block is finalized, then the access record will be determined, we will know the keys that are accessed, created or deleted. These information will be updated to the meta structure
Metahash is just simply the keccak256 hash of the meta bytecode
It would be very simple, only extend the current branch node, which will include an epoch map, which is used to mark its children’s accessed epoch value. Hash calculation will include this epoch map element, so once the epoch map is updated, the corresponding intermediate nodes will be updated as well.
If GetState tries to access an expired state, then the transaction will be reverted.
For SetState, it must do GetState first, if GetState fails due to accessing expired state, then the transaction will be reverted.
And since delete operation can be treated the same as set, it also needs to perform GetState first.
Depends on the witness size used in this contract, the cost can calculated simply by: WitnessSize * WitnessPrice
If several transactions have overlapped KVs to revive, the first transaction would probably pay more, as when the later transaction to access these KVs, they are already revived.
It is somehow reasonable, as the first transaction will be executed first, so it will pay slightly more.
Snapshot will still be corresponding to the MPT structure.
Once the MPT is shrinked due to more sub-paths being expired, which will make the MPT end up with some boundary nodes. Boundary nodes are trie nodes that are either leaf nodes or intermediate nodes with at least one of its children expired.
The snapshot shrink can be conducted by off-line prune according to the MPT.
After off line prune of the MPT trie, the boundary will be generated. Then just go through the MPT tree, prune the snapshot according to the intermediate boundary node.
// For contratc A, it has several several hashed keys with epoch marked:
0x000000000000000000000000_00000000, epoch 1
0x000000000000000000000000_01000000, epoch 0
0x000000000000000000000000_01010000, epoch 0
0x000000000000000000000000_01020000, epoch 0
0x000000000000000000000000_01030000, epoch 0
0x000000000000000000000000_02000000, epoch 1
0x000000000000000000000000_03000000, epoch 1
// Currently, it is in epoch 2, so after prune, the tree will be
0x000000000000000000000000_00000000, epoch 1
0x000000000000000000000000_01, epochMap(0,0,0,0) // 4 children expired in epoch 0
0x000000000000000000000000_02000000, epoch 1
0x000000000000000000000000_03000000, epoch 1
TBD
TBD
- KVs accessed in current or previous epoch will not be kept, no extra fee needed
- Liveness:
If the percentage of not expired KV accessed in the last epoch is greater than a threshold(30%? governable), then these alive KVs will not be expired.
update: no liveness check, since no big scan
- The Rent Balance will be used to pay for these alive KVs, pay by best
User could save a certain range of Epoch, user may prefer to only save the KV of a few recent epochs
There will be a system contract to handle it, users just need to call it with the target address & balance provided, the system contract will help add the rent balance to the target address.
The balance can not be reclaimed.
But if the user sends the balance to an un-existed address, can it be refunded?
// StateExpiryContract: 0x0000000000000000000000000000000000001008
// or Precompile Contract?
func AddRentBlance(target Address) {
balance = GetValue() // value in transaction
metaInfo = GetMetaInfo(target)
metaInfo.rentBalance += balance
}
TBD
On first access of metainfo in a new Epoch, i.e. CurrentEpoch is not in EpochRecord, there will be a rent price for each epoch
EpochRecords []{epoch, numKeys} // for epoch based rent charge
if curEpoch in EpochRecords {
// rent fee already charged, once per epoch
return
}
// first time to access
var NewEpochRecords := {curEpoch, 0} // reset
EpochRecordsSorted := sort(EpochRecords) // decending order
for epoch, numKeys := range EpochRecordsSorted {
fee := GetEpochFee(epoch, numKeys)
if RentBalance >= fee {
NewEpochRecords := append({epoch, numKeys})
RentBalance -= fee
}
}
This contract will has some variable to set for governance
contract BSCStateExpiry {
uint256 public scanSpeed;
mapping(uint256 -> uint256) public rentPriceMap;
uint256 public livenessThreshold;
function updateParam(string key, bytes value) {
if (Memory.compareStrings(key, "ScanSpeed")) {
scanSpeed = BytesToTypes.bytesToUint256(32, value);
} else if (Memory.compareStrings(key, "RentPrice")) {
rentPrice = BytesToTypes.bytesToUint256(32, value);
rentPriceMap[curEpoch] = rentPrice
} else if (Memory.compareStrings(key, "LivenessThreshold")) {
livenessThreshold = BytesToTypes.bytesToUint256(32, value);
}
}
}
type Account struct {
Nonce uint64
Balance *big.Int
Root []byte
CodeHash []byte
MetaHash []byte // for StateExpiry
}
- witness can be aggregated into block
- witness can be discarded after a certain period, like blob data, since witness can be generated by re-executing the blocks.
type Witness struct {
addr Hash
keys []Hash
proof []byte
}
type Block struct {
header *Header
uncles []*Header
transactions Transactions
witness []Witness // for StateExpiry
}
- transaction does not need to provide a witness, just pay the gas fee as usual, the validator will help collect the witness.
- partial revive will be supported, but need at least 1 KV
- depends on boundary nodes, as witnesses must begin from a boundary node?
off-line prune(bloom-filter) + off-line prune(epoch based, go through the tree, there is a safety check)
And if PBSS is enabled, on second off-line is needed
may not need to cover in BEP
query new account.meta, KV…
-
Keep L1 Account Trie Now, could introduce GC mechanism to remove “tiny account”
-
Verkle Still Can Be Used In Storage Trie?
There are several reasons to keep it:
- The size of the L1 account trie is relatively small, constituting only around 4% of the L2 storage trie on BSC as of the end of 2022.
- The L1 account trie contains crucial information about user accounts, such as their balance and nonce. If users were required to revive their accounts before accessing their assets, it would significantly impact their experience.
- By retaining the L1 account trie, the witness verification process can be much simpler.
In this proposal, the trie skeleton will be kept in a new epoch. There are other approaches
which will generate a new trie tree from scratch at the start of a new epoch. Although they
provide a comprehensive solution for state expiry, there are still two unsolved issues to address: account resurrection conflict and witness size. Additionally, they would have a significant impact on the ecosystem and rely on other infrastructure, such as address extension and Verkle Tree.
By keeping the skeleton of the trie, it would be much easier to do witness verification and have less impact on the current ecosystem.
The state will expire if it has not been accessed for at least 1 epoch or at most 2 epochs. On average, the expiry period is 1.5 epochs. If we set the epoch period to represent 2/3 of a year, then the average state expiry period would be one year, which seems like a reasonable value.
less impact to user’s business
Account abstraction implementation will be impacted, as these accounts could be stored in the L2 storage trie and could be expired.
Rollups could be impacted if the rollup transactions try to access expired storage.
The current transaction types will be supported, but if the transaction tries to access or insert through expired nodes, then it could be reverted.
There are several changes that could affect user experience. The behavior of many DApps may change and users will have to pay to revive their expired storage. If the revival size is very large, the cost could be expensive.
Some of the APIs could be impacted, such as: getProof, eth_getStorageAt...
The snap sync mode will heal the world state after the initial block sync. The procedure of world state healing in snap sync mode will need to be updated.
More storage volume would be needed for the archive node, since more metadata will be generated in each epoch. The increased size could be remarkable, which would make the current archive node reluctant to keep the whole state of BSC mainnet. Archive service may have to be supported in other approaches.
The implementation of the light client would be impacted, since the proof of the shadow tree would also be needed.
The content is licensed under CC0.