-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Confidential Token Ids #25
Confidential Token Ids #25
Conversation
d6ba248
to
355ca2f
Compare
text/0000-confidential-token-ids.md
Outdated
|
||
Currently, the enclave aggregates the fees for multiple transactions in a block, in order to mint a single fee output for the block and conserve space. To support multiple confidential TokenTypes, when minting fee outputs, the enclave must create multiple fee outputs if multiple TokenTypes were used in the block. A consideration here is that we would like for an observer not to discern whether a block contains transactions of multiple asset types by statistically analyzing the number of outputs to derive information about the number of fee outputs. | ||
|
||
A simple proposal is that the enclave discontinue fee aggregation, so that the number of fee outputs scale linearly with the number of transactions in a block (not output TXOs). For example, a block with 3 transactions includes 3 fee outputs. Because each transaction can mint up to 16 output TXOs, there may be a variable number of outputs in the block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would worsen the problem I discussed in the fee epochs MCIP.
text/0000-confidential-token-ids.md
Outdated
|
||
### Fee Ratio and Transaction Sorting | ||
|
||
Currently the [WellFormedTxContext](https://github.com/mobilecoinfoundation/mobilecoin/blob/a092039a4a5cc7f5e3e16eeadd0b2bc3a12667ae/consensus/enclave/api/src/lib.rs#L50) contains the fee, which must be used by untrusted to [sort transactions](https://github.com/mobilecoinfoundation/mobilecoin/blob/a092039a4a5cc7f5e3e16eeadd0b2bc3a12667ae/consensus/enclave/api/src/lib.rs#L131) when [combining transactions](https://github.com/mobilecoinfoundation/mobilecoin/blob/a092039a4a5cc7f5e3e16eeadd0b2bc3a12667ae/consensus/service/src/tx_manager/untrusted_interfaces.rs#L40) for consideration by the enclave, prioritizing higher fees for inclusion in the block during periods of network congestion. In order to keep the `token_id` confidential, we must provide a mechanism for untrusted to accurately sort transactions before asking the enclave to do work on the transactions, without revealing the `token_id`, and at the same time, allowing for different asset types to express differing minimum fee requirements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can users decide what fee to use if fees are hidden (that seems to be your goal here)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could work, because they could learn the priority they need to shoot for, and then relate that to the minimum fee for the token id they are trying to transact in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users would be shooting in the dark... what happens if someone is spamming high fee txs? Users would have no way to accurately estimate the fee they need to get an urgent tx into the chain. This gives me the willies just thinking about it.. a fee market without a market.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can users decide what fee to use if fees are hidden (that seems to be your goal here)?
The goal is not that fees are hidden, but we do want the token id to be hidden. To the extent that different token ids have different minimum fees, the fee could reveal the token id. That's what we want to avoid.
Users would be shooting in the dark... what happens if someone is spamming high fee txs? Users would have no way to accurately estimate the fee they need to get an urgent tx into the chain. This gives me the willies just thinking about it.. a fee market without a market.
The "priority" of the transaction (computed as fee / minimum fee) is not a secret, and the recently accepted priority values are published to the users to guide their fee selection. That's how the users can determine what fee to use to submit an urgent tx.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@UkoeHB - added a note at the bottom of the fee section to address how users may determine the appropriate fee given network congestion. Let us know what you think!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went over this and provided some nit comments. I am not fluent enough in the cryptography part to properly review that, but everything else seemed correct to me and is aligned with my understanding of @garbageslam's POC implementation and prior design discussions.
Thank you for writing this!
I discussed this with Aaron Feickert, who wrote Spats (WIP draft of confidential assets compatible with the Spark protocol). Based on his work, I have a vision of a similar approach that permits hidden assets in the absence of fees (fees are a big problem for all hidden asset proposals, including this one, Spats, and this MCIP's original approach). Here is a brief sketch.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly good, just a couple inline comments and suggestions.
text/0000-confidential-token-ids.md
Outdated
- To compute the 4 byte `masked_token_id` from the `token_id`, we hash the TXO shared secret, with a prefix, XORed with the `token_id`, and take the little endian representation of those bytes. | ||
|
||
``` | ||
masked_token_id = (token_id ^ Blake2B(token_id_tag | shared_secret)).to_le() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is token_id_tag? And does | denote concatenation or logical OR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
token_id_tag
is meant to be a domain tag - some string, and | was meant to denote concatenation - definitely open to sugs for fixing up notation for clarity!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe instead of token_id_tag
just call it domain_tag
and the reader will understand?
I think ||
is more commonly used to denote concatenation in these kinds of docus, that would be my sug
text/0000-confidential-token-ids.md
Outdated
The non-interactive zero knowledge Proof of Opening protocol is the following: (See [Zero Knowledge Proofs and Commitment Schemes, Page 27](https://www.cs.purdue.edu/homes/ninghui/courses/555_Spring12/handouts/555_Spring12_topic23.pdf)) | ||
|
||
|
||
1. The prover establishes a value, `d`, calculated by hashing their secret values from the commitment (the `token_id` and the `blinding_factor`). Implementation note, we may use [Merlin](https://docs.rs/merlin/latest/merlin/) rather than Blake2B. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am obviously biased, but would recommend using Merlin. We put quite a lot of thought and design consideration into abstracting away things like sponge function rollback/ratcheting and domain separation away from higher level protocols like this so that people just wouldn't have to think or worry about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We ultimately did not use the proof of opening approach, but we are using Blake2b for hashing, which matches how we implemented the masked_amount
. I would like to better understand the tradeoffs for that use case and why we chose Blake2b, if @garbageslam or @isis-mc have thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the security requirement here is that the masked value must be indistinguishable from uniformly random bytes from the point of view of someone who doesn't know the shared secret.
If we assume blake2b has a "secret prefix prf" property, then I think it does that at > 128 bit security level. And that assumption is reasonable and already used elsewhere.
Merlin is another tool that we can use to do this.
I coded it using blake2b because we are already doing the value mask that way, so it seemed more consistent with the existing code. Happy to change it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Will incorporate in the MCIP explanation in alternatives
text/0000-confidential-token-ids.md
Outdated
|
||
Currently the [WellFormedTxContext](https://github.com/mobilecoinfoundation/mobilecoin/blob/a092039a4a5cc7f5e3e16eeadd0b2bc3a12667ae/consensus/enclave/api/src/lib.rs#L50) contains the fee, which must be used by untrusted to [sort transactions](https://github.com/mobilecoinfoundation/mobilecoin/blob/a092039a4a5cc7f5e3e16eeadd0b2bc3a12667ae/consensus/enclave/api/src/lib.rs#L131) when [combining transactions](https://github.com/mobilecoinfoundation/mobilecoin/blob/a092039a4a5cc7f5e3e16eeadd0b2bc3a12667ae/consensus/service/src/tx_manager/untrusted_interfaces.rs#L40) for consideration by the enclave, prioritizing higher fees for inclusion in the block during periods of network congestion. In order to keep the `token_id` confidential, we must provide a mechanism for untrusted to accurately sort transactions before asking the enclave to do work on the transactions, without revealing the `token_id`, and at the same time, allowing for different asset types to express differing minimum fee requirements. | ||
|
||
To derive a sorting order without revealing the transaction type, the `fee` in the WellFormedTxContext becomes a `fee_ratio`, which is calculated based on the ratio between the minimum fees configured by the node operator on startup. For example, the network operators may have configured the following fees: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh I don't think you should allow fees outside the base asset type (MOB). It's really asking for trouble - how can you allow user-defined asset types? What if an asset type is completely worthless? Fees exist to mitigate DDOS, and must be carefully regulated by node operators for that purpose. The only robust way to do that is to coordinate on a single fee asset type.
Imo it would be better to require all special-asset-transfer txs have two sections: a 'base asset transfer' section, and a 'special asset transfer' section. The advantage here is you don't have to reveal the special asset type to validators. This can be done pretty trivially, since with my proposal all asset types can be CLSAGd and range proofed together (you only need to segregate inputs/outputs a little bit in order to do balance proofs).
The advantage of SGX over an open validator is you can 'clear away' the information that a block contains special asset transfers (aside from the heuristic that 'more inputs/outputs == maybe special asset transfer', and maybe some tx-size analysis...hmm).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This, and the proposal above, have nice properties. I may want to grab some time with you to flesh out my understanding a bit more!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
@UkoeHB thanks for discussing with Aaron and posting your thoughts (#25 (comment)) I'm trying to assess if this will be more or fewer elliptic curve operations if we do it as you suggest. It's nice that there are only a few generators, as opposed to be many which may have to be calculated at run time, but there are also a bunch of extra operations adding and subtracting |
This file should be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second review
text/0025-confidential-token-ids.md
Outdated
|
||
A Transaction Output (TxOut) has a confidential `token_id` bytes field (`masked_token_id`). Transactions with inputs and outputs, including a fee output, must be composed of a single `token_id`. Some tokens may have additional functionality, such as minting and burning, which require verification of the `token_id` as part of the authorization of the action. | ||
|
||
Each `token_id` has an explicit minimum fee specified by node operators, and included in the attestation handshake, ensuring that all nodes in the network are configured with the same fee minimums and `token_id` sets. New `token_ids` can be added only with unanimous agreement from all node operators. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO centralized token issuance would hamstring MobileCoin as a 'multi-asset ecosystem'. A big part of ethereum's success derives from users having free reign to define their own tokens. Not to mention the added burden on node operators to validate/justify hard forking for every new asset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is correct -- we want users to be able to define their own tokens. But as you have pointed out elsewhere, if those tokens can be used to pay transaction fees, then transaction fees become ineffective as a rate-limiting / DOS prevention mechanism.
However, we believe the following:
- In many cases, users of a payment rail would rather conduct transactions in something pegged to their local fiat currency
- Requiring them to purchase MOB in order to use our network is a negative user experience, and it adds another step to on-boarding. It also adds complexity because now the user has two balances instead of one.
- Therefore, we would rather let them pay the fee in their currency of choice, and the foundation will deal with the conversions on the backend. As long as they pay the foundation something that we know has value, then the fee serves its purpose of DOS mitigation. And this results in a superior user experience.
So this together suggests that we want to have some way to create alternate tokens that can be used to pay network fees, but we don't want just anyone to be able to stand up such tokens.
So where I hope we will end up is, there will eventually be a way to allow people to create their own tokens, but not all tokens will be allowed to be used to pay fees, and only tokens approved by the foundation will be allowed to do that.
We haven't put much thought yet into what that part of it looks like. If you have suggestions, would love to hear.
text/0025-confidential-token-ids.md
Outdated
|
||
Each `token_id` has an explicit minimum fee specified by node operators, and included in the attestation handshake, ensuring that all nodes in the network are configured with the same fee minimums and `token_id` sets. New `token_ids` can be added only with unanimous agreement from all node operators. | ||
|
||
A transaction specifies the `token_id` for the entire transaction, with a cryptographic guarantee that all inputs and outputs are of the same `token_id`. This prevents spending one `token_id` as a member of another `token_id`. The TransactionBuilder now takes a `token_id` as an argument in its initializer. The `token_id` is set in the clear on the `TxPrefix` for a transaction, which is only accessed in the enclave, after the payload is delivered via an encrypted, attested channel. This is similar to the `TxInputs`, and thus has a similar threat model for confidentiality. See [Confidentiality Analysis](#confidentiality-analysis) for an assessment of risk. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If tokens are not truly confidential from the beginning, then there will likely be no way to remove SGX without also removing the 'confidential' from 'confidential tokens'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now I think we believe it can be done using ZK-STARK's. I expect that if we accept this MCIP, then this would be a blocker for removing SGX.
text/0025-confidential-token-ids.md
Outdated
B_blinding = RISTRETTO_BASEPOINT | ||
``` | ||
|
||
After this change, `range_proofs`, `rct_bulletproofs`, and `ring_signature/mlsags` will be relative to a Pedersen generator, and it is not possible to construct a range proof relative to another generator if those generators are orthogonal. Thus, it is guaranteed that all transaction inputs and outputs are using the same `token_ids`. This is implied by the homomorphic encryption property of [Pedersen Commitments](https://www.cs.cornell.edu/courses/cs754/2001fa/129.PDF), namely that addition on the commitments preserves the additive relationship of the pre-committed values, as long as they are using the same group generator. `H_i` is revealed to the verifier, who must show that the sum of inputs does not overflow, or `sum(outputs) = a * G + b * H_i` for some TxOut-specific values of `a` and `b`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modifying range proof structures to accommodate this proposal is intrusive, prevents efficient batch-verification which directly affects the upper bound of tx throughput, and may impose a burden on/barrier to future proof upgrades.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, and a reason to do it the way the Spats proposal works.
I think right now we don't have any of the batch verification stuff in place, so we'll have to think seriously about this. I don't know if this is actually the bottleneck right now in transaction validation.
text/0025-confidential-token-ids.md
Outdated
|
||
Currently, the enclave aggregates the fees for multiple transactions in a block, in order to mint a single fee output for the block and conserve space. To support multiple confidential `token_ids`, when minting fee outputs, the enclave must create multiple fee outputs if multiple `token_ids` were used in the block. A consideration here is that we would like for an observer not to discern whether a block contains transactions of multiple asset types by statistically analyzing the number of outputs to derive information about the number of fee outputs. | ||
|
||
A simple proposal is that each block now contains a number of fee outputs scaling linearly with the total number of supported `token_ids`. This way we can continue to aggregate fees, but not reveal how many token types were included in the block. For blocks with fewer transactions than `token_ids`, the number of fee outputs is `min(num_token_ids, num_transactions_in_block)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This greatly increases fee output spam (which has privacy implications), and prevents fee epochs as mentioned previously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I guess I'm not sure that this is greatly increasing fee output spam. We currently only have one transaction per block typically. So every 3rd transaction output or so is fee right now. I don't think this proposal can ever lead to more than this, and also in the case of heavy load, if there are hundreds of transactions in a block but many fewer token ids, then this will be better than that.
- Can you explain why this will prevent fee epochs? It seems to me that fee epochs would still work just fine if we decided it was necessary, we will just accumulate fees from several different token types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Ah my mistake, the ratio of fee outputs to normal outputs probably wouldn't change much even with PR Transactions with Contingent Inputs #31. Tbh the part about
min(num_token_ids, num_transactions_in_block)
didn't sink in properly. - Fee epochs requires printing the fee amount in block headers, so it would eliminate the confidentiality of token ids. I don't see a way to hide the amount to support that proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see -- I didn't realize it was storing the fee amount in the block headers.
Is that necessary for it to work? What if we just stored in fee output in (ephemeral) enclave memory and wrote out some number of TxOut's every 1000 blocks or so as you suggest.
It might be annoying in that we might have to give the enclaves a way to sync this state, so if nodes go down and come back up they will have the right accumulated fee values. I would kind of like them to sync up anyways on the hash of the latest block, so that doesn't seem too onerous to also sync the accumulated fee values.
The worst case would be that all the enclaves went down and some accumulated fees were lost. But, that might be acceptable, we've tried to engineer it all so that never happens anyways. The fees don't represent that much money anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idk what the path would be here when SGX is removed though, that seems pretty tricky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be annoying in that we might have to give the enclaves a way to sync this state, so if nodes go down and come back up they will have the right accumulated fee values. I would kind of like them to sync up anyways on the hash of the latest block, so that doesn't seem too onerous to also sync the accumulated fee values.
I feel like this would be an easy way to break the network, since it sounds like you don't want to use SCP for syncing... (which is designed to prevent breakage). Everything you do with the network that doesn't involve SCP makes the network more brittle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a lot of configuration right now, such as minimum fee settings, the strategy is to use start-up parameters that become part of the responder id, and leverage node-to-node attestation to ensure that the nodes are all configured the same way. This is because it is simpler to do it this way right now, than to use SCP to consense on config. (We would like to get there though.)
I don't think this actually makes the network more brittle -- it makes problems surface immediately if nodes are misconfigured because there is no way a node can proceed if it can't attest to the other nodes.
This is not dynamic state, which is more complex for sure. But it is an example of using attestation to ensure that some data is being shared correctly.
The reason I'm not sure we need to "consense on accumulated fees" is, the accumulated fees are strictly a function of the transactions in each block. And we are already using SCP to consense on that. So we can just compute (in the enclave) based on what we already consensed on, and we don't need to do a second round of consensus to agree on the (deterministic) computation of fees.
The only part about this that is different from what we already do is, when a node restarts, its enclave loses all state. Currently, nodes that fall far enough behind go into a catchup mode where they trust blocks based on peer signatures and don't expect to see the Tx's for those blocks. Eventually they finish catching up and then they are validating transactions in the enclave again. However, if we tried to do fee epochs with secret accumulation in the enclaves, they would have to recover the accumulated fees as well from their peers, when they exit catchup. I don't think this would be a problem though if the accumulated fees are transmitted using node to node attestation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, node catch-up (state recovery) relies on SCP (see mechanics of mobilecoin section 9.4.1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it relies on SCP in that, the decision to trust blocks based on peer signatures is based on if those peers form a quorum.
i think it is also valid to send fee counters over node-to-node attested channels, based on a group of peers that form a quorum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the meeting: @garbageslam mentioned that you could use fee priorities to build a token ID oracle from user-transmitted transactions, since users don't validate node configuration.
Update on my previous proposed scheme: an observer would be able to divide the output set of each transaction into two groups, 'fee asset' outputs and 'non-fee asset' outputs. If fees are all in MOB, then the full ledger output set can be divided into 'base asset' outputs and 'secondary asset' outputs. This would reduce the efficacy of ring signatures unless users could use that distinction to only choose ring members that match their inputs' asset types (base vs secondary). |
@sugargoat can you please update this docu at some point to include the fixes to the extended message when signing ring MLSAG? https://github.com/mobilecoinfoundation/mobilecoin/pull/1700/files/c926556e63220c754b29a05b2c72fd5f39ef0e26#r836646544 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also state the block version that this feature is targeted to release in
We should mention the extended message digest somewhere in here as well |
This new paper (Bulletproofs++) mentions 'typed' range proofs for multi-asset protocols. |
I created a pull request here with updates and improvements: |
Co-authored-by: Eran Rundstein <[email protected]>
…ments Rewrite several sections and add detail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I had only minor suggestions
Co-authored-by: Chris Beck <[email protected]>
Co-authored-by: Chris Beck <[email protected]>
Co-authored-by: Chris Beck <[email protected]>
Co-authored-by: Chris Beck <[email protected]>
Co-authored-by: Chris Beck <[email protected]>
Co-authored-by: Chris Beck <[email protected]>
A proposal for confidential token IDs to support multiple asset types.
Rendered Proposal