-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Network Risk Score recallibration #99
Comments
@drinkcoffee Agree that it feels weird to have network consensus risk rated so low. I think there could also be more granularity here, for example for This score is also likely different when looking at different chains, would the actual score be an average of all the scores? Or perhaps a TVL weighted average across each chain? |
The scoring should be viewed as a score for a bridge between two chains. The same bridge between two other chains will yield a different score, because the chains it is bridging are different. |
A one in a million probability of relaying still implies (to me) an unreliable and unusable bridge. 1 in 10 is just a bit more unusable. However, both are unusable. As such, the score needs to reflect that. |
This is being addressed in this PR. Please have a look at comment what you think the scores should be: Note that 0 is perfect; 100 is completely risky. Generally, the numbers add up within a risk category, capping out at 100, and then the overall score is the worst of all risk categories. |
I see, thanks for clarifying. Does it make sense to scope the score to within two chains? It seems like some of these ( From my understanding, bridges have multiple spokes where the bridge contract on any chain can send a message to any other chain. In this case, with N spokes you have |
Curious how these numbers are calculated. One in a million for a particular message implies a ~1% that 10 thousand messages are bridged correctly. Similarly, one in a billion implies a ~0.01% that 100 thousand messages are bridged correctly. Both these cases feel essentially unusable in practice -- but where is the bright line? I feel like there should be a sharp boolean-like calculation here. Either essentially all messages are bridged correctly, or they aren't and the bridge shouldn't be used. |
block and hence transaction finality is different to the wrong message being sent across the bridge. Imagine an event is emitted by a transaction on a source chain, and is included in a block. That block ends up not being included in the canonical chain. The transaction will be put back into the transaction pool, and could well end up in another block. It could even be in the block that replaced the original block in the canonical chain. The problem arises if there is a situation that diverges. For example, account A on the source chain sends 5 tokens to the bridge account / does a crosschain transfer in a transaction on the source chain. In a separate transaction on the source chain, they send the same tokens to account B on the source chain, thus double spending the tokens. Only one of these transactions can succeed. If the first transaction was going to be in the canonical chain, and then the bridging protocol uses a finality of 1 in a million, assuming a 10% attack on the POW chain, and the user mounts such an attack, and successfully creates a block that goes into the canonical chain that includes the second block. In this case, the bridge will assume the 5 tokens are in escrow on the source chain and the user / attacker will have transferred the tokens to account B. |
The degree of finality is a bridge specific configuration. As @chen-robert points out, the other factors are chain specific, independent of bridge. @ermyas , I am keen to hear your thoughts. |
Transaction Finality: The degree of finality to wait for before relaying a message is primarily a parameter of the crosschain protocol. So it makes sense to evaluate the risks associated with it as such. What the current set of scores doesn't yet capture is that some protocols (partly) delegate this decision to the application layer. In this model, the application decides what level of finality it requires. For such designs, I think we can either assign a score based on the protocol's default finality configuration (i.e. if the application doesn't provide one) or ascribe a very low-risk score. Network Safety and Liveness Failure: As mentioned in the network risk section, network safety failures are risks of the underlying chain and impact all protocols that bridge to/from it. This type of risk is largely outside of the control of individual bridges. However, there are three ways they can choose to deal with it: Ignore this risk For option 1, I think we should have a high-risk score that directly impacts the overall score of a bridge protocol, based on its weakest connected chain. For option 2, I think we can have a more nuanced score based on the mitigation strategy employed for the whole protocol and offer an additional score that is bridge-leg specific. |
Yeah apologies for any confusion, by "wrong" I was referring to a message which is not included in the canonical chain being sent, which is the same as what you mentioned. I suppose maybe my confusion was around if the 100M cost is meant to be incurred per attempt, or if it simply means any attacker with 100M can cause an X% chance that a particular message is dropped.
@ermyas wanted to clarify, how does network liveliness affect the security of the rest of the bridge ecosystem? Even when the chain goes down, you would still be unable to spoof messages right? I'm not sure if this is a fair metric either for bridge security, I think most the risk is born by applications who choose to deploy their endpoints on the various chains. |
@chen-robert good catch, that was a typo, I didn't mean to include Liveness in the heading there... as the sentence below it suggests. I discuss the considerations of network liveness failures on certain types of protocols (e.g. Optimistic protocols) separately here. |
Agreed, just checking :) |
How do price a network outage for our transaction services? The current existing Gas Pricing API can be used for this purposes by applying the principle of Little’s Law 1. We can take the value for Littles Law (we apply this to a new field called ‘networkCongestion’) and apply it to the time of networkOutage, which is the time that zero transactions are able to be included in the network. A period of networkOutage can be defined as a value that is three standard deviations above networkCongestion. Think of this as the time it takes to return to a normal value of networkCongestion after the outage is over, i.e. how long it takes to return to normal network congestions (how long it takes to process the transactions that have accumulated during an outage) after networkOutage is over. New Field: networkCongestion networkCongestion - A normalized number that can be used to gauge the congestion level of the network, with 0 meaning not congested and 1 meaning extremely congested New Field: networkOutage networkOutage - A true/false indicating a recognized network outage event. True means we are currently experiencing a network outage Footnotes
|
As a side note to the above suggestion, we considered applying this for 'taxing' cross chain transfers for assets being moved to less 'secure' networks. I.e. we want to effectuate the transfer balance to reflect the security subsidy that is implicit as an actual % change effect on the dest. chain balances. Meaning you move 100 tokens from Ethereum to 'LessSecureNetwork', you would get only 80 tokens, showing a risk of 20% or something along those lines. |
The Network Risk Score currently goes between 0 and 10. However, the rest of the framework is between 0 and 100.
@ermyas and @prototypo, and others: How important are network issues relative to other issues? That is, if the network issues were the worst they possibly could be (protocol uses unfinalised information, the source chain goes offline, is congested, and has safety violations), is that as bad as really bad architectural, implementation, and operational issues? If they are, then we should rescale up network risk score to be out of 100.
The text was updated successfully, but these errors were encountered: