Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isolate transaction broadcasting latency from regas logic #465

Merged
merged 2 commits into from
Apr 18, 2024

Conversation

ian-shim
Copy link
Contributor

@ian-shim ian-shim commented Apr 10, 2024

Why are these changes needed?

Currently, we send a replacement transaction if an existing transaction doesn't get confirmed within predetermined timeout. This duration is measured from the time TxnManager sends the transaction. If the transaction is published directly on eth network, this would have been a good measure to detect if the transaction is stuck on mempool and if a replacement transaction with higher gas is necessary.

However, when a transaction is submitted via Fireblocks, this is not a good measure to determine if the transaction is stuck on mempool because it takes non-trivial amount of time for Fireblocks to publish the txn to mempool due to multiparty computation.

This PR breaks down this part of latency (pre-broadcast), and excludes this duration from the timeout watched to create a replacement transaction. For example, if it takes a long time to broadcast the transaction to mempool (due to MPC operation from Fireblocks), no replacement transactions should be created. Only if it takes a long time to confirm transaction onchain after it's been published in mempool, it creates a replacement transaction with higher gas.

Checks

  • I've made sure the lint is passing in this PR.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

@ian-shim ian-shim force-pushed the broadcasting-timeout branch from d240d87 to e4fed18 Compare April 10, 2024 17:17
@ian-shim ian-shim changed the title broadcasting timeout Isolate transaction broadcasting latency from regas logic Apr 10, 2024
@ian-shim ian-shim force-pushed the broadcasting-timeout branch from e4fed18 to d5cf05c Compare April 10, 2024 21:50
@ian-shim ian-shim marked this pull request as ready for review April 10, 2024 22:26
@ian-shim ian-shim requested a review from dmanc April 11, 2024 17:32
for {
for _, tx := range txs {
_, err := t.wallet.GetTransactionReceipt(ctx, tx.TxID)
if err == nil || errors.Is(err, walletsdk.ErrReceiptNotYetAvailable) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking error types seems like it's a bit brittle. I guess there's no other option here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can change the implementation in eigensdk, but I think this signature should be kept.
One thing we can do is make the error type more expressive and return Fireblocks status as part of the error, but not sure if that will be less brittle

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment: "Before the transaction is broadcast, this wallet method will return error of type walletsdk.Whatever. Once the transaction has been broadcast, it will either successfully return the receipt or return an error of type walletsdk.ErrReceiptNotYetAvailable."

Copy link
Collaborator

@mooselumph mooselumph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we being retrying the fireblocks request if that times out? Or would we just set a long timeout for this? It looks like the current behavior is just to fail?

@ian-shim
Copy link
Contributor Author

Should we being retrying the fireblocks request if that times out? Or would we just set a long timeout for this? It looks like the current behavior is just to fail?

I think we should just set a long timeout for this. If the mpc operation latency is extended, it's probably not transient and the retry will likely encounter similar issue. The current behavior is that it will "cancel" the transaction and fail.

@ian-shim ian-shim force-pushed the broadcasting-timeout branch 2 times, most recently from 9e834ae to 5e41c1c Compare April 15, 2024 19:23
// Ensure transactions are broadcasted to the network before querying the receipt.
// This is to avoid querying the receipt of a transaction that hasn't been broadcasted yet.
// For example, when Fireblocks wallet is used, there may be delays in broadcasting the transaction due to latency from cosigning and MPC operations.
err = t.ensureAnyTransactionBroadcasted(ctxWithTimeout, req.txAttempts)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't retry here because there wouldn't be any advantage over just waiting longer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah as explained in this comment

Copy link
Collaborator

@mooselumph mooselumph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple more comments

@ian-shim ian-shim force-pushed the broadcasting-timeout branch from 5e41c1c to f036cfe Compare April 17, 2024 21:56
@ian-shim ian-shim force-pushed the broadcasting-timeout branch from f036cfe to ab140d0 Compare April 18, 2024 02:59
@ian-shim ian-shim merged commit 13cf70c into Layr-Labs:master Apr 18, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants