-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update TXM confirmation logic to use the mined transaction count #14405
Update TXM confirmation logic to use the mined transaction count #14405
Conversation
common/txmgr/confirmer.go
Outdated
// CheckForConfirmation fetches the mined transaction count for each enabled address and marks transactions with a lower sequence as confirmed and ones with equal or higher sequence as unconfirmed | ||
func (ec *Confirmer[CHAIN_ID, HEAD, ADDR, TX_HASH, BLOCK_HASH, R, SEQ, FEE]) CheckForConfirmation(ctx context.Context, head types.Head[BLOCK_HASH]) error { | ||
for _, fromAddress := range ec.enabledAddresses { | ||
// Returns the total transaction count for from address which is the highest mined sequence + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Let's drop this comment. This RPC call is straightforward, and the functionality is already described at the top.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do, just added it in there because SequenceAt
is sort of a misnomer
.changeset/late-windows-clean.md
Outdated
"chainlink": minor | ||
--- | ||
|
||
Updated the TXM confirmation logic to use the mined transaction count to identify re-org'd or confirmed transactions. #internal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might need to be updated. We'll have to collect all the changes and figure out the impact. For example, we're indirectly deprecating the ConfirmedMissingReceipt
type.
core/chains/evm/txmgr/finalizer.go
Outdated
@@ -137,14 +189,28 @@ func (f *evmFinalizer) DeliverLatestHead(head *evmtypes.Head) bool { | |||
func (f *evmFinalizer) ProcessHead(ctx context.Context, head *evmtypes.Head) error { | |||
ctx, cancel := context.WithTimeout(ctx, processHeadTimeout) | |||
defer cancel() | |||
// Fetch the latest finalized block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We don't need this
…c-to-use-the-mined-nonce
…c-to-use-the-mined-nonce
query := ` | ||
SELECT evm.tx_attempts.* FROM evm.tx_attempts | ||
JOIN evm.txes ON evm.txes.ID = evm.tx_attempts.eth_tx_id | ||
WHERE evm.tx_attempts.state = 'broadcast' AND evm.txes.state IN ('confirmed', 'confirmed_missing_receipt', 'fatal_error') AND evm.txes.evm_chain_id = $1 AND evm.txes.ID NOT IN ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is confirmed_missing_receipt
relevant anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is for now for backwards compatibility for existing transactions in this state. These changes should enable us to deprecate the confirmed_missing_receipt
state but I didn't want to include that effort in this change.
if err := ec.txStore.MarkAllConfirmedMissingReceipt(ctx, ec.chainID); err != nil { | ||
return fmt.Errorf("unable to mark txes as 'confirmed_missing_receipt': %w", err) | ||
// Mark the transactions included on-chain with a purge attempt as fatal error with the terminally stuck error message | ||
if err := ec.txStore.UpdateTxFatalError(ctx, purgeTxIDs, ec.stuckTxDetector.StuckTxFatalError()); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if this functionality is better suited for the Finalizer instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we would want to mark tx as fatal for being stuck as soon as possible which is why this was left in the Confirmer. Otherwise, we would only surface the fatal error after finalization which could take a long time. I also wanted to avoid marking a fatal stuck transaction as confirmed at any point to avoid any confusion with what that state implies.
common/txmgr/confirmer.go
Outdated
var confirmedTxIDs []int64 | ||
for _, tx := range includedTxs { | ||
isPurgeTx := false | ||
for _, attempt := range tx.TxAttempts { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel this could be a transaction method like isPurgable
.
if receipt == nil { | ||
// NOTE: This should never happen, but it seems safer to check | ||
// regardless to avoid a potential panic | ||
l.AssumptionViolation("got nil receipt") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AssumptionViolation
seems a bit excessive here, as empty responses on certain fields can happen occasionally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left this untouched from what we had previously in the Confirmer but I'm not against updating this. Is Error
or Warn
maybe better?
func (f *evmFinalizer) buildReceiptIdList(finalizedReceipts []Receipt) []int64 { | ||
receiptIds := make([]int64, len(finalizedReceipts)) | ||
func (f *evmFinalizer) FetchAndStoreReceipts(ctx context.Context, head, latestFinalizedHead *evmtypes.Head) error { | ||
attempts, err := f.txStore.FindAttemptsRequiringReceiptFetch(ctx, f.chainID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to separate an overflowed from a fatal transaction? Although they are both classified as fatal, the former has a receipt, meaning this check will eventually stop. For the latter, seems like we'll try to fetch receipts indefinitely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that's implicitly separated by us returning attempts. Other transactions marked as fatal in the Broadcaster delete their attempts so this query wouldn't pick those up.
} | ||
_, err := orm.q.ExecContext(ctx, `UPDATE evm.tx_attempts SET broadcast_before_block_num = NULL, state = 'in_progress' WHERE id = $1`, attempt.ID) | ||
return pkgerrors.Wrap(err, "updateEthTxAttemptUnbroadcast failed") | ||
func updateEthTxAttemptUnbroadcast(ctx context.Context, orm *evmTxStore, attemptIDs []int64) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
func updateEthTxAttemptUnbroadcast(ctx context.Context, orm *evmTxStore, attemptIDs []int64) error { | |
func updateEthTxAttemptsUnbroadcast(ctx context.Context, orm *evmTxStore, attemptIDs []int64) error { |
} | ||
_, err := orm.q.ExecContext(ctx, `UPDATE evm.txes SET state = 'unconfirmed' WHERE id = $1`, etx.ID) | ||
return pkgerrors.Wrap(err, "updateEthTxUnconfirm failed") | ||
func updateEthTxUnconfirm(ctx context.Context, orm *evmTxStore, etxIDs []int64) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
func updateEthTxUnconfirm(ctx context.Context, orm *evmTxStore, etxIDs []int64) error { | |
func updateEthTxsUnconfirm(ctx context.Context, orm *evmTxStore, etxIDs []int64) error { |
} | ||
|
||
func deleteEthReceipts(ctx context.Context, orm *evmTxStore, etxID int64) (err error) { | ||
func deleteEthReceipts(ctx context.Context, orm *evmTxStore, etxIDs []int64) (err error) { | ||
_, err = orm.q.ExecContext(ctx, ` | ||
DELETE FROM evm.receipts | ||
USING evm.tx_attempts | ||
WHERE evm.receipts.tx_hash = evm.tx_attempts.hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably out of scope but this WHERE condition seems irrelevant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm seems like it but I'd rather keep the query as is since it's carrying over from the previous code in case I'm misreading it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not in these changes, but can we update the following message:
Received finalized block older than one already processed. This should never happen and could be an issue with RPCs."
It is not correct as HeadTracker doesn't provide such a guarantee.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if I just remove the "This should never happen" part. Think this helped catch an RPC issue recently so I still want to warn for that possibility. So something like Received finalized block older than one already processed. There could be an issue with one or more configured RPCs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would drop the: There could be an issue with one or more configured RPCs.
and let us do the interpretation instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed it in the latest commit
receiptIDs := f.buildReceiptIdList(finalizedReceipts) | ||
|
||
err = f.txStore.UpdateTxStatesToFinalizedUsingReceiptIds(ctx, receiptIDs, f.chainId) | ||
err = f.txStore.UpdateTxStatesToFinalizedUsingTxHashes(ctx, txHashes, f.chainID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
core/chains/evm/txmgr/finalizer.go
Outdated
// Update lastProcessedFinalizedBlockNum after processing has completed to allow failed processing to retry on subsequent heads | ||
// Does not need to be protected with mutex lock because the Finalizer only runs in a single loop | ||
f.lastProcessedFinalizedBlockNum = latestFinalizedHead.BlockNumber() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You removed this by accident
…c-to-use-the-mined-nonce
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me, can you remind me the reason for moving minConfirmation callback logic to finalizer instead of confirmer?
Ya the |
) * Updated confirmation logic to use the mined transaction count * Added changeset * Fixed linting * Addressed feedback and fixed linting * Fixed linting * Addressed feedback * Fixed VRF v2 integration tests * Fixed linting * Removed misleading log and fixed vrf v2 integration tests * Updated receipt check for vrf v2 integration tests * Fixed CCIP integration tests * Updated tests for new head type * Fixed linting * Fixed VRF v2 integration tests * Fixed pipeline integration tests and added logs to nonce tracker * Fixed VRF e2e tests * Fixed VRF e2e test * Updated find attempts for receipt fetch query * Addressed feedback * Cleaned config interfaces and added back methods to txstore interface * Updated confirmer description and txstore method name * Changed logic to mark old transactions without receipts as fatal * Removed chain ID from the SaveFetchedReceipts method * Fixed tests and interfaces after merge conflict * Updated finalizer logs * Fixed linting * Fixed linting * Fixed linting * Pre-allocate slices for linting * Cleaned up some eventually's in vrf v2 tests * Added finalized transaction count prom metric * Fixed vrf v2 integration test * Fixed vrf integration test * Updated evm store mocks * Generated new mock and clean linting * Added no lint comments * Fixed ifElseChain linting * Changed filter timer in vrf smoke test * Undid timer change and switched to hardcoded 10s filter timeout * Extended randomness event timeout * Reverted hardcoded timeouts * Removed allowed logs * Increased integration test finality depth * Updated vrf integration test tomls and reverted the default toml * Updated finality depth for simulated chain config defaults in integration tests * Replaced gomega with require for two VRF test helper methods * Added simualted chain node configs and reverted default toml changes * Reduced vrf simulated test configs to fallback to defaults * Updated VRF simulated test config to match smoke defaults * Added HistoryDepth config to vrf smoke test configs * Removed VRF test simulates chain configs * Reverted vrf test changes * Fixed flakey vrf bhs integration tests * Updated finalization logic to delete stale receipts if detected * Addressed feedback * Updated log
BCFR-620
Confirmer changes
confirmed_missing_receipt
. This state is maintained in queries for backwards compatibilityFinalizer Changes