Update TXM confirmation logic to use the mined transaction count #14405

amit-momin · 2024-09-12T00:49:03Z

Confirmer changes
- Confirmer uses the mined transaction count to determine if transactions have been re-org'd or confirmed
- Confirmer no longer sets transaction states to confirmed_missing_receipt. This state is maintained in queries for backwards compatibility
Finalizer Changes
- Finalizer now responsible for fetching and storing receipts for confirmed transactions
- Finalizer now responsible for resuming pending task runs
- Finalizer now responsible for marking old transactions without receipts broadcasted before the finalized head as fatal

dimriou · 2024-09-12T12:07:02Z

common/txmgr/confirmer.go

+// CheckForConfirmation fetches the mined transaction count for each enabled address and marks transactions with a lower sequence as confirmed and ones with equal or higher sequence as unconfirmed
+func (ec *Confirmer[CHAIN_ID, HEAD, ADDR, TX_HASH, BLOCK_HASH, R, SEQ, FEE]) CheckForConfirmation(ctx context.Context, head types.Head[BLOCK_HASH]) error {
+	for _, fromAddress := range ec.enabledAddresses {
+		// Returns the total transaction count for from address which is the highest mined sequence + 1


nit: Let's drop this comment. This RPC call is straightforward, and the functionality is already described at the top.

Will do, just added it in there because SequenceAt is sort of a misnomer

dimriou · 2024-09-12T12:09:13Z

.changeset/late-windows-clean.md

+"chainlink": minor
+---
+
+Updated the TXM confirmation logic to use the mined transaction count to identify re-org'd or confirmed transactions. #internal


This might need to be updated. We'll have to collect all the changes and figure out the impact. For example, we're indirectly deprecating the ConfirmedMissingReceipt type.

dimriou · 2024-09-12T14:22:08Z

core/chains/evm/txmgr/finalizer.go

@@ -137,14 +189,28 @@ func (f *evmFinalizer) DeliverLatestHead(head *evmtypes.Head) bool {
 func (f *evmFinalizer) ProcessHead(ctx context.Context, head *evmtypes.Head) error {
 	ctx, cancel := context.WithTimeout(ctx, processHeadTimeout)
 	defer cancel()
+	// Fetch the latest finalized block


nit: We don't need this

…c-to-use-the-mined-nonce

common/txmgr/confirmer.go

…c-to-use-the-mined-nonce

dimriou · 2024-09-23T12:33:26Z

core/chains/evm/txmgr/evm_tx_store.go

+	query := `
+		SELECT evm.tx_attempts.* FROM evm.tx_attempts
+		JOIN evm.txes ON evm.txes.ID = evm.tx_attempts.eth_tx_id
+		WHERE evm.tx_attempts.state = 'broadcast' AND evm.txes.state IN ('confirmed', 'confirmed_missing_receipt', 'fatal_error') AND evm.txes.evm_chain_id = $1 AND evm.txes.ID NOT IN (


Is confirmed_missing_receipt relevant anymore?

It is for now for backwards compatibility for existing transactions in this state. These changes should enable us to deprecate the confirmed_missing_receipt state but I didn't want to include that effort in this change.

dimriou · 2024-09-23T13:50:41Z

common/txmgr/confirmer.go

-	if err := ec.txStore.MarkAllConfirmedMissingReceipt(ctx, ec.chainID); err != nil {
-		return fmt.Errorf("unable to mark txes as 'confirmed_missing_receipt': %w", err)
+	// Mark the transactions included on-chain with a purge attempt as fatal error with the terminally stuck error message
+	if err := ec.txStore.UpdateTxFatalError(ctx, purgeTxIDs, ec.stuckTxDetector.StuckTxFatalError()); err != nil {


I'm wondering if this functionality is better suited for the Finalizer instead

I think we would want to mark tx as fatal for being stuck as soon as possible which is why this was left in the Confirmer. Otherwise, we would only surface the fatal error after finalization which could take a long time. I also wanted to avoid marking a fatal stuck transaction as confirmed at any point to avoid any confusion with what that state implies.

dimriou · 2024-09-23T15:02:32Z

common/txmgr/confirmer.go

+	var confirmedTxIDs []int64
+	for _, tx := range includedTxs {
+		isPurgeTx := false
+		for _, attempt := range tx.TxAttempts {


I feel this could be a transaction method like isPurgable.

dimriou · 2024-09-23T15:15:54Z

core/chains/evm/txmgr/finalizer.go

+	if receipt == nil {
+		// NOTE: This should never happen, but it seems safer to check
+		// regardless to avoid a potential panic
+		l.AssumptionViolation("got nil receipt")


AssumptionViolation seems a bit excessive here, as empty responses on certain fields can happen occasionally.

I left this untouched from what we had previously in the Confirmer but I'm not against updating this. Is Error or Warn maybe better?

dimriou · 2024-09-23T16:28:32Z

core/chains/evm/txmgr/finalizer.go

-func (f *evmFinalizer) buildReceiptIdList(finalizedReceipts []Receipt) []int64 {
-	receiptIds := make([]int64, len(finalizedReceipts))
+func (f *evmFinalizer) FetchAndStoreReceipts(ctx context.Context, head, latestFinalizedHead *evmtypes.Head) error {
+	attempts, err := f.txStore.FindAttemptsRequiringReceiptFetch(ctx, f.chainID)


Is there a way to separate an overflowed from a fatal transaction? Although they are both classified as fatal, the former has a receipt, meaning this check will eventually stop. For the latter, seems like we'll try to fetch receipts indefinitely.

I believe that's implicitly separated by us returning attempts. Other transactions marked as fatal in the Broadcaster delete their attempts so this query wouldn't pick those up.

dimriou · 2024-11-18T10:37:08Z

core/chains/evm/txmgr/evm_tx_store.go

-	}
-	_, err := orm.q.ExecContext(ctx, `UPDATE evm.tx_attempts SET broadcast_before_block_num = NULL, state = 'in_progress' WHERE id = $1`, attempt.ID)
-	return pkgerrors.Wrap(err, "updateEthTxAttemptUnbroadcast failed")
+func updateEthTxAttemptUnbroadcast(ctx context.Context, orm *evmTxStore, attemptIDs []int64) error {


nit:

Suggested change

func updateEthTxAttemptUnbroadcast(ctx context.Context, orm *evmTxStore, attemptIDs []int64) error {

func updateEthTxAttemptsUnbroadcast(ctx context.Context, orm *evmTxStore, attemptIDs []int64) error {

dimriou · 2024-11-18T10:38:18Z

core/chains/evm/txmgr/evm_tx_store.go

-	}
-	_, err := orm.q.ExecContext(ctx, `UPDATE evm.txes SET state = 'unconfirmed' WHERE id = $1`, etx.ID)
-	return pkgerrors.Wrap(err, "updateEthTxUnconfirm failed")
+func updateEthTxUnconfirm(ctx context.Context, orm *evmTxStore, etxIDs []int64) error {


nit:

Suggested change

func updateEthTxUnconfirm(ctx context.Context, orm *evmTxStore, etxIDs []int64) error {

func updateEthTxsUnconfirm(ctx context.Context, orm *evmTxStore, etxIDs []int64) error {

dimriou · 2024-11-18T10:53:20Z

core/chains/evm/txmgr/evm_tx_store.go

 }

-func deleteEthReceipts(ctx context.Context, orm *evmTxStore, etxID int64) (err error) {
+func deleteEthReceipts(ctx context.Context, orm *evmTxStore, etxIDs []int64) (err error) {
 	_, err = orm.q.ExecContext(ctx, `
 DELETE FROM evm.receipts
 USING evm.tx_attempts
 WHERE evm.receipts.tx_hash = evm.tx_attempts.hash


Probably out of scope but this WHERE condition seems irrelevant

Hmmm seems like it but I'd rather keep the query as is since it's carrying over from the previous code in case I'm misreading it.

dimriou · 2024-11-18T11:55:29Z

core/chains/evm/txmgr/finalizer.go

It's not in these changes, but can we update the following message:
Received finalized block older than one already processed. This should never happen and could be an issue with RPCs."
It is not correct as HeadTracker doesn't provide such a guarantee.

What if I just remove the "This should never happen" part. Think this helped catch an RPC issue recently so I still want to warn for that possibility. So something like Received finalized block older than one already processed. There could be an issue with one or more configured RPCs.

I would drop the: There could be an issue with one or more configured RPCs. and let us do the interpretation instead.

I've removed it in the latest commit

dimriou · 2024-11-18T12:01:39Z

core/chains/evm/txmgr/finalizer.go

-	receiptIDs := f.buildReceiptIdList(finalizedReceipts)
-
-	err = f.txStore.UpdateTxStatesToFinalizedUsingReceiptIds(ctx, receiptIDs, f.chainId)
+	err = f.txStore.UpdateTxStatesToFinalizedUsingTxHashes(ctx, txHashes, f.chainID)


dimriou · 2024-11-18T12:04:05Z

core/chains/evm/txmgr/finalizer.go

-	// Update lastProcessedFinalizedBlockNum after processing has completed to allow failed processing to retry on subsequent heads
-	// Does not need to be protected with mutex lock because the Finalizer only runs in a single loop
-	f.lastProcessedFinalizedBlockNum = latestFinalizedHead.BlockNumber()


You removed this by accident

…c-to-use-the-mined-nonce

huangzhen1997

Overall looks good to me, can you remind me the reason for moving minConfirmation callback logic to finalizer instead of confirmer?

amit-momin · 2024-11-18T19:10:27Z

Overall looks good to me, can you remind me the reason for moving minConfirmation callback logic to finalizer instead of confirmer?

Ya the minConfirmation logic to resume pending task runs are dependent on transaction receipts since we need the blocknum from them to determine when we can resume. Since we changed the confirmation logic in the Confirmer to use nonces, we moved the receipt fetching/storing to the finalizer. So it made sense to move the minConfirmation logic there too since we only need receipts for resuming runs and finalization now.

) * Updated confirmation logic to use the mined transaction count * Added changeset * Fixed linting * Addressed feedback and fixed linting * Fixed linting * Addressed feedback * Fixed VRF v2 integration tests * Fixed linting * Removed misleading log and fixed vrf v2 integration tests * Updated receipt check for vrf v2 integration tests * Fixed CCIP integration tests * Updated tests for new head type * Fixed linting * Fixed VRF v2 integration tests * Fixed pipeline integration tests and added logs to nonce tracker * Fixed VRF e2e tests * Fixed VRF e2e test * Updated find attempts for receipt fetch query * Addressed feedback * Cleaned config interfaces and added back methods to txstore interface * Updated confirmer description and txstore method name * Changed logic to mark old transactions without receipts as fatal * Removed chain ID from the SaveFetchedReceipts method * Fixed tests and interfaces after merge conflict * Updated finalizer logs * Fixed linting * Fixed linting * Fixed linting * Pre-allocate slices for linting * Cleaned up some eventually's in vrf v2 tests * Added finalized transaction count prom metric * Fixed vrf v2 integration test * Fixed vrf integration test * Updated evm store mocks * Generated new mock and clean linting * Added no lint comments * Fixed ifElseChain linting * Changed filter timer in vrf smoke test * Undid timer change and switched to hardcoded 10s filter timeout * Extended randomness event timeout * Reverted hardcoded timeouts * Removed allowed logs * Increased integration test finality depth * Updated vrf integration test tomls and reverted the default toml * Updated finality depth for simulated chain config defaults in integration tests * Replaced gomega with require for two VRF test helper methods * Added simualted chain node configs and reverted default toml changes * Reduced vrf simulated test configs to fallback to defaults * Updated VRF simulated test config to match smoke defaults * Added HistoryDepth config to vrf smoke test configs * Removed VRF test simulates chain configs * Reverted vrf test changes * Fixed flakey vrf bhs integration tests * Updated finalization logic to delete stale receipts if detected * Addressed feedback * Updated log

amit-momin added 3 commits September 11, 2024 19:45

Updated confirmation logic to use the mined transaction count

d2a6193

Added changeset

39616c3

Fixed linting

914876f

amit-momin marked this pull request as ready for review September 12, 2024 01:19

amit-momin requested review from a team as code owners September 12, 2024 01:19

amit-momin requested review from EasterTheBunny and removed request for a team September 12, 2024 01:19

dimriou reviewed Sep 12, 2024

View reviewed changes

amit-momin added 2 commits September 12, 2024 15:02

Addressed feedback and fixed linting

78509ea

Merge branch 'develop' into BCI-4097-Update-the-TXM-confirmation-logi…

d77cc28

…c-to-use-the-mined-nonce

huangzhen1997 reviewed Sep 12, 2024

View reviewed changes

common/txmgr/confirmer.go Show resolved Hide resolved

common/txmgr/confirmer.go Show resolved Hide resolved

common/txmgr/confirmer.go Show resolved Hide resolved

amit-momin added 5 commits September 12, 2024 16:34

Fixed linting

568bee3

Addressed feedback

7d61e80

Merge branch 'develop' into BCI-4097-Update-the-TXM-confirmation-logi…

444c9c7

…c-to-use-the-mined-nonce

Fixed VRF v2 integration tests

b1a8883

Merge branch 'develop' into BCI-4097-Update-the-TXM-confirmation-logi…

83b8a17

…c-to-use-the-mined-nonce

amit-momin requested a review from a team as a code owner September 13, 2024 19:23

amit-momin added 3 commits September 13, 2024 14:40

Fixed linting

d3dc084

Removed misleading log and fixed vrf v2 integration tests

3c1023b

Updated receipt check for vrf v2 integration tests

2adc3d6

dimriou reviewed Sep 23, 2024

View reviewed changes

Fixed CCIP integration tests

9906082

amit-momin requested a review from a team as a code owner September 23, 2024 23:25

smartcontractkit deleted a comment from github-actions bot Nov 14, 2024

dimriou reviewed Nov 18, 2024

View reviewed changes

amit-momin added 3 commits November 18, 2024 09:52

Addressed feedback

002d6a5

Updated log

ca38fc9

Merge branch 'develop' into BCI-4097-Update-the-TXM-confirmation-logi…

fe22dc4

…c-to-use-the-mined-nonce

huangzhen1997 reviewed Nov 18, 2024

View reviewed changes

jmank88 approved these changes Nov 18, 2024

View reviewed changes

winder approved these changes Nov 18, 2024

View reviewed changes

huangzhen1997 approved these changes Nov 18, 2024

View reviewed changes

dimriou added this pull request to the merge queue Nov 19, 2024

Merged via the queue into develop with commit 03115e8 Nov 19, 2024
164 checks passed

dimriou deleted the BCI-4097-Update-the-TXM-confirmation-logic-to-use-the-mined-nonce branch November 19, 2024 12:30

This was referenced Nov 19, 2024

[DO NOT MERGE] Changeset Release Preview - v2.19.0 #13148

Draft

[DO NOT MERGE] Changeset Release Preview - v2.19.0 Emirhan-Cavusoglu-sftw/chainlink#1

Draft

github-actions bot mentioned this pull request Nov 26, 2024

[DO NOT MERGE] Changeset Release Preview - v2.19.0 galactic-beyond/chainlink#1

Draft

github-actions bot mentioned this pull request Dec 18, 2024

[DO NOT MERGE] Changeset Release Preview - v2.19.0 petermetz/chainlink#1

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update TXM confirmation logic to use the mined transaction count #14405

Update TXM confirmation logic to use the mined transaction count #14405

amit-momin commented Sep 12, 2024 •

edited by jira bot

Loading

dimriou Sep 12, 2024

amit-momin Sep 12, 2024

dimriou Sep 12, 2024

dimriou Sep 12, 2024

dimriou Sep 23, 2024

amit-momin Sep 24, 2024

dimriou Sep 23, 2024

amit-momin Sep 24, 2024

dimriou Sep 23, 2024

dimriou Sep 23, 2024

amit-momin Sep 24, 2024

dimriou Sep 23, 2024

amit-momin Sep 24, 2024

dimriou Nov 18, 2024

dimriou Nov 18, 2024

dimriou Nov 18, 2024 •

edited

Loading

amit-momin Nov 18, 2024

dimriou Nov 18, 2024

amit-momin Nov 18, 2024

dimriou Nov 18, 2024

amit-momin Nov 18, 2024

dimriou Nov 18, 2024

dimriou Nov 18, 2024

huangzhen1997 left a comment

amit-momin commented Nov 18, 2024

	func updateEthTxAttemptUnbroadcast(ctx context.Context, orm *evmTxStore, attemptIDs []int64) error {
	func updateEthTxAttemptsUnbroadcast(ctx context.Context, orm *evmTxStore, attemptIDs []int64) error {

	func updateEthTxUnconfirm(ctx context.Context, orm *evmTxStore, etxIDs []int64) error {
	func updateEthTxsUnconfirm(ctx context.Context, orm *evmTxStore, etxIDs []int64) error {

Update TXM confirmation logic to use the mined transaction count #14405

Update TXM confirmation logic to use the mined transaction count #14405

Conversation

amit-momin commented Sep 12, 2024 • edited by jira bot Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dimriou Nov 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huangzhen1997 left a comment

Choose a reason for hiding this comment

amit-momin commented Nov 18, 2024

amit-momin commented Sep 12, 2024 •

edited by jira bot

Loading

dimriou Nov 18, 2024 •

edited

Loading