Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update TXM confirmation logic to use the mined transaction count #14405

Conversation

amit-momin
Copy link
Contributor

@amit-momin amit-momin commented Sep 12, 2024

BCFR-620

  • Confirmer changes

    • Confirmer uses the mined transaction count to determine if transactions have been re-org'd or confirmed
    • Confirmer no longer sets transaction states to confirmed_missing_receipt. This state is maintained in queries for backwards compatibility
  • Finalizer Changes

    • Finalizer now responsible for fetching and storing receipts for confirmed transactions
    • Finalizer now responsible for resuming pending task runs
    • Finalizer now responsible for marking old transactions without receipts broadcasted before the finalized head as fatal

@amit-momin amit-momin marked this pull request as ready for review September 12, 2024 01:19
@amit-momin amit-momin requested review from a team as code owners September 12, 2024 01:19
@amit-momin amit-momin requested review from EasterTheBunny and removed request for a team September 12, 2024 01:19
// CheckForConfirmation fetches the mined transaction count for each enabled address and marks transactions with a lower sequence as confirmed and ones with equal or higher sequence as unconfirmed
func (ec *Confirmer[CHAIN_ID, HEAD, ADDR, TX_HASH, BLOCK_HASH, R, SEQ, FEE]) CheckForConfirmation(ctx context.Context, head types.Head[BLOCK_HASH]) error {
for _, fromAddress := range ec.enabledAddresses {
// Returns the total transaction count for from address which is the highest mined sequence + 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Let's drop this comment. This RPC call is straightforward, and the functionality is already described at the top.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do, just added it in there because SequenceAt is sort of a misnomer

"chainlink": minor
---

Updated the TXM confirmation logic to use the mined transaction count to identify re-org'd or confirmed transactions. #internal
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might need to be updated. We'll have to collect all the changes and figure out the impact. For example, we're indirectly deprecating the ConfirmedMissingReceipt type.

@@ -137,14 +189,28 @@ func (f *evmFinalizer) DeliverLatestHead(head *evmtypes.Head) bool {
func (f *evmFinalizer) ProcessHead(ctx context.Context, head *evmtypes.Head) error {
ctx, cancel := context.WithTimeout(ctx, processHeadTimeout)
defer cancel()
// Fetch the latest finalized block
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We don't need this

common/txmgr/confirmer.go Show resolved Hide resolved
common/txmgr/confirmer.go Show resolved Hide resolved
common/txmgr/confirmer.go Show resolved Hide resolved
@amit-momin amit-momin requested a review from a team as a code owner September 13, 2024 19:23
query := `
SELECT evm.tx_attempts.* FROM evm.tx_attempts
JOIN evm.txes ON evm.txes.ID = evm.tx_attempts.eth_tx_id
WHERE evm.tx_attempts.state = 'broadcast' AND evm.txes.state IN ('confirmed', 'confirmed_missing_receipt', 'fatal_error') AND evm.txes.evm_chain_id = $1 AND evm.txes.ID NOT IN (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is confirmed_missing_receipt relevant anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is for now for backwards compatibility for existing transactions in this state. These changes should enable us to deprecate the confirmed_missing_receipt state but I didn't want to include that effort in this change.

if err := ec.txStore.MarkAllConfirmedMissingReceipt(ctx, ec.chainID); err != nil {
return fmt.Errorf("unable to mark txes as 'confirmed_missing_receipt': %w", err)
// Mark the transactions included on-chain with a purge attempt as fatal error with the terminally stuck error message
if err := ec.txStore.UpdateTxFatalError(ctx, purgeTxIDs, ec.stuckTxDetector.StuckTxFatalError()); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if this functionality is better suited for the Finalizer instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we would want to mark tx as fatal for being stuck as soon as possible which is why this was left in the Confirmer. Otherwise, we would only surface the fatal error after finalization which could take a long time. I also wanted to avoid marking a fatal stuck transaction as confirmed at any point to avoid any confusion with what that state implies.

var confirmedTxIDs []int64
for _, tx := range includedTxs {
isPurgeTx := false
for _, attempt := range tx.TxAttempts {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this could be a transaction method like isPurgable.

if receipt == nil {
// NOTE: This should never happen, but it seems safer to check
// regardless to avoid a potential panic
l.AssumptionViolation("got nil receipt")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AssumptionViolation seems a bit excessive here, as empty responses on certain fields can happen occasionally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left this untouched from what we had previously in the Confirmer but I'm not against updating this. Is Error or Warn maybe better?

func (f *evmFinalizer) buildReceiptIdList(finalizedReceipts []Receipt) []int64 {
receiptIds := make([]int64, len(finalizedReceipts))
func (f *evmFinalizer) FetchAndStoreReceipts(ctx context.Context, head, latestFinalizedHead *evmtypes.Head) error {
attempts, err := f.txStore.FindAttemptsRequiringReceiptFetch(ctx, f.chainID)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to separate an overflowed from a fatal transaction? Although they are both classified as fatal, the former has a receipt, meaning this check will eventually stop. For the latter, seems like we'll try to fetch receipts indefinitely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that's implicitly separated by us returning attempts. Other transactions marked as fatal in the Broadcaster delete their attempts so this query wouldn't pick those up.

@amit-momin amit-momin requested a review from a team as a code owner September 23, 2024 23:25
@smartcontractkit smartcontractkit deleted a comment from github-actions bot Nov 14, 2024
@smartcontractkit smartcontractkit deleted a comment from github-actions bot Nov 14, 2024
@smartcontractkit smartcontractkit deleted a comment from github-actions bot Nov 14, 2024
@smartcontractkit smartcontractkit deleted a comment from github-actions bot Nov 14, 2024
@smartcontractkit smartcontractkit deleted a comment from github-actions bot Nov 14, 2024
@smartcontractkit smartcontractkit deleted a comment from github-actions bot Nov 14, 2024
@smartcontractkit smartcontractkit deleted a comment from github-actions bot Nov 14, 2024
}
_, err := orm.q.ExecContext(ctx, `UPDATE evm.tx_attempts SET broadcast_before_block_num = NULL, state = 'in_progress' WHERE id = $1`, attempt.ID)
return pkgerrors.Wrap(err, "updateEthTxAttemptUnbroadcast failed")
func updateEthTxAttemptUnbroadcast(ctx context.Context, orm *evmTxStore, attemptIDs []int64) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
func updateEthTxAttemptUnbroadcast(ctx context.Context, orm *evmTxStore, attemptIDs []int64) error {
func updateEthTxAttemptsUnbroadcast(ctx context.Context, orm *evmTxStore, attemptIDs []int64) error {

}
_, err := orm.q.ExecContext(ctx, `UPDATE evm.txes SET state = 'unconfirmed' WHERE id = $1`, etx.ID)
return pkgerrors.Wrap(err, "updateEthTxUnconfirm failed")
func updateEthTxUnconfirm(ctx context.Context, orm *evmTxStore, etxIDs []int64) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
func updateEthTxUnconfirm(ctx context.Context, orm *evmTxStore, etxIDs []int64) error {
func updateEthTxsUnconfirm(ctx context.Context, orm *evmTxStore, etxIDs []int64) error {

}

func deleteEthReceipts(ctx context.Context, orm *evmTxStore, etxID int64) (err error) {
func deleteEthReceipts(ctx context.Context, orm *evmTxStore, etxIDs []int64) (err error) {
_, err = orm.q.ExecContext(ctx, `
DELETE FROM evm.receipts
USING evm.tx_attempts
WHERE evm.receipts.tx_hash = evm.tx_attempts.hash
Copy link
Collaborator

@dimriou dimriou Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably out of scope but this WHERE condition seems irrelevant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm seems like it but I'd rather keep the query as is since it's carrying over from the previous code in case I'm misreading it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not in these changes, but can we update the following message:
Received finalized block older than one already processed. This should never happen and could be an issue with RPCs."
It is not correct as HeadTracker doesn't provide such a guarantee.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I just remove the "This should never happen" part. Think this helped catch an RPC issue recently so I still want to warn for that possibility. So something like Received finalized block older than one already processed. There could be an issue with one or more configured RPCs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would drop the: There could be an issue with one or more configured RPCs. and let us do the interpretation instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed it in the latest commit

receiptIDs := f.buildReceiptIdList(finalizedReceipts)

err = f.txStore.UpdateTxStatesToFinalizedUsingReceiptIds(ctx, receiptIDs, f.chainId)
err = f.txStore.UpdateTxStatesToFinalizedUsingTxHashes(ctx, txHashes, f.chainID)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

Comment on lines 213 to 215
// Update lastProcessedFinalizedBlockNum after processing has completed to allow failed processing to retry on subsequent heads
// Does not need to be protected with mutex lock because the Finalizer only runs in a single loop
f.lastProcessedFinalizedBlockNum = latestFinalizedHead.BlockNumber()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed this by accident

Copy link
Contributor

@huangzhen1997 huangzhen1997 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me, can you remind me the reason for moving minConfirmation callback logic to finalizer instead of confirmer?

@amit-momin
Copy link
Contributor Author

Overall looks good to me, can you remind me the reason for moving minConfirmation callback logic to finalizer instead of confirmer?

Ya the minConfirmation logic to resume pending task runs are dependent on transaction receipts since we need the blocknum from them to determine when we can resume. Since we changed the confirmation logic in the Confirmer to use nonces, we moved the receipt fetching/storing to the finalizer. So it made sense to move the minConfirmation logic there too since we only need receipts for resuming runs and finalization now.

@dimriou dimriou added this pull request to the merge queue Nov 19, 2024
Merged via the queue into develop with commit 03115e8 Nov 19, 2024
164 checks passed
@dimriou dimriou deleted the BCI-4097-Update-the-TXM-confirmation-logic-to-use-the-mined-nonce branch November 19, 2024 12:30
cedric-cordenier pushed a commit that referenced this pull request Nov 20, 2024
)

* Updated confirmation logic to use the mined transaction count

* Added changeset

* Fixed linting

* Addressed feedback and fixed linting

* Fixed linting

* Addressed feedback

* Fixed VRF v2 integration tests

* Fixed linting

* Removed misleading log and fixed vrf v2 integration tests

* Updated receipt check for vrf v2 integration tests

* Fixed CCIP integration tests

* Updated tests for new head type

* Fixed linting

* Fixed VRF v2 integration tests

* Fixed pipeline integration tests and added logs to nonce tracker

* Fixed VRF e2e tests

* Fixed VRF e2e test

* Updated find attempts for receipt fetch query

* Addressed feedback

* Cleaned config interfaces and added back methods to txstore interface

* Updated confirmer description and txstore method name

* Changed logic to mark old transactions without receipts as fatal

* Removed chain ID from the SaveFetchedReceipts method

* Fixed tests and interfaces after merge conflict

* Updated finalizer logs

* Fixed linting

* Fixed linting

* Fixed linting

* Pre-allocate slices for linting

* Cleaned up some eventually's in vrf v2 tests

* Added finalized transaction count prom metric

* Fixed vrf v2 integration test

* Fixed vrf integration test

* Updated evm store mocks

* Generated new mock and clean linting

* Added no lint comments

* Fixed ifElseChain linting

* Changed filter timer in vrf smoke test

* Undid timer change and switched to hardcoded 10s filter timeout

* Extended randomness event timeout

* Reverted hardcoded timeouts

* Removed allowed logs

* Increased integration test finality depth

* Updated vrf integration test tomls and reverted the default toml

* Updated finality depth for simulated chain config defaults in integration tests

* Replaced gomega with require for two VRF test helper methods

* Added simualted chain node configs and reverted default toml changes

* Reduced vrf simulated test configs to fallback to defaults

* Updated VRF simulated test config to match smoke defaults

* Added HistoryDepth config to vrf smoke test configs

* Removed VRF test simulates chain configs

* Reverted vrf test changes

* Fixed flakey vrf bhs integration tests

* Updated finalization logic to delete stale receipts if detected

* Addressed feedback

* Updated log
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants