[Code Health] Improve Claim settlement processing scalability #1013

red-0ne · 2024-12-17T13:13:14Z

Objective

Ensure that Claim settlement processing scales well with the increasing number of claims to process

Origin Document

Goals

Make sure that the protocol is able to support and process a large number of sessions when those are settled in the same block height.
Investigate and address any potential bottleneck resulting from the processing of a large number of sessions.

Deliverables

A PR that

Query module parameters outside of the the claim settlement loop such as:

func (k Keeper) SettlePendingClaims(ctx cosmostypes.Context) {
   ...
+  targetNumRelays := k.serviceKeeper.GetParams(ctx).TargetNumRelays
   ...
   for _, claim := range expiringClaims {
      ...
-     targetNumRelays := k.serviceKeeper.GetParams(ctx).TargetNumRelays

Gather settled services data outside of the claim settlement loop:

func (k Keeper) SettlePendingClaims(ctx cosmostypes.Context) {
   ...
+  // Query mining difficulty only once for each service
+  serviceIdToRelayMiningDifficultyMap := getServicesRelayMiningDifficulties(claims)
   ...
   for _, claim := range expiringClaims {
      ...
-     relayMiningDifficulty, found := k.serviceKeeper.GetRelayMiningDifficulty(ctx, serviceId)

Perform proof validation at proof submission. Currently the validation is done at settlement time and involve signature verification.

Non-goals / Non-deliverables

Rewrite or rethink how the protocol works
Implement language specific or micro-optimizations

General deliverables

Comments: Add/update TODOs and comments alongside the source code so it is easier to follow.
Testing: Add new tests (unit and/or E2E) to the test suite.

Creator: @red-0ne
Co-Owners: @okdas @bryanchriswhite

Olshansk · 2024-12-18T04:04:33Z

@red-0ne A few followup quetions/comments:

Querying module parameters outside of the the claims loop.

Where/when would we be querying them? I don't fully understand the optimization here.

Gather settled services data outside of the claim settlement loop.

Same as above.

If I were to pick up and implement this myself, I don't understand the direction and rationale.

Perform proof validation at proof submission. Currently the validation is done at settlement time and involve signature verification.

We DO NOT want to do this for two reasons:

Protocol safety:
- An attacker would be able to hammer / rainbow attack if this was possible.
- Independently from the point above, you can't validate if a proof is correct until after the proof submission window closes, and you can only submit it BEFORE it closes. I would say this doesn't make sense.
Idiomatic software engineering:
- API calls have to be quick & lightweight.
- This can make a simple API query require heavy processing and simply moves the processing from validators (which should be ready for heavy processing) to full nodes (which are not set up for this).
- I would argue we have to #PUC next to the proof submission endpoint TO NOT do this.

red-0ne · 2024-12-18T16:23:38Z

@Olshansk,

Independently from the point above, you can't validate if a proof is correct until after the proof submission window closes, and you can only submit it BEFORE it closes. I would say this doesn't make sense.

TL;DR; I think there's a confusion between Claim submission (that binds the supplier to the smt root) and Proof submission (that selects which branch to submit)

It is possible to validate the proof correctness at submission time. Since closest path generation and validation uses the EarliestSupplierProofCommitHeight's hash as a seed to select which branch to prove.

See: x/proof/keeper/proof_validation.go#L235-L250

This height comes before ProofWindowCloseHeight and is known by the supplier at proof submission time.

Note that it is unknown at Claim submission which is binding the supplier to a claim root

More broadly, the supplier knows whether the proof it's submitting is valid or not and there's no (and should not be any) secret revealed at (or after) proof submission window closing.

An attacker would be able to hammer / rainbow attack if this was possible.

This is why we have to minimize the time a supplier has to submit a proof after the seed hash is known.

API calls have to be quick & lightweight.

Message submission responses are fast and do not wait for the tx to process (only basic verification is performed). The outcome of the transaction is only known if the tx gets included in a block that is committed.

cc @bryanchriswhite

Olshansk · 2024-12-20T19:33:10Z

This is why we have to minimize the time a supplier has to submit a proof after the seed hash is known.

ACK.

However, I don't want this to be a consideration / decision because whoever will manage the governance params 10 years from now will not have this thought process.

We need the submission windows to function independently, safely and securely without these limitations.

Olshansk · 2024-12-20T19:33:54Z

TL;DR; I think there's a confusion between Claim submission (that binds the supplier to the smt root) and Proof submission (that selects which branch to submit)

Noted and agreed with you approach.

I concur that it is safe but am still worried about a potential grinding attack.

Will keep thinking...

Olshansk · 2024-12-20T20:01:35Z

Message submission responses are fast and do not wait for the tx to process (only basic verification is performed). The outcome of the transaction is only known if the tx gets included in a block that is committed.

For the purposes of this messages, let's remove the grinding attack risk vector because I agree with your statement. I'm disregarding my two messages above.

Can you confirm that this is what you expect the flow to be?

If no: Can you provide a better explanation?
If yes: the greedy validation endpoint could potentially need a really long timeout, which is my concern for RPC nodes.

flowchart TD
    A[Client submits proof] --> B[RPC Node receives proof submission request]
    B --> C{Is 'greedy_validation' boolean field set?}
    C -->|greedy_validation=false| D[Store proof onchain w/ field 'validated=false']
    D --> E[[Return success]]
    C -->|greedy_validation=true| F[Validate proof]
    F --> G{Is proof valid?}
    G -->|No| H[Don't store on chain]
    H --> I[[Return error]]
    G -->|Yes| J[Store proof onchain w/ field 'validated=tre']
    J --> K[[Return success]]

red-0ne added on-chain On-chain business logic code health Cleans up some code scalability proof Claim & Proof life cycle labels Dec 17, 2024

red-0ne added this to the Shannon Beta TestNet Support milestone Dec 17, 2024

red-0ne self-assigned this Dec 17, 2024

red-0ne added this to Shannon Dec 17, 2024

github-project-automation bot moved this to 📋 Backlog in Shannon Dec 17, 2024

red-0ne moved this from 📋 Backlog to 🔖 Ready in Shannon Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Code Health] Improve Claim settlement processing scalability #1013

[Code Health] Improve Claim settlement processing scalability #1013

red-0ne commented Dec 17, 2024 •

edited

Loading

Olshansk commented Dec 18, 2024

red-0ne commented Dec 18, 2024 •

edited

Loading

Olshansk commented Dec 20, 2024

Olshansk commented Dec 20, 2024

Olshansk commented Dec 20, 2024 •

edited

Loading

[Code Health] Improve Claim settlement processing scalability #1013

[Code Health] Improve Claim settlement processing scalability #1013

Comments

red-0ne commented Dec 17, 2024 • edited Loading

Objective

Origin Document

Goals

Deliverables

Non-goals / Non-deliverables

General deliverables

Olshansk commented Dec 18, 2024

red-0ne commented Dec 18, 2024 • edited Loading

Olshansk commented Dec 20, 2024

Olshansk commented Dec 20, 2024

Olshansk commented Dec 20, 2024 • edited Loading

red-0ne commented Dec 17, 2024 •

edited

Loading

red-0ne commented Dec 18, 2024 •

edited

Loading

Olshansk commented Dec 20, 2024 •

edited

Loading