Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to derive correct actions hash #1872

Open
aii23 opened this issue Oct 17, 2024 · 18 comments
Open

Failed to derive correct actions hash #1872

aii23 opened this issue Oct 17, 2024 · 18 comments

Comments

@aii23
Copy link

aii23 commented Oct 17, 2024

This issue was previously identified and fixed in this pull request. However, it still appears to persist.

o1js version: 1.8.0

The error can be reproduced using the following script with any of the provided addresses:

const network = Mina.Network({
  mina: 'https://api.minascan.io/node/devnet/v1/graphql',
  archive: 'https://api.minascan.io/archive/devnet/v1/graphql',
});
Mina.setActiveInstance(network);

const actions = await Mina.activeInstance.fetchActions(
  PublicKey.fromBase58(
    'B62qrHzs8Sn6rJW3Jkd8xvUkPKb4RBC4xsTgKnBd7KvVWnjhkXV7YKJ'
    // 'B62qmASMoUYbRA2TwbB6gDTcdaS9QxEQYghV67i8oMeqA5tsbfvkJ6P'
    // 'B62qio1AyjVw6tBqYCuEJqDz3ej3qVCyJjcQfTTbt3VgE6MZmtruiMJ'
  )
);

The order of actions within a single block remains problematic. We have updated the o1js fetch script to attempt different permutations of actions. This approach successfully identifies the correct order of actions, but for blocks with 7 or more actions, it has to check too many permutations.

Additionally, we have not found a logical way to sort the array of actions, and their order seems random for blocks with a broken sequence. The order of actions does not match the order of events, even though they are dispatched within the same function.

@45930
Copy link
Contributor

45930 commented Oct 17, 2024

Sorry this is happening again. Can you share the order that o1js is returning and the order that works when you try to re-order them?

Last time, the fix was to order them by account update id. Seeing the out-of-order list and the data in it should help us pinpoint the sorting issue.

@aii23
Copy link
Author

aii23 commented Oct 18, 2024

Sure.
'B62qrHzs8Sn6rJW3Jkd8xvUkPKb4RBC4xsTgKnBd7KvVWnjhkXV7YKJ':
Error for state 11553781833537412187998791170094246504179739906426596650283148421062134031747
Failed at permutation: [ '40612', '40613', '40614', '40615' ]
Right order: [ '40613', '40612', '40614', '40615' ]

'B62qmASMoUYbRA2TwbB6gDTcdaS9QxEQYghV67i8oMeqA5tsbfvkJ6P':
This account appears to be unaffected by the bug

'B62qio1AyjVw6tBqYCuEJqDz3ej3qVCyJjcQfTTbt3VgE6MZmtruiMJ'
Error for state 622068088931135211835608923279390386187946637639515815219699650452056633535
Failed at permutation: [ '41263', '41264' ]
Right order: [ '41264', '41263' ]

Error for state 13601174010644725465032541467597965984215487137161264129512131433609455661639
Failed at permutation: [ '41327', '41328' ]
Right order: [ '41328', '41327' ]

Error for state 24001492778780244006185811344983416316405407346861526263853195607046226805225
Failed at permutation: [ '41704', '41705', '41712', '41713', '41714', '41715' ]
Right order: [ '41715', '41713', '41705', '41712', '41704', '41714' ]

Error for state 5355154305568360296397934424492884898082969130147334729446373076991022178134
Failed at permutation: [ '41742', '41743' ]
Right order: [ '41743', '41742' ]

@mitschabaude
Copy link
Collaborator

mitschabaude commented Oct 18, 2024

The order of actions does not match the order of events, even though they are dispatched within the same function.

isn't this just because we re-order actions in o1js after receiving them, but not events?

Last time, the fix was to order them by account update id. Seeing the out-of-order list and the data in it should help us pinpoint the sorting issue.

A thought: the Mina node hashes actions lists in a certain order (after ordering blocks), but always the actions from one account update together. I find it problematic that the archive node API even returns the actions one by one instead of in chunks per account update. But ok.

Then, looking at the code I see this:

actionData = actionData.sort((a1, a2) => {
  return Number(a1.accountUpdateId) < Number(a2.accountUpdateId) ? -1 : 1;
});

It's extremely important that when reordering the actions here, the order within the same account update is maintained. The Array.sort() spec also says that sorting is stable, which means the order would be maintained if you treated two same account update ids as equal. However, since you never return 0 in your sorting function, you might not get stability, and break the order within the same account update?

(Do we have tests with many actions in per account update?)

Another thought: The proper fix in my view is to look at the entire pipeline of actions from Mina node to archive node to archive node graphql API to o1js and make sure that their order stays the same. Intuitively that shouldn't be too hard 😅

@aii23
Copy link
Author

aii23 commented Oct 18, 2024

isn't this just because we re-order actions in o1js after receiving them, but not events?

Seems not, because we had a service that generated a merkle list from events, and it was working fine, until this error showed up. So it worked fine for sometime with reverse actions.

A thought: the Mina node hashes actions lists in a certain order (after ordering blocks), but always the actions from one account update together.

Here there is another problem, as there is only one action in each account update. So it also has some algorithms to sort actions within one block. And it seems, that order is not the order of transaction inclusion.

All actions from different account updates:
Failed at permutation: [ '41704', '41705', '41712', '41713', '41714', '41715' ]
Right order: [ '41715', '41713', '41705', '41712', '41704', '41714' ]

@mitschabaude
Copy link
Collaborator

isn't this just because we re-order actions in o1js after receiving them, but not events?

Seems not, because we had a service that generated a merkle list from events, and it was working fine, until this error showed up. So it worked fine for sometime with reverse actions.

I'm not following why that is contrary to what I wrote

Here there is another problem, as there is only one action in each account update.

Right ok so we run into a different problem here!

@aii23
Copy link
Author

aii23 commented Oct 18, 2024

I'm not following why that is contrary to what I wrote

If actionData.sort was the problem (that actions order and events order do not match) we would have an error from our service way earlier, as it would generate wrong proof using events. But our service caused this error only when we got permuted actions using our quick fix.

@mitschabaude
Copy link
Collaborator

I'm not following why that is contrary to what I wrote

If actionData.sort was the problem (that actions order and events order do not match) we would have an error from our service way earlier, as it would generate wrong proof using events. But our service caused this error only when we got permuted actions using our quick fix.

Got it, so it seems o1js returns actions and events in an order that is consistent with each other, but that order doesn't match the onchain processing order.

From that, my suspicion would be that some part of of the node -> archive node -> graphql pipeline is messing up the order (in the same way for events and actions), and not the o1js step

@aii23
Copy link
Author

aii23 commented Oct 18, 2024

Looks like that. Should I create an issue in other repo too?

@45930
Copy link
Contributor

45930 commented Oct 18, 2024

@aii23

Seems not, because we had a service that generated a merkle list from events, and it was working fine, until this error showed up

Do you have a log of when it stopped working? Like was this working 100% of the time, then all at once it stopped working?

Should I create an issue in other repo too?

I think we can keep the issue here for now unless we dig up a specific error. I am still working on getting input from the node team because they're going to understand the Mina consensus order better than me. We need o1js to match the Mina consensus order exactly, but I don't know what that order is.

@aii23
Copy link
Author

aii23 commented Oct 18, 2024

It's hard to tell what is causing the issue. For example we used 3 contracts for testnet
'B62qrHzs8Sn6rJW3Jkd8xvUkPKb4RBC4xsTgKnBd7KvVWnjhkXV7YKJ' - Round 0
'B62qmASMoUYbRA2TwbB6gDTcdaS9QxEQYghV67i8oMeqA5tsbfvkJ6P' - Round 1
'B62qio1AyjVw6tBqYCuEJqDz3ej3qVCyJjcQfTTbt3VgE6MZmtruiMJ' - Round 2

And round 0 and round 2 have actions with wrong order, but Round 1 does not. So its not like an issue started to arise from some time, but more like a floating error

@deepthiskumar
Copy link

deepthiskumar commented Oct 18, 2024

There are four different orderings that we need to consider when fetching transaction data from the archive DB-

  1. Blocks: blocks need to be ordered as per the canonical chain. For the ones pending ones you’ll need to select based on consensus rules which I believe is handled in the archive node API
  2. Transactions within a block have a sequence. There is a sequence_no column in blocks_zkapp_commands table that gives you the ordering
  3. Account updates within a zkapp transaction have a sequence. The ids are stored as an array dictating the order. zkapp_account_updates_ids column in zkapp_commands table
  4. Events/actions within an account update have a sequence. Same as account updates, ids are stored as array in zke.element_ids column in zkapp_events table
  5. [Edit:] Each of these element ids is referring to an array (zkapp_field_array) of field elements (zkapp_field). An action/event is a array of field elements

It doesn’t seem like the queries in the archive API consider 2, 3, 4, and 5

@45930
Copy link
Contributor

45930 commented Oct 18, 2024

Thank you @deepthiskumar!

With that in mind, I'll open an issue in the archive node API to expose those columns, then we can get to work on it.

@45930
Copy link
Contributor

45930 commented Oct 23, 2024

@aii23 I want to add a progress update on this one. Unfortunately, the issue runs pretty deep into the stack. I began work on the fix to the archive node API today. If I'm lucky, there will be a fixed version of that ready for review this week. But after it is ready for review, there may still be some revisions required, and it will need to be released and deployed by the node operator. It's not the kind of thing that can be quickly resolved.

If you're interested, I would love your input on what test cases should be added to the archive node. Right now, we reduce some actions in the test suite, but one keypair ever emits any actions. I added a test where 3 different keypairs all emit actions on the same block. If you could give me an idea of how your system actually works and how you're generating these actions that end up out of order, I can add those types of cases to the test suite.

Thanks for your patience while we work on it. We will do it right this time so that it's truly fixed.

@aii23
Copy link
Author

aii23 commented Oct 24, 2024

Hi, sorry for the late response.

We use actions for ticket buying. Each time a user calls buyTicket, the cost of the ticket is transferred to our treasury, and an action with the ticket is dispatched. (https://github.com/ZkNoid/L1-lottery/blob/7d1c7e67d46fec5991554778702621970e9493e3/src/PLottery.ts#L180)

It's hard to pinpoint the issue. We haven't identified what is causing it. The problem occurs independently of actions in a block(so we have multiple blocks with 2 and 6 actions with wrong order, but also have a bunch of blocks with 2, 3, 4, 5, 6, 7, 8 blocks with right order). The same account dispatching actions multiple times per block also doesn’t seem to cause an error. However, we didn't encounter this issue during our own testing when the number of actions was relatively low; perhaps we were just lucky. The first time this error occurred was with approximately 60 dispatched actions. We also had a round with the same number of tickets, but no error occurred, so it appears to be random.

@45930
Copy link
Contributor

45930 commented Oct 24, 2024

The issue is two-fold

  1. The account update ID that we sort on is likely to be in the right order, but not guaranteed to be. So we've all been lucky so far in testing, but we are relying on this order, and we shouldn't be because it's unreliable
  2. Within a single action, there can be multiple event elements. These need to be ordered correctly as well and our sorting doesn't account for them. We've also been lucky that these happen to be in order most of the time.

We are working on the fix in the archive node API, but the correct values to sort on are not yet exposed by the API. That's why we need to fix the API first, then update o1js to use the new sorting features. You can check out the archive node API PR here.

@kadirchan
Copy link
Contributor

I got this Failed to derive correct actions hash too when I send 5 different tx at once and tried to createSettlementProof.

@45930
Copy link
Contributor

45930 commented Nov 21, 2024

@deepthiskumar I believe the PR is almost ready for a review and I have my mind around the sort order, but I want to check with you about the formatting for graphQL. Here's a sample action from my response payload:

{
  "accountUpdateId": "22",
  "data": [
    "96585",
    "1",
    "26168",
    "2470435971292875657009308640003844111586128169442618635694692389957473893101",
    "0"
  ],
  "transactionInfo": {
    "sequenceNumber": 2,
    "zkappAccountUpdateIds": [
      22,
      23,
      24
    ],
    "zkappEventElementIds": [
      19
    ],
    "zkappFieldArrayElementIds": [
      107,
      3,
      109,
      110,
      1
    ]
  }
},

So we are now exposing the transaction sequence number. We are continuing to expose the account update id within the transaction, so I believe block -> transaction -> account id are now ok.


That brings us to the final two sorting requirements. The way we expose the zkapp event field arrays on an account update is in the data property. data is a string[], not a string[][]. As you can see from the transactionInfo metadata, this account update only has one zkappEventElement (referring to a zkapp field array), and that field array has five elements. The values of the elements are correctly sorted and displayed in data.

So my question is, what is the circumstance under which a zkapp event will have multiple field arrays? I need to reproduce that case to confirm the current behavior and ensure that the final level of sorting is accurate.

@45930
Copy link
Contributor

45930 commented Nov 22, 2024

@aii23

I wanted to let you know that I have a fix running locally!

o1-labs/Archive-Node-API#107 is basically ready to go, I opened o1-labs/Archive-Node-API#110 to merge first to make the main PR slightly cleaner and easier to review.

Using #1917 as my o1js version, o1-labs/Archive-Node-API#107 as my archive node API version, and a copy of the devnet database, I was able to resolve all 3 of your examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants