You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This approach doesn't actually skip batches, it moves the start of the next batch by 1 and then tries again.
For the sake of argument, let's say we have a bad transaction at version 3456 and the size of batches is 5000 consistently.
If we're processing a batch that goes from trx 2000 to trx 6999, the process will fail and will restart again, but this time trying to process trx 2001 to trx 7000. It will then fail again and again, until we get passed trx 3456, and then the process will resume without errors.
The problem with that, continuing with the example above, is that if I'm interested in a transaction at version 3000, I'm never going to see it because it'll always be in a bad batch.
Ideally, the bad transaction will not fail and will either be deserialised properly, or bad fields will be ignored.
We've had to add a "slow mode" in our code so that if we see a deserialisation failure, we restart the stream asking for 1 transaction only, until we fail again, at which point we know we've processed the actual bad transaction and we can restart the stream in full speed mode.
Repro
Run the python indexer starting a bit before transaction 1023992588.
The text was updated successfully, but these errors were encountered:
Description
Transaction 1023992588 on testnet is failing deserialization, causing whole batches to be skipped.
The batch skipping bit has been added in #352.
This approach doesn't actually skip batches, it moves the start of the next batch by 1 and then tries again.
For the sake of argument, let's say we have a bad transaction at version 3456 and the size of batches is 5000 consistently.
If we're processing a batch that goes from trx 2000 to trx 6999, the process will fail and will restart again, but this time trying to process trx 2001 to trx 7000. It will then fail again and again, until we get passed trx 3456, and then the process will resume without errors.
The problem with that, continuing with the example above, is that if I'm interested in a transaction at version 3000, I'm never going to see it because it'll always be in a bad batch.
Ideally, the bad transaction will not fail and will either be deserialised properly, or bad fields will be ignored.
We've had to add a "slow mode" in our code so that if we see a deserialisation failure, we restart the stream asking for 1 transaction only, until we fail again, at which point we know we've processed the actual bad transaction and we can restart the stream in full speed mode.
Repro
Run the python indexer starting a bit before transaction 1023992588.
The text was updated successfully, but these errors were encountered: