Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor indexer version config #332

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

keyliaran
Copy link
Contributor

@keyliaran keyliaran commented Mar 25, 2024

Resolve issue #219.
Added starting_version_if_nothing_in_db to the worker config. implementation works as follows:

  • If starting_version exists in the config, it will override all existing values (both starting_version_if_nothing_in_db and the database value).
  • If starting_version is not configured, we'll take a look to the database value. If the database has no value and starting_version_if_nothing_in_db is configured, we'll use that value; otherwise, we'll default to 0.

Test plan:

  1. Drop database, so there is no starting version in it, create config with starting_version_if_nothing_in_db, check that indexer will start from this version
  2. Restart indexer, indexer should start from the db version
  3. Add starting_version to config, indexer should start from this version

@larry-aptos larry-aptos requested review from larry-aptos, banool, CapCap, bowenyang007 and just-in-chang and removed request for banool March 25, 2024 19:28
@larry-aptos
Copy link
Collaborator

nice work!

one thing to point out is we're trying to avoid breaking changes introduced to our indexer stack.

  • community needs to change their processor config, otherwise crash due todeny_unknown_fields

is it possible to add a new optional field like overriding_starting_version(default to true) and if it's present and true or not present, we'll keep the existing behavior and if false, do the new logic.

@keyliaran keyliaran force-pushed the indexer_version_cfg_refactor branch from 2ad40b6 to 7de84f4 Compare March 25, 2024 19:57
@keyliaran
Copy link
Contributor Author

@larry-aptos I've implemented support for backward compatibility in the configuration. Please take a look.

@@ -25,8 +25,10 @@ Indexer GRPC parser is to indexer data processor that leverages the indexer grpc
indexer_grpc_http2_ping_timeout_in_secs: 10
number_concurrent_processing_tasks: 10
auth_token: AUTH_TOKEN
starting_version: 0 # optional
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: update the document

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for being late on this.

a question about this: if starting_version is specific, does it also mean that processing start from this version regardless of db version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a new field called "override_starting_version." The logic is straightforward:

  1. If "override_starting_version" exists and "starting_version" is configured, use it regardless of the database version.
  2. If "override_starting_version" doesn't exist, use the database version.

Copy link
Collaborator

@rtso rtso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keyliaran the current implementation is not backwards compatible with how the ecosystem uses starting_version. The current assumption is starting_version will override whatever is it DB. We should keep that assumption so we don't break ecosystem indexers. I propose to add a starting_version_if_nothing_in_db instead of the override.

@keyliaran
Copy link
Contributor Author

@keyliaran the current implementation is not backwards compatible with how the ecosystem uses starting_version. The current assumption is starting_version will override whatever is it DB. We should keep that assumption so we don't break ecosystem indexers. I propose to add a starting_version_if_nothing_in_db instead of the override.

If the indexer is launched with starting_version set in the configuration, it will always resume from that version after each restart due to node crashes or other issues like panics. This behavior isn't ideal because, i guess, nobody wants to keep reindexing the entire blockchain from starting_version setted in config. To address this, I would need to start the indexer from the specified starting_version and then modify the configuration.

To improve this, you're suggesting adding a new field called starting_version_if_nothing_in_db. Here's how it would work: starting_version will override both the database value and the config value of starting_version_if_nothing_in_db. This means that if the database doesn't have a starting version stored, it will use starting_version from the config. So there are no usefull usecases of new field starting_version_if_nothing_in_db, it will be used only if there is nothing in database

@keyliaran
Copy link
Contributor Author

If override_starting_version has a default value of true, the current implementation will not be backwards compatible.

@keyliaran
Copy link
Contributor Author

@rtso Pls take a look to the updated implementation. It shoudln't break backward compatible.

@rtso
Copy link
Collaborator

rtso commented Apr 17, 2024

@keyliaran I think the issue is originally suggesting this approach:

  • If there's no latest version set in the db, it will use starting_version_if_nothing_in_db if set
  • Otherwise it will use starting_version if set
  • Otherwise, it will use the latest version in db

This makes it easier to develop the indexer. When you wipe the db, the config remembers the version the contract was deployed (starting_version_if_nothing_in_db ) and start indexing from that version. If you need to start indexing from specific version, you can set an arbitrary version in starting_version. You won't have to set/unset the flag and starting_version every time.

@rtso
Copy link
Collaborator

rtso commented Apr 17, 2024

starting_version_if_nothing_in_db, it will be used only if there is nothing in database

Yep you're absolutely correct! And if you want it to start from latest version in processor_status table, leave starting_version blank so that it starts from latest version in db.

@keyliaran
Copy link
Contributor Author

@rtso Made an update, pls take a look

0
});

let starting_version = self.starting_version.unwrap_or(starting_version_from_db);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the change! This line is still necessary to override with starting_version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Fixed

@rtso
Copy link
Collaborator

rtso commented Apr 18, 2024

Could you also share your test plan in the description? TY

@@ -27,6 +27,8 @@ pub struct IndexerGrpcProcessorConfig {
pub auth_token: String,
// Version to start indexing from
pub starting_version: Option<u64>,
// Version to start indexing from if nothing in db
pub starting_version_if_nothing_in_db: Option<u64>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably want a better name..

also, for these two parameters, it's not clear which one will take precedence. we probably should either use an enum here, or ensure they are mutually exclusive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should avoid using enums to maintain backward compatibility. The current implementation works as follows:

  • If starting_version exists in the config, it will override all existing values (both starting_version_if_nothing_in_db and the database value).
  • If starting_version is not configured, we'll take a look to the database value. If the database has no value and starting_version_if_nothing_in_db is configured, we'll use that value; otherwise, we'll default to 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants