Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change how we store SKF for performance gains #256

Closed
wants to merge 1 commit into from

Conversation

michaeldjeffrey
Copy link
Contributor

SKF are stored with the timestamp of the last time they were used to make it more likely to hit a session key for a packet early. An ets ordered_set table was used to achieve this, but it follows total sorting, which means the timestamp needs to be the first element in the table if we want that to take precedence.

Part of removing, or updating, a SKF involved ets:select_delete/2 which requires a full table scan because we only had the SessionKey at the time of deletion. When maybe removes or updates come in from the config service, all of those updates would be spawned into their own process and contend for the SKF table at the same time. For sufficiently large ets tables (+10k) and sufficiently large batches of updates (1k), this would cause all schedulers to have trouble doing anything while tables scans were happening. This pausing would mean the grpc connection could not talk back to the config-service and negotiate a slower rate of updates, causing them to get out of order, and grpcbox would kill the connection. And it would all start again.

The structure of the table has been flattened, so we can tell ets the key position is the SessionKey in the second slot. It will still respect total ordering with the timestamp in the first position, but we can now do deletes in constant time with only the SessionKey. 💃

SKF are stored with the timestamp of the last time they were used to
make it more likely to hit a session key for a packet early. An ets
ordered_set table was used to achieve this, but it follows total
sorting, which means the timestamp needs to be the first element in the
table if we want that to take precedence.

Part of removing, or updating, a SKF involved `ets:select_delete/2`
which requires a full table scan because we only had the SessionKey at
the time of deletion. When maybe removes or updates come in from the
config service, all of those updates would be spawned into their own
process and contend for the SKF table at the same time. For sufficiently
large ets tables (+10k) and sufficiently large batches of updates (1k),
this would cause all schedulers to have trouble doing anything while
tables scans were happening. This pausing would mean the grpc connection
could not talk back to the config-service and negotiate a slower rate of
updates, causing them to get out of order, and grpcbox would kill the
connection. And it would all start again.

The structure of the table has been flattened, so we can tell ets the
key position is the SessionKey in the second slot. It will still
respect total ordering with the timestamp in the first position, but we
can now do deletes in constant time with only the SessionKey. 💃
@michaeldjeffrey
Copy link
Contributor Author

The test data I was using mislead me. It also sorts on keypos. :sigh:

0 -> lager:debug(MD, "inserted SKF");
_ -> lager:debug([{deleted, Deleted} | MD], "updated SKF")
end;
true = do_remove_skf(SKFETS, SKF),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make them return ok for both?

ordered_set,
{read_concurrency, true},
{write_concurrency, true},
{keypos, 2},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idk if this will work as the ordered_set will be on the key. Can you 2x check?

Copy link
Member

@macpie macpie Sep 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one key, one object, ordered in Erlang term order

Maybe it's fine

@@ -572,7 +570,10 @@ do_insert_skf(SKFETS, SKF) ->
%% This is negative to make newest time on top
Now = erlang:system_time(millisecond) * -1,
true = ets:insert(SKFETS, {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe create a record for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants