-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rclone: stream files as they are being listed #4146
Conversation
It does not make sense, as sstable dirs are flat. Moreover, using recursion makes it impossible to use internal rclone ListCB which improves performance and memory allocation.
2c71eea
to
e0663f3
Compare
e0663f3
to
e3fe0c9
Compare
Nice benchmark! LGTM! As I understand all the heavy lifting was done in scylladb/rclone repository? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Please monitor https://jenkins.scylladb.com/view/scylla-manager/job/manager-master/ SCT tests after merging.
It may be even worth to start the tests just on this branch.
Correct, but we treat it more as a playground, so it would be good to review vendored changes as a part of this PR. |
Is this another private patch on top of our forked rclone? |
yes |
After the discussion with @VAveryanov8 it turns out that this issue/PR does not really make sense (thanks for the effort, you saved us from unnecessary When developing the patch, I was under the impression that we use non-recursive It turned out that we are using recursive Even though this PR "improves" flat dir listing, it's not connected to https://github.com/scylladb/scylla-enterprise/issues/4861, which was a driving force behind that, and we shouldn't risk questionable improvements connected to Anyway, I'm closing this PR, perhaps it will be revisited in the future. |
I don't think there is any special cause that lead to choose recursive over the flat approach. |
This PR changes the rclone version used by SM-agent server.
The goal is improve huge dir listing by streaming the files to the SM as the dir is being listed. Rclone lists files in chunks (1000 is the default value for s3), so we can start streaming those files after each chunk has been listed, not only after the whole dir has been listed.
This allows for more reliable timeout handling on SM side, and also decreases memory pressure on agent.
Since this is timeout/performance change, I didn't prepare a dedicated test for it, but I created a benchmark for that (see last commit for details). It measures some interesting averages when listing a flat dir with 5555 files 1000 times. Note that this has been tested on my local docker setup without any rate limiting, but I still think that the results show what needs to be seen.
The results without the fix:
The results after the fix:
What is important, is that this fix reduces the time needed for listing the first item from
~146ms
to~26ms
. Let's have in mind that 5555 files is not a huge amount for backed up sstables, and that the local setup has faster connections than the real one. It also slightly reduces the total time needed for listing, but the values are too similar to tell it for sure.Fixes #4132