-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added --recursive support to upload_object for s3 file storage #277
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
Testing this locally, I see that folder structure is not preserved when using --recursive
as it is implemented here.
Using a contrived example:
wsmith@linode::linode-cli[mechpaul/master]$ tree tmp/
tmp/
├── a
│ └── file2
├── b
│ └── c
│ ├── file3
│ └── file4
└── file1
3 directories, 4 files
I would expect to see a bucket like this:
wsmith@linode::linode-cli[mechpaul/master]$ python3 -m linodecli obj ls 277-test-s3cmd
DIR a/
DIR b/
2021-12-07 18:19 0 file1
What I end up with is this:
wsmith@linode::linode-cli[mechpaul/master]$ python3 -m linodecli obj ls 277-test
2021-12-07 18:24 0 file1
2021-12-07 18:24 0 file2
2021-12-07 18:24 0 file3
2021-12-07 18:24 0 file4
(the reference bucket above was created using the equivalent s3cmd upload action)
It looks like all that needs to be done to achieve the desired behavior is to keep the paths in addition to the filenames when collecting files, which shouldn't be terribly difficult. If you don't mind, I might tinker with this a little bit and push up a commit that does this.
Working as intended. It was coded this way for the following reasons:
To continue with the critique of this, now that I'm running this against larger file sets only the first 1400 files are uploaded. The rest of the files are skipped...? I'm wondering if there is some Linode limitation on the number of files uploaded in one login session. |
There's no login session at work here - the CLI uses Object Storage keys to perform operations through this plugin, and these are effectively stateless. There is probably an upper limit to objects in a bucket, but it would be much higher than 1400. I don't think that preserving folder structure is a break from old behavior; in the example above, I asked the s3cmd to recursively upload Supposing all structure was flattened, what would happen if I had two files with the same name in different directories during a recursive upload? Take for example: # set up a new folder, tmp2, with one subdirectory and a file named "pet" in each
wsmith@linode::linode-cli[mechpaul/master]$ mkdir -p tmp2/a
wsmith@linode::linode-cli[mechpaul/master]$ echo "cat" > tmp2/a/pet
wsmith@linode::linode-cli[mechpaul/master]$ echo "dog" > tmp2/pet
# recursively upload tmp2
wsmith@linode::linode-cli[mechpaul/master]$ python3 -m linodecli obj put --recursive tmp2/ 277-test
pet
|####################################################################################################| 100.0%
pet
|####################################################################################################| 100.0%
Done.
# but only one file exists in the bucket - the second one overrode the first
wsmith@linode::linode-cli[mechpaul/master]$ python3 -m linodecli obj ls 277-test
2021-12-07 19:57 4 pet
# let's see which one it is
wsmith@linode::linode-cli[mechpaul/master]$ python3 -m linodecli obj get 277-test pet
|####################################################################################################| 100.0%
Done.
wsmith@linode::linode-cli[mechpaul/master]$ cat pet
dog In the above, I lost my cat. Maybe this explains where some of the files you were attempting to upload went too? In object storage, there aren't really "folders" - these are emulated client-side using prefixes (i.e. delimitated on a |
Ahhhh, I didn't know those folders were just virtual folders in a flat space! The files I was trying to upload were about 1.2M PNG files - about 14 GB worth. There is no folder structure (all flat), and each file is named {sha256}.png. So no, there's no name clobbering here. You can reject the PR. I'll work on preserving folder structure and submit a new PR with the changes requested. Thanks for sharing your knowledge with me. Linode is my first cloud platform. |
Thanks for contributing! I'm fine to leave this open if you want to push changes here. I also don't mind doing it myself if you don't feel like it doing it. |
Go ahead and leave it open. I'll resubmit by EOWeekend. |
Hmmm... dealing with a lot of corner cases to support folders on the source computer and virtual folders in s3.
Pretty sure I have these figured out now. After working out these issues, I'm now trying to figure out another issue. subfolder\00000e5e66a365da9134ab537cf8125f.png Going to dig into this error to figure out what's going on. Looks like it's an error from Boto. I passed in the following: linode obj put d:\maplestory\imageslinode\ --recursive happychinchilla I threw about 70 images in here, half of which are in the folder /subfolder/ which should be preserved in the S3 object key. It's throwing this error on the first image contained within a subfolder. |
I took a stab at this in mechpaul#1 (which is a PR against this branch) - let me know if it works for you. |
I commented in the above-linked PR, but I can reproduce that SignatureDoesNotMatch error on the master branch; it seems to be a bug with how some characters in file names are treated (in your case |
…ters Related to #277 @mechpaul discovered that putting filenames with some special characters, such as `\` and `$` using the `obj` plugin returns a SignatureDoesNotMatch error. To reproduce: ```bash $ echo "test" > 'test$file\name' $ python3 -m linodecli obj put 'test$file\name' some-bucket ``` This change properly quotes the URLs such that these characters are accepted in keys.
I put up a fix for the bug in #278; let me know if it works for you |
This commit adds --recursive and folder support to upload_object for s3 file storage.
If a user supplies a folder but does not supply --recursive, then only the files from that folder are uploaded.
If a user supplies a folder and supplies --recursive, then all files from that folder recursively are uploaded.
I did not test my code against symbolic/hard links.