You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am not sure if this is indented please let me know if it is.
When uploading a directory first time it will add the data into the correct spot; i.e: hdfs-path/sub-folder. However, when trying to add more data to the same place it output it in the /hdfs_path/sub-folder/<local_name>/.
If this is not an intended output, I believe the culprit is here on line 553 where hdfs_path and local_name are joined. I removed the local_name on the join and it seemed to upload all data into hdfs_path while making no subfolders.
After a bit more debugging, I found that if the path in hdfs exists, it will append the folder name in which the files are coming from. I need the files to be added to the specified directory and not to the directory + sub folder. To remedy this I created a new variable called use_existing. When True it will use the hdfs path and not the hdfs+local_name.
Again let me know if my understanding is off, or you would like a PR with the added variable.
The text was updated successfully, but these errors were encountered:
Thanks for the detailed report. Your understanding is correct. It is implemented this way to be consistent with local commands:
# In an empty directory
$ mkdir src1 src2
$ cp -r src1 dst # Copies src1 as dst
$ cp -r src2 dst # Copies src2 as dst/src2
As you point out, there is a usability gap though. You can achieve what you are trying to do locally by globbing (cp -r src2/* dst) but there is no equivalent here, at least until #105. I think this justifies adding an option; if you send a PR I would be happy to review it.
I am not sure if this is indented please let me know if it is.
When uploading a directory first time it will add the data into the correct spot; i.e:
hdfs-path/sub-folder
. However, when trying to add more data to the same place it output it in the/hdfs_path/sub-folder/<local_name>/
.If this is not an intended output, I believe the culprit is here on line 553 where
hdfs_path
andlocal_name
are joined. I removed thelocal_name
on the join and it seemed to upload all data intohdfs_path
while making no subfolders.hdfs/hdfs/client.py
Line 553 in 5b40065
EDIT
Coded used:
After a bit more debugging, I found that if the path in
hdfs
exists, it will append the folder name in which the files are coming from. I need the files to be added to the specified directory and not to the directory + sub folder. To remedy this I created a new variable calleduse_existing
. WhenTrue
it will use the hdfs path and not thehdfs+local_name
.Again let me know if my understanding is off, or you would like a PR with the added variable.
The text was updated successfully, but these errors were encountered: