Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Method to have no paths when downloading #8

Open
MattSidney opened this issue Aug 3, 2022 · 4 comments
Open

[Feature Request] Method to have no paths when downloading #8

MattSidney opened this issue Aug 3, 2022 · 4 comments

Comments

@MattSidney
Copy link

MattSidney commented Aug 3, 2022

Is there a method to specify no paths when downloading? I noticed there is the --nopaths option for verify but wondering if it is possible to do the same when downloading.
Ideally I want remove the top level folder when downloading an item so that I can specify the final directory for the files.

Otherwise can this be done with a search method? so just return specific file formats maybe..

@ProximaNova
Copy link

I haven't tested this, but here is a starting point:

https://github.com/john-corcoran/internetarchive-downloader/blob/main/ia_downloader.py#L410
dest_file_path = os.path.join(os.path.join(output_folder, identifier), ia_file_name)
->
dest_file_path = os.path.join(output_folder, ia_file_name)

https://github.com/john-corcoran/internetarchive-downloader/blob/main/ia_downloader.py#L1106
identifier_output_folder = os.path.join(output_folder, identifier)
if (
os.path.isdir(identifier_output_folder)
and len(file_paths_in_folder(identifier_output_folder)) > 0
):
->
if (
os.path.isdir(output_folder)
and len(file_paths_in_folder(output_folder)) > 0
):

Question:
There is os.path.join(output_folder, identifier), but would this work: os.path(output_folder)?

@ProximaNova
Copy link

I commented out destdir=output_folder,:

                            internetarchive.download(
                                identifier,
                                files=[ia_file_name],
#-                                destdir=output_folder,
                                on_the_fly=True,
                            )

I commented out other stuff too, but it seems that the "destdir=output_folder" part is what did it. See this file which is a modified version of ia downloader which works to do the thing in the OP (it might get moved around/renamed to something/somewhere else in repo "ia-to-ipfs"):
https://github.com/ProximaNova/ia-to-ipfs/blob/main/ignore/ia_dl_working_errors.py

@ProximaNova
Copy link

Can confirm that merely commenting out "destdir=output_folder," will do it (you get files in such a folder), but you get errors. Implementing code edits from #8 (comment) results in less bad errors/warnings. Todo - fix "ia_dl_working_errors.py" so it doesn't return the non-info parts below at the end of stdout:

2023-03-06 22:32:50 - INFO - Download phase complete for item '[ia_item_here]'
2023-03-06 22:32:50 - WARNING - No item folders were found in provided data folder '[ia_item_here]' - make sure the parent download folder was provided rather than the item subfolder (e.g. provide '/downloads/' rather than '/downloads/item/'
2023-03-06 22:32:50 - ERROR - No metadata found in cache - verification cannot be performed
2023-03-06 22:32:50 - WARNING - Script complete; 2 warnings/errors occurred requiring review (see log entries above, replicated in folder 'ia_downloader_logs')

@ProximaNova
Copy link

ProximaNova commented Mar 6, 2023

"I commented out other stuff too" refers to this version:
ProximaNova/ia-to-ipfs@f7baabd

"Todo - fix ..." - I guess this would not be a hard fix as it would be done by removing references to a certain parent folder.

Currently "ia_dl_working_errors.py" (ProximaNova/ia-to-ipfs@86621a1) is a replacement to do something differently; it is not a modification where more options are gained. The latter might be the best outcome if I or anyone wants to make it happen.

@john-corcoran john-corcoran changed the title Method to have no paths when downloading [Feature Request] Method to have no paths when downloading Oct 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants