Add -w
/--wait
& --random-wait
options which implement rate limiting
#5
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
IMPORTANT: PR #4 is a prerequisite for the PR and should be merged first.
This implements rate limiting (see Issue #1) by adding:
-w
/--wait
option which accepts a number of seconds to pause/sleep between subsequent requests--random-wait
option which will cause-w
/--wait
seconds to be randomized by 0.5x-2xWaybackMachineDownloader#wait
method which implements the functionality using the aforementioned optionsSpecifically,
WaybackMachineDownloader#wait
is called only when requesting additional pages of results in#get_all_snapshots_to_consider
and before downloading individual files in#download_files
. This means that the first request of pages to download in not delayed, but all subsequent requests & actual page downloads are delayed.Example usage:
./bin/wayback_machine_downloader --to 20120222134837 --wait 300 --random-wait http://www.folklore.org/
This should pause for a random number of seconds between 150 (2.5 minutes) and 600 (10 minutes) between each request, since it's using both
--wait 300
(5 minutes) and--random-wait
.