-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aiohttp-ip-rotator #6
Comments
Hi! |
Great stuff! I already installed the fork and am now going through the class Harry wrote. I am slightly lost due to also being new to every other module in the project, but I think I'll eventually get it, emphasis on eventually. A usage case of the new class would be much appreciated and would explain what everything does at a surface level. Currently stuck at where |
Hi! |
Hey @ZOV-code, still very much a WIP and nothing publicly available for 2-4 months I'm afraid! |
@Ge0rg3 thank you for the information |
Hi! There is any update on this? I am trying to run my code asynchronously but i dont find a way to do it with this approach. Thank you |
Hi @jherrerogb98 , I probably won't get around to implementing this for another couple of months at the very least. However, multithreading via threading or otherwise parallel programming such as multiprocessing should work fine. I hope this helps 😄 |
Okay, thank you! |
Hello! I also needed a similar asynchronous library and I made an implementation of aiohttp-ip-rotator. You can find it here: https://github.com/D4rkwat3r/aiohttp-ip-rotator |
Hi all, closing this issue as the aiohttp code will not be merged into this project. If you are set on on aiohttp, then the aiohttp-ip-rotator lib is probably a good fit. Depending on your use case, aiohttp may be faster than threading requests. However, you can also run async requests with this lib via: import requests as rq
import concurrent.futures
from requests_ip_rotator import ApiGateway
site = "https://bbc.co.uk"
gateway = ApiGateway(site)
gateway.pool_connections = 30
gateway.pool_maxsize = 30
gateway.start()
session = rq.Session()
session.mount(site, gateway)
with concurrent.futures.ThreadPoolExecutor(max_workers=25) as executor:
futures_map = {}
# Trigger 100 requests
for i in range(100):
url = site + "/" + str(i)
future = executor.submit(session.get, url)
futures_map[future] = url
# Collect results
for future in concurrent.futures.as_completed(futures_map):
# Check for error
error = future.exception()
url = futures_map[future]
if error:
print(f"Error for {url}: {error}")
continue
# Get response
response = future.result()
print(f"{url} - {response.status_code}") |
Hey, I am back again:) I was happily parsing last few days away, blew past the free tier already, but I wanted to scale up my operation further to make it even faster. I am limited with max 60 workers in the pool, so I decided to rewrite it from multiprocessing to asynchronous concurrency. This is where I realized that I can't use Requests module, but your module was made to work only with Requests.
How difficult would it be to rewrite it to make it compatible with aiohttp? Is there any way to make it work with requests, even thou the module is inherently blocking at the sockets level? The higher latency, the more beneficial would it be to move this work from multiprocessing to asynchronous loop. I could try to work on it, but I am new to python so I appreciate any kind of feedback or advice you can give me.
I just found Dask and am looking into it, if it could help me keep using Requests.
Another possibility is to rent a server that has enough virtual cores to go beyond 60 workers.
The text was updated successfully, but these errors were encountered: