aiohttp-ip-rotator #6

vincenthawke · 2021-09-23T17:01:36Z

Hey, I am back again:) I was happily parsing last few days away, blew past the free tier already, but I wanted to scale up my operation further to make it even faster. I am limited with max 60 workers in the pool, so I decided to rewrite it from multiprocessing to asynchronous concurrency. This is where I realized that I can't use Requests module, but your module was made to work only with Requests.

How difficult would it be to rewrite it to make it compatible with aiohttp? Is there any way to make it work with requests, even thou the module is inherently blocking at the sockets level? The higher latency, the more beneficial would it be to move this work from multiprocessing to asynchronous loop. I could try to work on it, but I am new to python so I appreciate any kind of feedback or advice you can give me.

I just found Dask and am looking into it, if it could help me keep using Requests.
Another possibility is to rent a server that has enough virtual cores to go beyond 60 workers.

Ge0rg3 · 2021-09-23T19:28:50Z

Hi!
If you go to the PR tab, the latest PR shows a version which supports this. I'm looking to complete the changes and merge it in early next month, if you'd rather wait, but if not then I'd recommend cloning the PR'd fork and using the example given in the thread.
Hope this helps! :)

vincenthawke · 2021-09-23T23:31:03Z

Great stuff! I already installed the fork and am now going through the class Harry wrote. I am slightly lost due to also being new to every other module in the project, but I think I'll eventually get it, emphasis on eventually. A usage case of the new class would be much appreciated and would explain what everything does at a surface level.

Currently stuck at where labeled_urls came from, really hard when there are no variable or return types.

ZOV-code · 2021-11-15T15:51:59Z

Hi!
I was near starting to write something same but then found this project. Great job!
How is going your new project with aiohttp? Is release close?

Ge0rg3 · 2021-11-16T00:15:03Z

Hey @ZOV-code, still very much a WIP and nothing publicly available for 2-4 months I'm afraid!

ZOV-code · 2021-11-16T15:10:30Z

@Ge0rg3 thank you for the information

jherrerogb98 · 2022-05-14T18:22:00Z

Hi! There is any update on this? I am trying to run my code asynchronously but i dont find a way to do it with this approach. Thank you

Ge0rg3 · 2022-05-14T21:53:05Z

Hi @jherrerogb98 , I probably won't get around to implementing this for another couple of months at the very least. However, multithreading via threading or otherwise parallel programming such as multiprocessing should work fine. I hope this helps 😄

jherrerogb98 · 2022-05-15T11:04:28Z

Okay, thank you!

D4rkwat3r · 2022-08-29T13:29:15Z

Hello! I also needed a similar asynchronous library and I made an implementation of aiohttp-ip-rotator. You can find it here: https://github.com/D4rkwat3r/aiohttp-ip-rotator

Ge0rg3 · 2024-01-16T08:20:00Z

Hi all, closing this issue as the aiohttp code will not be merged into this project. If you are set on on aiohttp, then the aiohttp-ip-rotator lib is probably a good fit.

Depending on your use case, aiohttp may be faster than threading requests.

However, you can also run async requests with this lib via:

import requests as rq
import concurrent.futures
from requests_ip_rotator import ApiGateway

site = "https://bbc.co.uk"
gateway = ApiGateway(site)
gateway.pool_connections = 30
gateway.pool_maxsize = 30
gateway.start()

session = rq.Session()
session.mount(site, gateway)

with concurrent.futures.ThreadPoolExecutor(max_workers=25) as executor:
    futures_map = {}
    # Trigger 100 requests
    for i in range(100):
        url = site + "/" + str(i)
        future = executor.submit(session.get, url)
        futures_map[future] = url

    # Collect results
    for future in concurrent.futures.as_completed(futures_map):
        # Check for error
        error = future.exception()
        url = futures_map[future]
        if error:
            print(f"Error for {url}: {error}")
            continue

        # Get response
        response = future.result()
        print(f"{url} - {response.status_code}")

Ge0rg3 added help wanted Extra attention is needed enhancement New feature or request labels Nov 18, 2021

Ge0rg3 mentioned this issue Dec 1, 2021

With aiohttp #18

Closed

Ge0rg3 pinned this issue Dec 30, 2021

gelsas mentioned this issue Jan 23, 2023

Still up to date ? D4rkwat3r/aiohttp-ip-rotator#1

Open

Ge0rg3 closed this as completed Jan 16, 2024

Ge0rg3 mentioned this issue Mar 23, 2024

Does this api support parallel processing? #68

Closed

Ge0rg3 mentioned this issue Jun 16, 2024

Asyncio Integration / Multi threading ? #71

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aiohttp-ip-rotator #6

aiohttp-ip-rotator #6

vincenthawke commented Sep 23, 2021 •

edited

Loading

Ge0rg3 commented Sep 23, 2021

vincenthawke commented Sep 23, 2021 •

edited

Loading

ZOV-code commented Nov 15, 2021

Ge0rg3 commented Nov 16, 2021

ZOV-code commented Nov 16, 2021

jherrerogb98 commented May 14, 2022

Ge0rg3 commented May 14, 2022

jherrerogb98 commented May 15, 2022

D4rkwat3r commented Aug 29, 2022

Ge0rg3 commented Jan 16, 2024

aiohttp-ip-rotator #6

aiohttp-ip-rotator #6

Comments

vincenthawke commented Sep 23, 2021 • edited Loading

Ge0rg3 commented Sep 23, 2021

vincenthawke commented Sep 23, 2021 • edited Loading

ZOV-code commented Nov 15, 2021

Ge0rg3 commented Nov 16, 2021

ZOV-code commented Nov 16, 2021

jherrerogb98 commented May 14, 2022

Ge0rg3 commented May 14, 2022

jherrerogb98 commented May 15, 2022

D4rkwat3r commented Aug 29, 2022

Ge0rg3 commented Jan 16, 2024

vincenthawke commented Sep 23, 2021 •

edited

Loading

vincenthawke commented Sep 23, 2021 •

edited

Loading