You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've just encountered a project that responds 403 Forbidden to requests with FEDC's User-Agent. Once I realised this was what was happening, I experimented with a few different user-agents for a few minutes, and then my IP address was blocklisted by the web server, which no longer responds to TCP connection attempts.
Searching the project's forum (via a VPN!) I came across this thread where the author refers to:
automated third-party software in your network that disregards robots.txt or HTTP 403.
I checked the project's robots.txt and there is nothing there that the tools I was using (fedc, wget, and Debian's uscan) were disobeying. But it's true that FEDC does not check robots.txt. It probably should, with a per-domain cache that would ideally persist between runs of the tool.
The text was updated successfully, but these errors were encountered:
I've just encountered a project that responds 403 Forbidden to requests with FEDC's User-Agent. Once I realised this was what was happening, I experimented with a few different user-agents for a few minutes, and then my IP address was blocklisted by the web server, which no longer responds to TCP connection attempts.
Searching the project's forum (via a VPN!) I came across this thread where the author refers to:
I checked the project's robots.txt and there is nothing there that the tools I was using (fedc, wget, and Debian's uscan) were disobeying. But it's true that FEDC does not check robots.txt. It probably should, with a per-domain cache that would ideally persist between runs of the tool.
The text was updated successfully, but these errors were encountered: