-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nightlies and link validation failing because of repository.apache.org blockage #1585
Comments
I wonder if we could try ordering the resolvers in sbt. I've seen failures where we get issues loading 3rd party jars because our sbt setup seems to check repository.apache.org before checking maven central. Ideally, repository.apache.org should be checked last. |
https://brettporter.wordpress.com/2009/06/16/configuring-maven-http-connections/ suggests you can set a custom user agent header for the requests. We could make use of this, if we come up with a standard format for denoting ASF projects. This would allow us to tailor rules to both be more lenient in these cases, as well as debug which projects or builds are causing issues. |
I agree that would be a good thing to keep an eye on. 'Normal' CI builds shouldn't reference repository.a.o at all, though, right? And even when including
Yes (or arbitrary other headers). Pekko uses |
@raboof one source of strain that we put on repository.apache.org is from https://github.com/pjfanning/sbt-pekko-build This has logic to find the latest snapshot versions by scraping pages served by repository.apache.org. |
We haven't seen GitHub Actions runners get blocked anymore by the "too many 404's on repository.apache.org" rule since apache/ranger#435 was merged. I now (ack'ed by infra) removed all those bans. That should help, but GitHub Actions runners are still being banned for Bugzilla scraping (> 800req/hr to |
looks like this might be bingbot, filed https://issues.apache.org/jira/browse/INFRA-26405 to get a robots.txt in place |
Our nightlies and link validation sometimes fail when it is ran on a GitHub Actions running that is blocked from repository.apache.org.
Infra seems open to create per-project buckets for the abuse thresholds, but we'd have to add a header to the requests to identify ourselves.
Looks like this would depend on coursier/coursier#1203
The text was updated successfully, but these errors were encountered: