Adding configuration elements to block based on user-agents #20

robin-francois · 2024-09-05T15:40:40Z

With the increase of traffic generated by generative AI robots, and since those bots do not respect the robots.txt, user-agent filtering seems at the moment a good approach to prevent services being overloaded.

This PR adds a new variable to configure a block list of user agents. This list is case sensitive.

tomcbe

Hi @robin-francois

I reviewed th PR for docuteam: I have a few smaller suggestions, but in general it looks good to me.

tomcbe · 2024-09-05T18:48:00Z

templates/Caddyfile.j2

+  }
+
+  handle @badbots {
+    abort


Wouldn't it be better to send a HTTP header code 403 Forbidden (https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403) instead of just closing the connection?

Well, I would say that if they were respecting us, we could be respecting them. But there are not. We want to be sure that they do not come back and do not try again.

Artefactual sends 403 status code by default in their Nginx Role for AtoM:

https://github.com/artefactual-labs/ansible-nginx/blob/master/defaults/main/main.yml#L70

I still wonder, if sending a 403 is better than no answer to keep them away.

templates/Caddyfile.j2

robin-francois · 2024-09-11T12:18:17Z

Apart from the response to send, I think all points have been solved.

tomcbe

Apart form the discussion about wether to send a 403 HTTP status code or aborting the connection, this PR looks good to me now.

robin-francois · 2024-09-13T09:10:22Z

We can easily reply 403 if you prefer. I will just need to double check how to do it.

…ing TCP connection

robin-francois · 2024-09-13T09:48:10Z

@tomcbe I have tested how to reply a 403 with some simple HTML content. I have made a new commit to do 403 instead of closing the TCP connection.

tomcbe · 2024-09-13T14:01:53Z

@robin-francois I approved your PR now, but as this is a repository managed by simplificator. I can't merge it.
I'll create a PR for our provisioning projects to use your fork of the role in the meantime.

@cedricwider @tizpuppi Can one of you review this PR and merge it if agree with the proposed changes?

…d when defining the variables

cedricwider · 2024-10-21T11:52:16Z

Closing this PR in favor of #22

Robin François added 3 commits September 5, 2024 16:45

Adding Caddyfile code blocks to generate user-agents block list

2cf3d5d

Documenting new variable in README

db40194

Adding converge and Caddyfile.expected examples

32409cb

tomcbe suggested changes Sep 9, 2024

View reviewed changes

Changes according to Thomas comments

eac7853

tomcbe reviewed Sep 12, 2024

View reviewed changes

Replying 403 code and basic access forbidden content, instead of clos…

a06df43

…ing TCP connection

Adapting Caddyfile.excepted to new 403 response

1b14fb7

tomcbe approved these changes Sep 13, 2024

View reviewed changes

Removing wildcard in the ansible role. Wildcards will have to be adde…

fba99e3

…d when defining the variables

tomcbe approved these changes Sep 25, 2024

View reviewed changes

cedricwider approved these changes Oct 21, 2024

View reviewed changes

cedricwider mentioned this pull request Oct 21, 2024

Robin francois/master: Adding configuration elements to block based on user-agents #22

Merged

cedricwider closed this Oct 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding configuration elements to block based on user-agents #20

Adding configuration elements to block based on user-agents #20

robin-francois commented Sep 5, 2024

tomcbe left a comment

tomcbe Sep 5, 2024

robin-francois Sep 9, 2024

tomcbe Sep 12, 2024 •

edited

Loading

robin-francois commented Sep 11, 2024

tomcbe left a comment

robin-francois commented Sep 13, 2024

robin-francois commented Sep 13, 2024

tomcbe commented Sep 13, 2024

cedricwider commented Oct 21, 2024

Adding configuration elements to block based on user-agents #20

Adding configuration elements to block based on user-agents #20

Conversation

robin-francois commented Sep 5, 2024

tomcbe left a comment

Choose a reason for hiding this comment

tomcbe Sep 5, 2024

Choose a reason for hiding this comment

robin-francois Sep 9, 2024

Choose a reason for hiding this comment

tomcbe Sep 12, 2024 • edited Loading

Choose a reason for hiding this comment

robin-francois commented Sep 11, 2024

tomcbe left a comment

Choose a reason for hiding this comment

robin-francois commented Sep 13, 2024

robin-francois commented Sep 13, 2024

tomcbe commented Sep 13, 2024

cedricwider commented Oct 21, 2024

tomcbe Sep 12, 2024 •

edited

Loading