generated from discourse/discourse-plugin-skeleton
-
Notifications
You must be signed in to change notification settings - Fork 24
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
FEATURE: Add AI-powered spam detection for new user posts (#1004)
This introduces a comprehensive spam detection system that uses LLM models to automatically identify and flag potential spam posts. The system is designed to be both powerful and configurable while preventing false positives. Key Features: * Automatically scans first 3 posts from new users (TL0/TL1) * Creates dedicated AI flagging user to distinguish from system flags * Tracks false positives/negatives for quality monitoring * Supports custom instructions to fine-tune detection * Includes test interface for trying detection on any post Technical Implementation: * New database tables: - ai_spam_logs: Stores scan history and results - ai_moderation_settings: Stores LLM config and custom instructions * Rate limiting and safeguards: - Minimum 10-minute delay between rescans - Only scans significant edits (>10 char difference) - Maximum 3 scans per post - 24-hour maximum age for scannable posts * Admin UI features: - Real-time testing capabilities - 7-day statistics dashboard - Configurable LLM model selection - Custom instruction support Security and Performance: * Respects trust levels - only scans TL0/TL1 users * Skips private messages entirely * Stops scanning users after 3 successful public posts * Includes comprehensive test coverage * Maintains audit log of all scan attempts --------- Co-authored-by: Keegan George <[email protected]> Co-authored-by: Martin Brennan <[email protected]>
- Loading branch information
1 parent
ae80494
commit 47f5da7
Showing
27 changed files
with
1,801 additions
and
6 deletions.
There are no files selected for viewing
11 changes: 11 additions & 0 deletions
11
admin/assets/javascripts/discourse/routes/admin-plugins-show-discourse-ai-spam.js
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
import { service } from "@ember/service"; | ||
import { ajax } from "discourse/lib/ajax"; | ||
import DiscourseRoute from "discourse/routes/discourse"; | ||
|
||
export default class DiscourseAiSpamRoute extends DiscourseRoute { | ||
@service store; | ||
|
||
model() { | ||
return ajax("/admin/plugins/discourse-ai/ai-spam.json"); | ||
} | ||
} |
1 change: 1 addition & 0 deletions
1
admin/assets/javascripts/discourse/templates/admin-plugins/show/discourse-ai-spam.hbs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<AiSpam @model={{this.model}} /> |
112 changes: 112 additions & 0 deletions
112
app/controllers/discourse_ai/admin/ai_spam_controller.rb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,112 @@ | ||
# frozen_string_literal: true | ||
|
||
module DiscourseAi | ||
module Admin | ||
class AiSpamController < ::Admin::AdminController | ||
requires_plugin "discourse-ai" | ||
|
||
def show | ||
render json: AiSpamSerializer.new(spam_config, root: false) | ||
end | ||
|
||
def update | ||
updated_params = {} | ||
if allowed_params.key?(:llm_model_id) | ||
llm_model_id = updated_params[:llm_model_id] = allowed_params[:llm_model_id] | ||
if llm_model_id.to_i < 0 && | ||
!SiteSetting.ai_spam_detection_model_allowed_seeded_models_map.include?( | ||
"custom:#{llm_model_id}", | ||
) | ||
return( | ||
render_json_error( | ||
I18n.t("discourse_ai.llm.configuration.invalid_seeded_model"), | ||
status: 422, | ||
) | ||
) | ||
end | ||
end | ||
updated_params[:data] = { | ||
custom_instructions: allowed_params[:custom_instructions], | ||
} if allowed_params.key?(:custom_instructions) | ||
|
||
if updated_params.present? | ||
# not using upsert cause we will not get the correct validation errors | ||
if AiModerationSetting.spam | ||
AiModerationSetting.spam.update!(updated_params) | ||
else | ||
AiModerationSetting.create!(updated_params.merge(setting_type: :spam)) | ||
end | ||
end | ||
|
||
is_enabled = ActiveModel::Type::Boolean.new.cast(allowed_params[:is_enabled]) | ||
|
||
if allowed_params.key?(:is_enabled) | ||
if is_enabled && !AiModerationSetting.spam&.llm_model_id | ||
return( | ||
render_json_error( | ||
I18n.t("discourse_ai.llm.configuration.must_select_model"), | ||
status: 422, | ||
) | ||
) | ||
end | ||
|
||
SiteSetting.ai_spam_detection_enabled = is_enabled | ||
end | ||
|
||
render json: AiSpamSerializer.new(spam_config, root: false) | ||
end | ||
|
||
def test | ||
url = params[:post_url].to_s | ||
post = nil | ||
|
||
if url.match?(/^\d+$/) | ||
post_id = url.to_i | ||
post = Post.find_by(id: post_id) | ||
end | ||
|
||
route = UrlHelper.rails_route_from_url(url) if !post | ||
|
||
if route | ||
if route[:controller] == "topics" | ||
post_number = route[:post_number] || 1 | ||
post = Post.with_deleted.find_by(post_number: post_number, topic_id: route[:topic_id]) | ||
end | ||
end | ||
|
||
raise Discourse::NotFound if !post | ||
|
||
result = | ||
DiscourseAi::AiModeration::SpamScanner.test_post( | ||
post, | ||
custom_instructions: params[:custom_instructions], | ||
llm_id: params[:llm_id], | ||
) | ||
|
||
render json: result | ||
end | ||
|
||
private | ||
|
||
def allowed_params | ||
params.permit(:is_enabled, :llm_model_id, :custom_instructions) | ||
end | ||
|
||
def spam_config | ||
spam_config = { | ||
enabled: SiteSetting.ai_spam_detection_enabled, | ||
settings: AiModerationSetting.spam, | ||
} | ||
|
||
spam_config[:stats] = DiscourseAi::AiModeration::SpamReport.generate(min_date: 1.week.ago) | ||
|
||
if spam_config[:stats].scanned_count > 0 | ||
spam_config[ | ||
:flagging_username | ||
] = DiscourseAi::AiModeration::SpamScanner.flagging_user&.username | ||
end | ||
spam_config | ||
end | ||
end | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# frozen_string_literal: true | ||
|
||
module Jobs | ||
class AiSpamScan < ::Jobs::Base | ||
def execute(args) | ||
return if !args[:post_id] | ||
post = Post.find_by(id: args[:post_id]) | ||
return if !post | ||
|
||
DiscourseAi::AiModeration::SpamScanner.perform_scan(post) | ||
end | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# frozen_string_literal: true | ||
class AiModerationSetting < ActiveRecord::Base | ||
belongs_to :llm_model | ||
|
||
validates :llm_model_id, presence: true | ||
validates :setting_type, presence: true | ||
validates :setting_type, uniqueness: true | ||
|
||
def self.spam | ||
find_by(setting_type: :spam) | ||
end | ||
|
||
def custom_instructions | ||
data["custom_instructions"] | ||
end | ||
end | ||
|
||
# == Schema Information | ||
# | ||
# Table name: ai_moderation_settings | ||
# | ||
# id :bigint not null, primary key | ||
# setting_type :enum not null | ||
# data :jsonb | ||
# llm_model_id :bigint not null | ||
# created_at :datetime not null | ||
# updated_at :datetime not null | ||
# | ||
# Indexes | ||
# | ||
# index_ai_moderation_settings_on_setting_type (setting_type) UNIQUE | ||
# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# frozen_string_literal: true | ||
class AiSpamLog < ActiveRecord::Base | ||
belongs_to :post | ||
belongs_to :llm_model | ||
belongs_to :ai_api_audit_log | ||
belongs_to :reviewable | ||
end | ||
|
||
# == Schema Information | ||
# | ||
# Table name: ai_spam_logs | ||
# | ||
# id :bigint not null, primary key | ||
# post_id :bigint not null | ||
# llm_model_id :bigint not null | ||
# ai_api_audit_log_id :bigint | ||
# reviewable_id :bigint | ||
# is_spam :boolean not null | ||
# payload :string(20000) default(""), not null | ||
# created_at :datetime not null | ||
# updated_at :datetime not null | ||
# | ||
# Indexes | ||
# | ||
# index_ai_spam_logs_on_post_id (post_id) | ||
# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
# frozen_string_literal: true | ||
|
||
class AiSpamSerializer < ApplicationSerializer | ||
attributes :is_enabled, :llm_id, :custom_instructions, :available_llms, :stats, :flagging_username | ||
|
||
def is_enabled | ||
object[:enabled] | ||
end | ||
|
||
def llm_id | ||
settings&.llm_model&.id | ||
end | ||
|
||
def custom_instructions | ||
settings&.custom_instructions | ||
end | ||
|
||
def available_llms | ||
DiscourseAi::Configuration::LlmEnumerator | ||
.values(allowed_seeded_llms: SiteSetting.ai_spam_detection_model_allowed_seeded_models_map) | ||
.map { |hash| { id: hash[:value], name: hash[:name] } } | ||
end | ||
|
||
def flagging_username | ||
object[:flagging_username] | ||
end | ||
|
||
def stats | ||
{ | ||
scanned_count: object[:stats].scanned_count.to_i, | ||
spam_detected: object[:stats].spam_detected.to_i, | ||
false_positives: object[:stats].false_positives.to_i, | ||
false_negatives: object[:stats].false_negatives.to_i, | ||
} | ||
end | ||
|
||
def settings | ||
object[:settings] | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.