Skip to content

A fluentd plugin to effectively filter black-listed keywords

License

Notifications You must be signed in to change notification settings

yanana/fluent-plugin-filter-list

Repository files navigation

fluent-plugin-filter-list

Build Status

Want to filter fluentd messages containing black-listed words in the list effectively? Use the fluent-plugin-filter-list plugin. The plugin enables you to filter messages in the list of words you provide. You can either discard such messages simply, or process them in a different flow by retagging them.

Installation

Add this line to your application's Gemfile:

gem 'fluent-plugin-filter-list'

And then execute:

$ bundle

Or install it yourself as:

$ gem install fluent-plugin-filter-list

Usage

This repository contains two plugins: Filter and Output, and expects two main use cases.

Filter plugin

Use the filter_list filter. Configure fluentd as follows.

ACMatcher

ACMatcher is a matcher using Aho–Corasick algorithm to enable faster multiple-pattern matching.

<filter pattern>
  @type filter_list

  filter AC
  key_to_filter x
  pattern_file_paths ["blacklist_1.txt", "blacklist_2.txt"]
  filter_empty true
</filter>

Given the blacklist.txt is as follows.

foo
bar
buzz

The following message is discarded since its x field contains the sequence of characters bar, contained in the list.

{
  "x": "halbart",
  "y": 1
}

While the following message is passed through as the target field specified in the config is not y but x.

{
  "x": 1,
  "y": "halbart"
}

Additionally, the following message is also omitted since filter_empty is true. The value is determined to be empty when the trimmed value is empty.

{
  "x": "   ",
  "y": "halbart"
}

All these examples are blacklisting. That is, a text matched to a pattern will be determined to be filtered. The plugin provides the other type of filtering: whitelisting. With the type you can filter records that don't match any pattern. You can enable whitelisting by specifying the action (the default value is blacklist) explicitly as follows.

<filter>
  @type filter_list

  filter AC
  key_to_filter foo
  pattern_file_paths blacklist.txt
  action whitelist
</filter>

IPMatcher

<filter pattern>
  @type filter_list

  filter IP
  key_to_filter ip
  pattern_file_paths blacklist.txt
</filter>

Given the blacklist.txt is as follows.

192.168.1.0/24
127.0.0.1/24
255.255.0.0

The following message is discarded since its ip field is the IP address in the list (exact IP).

{
  "ip": "255.255.0.0",
  "y": 1
}

Also the following message is discarded since its ip field is the IP address in the list (CIDR-notated IP).

{
  "ip": "192.168.1.255",
  "y": 1
}

While the following message is passed through.

{
  "ip": "192.168.2.0",
  "y": 1
}

Output plugin

The other use case is to filter messages likewise, but process the filtered messages in a different tag. You need to configure the plugin to tell it how to retag both non-filtered messages and filtered messages. We provide two mutually-exclusive parameters: tag and add_prefix. THe tag parameter tells the plugin to retag the message with the value exactly provided by the parameter. The add_prefix parameter tells the plugin to retag the messages with the original tag prepended with the value you provide. So if the original message had a tag foo and you set the add_prefix parameter filtered, then the processed message would have the tag filtered.foo (note that the period before the original tag value is also prepended).

<match pattern>
  @type filter_list

  key_to_filter field_name_you_want_to_filter
  pattern_file_paths ["file_including_patterns_separated_by_new_line"]
  filter_empty true
  action blacklist

  <retag>
    add_prefix x # retag non-filtered messages whose tag will be "x.your_tag"
  </retag>
  <retag_filtered>
    tag y # simply retag filtered (matched) messages with "y"
  </retag_filtered>
</match>

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake test to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/yanana/fluent-plugin-filter-list. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

License

The gem is available as open source under the terms of the MIT License.

About

A fluentd plugin to effectively filter black-listed keywords

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published