Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

geoip: extract database manager to stand-alone feature #15348

Merged
merged 15 commits into from
Nov 6, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions config/logstash.yml
Original file line number Diff line number Diff line change
Expand Up @@ -379,7 +379,7 @@
#xpack.management.elasticsearch.sniffing: false
#xpack.management.logstash.poll_interval: 5s

# X-Pack GeoIP plugin
# X-Pack GeoIP Database Management
# https://www.elastic.co/guide/en/logstash/current/plugins-filters-geoip.html#plugins-filters-geoip-manage_update
#xpack.geoip.download.endpoint: "https://geoip.elastic.co/v1/database"
#xpack.geoip.downloader.enabled: true
#xpack.geoip.downloader.endpoint: "https://geoip.elastic.co/v1/database"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, intentionally:

-#xpack.geoip.download.endpoint: "https://geoip.elastic.co/v1/database"
+#xpack.geoip.downloader.endpoint: "https://geoip.elastic.co/v1/database"

To align with naming internally and with the similar feature in Elasticsearch's Geoip database manager. The deprecated pathway is included in geoip_database_management/extension.rb.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can add poll interval here
#xpack.geoip.downloader.poll.interval: 24h

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#xpack.geoip.downloader.endpoint: "https://geoip.elastic.co/v1/database"
#xpack.geoip.downloader.endpoint: "https://geoip.elastic.co/v1/database"
#xpack.geoip.downloader.poll.interval: 24h

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't add the poll interval intentionally, and would prefer deferring that discussion until later. I found the poll interval very helpful in manual testing, but there are some very sharp edges (like our TimeValue setting not supporting upper/lower bounds on the total value or on granularity) that limit the value of exposing this to users.

3 changes: 3 additions & 0 deletions docs/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,9 @@ include::static/transforming-data.asciidoc[]
// Deploying & Scaling
include::static/deploying.asciidoc[]

// GeoIP Database Management
include::static/geoip-database-management.asciidoc[]

// Troubleshooting performance
include::static/performance-checklist.asciidoc[]

Expand Down
10 changes: 10 additions & 0 deletions docs/static/geoip-database-management.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[[geoip-database-management]]
== Managing GeoIP Databases

Logstash provides GeoIP database management features to make it easier for you to
use plugins that require an up-to-date database to enrich events with geographic data.

- <<logstash-geoip-database-management, Feature Overview>>
- <<configuring-geoip-database-management, Configuration Guide>>

include::geoip-database-management/index.asciidoc[]
68 changes: 68 additions & 0 deletions docs/static/geoip-database-management/configuring.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
[role="xpack"]
[[configuring-geoip-database-management]]
=== Configure GeoIP Database Management

To configure
<<logstash-geoip-database-management>>:

. Verify that you are using a license that includes the geoip database management
feature.
+
--
For more information, see https://www.elastic.co/subscriptions and
{kibana-ref}/managing-licenses.html[License management].
--

. Specify
<<geoip-database-management-settings,geoip database management settings>> in the
`logstash.yml` file to tune the configuration as-needed.

include::../settings/geoip-database-management-settings.asciidoc[]

[[configuring-geoip-database-management-offline]]
==== Offline and air-gapped environments

If Logstash does not have access to the internet, or if you want to disable the database manager, set the `xpack.geoip.downloader.enabled` value to `false` in `logstash.yml`.
When the database manager is disabled, plugins that require GeoIP lookups must be configured with their own source of GeoIP databases.

===== Using an HTTP proxy

If you can't connect directly to the Elastic GeoIP endpoint, consider setting up an HTTP proxy server.
You can then specify the proxy with `http_proxy` environment variable.

[source,sh]
----
export http_proxy="http://PROXY_IP:PROXY_PORT"
----

===== Using a custom endpoint

If you work in an air-gapped environment and can't update your databases from the Elastic endpoint,
You can then download databases from MaxMind and bootstrap the service.

. Download both `GeoLite2-ASN.mmdb` and `GeoLite2-City.mmdb` database files from the
http://dev.maxmind.com/geoip/geoip2/geolite2[MaxMind site].

. Copy both database files to a single directory.

. https://www.elastic.co/downloads/elasticsearch[Download {es}].

. From your {es} directory, run:
+
[source,sh]
----
./bin/elasticsearch-geoip -s my/database/dir
----

. Serve the static database files from your directory. For example, you can use
Docker to serve the files from nginx server:
+
[source,sh]
----
docker run -p 8080:80 -v my/database/dir:/usr/share/nginx/html:ro nginx
----

. Specify the service's endpoint URL in Logstash using the
`xpack.geoip.download.endpoint=http://localhost:8080/overview.json` setting in `logstash.yml`.

Logstash gets automatic updates from this service.
19 changes: 19 additions & 0 deletions docs/static/geoip-database-management/index.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
[role="xpack"]
[[logstash-geoip-database-management]]
=== GeoIP Database Management

Logstash provides a mechanism for provisioning and maintaining GeoIP databases, which plugins can use to ensure that they have access to an always-up-to-date and EULA-compliant database for geo enrichment.
This mechanism requires internet access or a network route to an Elastic GeoIP database service.

If the database manager is enabled in `logstash.yml` (as it is by default), a plugin may subscribe to a database, triggering a download if a valid database is not already available.
Logstash checks for updates every day.
When an updated database is discovered, it is downloaded in the background and made available to the plugins that rely on it.

The GeoIP databases are separately-licensed from MaxMind under the terms of an End User License Agreement, which prohibits a database from being used after an update has been available for more than 30 days.
When Logstash cannot reach the database service for 30 days or more to validate that a managed database is up-to-date, that database is deleted and made unavailable to the plugins that subscribed to it.

NOTE: GeoIP database management is a licensed feature of Logstash, and is only available in the Elastic-licensed complete distribution of Logstash.

include::metrics.asciidoc[]

include::configuring.asciidoc[]
56 changes: 56 additions & 0 deletions docs/static/geoip-database-management/metrics.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@

[[logstash-geoip-database-management-metrics]]
==== Database Metrics

You can monitor the managed database's status through the <<node-stats-api,Node Stats API>>.

The following request returns a JSON document containing database manager stats,
including:

* database status and freshness
** `geoip_download_manager.database.*.status`
*** `init` : initial CC database status
*** `up_to_date` : using up-to-date EULA database
*** `to_be_expired` : 25 days without calling service
*** `expired` : 30 days without calling service
** `fail_check_in_days` : number of days Logstash fails to call service since the last success
* info about download successes and failures
** `geoip_download_manager.download_stats.successes` number of successful checks and downloads
** `geoip_download_manager.download_stats.failures` number of failed check or download
** `geoip_download_manager.download_stats.status`
*** `updating` : check and download at the moment
*** `succeeded` : last download succeed
*** `failed` : last download failed

[source,js]
--------------------------------------------------
curl -XGET 'localhost:9600/_node/stats/geoip_download_manager?pretty'
--------------------------------------------------

Example response:

[source,js]
--------------------------------------------------
{
"geoip_download_manager" : {
"database" : {
"ASN" : {
"status" : "up_to_date",
"fail_check_in_days" : 0,
"last_updated_at": "2021-06-21T16:06:54+02:00"
},
"City" : {
"status" : "up_to_date",
"fail_check_in_days" : 0,
"last_updated_at": "2021-06-21T16:06:54+02:00"
}
},
"download_stats" : {
"successes" : 15,
"failures" : 1,
"last_checked_at" : "2021-06-21T16:07:03+02:00",
"status" : "succeeded"
}
}
}
--------------------------------------------------
26 changes: 26 additions & 0 deletions docs/static/settings/geoip-database-management-settings.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
[role="xpack"]
[[geoip-database-management-settings]]
==== GeoIP database Management settings in {ls}
++++
<titleabbrev>GeoIP Database Management Settings</titleabbrev>
++++

You can set the following `xpack.geoip` settings in `logstash.yml` to configure the <<logstash-geoip-database-management, geoip database manager>>.
For more information about configuring Logstash, see <<logstash-settings-file>>.

`xpack.geoip.downloader.enabled`::

(Boolean) If `true`, Logstash automatically downloads and manages updates for GeoIP2 databases from the `xpack.geoip.downloader.endpoint`.
If `false`, Logstash does not manage GeoIP2 databases and plugins that need a GeoIP2 database must be configured to provide their own.

`xpack.geoip.downloader.endpoint`::

(String) Endpoint URL used to download updates for GeoIP2 databases.
For example, `https://mydomain.com/overview.json`.
Defaults to `https://geoip.elastic.co/v1/database`.
Note that Logstash will periodically make a GET request to `${xpack.geoip.downloader.endpoint}?elastic_geoip_service_tos=agree`, expecting the list of metadata about databases typically found in `overview.json`.

`xpack.geoip.downloader.poll.interval`::
(Time Value) How often Logstash checks for GeoIP2 database updates at the `xpack.geoip.downloader.endpoint`.
For example, `6h` to check every six hours.
Defaults to `24h` (24 hours).
12 changes: 6 additions & 6 deletions logstash-core/lib/logstash/agent.rb
Original file line number Diff line number Diff line change
Expand Up @@ -609,9 +609,9 @@ def update_successful_reload_metrics(action, action_result)

def initialize_geoip_database_metrics(metric)
begin
relative_path = ::File.join(LogStash::Environment::LOGSTASH_HOME, "x-pack", "lib", "filters", "geoip")
require_relative ::File.join(relative_path, "database_manager")
require_relative ::File.join(relative_path, "database_metric")
relative_path = ::File.join(LogStash::Environment::LOGSTASH_HOME, "x-pack", "lib", "geoip_database_management")
require_relative ::File.join(relative_path, "manager")
require_relative ::File.join(relative_path, "metric")

geoip_metric = metric.namespace([:geoip_download_manager]).tap do |n|
db = n.namespace([:database])
Expand All @@ -629,11 +629,11 @@ def initialize_geoip_database_metrics(metric)
dl.gauge(:status, nil)
end

database_metric = LogStash::Filters::Geoip::DatabaseMetric.new(geoip_metric)
database_manager = LogStash::Filters::Geoip::DatabaseManager.instance
database_metric = LogStash::GeoipDatabaseManagement::Metric.new(geoip_metric)
database_manager = LogStash::GeoipDatabaseManagement::Manager.instance
database_manager.database_metric = database_metric
rescue LoadError => e
@logger.trace("DatabaseManager is not in classpath")
@logger.trace("DatabaseManager is not in classpath", exception: e.message, backtrace: e.backtrace)
end
end
end # class LogStash::Agent
13 changes: 13 additions & 0 deletions logstash-core/lib/logstash/util.rb
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,19 @@ def self.set_thread_plugin(plugin)
Thread.current[:plugin] = plugin
end

def self.with_logging_thread_context(override_context)
java_import org.apache.logging.log4j.ThreadContext

backup = ThreadContext.getImmutableContext()
ThreadContext.putAll(override_context)

yield

ensure
ThreadContext.removeAll(override_context.keys)
ThreadContext.putAll(backup)
end

def self.thread_info(thread)
# When the `thread` is dead, `Thread#backtrace` returns `nil`; fall back to an empty array.
backtrace = (thread.backtrace || []).map do |line|
Expand Down
Loading