Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

geoip: extract database manager to stand-alone feature #15348

Merged
merged 15 commits into from
Nov 6, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion config/jvm.options
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,8 @@
#-Djna.nosys=true

# Turn on JRuby invokedynamic
-Djruby.compile.invokedynamic=true
# TEMPORARILY DISABLED FOR DEV WHILE 9.4.3.0 is broken
-Djruby.compile.invokedynamic=false
yaauie marked this conversation as resolved.
Show resolved Hide resolved

## heap dumps

Expand Down
4 changes: 2 additions & 2 deletions config/logstash.yml
Original file line number Diff line number Diff line change
Expand Up @@ -379,7 +379,7 @@
#xpack.management.elasticsearch.sniffing: false
#xpack.management.logstash.poll_interval: 5s

# X-Pack GeoIP plugin
# X-Pack GeoIP Database Management
# https://www.elastic.co/guide/en/logstash/current/plugins-filters-geoip.html#plugins-filters-geoip-manage_update
#xpack.geoip.download.endpoint: "https://geoip.elastic.co/v1/database"
#xpack.geoip.downloader.enabled: true
#xpack.geoip.downloader.endpoint: "https://geoip.elastic.co/v1/database"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, intentionally:

-#xpack.geoip.download.endpoint: "https://geoip.elastic.co/v1/database"
+#xpack.geoip.downloader.endpoint: "https://geoip.elastic.co/v1/database"

To align with naming internally and with the similar feature in Elasticsearch's Geoip database manager. The deprecated pathway is included in geoip_database_management/extension.rb.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can add poll interval here
#xpack.geoip.downloader.poll.interval: 24h

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#xpack.geoip.downloader.endpoint: "https://geoip.elastic.co/v1/database"
#xpack.geoip.downloader.endpoint: "https://geoip.elastic.co/v1/database"
#xpack.geoip.downloader.poll.interval: 24h

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't add the poll interval intentionally, and would prefer deferring that discussion until later. I found the poll interval very helpful in manual testing, but there are some very sharp edges (like our TimeValue setting not supporting upper/lower bounds on the total value or on granularity) that limit the value of exposing this to users.

3 changes: 3 additions & 0 deletions docs/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,9 @@ include::static/config-management.asciidoc[]

include::static/management/configuring-centralized-pipelines.asciidoc[]

// Geoip Database Manger
include::static/geoip/database-manager.asciidoc
yaauie marked this conversation as resolved.
Show resolved Hide resolved

// Working with Logstash Modules
include::static/modules.asciidoc[]

Expand Down
119 changes: 119 additions & 0 deletions docs/static/geoip/database-manager.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
[[logstash-geoip-database-management]]
=== GeoIP database management

Logstash provides a mechanism for provisioning and maintaining GeoIP databases, which plugins can use to ensure that they have access to an always-up-to-date and EULA-compliant database for geo enrichment.
This mechanism requires internet access or a network route to an Elastic GeoIP database service.

If the database manager is enabled in `logstash.yml` (as it is by default), a plugin may subscribe to a database, triggering a download if a valid database is not already available.
Logstash checks for updates every day.
When an updated database is discovered, it is downloaded in the background and made available to the plugins that rely on it.

The GeoIP databases are separately-licensed from MaxMind under the terms of an End User License Agreement, which prohibits a database from being used after an update has been available for more than 30 days.
When Logstash cannot reach the database service for 30 days or more to validate that a managed database is up-to-date, that database is deleted and made unavailable to the plugins that subscribed to it.

NOTE: Geoip database management is a licensed feature of Logstash, and is only available in the Elastic-licensed complete distribution of Logstash.

[[logstash-geoip-database-management-offline]]
=== Offline and air-gapped environments

If Logstash does not have access to the internet, or if you want to disable the database manager, set the `xpack.geoip.downloader.enabled` value to `false` in `logstash.yml`.
When the database manager is disabled, plugins that require GeoIP lookups must be configured with their own source of GeoIP databases.

==== Using an HTTP proxy

If you can't connect directly to the Elastic GeoIP endpoint, consider setting up an HTTP proxy server.
You can then specify the proxy with `http_proxy` environment variable.

[source,sh]
----
export http_proxy="http://PROXY_IP:PROXY_PORT"
----

==== Using a custom endpoint

If you work in an air-gapped environment and can't update your databases from the Elastic endpoint,
You can then download databases from MaxMind and bootstrap the service.

. Download both `GeoLite2-ASN.mmdb` and `GeoLite2-City.mmdb` database files from the
http://dev.maxmind.com/geoip/geoip2/geolite2[MaxMind site].

. Copy both database files to a single directory.

. https://www.elastic.co/downloads/elasticsearch[Download {es}].

. From your {es} directory, run:
+
[source,sh]
----
./bin/elasticsearch-geoip -s my/database/dir
----

. Serve the static database files from your directory. For example, you can use
Docker to serve the files from nginx server:
+
[source,sh]
----
docker run -p 8080:80 -v my/database/dir:/usr/share/nginx/html:ro nginx
----

. Specify the service's endpoint URL in Logstash using the
`xpack.geoip.download.endpoint=http://localhost:8080/overview.json` setting in `logstash.yml`.

Logstash gets automatic updates from this service.

=== Database Metrics

You can monitor the managed database's status through the {logstash-ref}/node-stats-api.html#node-stats-api[Node Stats API].

The following request returns a JSON document containing database manager stats,
including:

* database status and freshness
** `geoip_download_manager.database.*.status`
*** `init` : initial CC database status
*** `up_to_date` : using up-to-date EULA database
*** `to_be_expired` : 25 days without calling service
*** `expired` : 30 days without calling service
** `fail_check_in_days` : number of days Logstash fails to call service since the last success
* info about download successes and failures
** `geoip_download_manager.download_stats.successes` number of successful checks and downloads
** `geoip_download_manager.download_stats.failures` number of failed check or download
** `geoip_download_manager.download_stats.status`
*** `updating` : check and download at the moment
*** `succeeded` : last download succeed
*** `failed` : last download failed

[source,js]
--------------------------------------------------
curl -XGET 'localhost:9600/_node/stats/geoip_download_manager?pretty'
--------------------------------------------------

Example response:

[source,js]
--------------------------------------------------
{
"geoip_download_manager" : {
"database" : {
"ASN" : {
"status" : "up_to_date",
"fail_check_in_days" : 0,
"last_updated_at": "2021-06-21T16:06:54+02:00"
},
"City" : {
"status" : "up_to_date",
"fail_check_in_days" : 0,
"last_updated_at": "2021-06-21T16:06:54+02:00"
}
},
"download_stats" : {
"successes" : 15,
"failures" : 1,
"last_checked_at" : "2021-06-21T16:07:03+02:00",
"status" : "succeeded"
}
}
}
--------------------------------------------------

include::{log-repo-dir}/static/settings/geoip-database-manager-settings.asciidoc[]
26 changes: 26 additions & 0 deletions docs/static/settings/geoip-database-manager-settings.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
[role="xpack"]
[[geoip-database-manager-settings]]
==== Geoip database manager settings in {ls}
++++
<titleabbrev>Geoip Database Manager Settings</titleabbrev>
++++

You can set the folowing `xpack.geoip` settings in `logstash.yml` to configure the <<logstash-geoip-database-management, geoip database manager>>.
For more information about configuring Logstash, see <<logstash-settings-file>>.

`xpack.geoip.downloader.enabled`::

(Boolean) If `true`, Logstash automatically downloads and manages updates for GeoIP2 databases from the `xpack.geoip.downloader.endpoint`.
If `false`, Logstash does not manage GeoIP2 databases and plugins that need a GeoIP2 database must be configured to provide their own.

`xpack.geoip.downloader.endpoint`::

(String) Endpoint URL used to download updates for GeoIP2 databases.
For example, `https://mydomain.com/overview.json`.
Defaults to `https://geoip.elastic.co/v1/database`.
Note that Logstash will periodically make a GET request to `${xpack.geoip.downloader.endpoint}?elastic_geoip_service_tos=agree`, expecting the list of metadata about databases typically found in `overview.json`.

`xpack.geoip.downloader.poll.interval`::
(Time Value) How often Logstash checks for GeoIP2 database updates at the `xpack.geoip.downloader.endpoint`.
For example, `6h` to check every six hours.
Defaults to `24h` (24 hours).
12 changes: 6 additions & 6 deletions logstash-core/lib/logstash/agent.rb
Original file line number Diff line number Diff line change
Expand Up @@ -609,9 +609,9 @@ def update_successful_reload_metrics(action, action_result)

def initialize_geoip_database_metrics(metric)
begin
relative_path = ::File.join(LogStash::Environment::LOGSTASH_HOME, "x-pack", "lib", "filters", "geoip")
require_relative ::File.join(relative_path, "database_manager")
require_relative ::File.join(relative_path, "database_metric")
relative_path = ::File.join(LogStash::Environment::LOGSTASH_HOME, "x-pack", "lib", "geoip_database_management")
require_relative ::File.join(relative_path, "manager")
require_relative ::File.join(relative_path, "metric")

geoip_metric = metric.namespace([:geoip_download_manager]).tap do |n|
db = n.namespace([:database])
Expand All @@ -629,11 +629,11 @@ def initialize_geoip_database_metrics(metric)
dl.gauge(:status, nil)
end

database_metric = LogStash::Filters::Geoip::DatabaseMetric.new(geoip_metric)
database_manager = LogStash::Filters::Geoip::DatabaseManager.instance
database_metric = LogStash::GeoipDatabaseManagement::Metric.new(geoip_metric)
database_manager = LogStash::GeoipDatabaseManagement::Manager.instance
database_manager.database_metric = database_metric
rescue LoadError => e
@logger.trace("DatabaseManager is not in classpath")
@logger.trace("DatabaseManager is not in classpath", exception: e.message, backtrace: e.backtrace)
end
end
end # class LogStash::Agent
13 changes: 13 additions & 0 deletions logstash-core/lib/logstash/util.rb
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,19 @@ def self.set_thread_plugin(plugin)
Thread.current[:plugin] = plugin
end

def self.with_logging_thread_context(override_context)
java_import org.apache.logging.log4j.ThreadContext

backup = ThreadContext.getImmutableContext()
ThreadContext.putAll(override_context)

yield

ensure
ThreadContext.removeAll(override_context.keys)
ThreadContext.putAll(backup)
end

def self.thread_info(thread)
# When the `thread` is dead, `Thread#backtrace` returns `nil`; fall back to an empty array.
backtrace = (thread.backtrace || []).map do |line|
Expand Down
Loading