Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clean_consumed not functioning #49

Open
pbygrave-lucid opened this issue Apr 4, 2023 · 0 comments
Open

clean_consumed not functioning #49

pbygrave-lucid opened this issue Apr 4, 2023 · 0 comments
Labels

Comments

@pbygrave-lucid
Copy link

pbygrave-lucid commented Apr 4, 2023

Logstash information:

Please include the following information:

  1. Logstash version : logstash 8.7.0
  2. Logstash installation source: Ansible install via geerlingguy roles
  3. How is Logstash being run (e.g. as a service/service manager: systemd
  4. How was the Logstash Plugin installed: Via ansible roles install
  5. Plugin version: logstash-input-dead_letter_queue **(2.0.0)**

JVM (e.g. java -version):

  1. Java version:
openjdk version "11.0.18" 2023-01-17
OpenJDK Runtime Environment (build 11.0.18+10-post-Ubuntu-0ubuntu120.04.1)
OpenJDK 64-Bit Server VM (build 11.0.18+10-post-Ubuntu-0ubuntu120.04.1, mixed mode, sharing)
  1. JVM installation source (e.g. from the Operating System's package manager, from source, etc): Geerlingguy ansible role

OS version (uname -a if on a Unix-like system):

20.04.1-Ubuntu

Description of the problem including expected versus actual behavior:

clean_consumed option does not clear used segments as per documentation.

ll data/dead_letter_queue/main/
total 84
drwxr-xr-x 2 logstash logstash  4096 Apr  4 19:07 ./
drwxr-xr-x 4 logstash logstash  4096 Apr  4 18:46 ../
-rw-r--r-- 1 logstash logstash     0 Apr  4 19:06 .lock
-rw-r--r-- 1 logstash logstash 23578 Apr  4 18:55 1.log
-rw-r--r-- 1 logstash logstash 23576 Apr  4 18:57 2.log
-rw-r--r-- 1 logstash logstash 23578 Apr  4 19:07 3.log
-rw-r--r-- 1 logstash logstash     1 Apr  4 19:07 4.log.tmp
-rw-r--r-- 1 logstash logstash     0 Apr  4 18:46 dlq_reader.lock

These segments each contain 3 log messages that were rejected by Elasticsearch with a 400 mapping issues, they were correctly pseed to the DLQ, edited by a filter and re-processed, and I can see them now in the intended index. But these used segments have not been cleaned from disk, which is the intended use of clean_consumed.

Steps to reproduce:

logstash.yml DLQ settings:

dead_letter_queue.enable: true
dead_letter_queue.max_bytes: 1024mb
dead_letter_queue.flush_interval: 5000
dead_letter_queue.storage_policy: drop_newer
dead_letter_queue.storage_policy: drop_newer
path.dead_letter_queue: /usr/share/logstash/data/dead_letter_queue

pipelines.yml file:

- pipeline.id: main
  path.config: "/etc/logstash/conf.d/*.conf"
  pipeline.workers: 3
  dead_letter_queue.enable: true
- pipeline.id: dlq
  path.config: "/etc/logstash/dlq_conf.d/*.conf"
  pipeline.workers: 1

DLQ input file:

input {
  dead_letter_queue {
    path => "/usr/share/logstash/data/dead_letter_queue"
    pipeline_id => "main"
    clean_consumed => true
    commit_offsets => true
  }
}

Looking at the documentation here:
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-dead_letter_queue.html

It says that when clean_consumed is set to true, then commit_offsets must also be set to true, which I've done. It also states that sincedb tracks the checkpoint of the DLQ, but I cannot find any trace of it writing any checkpointing files in <path.data>/plugins/inputs/dead_letter_queue:

$ ll /usr/share/logstash/data/
total 12
drwxr-xr-x  3 logstash logstash 4096 Apr  3 21:21 ./
drwxr-xr-x 12 root     root     4096 Apr  3 20:50 ../
drwxr-xr-x  4 logstash logstash 4096 Apr  4 18:46 dead_letter_queue/

The DQL is functioning correctly but without cleaning up the used log segments it is not fit for purpose to be released into my Production environment. The documentation here:

https://www.elastic.co/guide/en/logstash/current/dead-letter-queues.html#auto-clean

Does suggest maybe there's a formatting issue, but not sure whether that's an error in the docs.

Provide logs (if relevant):

[2023-04-04T19:31:50,788][INFO ][logstash.javapipeline    ][dlq] Pipeline started {"pipeline.id"=>"dlq"}
[2023-04-04T19:31:50,800][DEBUG][logstash.javapipeline    ] Pipeline started successfully {:pipeline_id=>"dlq", :thread=>"#<Thread:0x5874ce7d run>"}
[2023-04-04T19:31:50,834][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:31:51,200][DEBUG][logstash.filters.mutate  ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,234][DEBUG][logstash.filters.mutate  ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,234][DEBUG][logstash.filters.mutate  ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,235][DEBUG][logstash.filters.mutate  ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,238][DEBUG][logstash.filters.mutate  ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,243][DEBUG][logstash.filters.mutate  ][dlq][1f58cd3c635a7b5f9974080cc749dc88db75d1104d78fcb6ce486f5b89c9cc7c] filters/LogStash::Filters::Mutate: removing field {:field=>"[app-log]"}
[2023-04-04T19:31:51,572][DEBUG][logstash.outputs.opensearch][dlq][570ab0ae3d602ac4e8f85245ef9b1b948e46d98a72b514ee9bd097e60bd9ea6d] Sending final bulk request for batch. {:action_count=>6, :payload_size=>17759, :content_length=>17759, :b
atch_offset=>0}
[2023-04-04T19:31:54,371][INFO ][logstash.agent           ] Pipelines running {:count=>2, :running_pipelines=>[:dlq, :main], :non_running_pipelines=>[]}
[2023-04-04T19:31:55,831][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:00,831][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:10,832][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:15,831][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:20,834][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:20,834][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:25,831][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
[2023-04-04T19:32:30,831][DEBUG][org.logstash.execution.PeriodicFlush][dlq] Pushing flush onto pipeline.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant