Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no address for opensearch (Resolv::ResolvError) #147

Open
andreibrebene opened this issue Oct 29, 2024 · 2 comments
Open

no address for opensearch (Resolv::ResolvError) #147

andreibrebene opened this issue Oct 29, 2024 · 2 comments

Comments

@andreibrebene
Copy link

Steps to replicate

Provide example config and message
Fluentd config:

log_level debug

<source>
  @type tail
  path /var/log/containers/*.log
  exclude_path /var/log/containers/fluentd*.log
  tag kubernetes.*
  pos_file /var/log/fluentd.pos
  read_from_head true

  <parse>
    @type cri
    time_format %Y-%m-%dT%H:%M:%S.%L%z
  </parse>
</source>

<filter kubernetes.**>
  @type kubernetes_metadata
</filter>

<filter kubernetes.wazuh>
  @type concat
  key message
  multiline_start_regexp /^\[\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2},\d{3}\]|\berror\b/
  flush_interval 120s
</filter>

<filter kubernetes.icsenrich>
  @type concat
  key message
  multiline_start_regexp /^\[\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2},\d{3}\]|\d{4}-\d{2}-\d{2}/
  flush_interval 120s
</filter>

<filter kubernetes.**>
  @type record_transformer
  remove_keys $['kubernetes']['labels'], $["kubernetes"]["pod_id"], $["kubernetes"]["container_image_id"], $["kubernetes"]["container_image"], $["kubernetes"]["master_url"], $["docker"]["container_id"], $["kubernetes"]["namespace_id"]
</filter>

<match kubernetes.**>
  @type opensearch
  hosts "#{ENV['ELASTICSEARCH_HOST']}"
  scheme https
  user "#{ENV['ELASTICSEARCH_USERNAME']}"
  password "#{ENV['ELASTICSEARCH_PASSWORD']}"
  ssl_verify false
  logstash_format true
  request_timeout 60s
  include_tag_key true
  logstash_prefix "logstash-${tag}"
  logstash_dateformat "%Y.%m.%d"

  <buffer>
    @type file
    path /var/log/fluentd-buffers/kubernetes
    chunk_limit_size 64m
    total_limit_size 2048m  
    flush_interval 10s      
    retry_max_interval 60s
    retry_forever true    
    flush_thread_count 10
    overflow_action block
  </buffer>
</match>

Logs:
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 fluent/log.rb:383:warn: failed to flush the buffer. retry_times=475 next_retry_time=2024-10-29 09:05:00 +0000 chunk="6259344e524c5ea61ad251179f87a5b4" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure error="could not push logs to OpenSearch cluster ({:host=>"192.168.10.10", :port=>9200, :scheme=>"https", :user=>"admin", :password=>"obfuscated"}): no address for opensearch-node (Resolv::ResolvError)"
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1135:in rescue in send_bulk' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1097:in send_bulk'
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:908:in block in write' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:in each'
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:in write' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1225:in try_flush'
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1538:in flush_thread_run' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:510:in block (2 levels) in start'
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin_helper/thread.rb:78:in block in thread_create' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [debug]: #0 fluent/log.rb:341:debug: taking back chunk for errors. chunk="6259345bd61c3866dd7c45eaceee824a" fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [debug]: #0 fluent/log.rb:341:debug: taking back chunk for errors. chunk="625934532a531331f902d98dd2ccfd50" fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 fluent/log.rb:383:warn: failed to flush the buffer. retry_times=475 next_retry_time=2024-10-29 09:04:54 +0000 chunk="6259345bd61c3866dd7c45eaceee824a" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure error="could not push logs to OpenSearch cluster ({:host=>\"192.168.10.10\", :port=>9200, :scheme=>\"https\", :user=>\"admin\", :password=>\"obfuscated\"}): no address for opensearch-node (Resolv::ResolvError)" fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1135:in rescue in send_bulk'
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1097:in send_bulk' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:908:in block in write'
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:in each' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:in write'
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1225:in try_flush' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1538:in flush_thread_run'
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:510:in block (2 levels) in start' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin_helper/thread.rb:78:in block in thread_create'
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 fluent/log.rb:383:warn: failed to flush the buffer. retry_times=475 next_retry_time=2024-10-29 09:05:02 +0000 chunk="625934532a531331f902d98dd2ccfd50" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure error="could not push logs to OpenSearch cluster ({:host=>"192.168.10.10", :port=>9200, :scheme=>"https", :user=>"admin", :password=>"obfuscated"}): no address for opensearch-node (Resolv::ResolvError)"
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1135:in rescue in send_bulk' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:1097:in send_bulk'
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:908:in block in write' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:in each'
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-opensearch-1.1.4/lib/fluent/plugin/out_opensearch.rb:907:in write' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1225:in try_flush'
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:1538:in flush_thread_run' fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin/output.rb:510:in block (2 levels) in start'
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 plugin/output.rb:1310:update_retry_state: /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.17.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [debug]: #0 fluent/log.rb:341:debug: taking back chunk for errors. chunk="625934524b3a23f460de0b179a356bc5"
fluentd-54r5n fluentd 2024-10-29 09:03:54 +0000 [warn]: #0 fluent/log.rb:383:warn: failed to flush the buffer. retry_times=475 next_retry_time=2024-10-29 09:05:02 +0000 chunk="625934524b3a23f460de0b179a356bc5" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure error="could not push logs to OpenSearch cluster ({:host=>"192.168.10.10", :port=>9200, :scheme=>"https", :user=>"admin", :password=>"obfuscated"}): no address for opensearch-node (Resolv::ResolvError)"

Expected Behavior or What you need to ask

Fluentd works for 10-12 hours, after this, will give you this error. The opensearch-node and address can be accessed via fluentd pod, everything works. If i restart the fluentd pods, will start work again and come back after some hours.

Using Fluentd and OpenSearch plugin versions

  • OS version -> Debian GNU/Linux 12 (bookworm)

  • Bare Metal or within Docker or Kubernetes or others? -> Kubernetes Cluster

  • Fluentd v1.0 or later -> fluentd 1.17.1

  • OpenSearch plugin version -> fluent-plugin-opensearch (1.1.4)

  • paste boot log of fluentd or td-agent
    abbrev (default: 0.1.1)
    addressable (2.8.7)
    aws-eventstream (1.3.0)
    aws-partitions (1.965.0)
    aws-sdk-core (3.201.5)
    aws-sigv4 (1.9.1)
    base64 (0.2.0, default: 0.1.1)
    benchmark (default: 0.2.1)
    bigdecimal (default: 3.1.3)
    bundler (default: 2.4.19, 2.4.17)
    cgi (default: 0.3.6)
    concurrent-ruby (1.3.4)
    cool.io (1.8.1)
    csv (3.3.0, default: 3.2.6)
    date (default: 3.3.3)
    delegate (default: 0.3.0)
    did_you_mean (default: 1.6.3)
    digest (default: 3.1.1)
    domain_name (0.6.20240107)
    drb (2.2.1, default: 2.1.1)
    english (default: 0.7.2)
    erb (default: 4.0.2)
    error_highlight (default: 0.5.1)
    etc (default: 1.4.2)
    excon (0.111.0)
    faraday (2.10.1)
    faraday-excon (2.1.0)
    faraday-net_http (3.1.1)
    faraday_middleware-aws-sigv4 (1.0.1)
    fcntl (default: 1.0.2)
    ffi (1.17.0 x86_64-linux-gnu)
    ffi-compiler (1.3.2)
    fiddle (default: 1.1.1)
    fileutils (default: 1.7.0)
    find (default: 0.1.1)
    fluent-config-regexp-type (1.0.0)
    fluent-plugin-concat (2.5.0)
    fluent-plugin-detect-exceptions (0.0.15)
    fluent-plugin-grok-parser (2.6.2)
    fluent-plugin-json-in-json-2 (1.0.2)
    fluent-plugin-kubernetes_metadata_filter (3.5.0)
    fluent-plugin-multi-format-parser (1.0.0)
    fluent-plugin-opensearch (1.1.4)
    fluent-plugin-parser-cri (0.1.1)
    fluent-plugin-prometheus (2.1.0)
    fluent-plugin-record-modifier (2.1.1)
    fluent-plugin-rewrite-tag-filter (2.4.0)
    fluent-plugin-systemd (1.0.5)
    fluentd (1.17.1)
    forwardable (default: 1.3.3)
    getoptlong (default: 0.2.0)
    http (5.2.0)
    http-accept (1.7.0)
    http-cookie (1.0.7)
    http-form_data (2.3.0)
    http_parser.rb (0.8.0)
    io-console (default: 0.6.0)
    io-nonblock (default: 0.2.0)
    io-wait (default: 0.3.0)
    ipaddr (default: 1.2.5)
    irb (default: 1.6.2)
    jmespath (1.6.2)
    json (default: 2.6.3)
    jsonpath (1.1.5)
    kubeclient (4.12.0)
    llhttp-ffi (0.5.0)
    logger (1.6.0, default: 1.5.3)
    lru_redux (1.1.0)
    mime-types (3.5.2)
    mime-types-data (3.2024.0806)
    msgpack (1.7.2)
    multi_json (1.15.0)
    mutex_m (default: 0.1.2)
    net-http (default: 0.4.1)
    net-protocol (default: 0.2.1)
    netrc (0.11.0)
    nkf (default: 0.1.2)
    observer (default: 0.1.1)
    oj (3.15.1)
    open-uri (default: 0.3.0)
    open3 (default: 0.1.2)
    opensearch-ruby (3.4.0)
    openssl (default: 3.1.0)
    optparse (default: 0.3.1)
    ostruct (default: 0.5.5)
    pathname (default: 0.2.1)
    pp (default: 0.4.0)
    prettyprint (default: 0.1.1)
    prometheus-client (4.2.3)
    pstore (default: 0.1.2)
    psych (default: 5.0.1)
    public_suffix (6.0.1)
    racc (default: 1.6.2)
    rake (13.2.1)
    rdoc (default: 6.5.1.1)
    readline (default: 0.0.3)
    readline-ext (default: 0.1.5)
    recursive-open-struct (1.2.2)
    reline (default: 0.3.2)
    resolv (default: 0.2.2)
    resolv-replace (default: 0.1.1)
    rest-client (2.1.0)
    rexml (3.2.9)
    rinda (default: 0.1.1)
    ruby2_keywords (default: 0.0.5)
    securerandom (default: 0.2.2)
    serverengine (2.3.2)
    set (default: 1.0.3)
    shellwords (default: 0.1.0)
    sigdump (0.2.5)
    singleton (default: 0.1.1)
    stringio (default: 3.0.4)
    strptime (0.2.5)
    strscan (3.1.0, default: 3.0.5)
    syntax_suggest (default: 1.1.0)
    syslog (default: 0.1.1)
    systemd-journal (1.4.2)
    tempfile (default: 0.1.3)
    time (default: 0.2.2)
    timeout (default: 0.3.1)
    tmpdir (default: 0.1.3)
    tsort (default: 0.1.1)
    tzinfo (2.0.6)
    tzinfo-data (1.2024.1)
    un (default: 0.2.1)
    uri (0.13.0, default: 0.12.2)
    weakref (default: 0.1.2)
    webrick (1.8.1)
    yajl-ruby (1.4.3)
    yaml (default: 0.2.1)
    zlib (default: 3.0.0)

  • OpenSearch version -> v 2.11.1

@rwunderer
Copy link

rwunderer commented Nov 5, 2024

I have the same problem with fluentd 1.17.1 and fluent-plugin-opensearch (1.1.5), both fluentd and opensearch running in Kubernetes. For me it happens every 2-3 hours, making the system effectively unusable.

Sample error line:

2024-11-05 11:21:42 +0000 [warn]: #0 failed to flush the buffer. retry_times=0 next_retry_time=2024-11-05 11:21:43 +0000 chunk="626289ac7f7285beba39bb9c87054dee" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure
 error="could not push logs to OpenSearch cluster ({:host=>\"opensearch-cluster.monitoring-platform.svc.cluster.local\", :port=>9200, :scheme=>\"https\", :user=>\"fluentduser\", :password=>\"obfuscated\", :path=>\"\"}): no address fo
r opensearch-cluster-warm-2 (Resolv::ResolvError)"

What I find interesting is that in the above error :host designates the configured output destination (a Kubernetes Service) but no address for mentions one of the opensearch nodes (a data only node in this example, but happens with all the nodes from time to time).
I have no idea why it would try to resolve an individual node (i.e. a pod).

@rwunderer
Copy link

@andreibrebene not sure if it's too early to celebrate but I seem to have mitigated the problem by setting:

reconnect_on_error true

in the output section.

I still see the above error in the log from time to time, but now it recovers immediately on the next try!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants