Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add index pattern / date math support to the index => setting #49

Open
jakommo opened this issue Sep 6, 2016 · 12 comments
Open

Add index pattern / date math support to the index => setting #49

jakommo opened this issue Sep 6, 2016 · 12 comments

Comments

@jakommo
Copy link

jakommo commented Sep 6, 2016

Currently the index => setting does not seem to support index patterns like logstash-%{+YYYY.MM.dd} as the ES output does.
Having the index pattern or date math support could be useful in some cases. E.g. if LS is running for batch processing.

  • Config File (if you have sensitive info, please remove it):
input {
        elasticsearch {
                hosts => "localhost"
                index => "logstash-%{+YYYY.MM.dd}"
                query => '{ "_source": ["@timestamp"],
                "query": { "match_all": {} }
            }'
        }
}

output {
    stdout { codec => rubydebug }
}

Running with this config results in:

Plugin: <LogStash::Inputs::Elasticsearch hosts=>["localhost"], index=>"logstash-%{+YYYY.MM.dd}" ....
@jordansissel
Copy link
Contributor

Some background:

The %{+...} syntax is what logstash calls a sprintf format and is only possible if you have an event. When deciding what index to query, there is no event, so "%{+YYYY.MM.dd}" couldn't work.

While I agree what you propose would be nice, I'm not sure how to expose it to users. I feel that supporting the sprintf format would be confusing because, for example, a field reference wouldn't work -- index => "logstash-%{foo}" again because there's no event.

Instead, what about doing this:

index => "logstash-*"

And having your query include your desired time range like '@timestamp:[now-1d TO now]` ?

@untergeek
Copy link
Contributor

With newer versions of Elasticsearch, we could add some field_stats API magic to determine which indices to query, couldn't we?

@phr0gz
Copy link

phr0gz commented Oct 29, 2018

Hello I just saw this discussion, and from a user point of view it can be very useful to insert the data that was inserted a specific days instead of using the timestamp of the data in the index.

E.g.
Imagine you have the following: devices --> Logstash1 --> ES --> Logstash2 --> "other infra"

You send the data (log) from ES to "other infra" with Logstash2 and '@timestamp:[now-1d/d TO now/d]' and the plugin has the following param. : schedule => "0 12 * * * America/Chicago"

Let's say that Logstash1 stops inserting during 2 days (or less), and then start again.

In this case you will miss some data, or you will need to change the Logstash2 configuration, play with the timestamp, and anyway there is a big risk to have duplicated/missing data in "other infra".
But if you can use the "index pattern" there will be no issues like that, A new daily index will be created in ES when logstash1 will start again, and without changing anything Logstash2 will insert all the missing data to "other infra".

@wols
Copy link

wols commented Mar 28, 2019

Compliant with the statement here #92 you should support the form

<static_name{date_math_expr{date_format|time_zone}}>

additional (Date math support in index names).

@untergeek
Copy link
Contributor

Can confirm that using date math in the index argument works as expected right now, with no change to the Logstash code:

Given that I am in UTC-5, and I ran this test at 7:25AM on 2019-03-28, with:

output {
  elasticsearch {
    index => "<staticname-{now/d{YYYY.MM.dd|+12:00}}>"
  }
}

… the resulting index was created with the proper date math corrected time:

[2019-03-28T07:25:14,968][INFO ][o.e.c.m.MetaDataCreateIndexService] [testcluster] [staticname-2019.03.29] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings []

Add 5 hours to 7:25AM, and UTC at time of execution would be 12:25PM. Add 12 hours to that and it's 2019-03-29, as the create index name indicates.

Please understand that this approach will force Elasticsearch to perform the date math calculations on every single event you send. The other way, Logstash does the work. I'm not sure what extra CPU cost this incurs, but it is a calculation, so there is some cost, even if it is negligible.

@untergeek
Copy link
Contributor

I know, I shipped to Elasticsearch, rather than read from Elasticsearch. It should work the same, however. Will test again right now that I've created it in the future.

@untergeek
Copy link
Contributor

Confirmed. I added another document:

PUT staticname-2019.03.29/doc/1
{
  "message": "This is a test",
  "@timestamp": "2019-03-29T00:41:00.000Z"
}

…and changed the Logstash config to:

input {
  elasticsearch {
    index => "<staticname-{now/d{YYYY.MM.dd|+12:00}}>"
    query => '{ "_source": ["@timestamp","message"],
      "query": { "match_all": {} }
    }'
  }
}

filter { }

output { stdout { codec => rubydebug } }

The results were clear:

{
    "@timestamp" => 2019-03-29T00:41:00.000Z,
       "message" => "This is a test",
      "@version" => "1"
}
{
    "@timestamp" => 2019-03-28T12:25:14.722Z,
       "message" => "This is a test",
      "@version" => "1"
}

Logstash configs can and do support date math in the index directives!

@wols
Copy link

wols commented Mar 28, 2019

My unsuccessful attempts today have this content

input {
    elasticsearch {
        schedule => "*/5 * * * *"
        hosts    => [ "127.0.0.1:9201" ]
#       index    => "ntpstats-live.*"
        index    => "<ntpstats-live.{now/d{YYYYMMdd}-30d}>"
        query    => '{
            "query": {
                "term": { "type": "loopstats" }
            },
            "sort": [ "stats_stamp" ]
        }'
    }
}

Can you check the multi-index

index => "<staticname-{now/d{YYYY.MM.dd}-1d}>,<staticname-{now/d{YYYY.MM.dd}}>"

please?

@untergeek
Copy link
Contributor

I tried this config:

index => '<staticname-{now/d{YYYY.MM.dd}-1d}>,<staticname-{now/d{YYYY.MM.dd}}>'

…and this was the result:

Plugin: <LogStash::Inputs::Elasticsearch index=>"<staticname-{now/d{YYYY.MM.dd}-1d}>,<staticname-{now/d{YYYY.MM.dd}}>", query=>"{ \"_source\": [\"@timestamp\",\"message\"],\n      \"query\": { \"match_all\": {} }\n    }", id=>"c2cd62803e841502f477fa5a73696111ff358612a41699c3575c166e2326586a", enable_metric=>true, codec=><LogStash::Codecs::JSON id=>"json_c132fc01-fc8c-4230-95ba-57a4f63672a0", enable_metric=>true, charset=>"UTF-8">, size=>1000, scroll=>"1m", docinfo=>false, docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"], ssl=>false>
  Error: [400] {"error":{"root_cause":[{"type":"parse_exception","reason":"invalid dynamic name expression [now/d{YYYY.MM.dd}-1d]. missing closing `}` for date math format"}],"type":"parse_exception","reason":"invalid dynamic name expression [now/d{YYYY.MM.dd}-1d]. missing closing `}` for date math format"},"status":400}

When I try it with hard-coded names:

index => 'staticname-2019.03.27,staticname-2019.03.28'

…this is the result:

{
       "message" => "This is 2019-03-28T01:00:00.000Z",
      "@version" => "1",
    "@timestamp" => 2019-03-28T01:00:00.000Z
}
{
       "message" => "This is 2019-03-27T01:00:00.000Z",
      "@version" => "1",
    "@timestamp" => 2019-03-27T01:00:00.000Z
}

And when I try URL-encoding the date math:

index => '%3Cstaticname-%7Bnow%2Fd%7BYYYY.MM.dd%7D-1d%7D%3E%2C%3Cstaticname-%7Bnow%2Fd%7BYYYY.MM.dd%7D%7D%3E'

…this is the result:

query=>"{ \"_source\": [\"@timestamp\",\"message\"],\n      \"query\": { \"match_all\": {} }\n    }", id=>"e996e32c317c784feb5e8853521c8a36d0363646c4b07f4ed80660c31dfe7307", enable_metric=>true, codec=><LogStash::Codecs::JSON id=>"json_a4555194-6ae9-4699-9eb9-652e0c6bdea6", enable_metric=>true, charset=>"UTF-8">, size=>1000, scroll=>"1m", docinfo=>false, docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"], ssl=>false>
  Error: [404] {"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"%3Cstaticname-%7Bnow%2Fd%7BYYYY.MM.dd%7D-1d%7D%3E%2C%3Cstaticname-%7Bnow%2Fd%7BYYYY.MM.dd%7D%7D%3E","index_uuid":"_na_","index":"%3Cstaticname-%7Bnow%2Fd%7BYYYY.MM.dd%7D-1d%7D%3E%2C%3Cstaticname-%7Bnow%2Fd%7BYYYY.MM.dd%7D%7D%3E"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"%3Cstaticname-%7Bnow%2Fd%7BYYYY.MM.dd%7D-1d%7D%3E%2C%3Cstaticname-%7Bnow%2Fd%7BYYYY.MM.dd%7D%7D%3E","index_uuid":"_na_","index":"%3Cstaticname-%7Bnow%2Fd%7BYYYY.MM.dd%7D-1d%7D%3E%2C%3Cstaticname-%7Bnow%2Fd%7BYYYY.MM.dd%7D%7D%3E"},"status":404}
  Exception: Elasticsearch::Transport::Transport::Errors::NotFound

@untergeek
Copy link
Contributor

It should be noted, though, that multiple date math in the query string doesn't work in Elasticsearch, and gives the exact same error that Logstash does:

GET /%3Cstaticname-%7Bnow%2Fd%7BYYYY.MM.dd%7D-1d%7D%3E%2C%3Cstaticname-%7Bnow%2Fd%7BYYYY.MM.dd%7D%7D%3E/_search

…results in:

{
  "error": {
    "root_cause": [
      {
        "type": "parse_exception",
        "reason": "invalid dynamic name expression [now/d{YYYY.MM.dd}-1d]. missing closing `}` for date math format"
      }
    ],
    "type": "parse_exception",
    "reason": "invalid dynamic name expression [now/d{YYYY.MM.dd}-1d]. missing closing `}` for date math format"
  },
  "status": 400
}

This suggests to me that what you are asking for is not even supported by Elasticsearch.

@wols
Copy link

wols commented Mar 28, 2019

Oh, my mistake:

WRONG position of -1d

index => '<staticname-{now/d{YYYY.MM.dd}-1d}>,<staticname-{now/d{YYYY.MM.dd}}>'

CORRECTLY

index => '<staticname-{now/d-1d{YYYY.MM.dd}}>,<staticname-{now/d{YYYY.MM.dd}}>'

Great, this rocks :-) Thanks a lot!

@anthy7154
Copy link

This discussion helped me. Thank you. I appreciate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants