You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Under certain conditions using the elasticsearch _bulk api, quickwit, particularly during spikes in ingestion traffic, or when an index is initially created, quickwit will reject documents with a document level status of 500, internal_exception with the reason no shards available. This tends to indicate that something has gone wrong on the server and the document cannot be retried
This can cause problems when using existing elasticsearch client libraries. Many of them have logic implemented for handing retires and document level errors from the bulk api. However, the rate limiting generally only kicks in when the document level status is a 429. This can be problematic for existing applications where the retry logic is leveraged. In the current quickwit behavior, documents will generally be dropped assuming the error is terminal when its really a transient warmup problem.
Expected behavior
The bulk api document errors should be a 429 when there are no shards available. It may also be helpful to return a error code that is more indicative of the problem rather than an `internal_exception
{status: 429,error: {type: 'no_shard_available_action_exception'// elasticsearch has this error code, but it may mean something else in that context.,reason: 'no shards available'}}
The text was updated successfully, but these errors were encountered:
Describe the bug
Under certain conditions using the elasticsearch _bulk api, quickwit, particularly during spikes in ingestion traffic, or when an index is initially created, quickwit will reject documents with a document level status of
500
,internal_exception
with the reasonno shards available
. This tends to indicate that something has gone wrong on the server and the document cannot be retriedThis can cause problems when using existing elasticsearch client libraries. Many of them have logic implemented for handing retires and document level errors from the bulk api. However, the rate limiting generally only kicks in when the document level status is a 429. This can be problematic for existing applications where the retry logic is leveraged. In the current quickwit behavior, documents will generally be dropped assuming the error is terminal when its really a transient warmup problem.
Expected behavior
The bulk api document errors should be a
429
when there are no shards available. It may also be helpful to return a error code that is more indicative of the problem rather than an `internal_exceptionThe text was updated successfully, but these errors were encountered: