Allow re-Committing offsets #1372

edenhill · 2017-08-09T13:00:11Z

librdkafka currently ignores (application) commit requests if the offsets match that of the last known commit.
For reasons stated here (commit expiry is shorter than message interval) it is desired for librdkafka not to perform this check and allow the commit to pass through to the broker.

@AlexeyRaga Should this affect auto commit behaviour as well?

AlexeyRaga · 2017-08-11T23:37:49Z

I never use auto commit to be honest, so I may miss some edge cases, but I personally would expect it to recommit.

edenhill · 2017-09-07T11:22:19Z

From @coberty in #1415:

I am not sure if it is wrong. I would rather say - unexpected.

We have such case:
Our C++ client had not received any messages for 2 days. And after it was restarted it started to read everything from the very beginning (offset reset to earliest) what is something that we want to avoid.

It wouldn't have happened with Java client since it retains offsets while is running with auto commit.

jrnt30 · 2017-10-16T20:12:28Z

This has led to some head scratching for us as well for somewhat bursty topic data, thanks for the work on this.

AlexeyRaga · 2017-11-28T23:42:50Z

To make it simpler, maybe instead of having a config option, it would be easier to implement a new function, like commitAlways or something like that?

edenhill · 2017-11-29T07:38:59Z

We want this behaviour to be used for auto commits too, so I think a config option is the easiest approach for users.

DavidLiuXh · 2017-11-29T08:13:01Z

I think this feature is particularly necessary.We have hundreds of topics(no replica) in a cluster , if a broker is broken, need to restart all the clients :(

AlexeyRaga · 2017-11-29T12:15:45Z

@DavidLiuXh it is even more critical because restart doesn't seem to help, negative lag doesn't go away. So you have to choose between two options, both are bad:

Configure consumers to earliest offsets, therefore it will work, but "broken" partitions will be consumed from the beginning each time you restart the job (or the job dies)
Configure consumers to latest offsets, therefore each time your job restarts or dies the broken partitions will continue from the high watermarks and you may lose (skip) messages.

I am in this crappy situation right now, sitting in the position #1 and praying that jobs don't restart until we figure out the solution :(

edenhill · 2017-11-29T12:34:22Z

There are two workarounds:

shut down the consumers, reset the offsets (--to-offset ..) using the offset tool, restart consumer. They should resume at the reset offset position. (no application logic needed)
set auto.offset.reset=error, handle the consumer error by seeking/assigning to a specific offset.

AlexeyRaga · 2017-11-29T12:46:45Z

Oh, thanks, something to think about!
The offset reset tool doesn't work for me. It errors out with something like "protocol v2 vs. protocol v1", probably because we are still using Kafka 0.10.x.
But the 2nd one can be an option.

DavidLiuXh · 2017-11-30T07:19:02Z

@AlexeyRaga I wrote a reset offset tool myself by librdkafka for kafka 0.9.0.1, basically do not need to restart a large number of clients.

edenhill · 2017-11-30T07:40:49Z

@DavidLiuXh Can you share that tool?

DavidLiuXh · 2017-11-30T09:33:38Z

@edenhill I need a little time

DavidLiuXh · 2017-12-04T06:02:53Z

@edenhill I shared the tool: https://github.com/DavidLiuXh/KafkaOffsetTools

AlexeyRaga · 2019-03-06T05:30:16Z

@edenhill has anything changed about this issue? This just hot us really hard here: a couple of partitions that were getting data infrequently suddenly lost their offsets...

edenhill · 2019-03-11T07:53:33Z

@AlexeyRaga This is still on the backburner, let's look into it after the v1.0.0 release.

nick-zh · 2020-05-07T11:27:13Z

@edenhill any update on this?

edenhill · 2020-05-20T08:43:47Z

This is too big of a change (risk-wise) to go into v1.5, will adress after that release.

AndrewKostousov · 2022-04-18T09:01:33Z

@edenhill One more ping regarding this issue.

We also have this annoying case with topics where data is rarely written to. Consumer is working 24/7, but after it restarts, sometimes it begins to process these topics from the beginning (due to offsets.retention.minutes setting).

Re-committing current offsets for consumers even when no new messages arrive would be a perfect solution for us.

edenhill added the enhancement label Aug 9, 2017

edenhill mentioned this issue Aug 9, 2017

KafkaConsumer and current positions confluentinc/confluent-kafka-python#118

Closed

edenhill mentioned this issue Sep 7, 2017

Inconsistent enable.auto.commit for Java and C++ clients #1415

Closed

edenhill mentioned this issue Oct 6, 2017

Resetting offsets after offsets.retention.minutes has passed #1442

Closed

edenhill mentioned this issue Nov 21, 2017

Can't committed offset after broker migrate #1538

Closed

9 tasks

edenhill mentioned this issue Nov 28, 2017

On topic data loss offsets are not committed properly #1558

Closed

4 tasks

rnpridgeon mentioned this issue Jul 20, 2018

Committed offset is reset periodically confluentinc/confluent-kafka-go#215

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow re-Committing offsets #1372

Allow re-Committing offsets #1372

edenhill commented Aug 9, 2017

AlexeyRaga commented Aug 11, 2017

edenhill commented Sep 7, 2017

jrnt30 commented Oct 16, 2017

AlexeyRaga commented Nov 28, 2017

edenhill commented Nov 29, 2017

DavidLiuXh commented Nov 29, 2017

AlexeyRaga commented Nov 29, 2017

edenhill commented Nov 29, 2017

AlexeyRaga commented Nov 29, 2017

DavidLiuXh commented Nov 30, 2017

edenhill commented Nov 30, 2017

DavidLiuXh commented Nov 30, 2017

DavidLiuXh commented Dec 4, 2017

AlexeyRaga commented Mar 6, 2019

edenhill commented Mar 11, 2019

nick-zh commented May 7, 2020

edenhill commented May 20, 2020

AndrewKostousov commented Apr 18, 2022

Allow re-Committing offsets #1372

Allow re-Committing offsets #1372

Comments

edenhill commented Aug 9, 2017

AlexeyRaga commented Aug 11, 2017

edenhill commented Sep 7, 2017

jrnt30 commented Oct 16, 2017

AlexeyRaga commented Nov 28, 2017

edenhill commented Nov 29, 2017

DavidLiuXh commented Nov 29, 2017

AlexeyRaga commented Nov 29, 2017

edenhill commented Nov 29, 2017

AlexeyRaga commented Nov 29, 2017

DavidLiuXh commented Nov 30, 2017

edenhill commented Nov 30, 2017

DavidLiuXh commented Nov 30, 2017

DavidLiuXh commented Dec 4, 2017

AlexeyRaga commented Mar 6, 2019

edenhill commented Mar 11, 2019

nick-zh commented May 7, 2020

edenhill commented May 20, 2020

AndrewKostousov commented Apr 18, 2022