Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow re-Committing offsets #1372

Open
edenhill opened this issue Aug 9, 2017 · 18 comments
Open

Allow re-Committing offsets #1372

edenhill opened this issue Aug 9, 2017 · 18 comments

Comments

@edenhill
Copy link
Contributor

edenhill commented Aug 9, 2017

librdkafka currently ignores (application) commit requests if the offsets match that of the last known commit.
For reasons stated here (commit expiry is shorter than message interval) it is desired for librdkafka not to perform this check and allow the commit to pass through to the broker.

@AlexeyRaga Should this affect auto commit behaviour as well?

@AlexeyRaga
Copy link
Contributor

I never use auto commit to be honest, so I may miss some edge cases, but I personally would expect it to recommit.

@edenhill
Copy link
Contributor Author

edenhill commented Sep 7, 2017

From @coberty in #1415:

I am not sure if it is wrong. I would rather say - unexpected.

We have such case:
Our C++ client had not received any messages for 2 days. And after it was restarted it started to read everything from the very beginning (offset reset to earliest) what is something that we want to avoid.

It wouldn't have happened with Java client since it retains offsets while is running with auto commit.

@jrnt30
Copy link

jrnt30 commented Oct 16, 2017

This has led to some head scratching for us as well for somewhat bursty topic data, thanks for the work on this.

@AlexeyRaga
Copy link
Contributor

To make it simpler, maybe instead of having a config option, it would be easier to implement a new function, like commitAlways or something like that?

@edenhill
Copy link
Contributor Author

We want this behaviour to be used for auto commits too, so I think a config option is the easiest approach for users.

@DavidLiuXh
Copy link

I think this feature is particularly necessary.We have hundreds of topics(no replica) in a cluster , if a broker is broken, need to restart all the clients :(

@AlexeyRaga
Copy link
Contributor

@DavidLiuXh it is even more critical because restart doesn't seem to help, negative lag doesn't go away. So you have to choose between two options, both are bad:

  1. Configure consumers to earliest offsets, therefore it will work, but "broken" partitions will be consumed from the beginning each time you restart the job (or the job dies)
  2. Configure consumers to latest offsets, therefore each time your job restarts or dies the broken partitions will continue from the high watermarks and you may lose (skip) messages.

I am in this crappy situation right now, sitting in the position #1 and praying that jobs don't restart until we figure out the solution :(

@edenhill
Copy link
Contributor Author

There are two workarounds:

  • shut down the consumers, reset the offsets (--to-offset ..) using the offset tool, restart consumer. They should resume at the reset offset position. (no application logic needed)
  • set auto.offset.reset=error, handle the consumer error by seeking/assigning to a specific offset.

@AlexeyRaga
Copy link
Contributor

Oh, thanks, something to think about!
The offset reset tool doesn't work for me. It errors out with something like "protocol v2 vs. protocol v1", probably because we are still using Kafka 0.10.x.
But the 2nd one can be an option.

@DavidLiuXh
Copy link

@AlexeyRaga I wrote a reset offset tool myself by librdkafka for kafka 0.9.0.1, basically do not need to restart a large number of clients.

@edenhill
Copy link
Contributor Author

@DavidLiuXh Can you share that tool?

@DavidLiuXh
Copy link

@edenhill I need a little time

@DavidLiuXh
Copy link

@AlexeyRaga
Copy link
Contributor

@edenhill has anything changed about this issue? This just hot us really hard here: a couple of partitions that were getting data infrequently suddenly lost their offsets...

@edenhill
Copy link
Contributor Author

@AlexeyRaga This is still on the backburner, let's look into it after the v1.0.0 release.

@nick-zh
Copy link
Contributor

nick-zh commented May 7, 2020

@edenhill any update on this?

@edenhill
Copy link
Contributor Author

This is too big of a change (risk-wise) to go into v1.5, will adress after that release.

@AndrewKostousov
Copy link

@edenhill One more ping regarding this issue.

We also have this annoying case with topics where data is rarely written to. Consumer is working 24/7, but after it restarts, sometimes it begins to process these topics from the beginning (due to offsets.retention.minutes setting).

Re-committing current offsets for consumers even when no new messages arrive would be a perfect solution for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants