-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uni.onFailure().withBackoff().retry() occasionally stalls #1388
Comments
Thanks for your report, we'll have a look in the coming days. |
/cc @ozangunalp |
I've looked briefly yesterday. I was able to reproduce with 2.0.0 and 2.3.1 and not with 2.0.0.milestone1 (before the non-prefetch concatmap) and 2.5.1 (latest). I'll continue to look today. |
I can also reproduce it with the latest version. I see that the issue is with the usage of noprefetch concatmap for the exponential backoff. However, I can't reproduce any hanging within the Mutiny retry tests. @jponge I'll need some help with this one. |
Sure, and thanks for the explorations, I'll schedule a call with you |
@ozangunalp, @jponge, is there any update on this? |
We need to get back to this. We had a discussion with @ozangunalp but we couldn't pinpoint any obvious culprit in the @ozangunalp did a great job to have a reproducer branch: https://github.com/ozangunalp/smallrye-mutiny/tree/reproducer_retry_exponential_backoff |
This fixes race conditions in concatMap and stream concatenation operators. Refs: #1388
Note: test this issue against #1448 |
Context
One of our unit tests became flaky after we upgraded our app from Quarkus 2.16 to Quarkus 3.2.
Long story short, we tracked this down to Mutiny. With Mutiny up to 2.0.0-milestone1 (inclusive) the test reliably passes. Starting with Mutiny 2.0.0-milestone2 the test sometimes fails.
The test uses
Uni.onFailure().withBackoff().retry()
to retry a gRPC call until the target service comes up.Sometimes the
retry()
doesn't trigger a new upstream subscription as expected, neither it emits any more downstream events.The only non-trivial change in Mutiny 2.0.0-milestone2 is this one: #997.
It changes the behavior of
MultiFlatten.concatenate()
, which is used internally inExponentialBackoff.randomExponentialBackoffFunction()
. Looks like a root cause of our problem.Reproducer
https://github.com/vladykin/mutiny-retry-issue-reproducer
mvn test -Dmutiny.version=...
always passes with Mutiny up to 2.0.0-milestone1 (inclusive), but sometimes fails with Mutiny starting 2.0.0-milestone2 and up to the latest 2.5.1.Use something like
to run it in a loop.
Additional details
Example output from a failed run:
Note that for
Request #11
there is anonFailure
, but no subsequentonSubscribe
.The text was updated successfully, but these errors were encountered: