-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connection not released when request is cancelled #140
Comments
We now ensure that no cancel signals interrupt preparation and connection acquisition by using the discardOnCancel(…) operator. [#140] Signed-off-by: Mark Paluch <[email protected]>
We now ensure that no cancel signals interrupt preparation and connection acquisition by using the discardOnCancel(…) operator. [#140] Signed-off-by: Mark Paluch <[email protected]>
I applied a change that ensures cancel signals do not interrupt the allocation sequence of connections. Can you test against |
We have repeated the tests against the new version, the connection keeps geeting "stucked".
LOGFILE: https://github.com/jfrossetto/connection-stuck/blob/stress-cancel-pool088/log_connectionStuck2.txt |
I'm still analyzing what is going on. My thesis right now is that we've uncovered a bigger issue that might span across all involved components. I suspect that we've addressed one issue in R2DBC Pool already, but there's more involved such as Spring's Transaction management that can potentially lead to resource leaks due to cancel interrupting active flows. |
I had a look at the involved components. Both, Spring's I think we did everything we could on the pooling side. Now it's time to tackle the issue within Spring Framework. |
Propagate subscriber context. Allow conditional cancellation propagation. [#140] Signed-off-by: Mark Paluch <[email protected]>
Propagate subscriber context. Allow conditional cancellation propagation. [#140] Signed-off-by: Mark Paluch <[email protected]>
New issue on Spring Framework |
We have created a new project without spring dependencies and the connection keeps getting stucked. You can find the source code here: https://github.com/jfrossetto/r2dbc-stress-cancel
The logfile for the tests: |
Bug Report
Versions
Current Behavior
This issue was first opened on r2dbc-postgresql, some discussion can be seen there.
pgjdbc/r2dbc-postgresql#459
We are facing some issues related to stuck connections in the pool.
After some investigation we could identify the problem happening in our search flow. Our frontend has a combobox with some kind of "debounce" which cancel resquests while user keep typing and sends only one request to our server. When these requests gets cancelled right after the validation query have runned establishing the connection as healthy, and before the real SQL query starts the connections never gets released. This is a very specific moment on the chain and its hard to reproduce.
We have found this post, that seems to be related but the suggested answer doesn't really helps, and the posted code sample does not leads to reproduce our problem.
https://stackoverflow.com/questions/68407202/r2dbc-pool-connection-not-released-after-cancel
We could reproduce the behavior by calling any endpoint thats fetchs data from database and using a breakpoint after the connection gets established and before the query starts, then we force the request's cancel and release the breakpoint in our application, this way the connection always gets stuck.
The last query runned by the connection is always the validation query, in our case a simple "SELECT 1":
The breakpoint was placed on the class DefaultFetchSpec from org.springframework.r2dbc.core, inside the public Flux all() method, but this is not a consistent way to reproduce it.
Another way we was able to reproduce it was by "stressing" the application and simulating several consecutives "network failures" forcing the cancel of the request, the log file produced by this approach can de found here
https://github.com/rogtejada/connection-stuck/blob/main/log_connectionStuck.txt
Analyzing the logs seems that when the connection gets "stuck" the FluxDiscardOnCancel does not show up in the logs
All connections are stuck with field query filled with value SELECT 1
The following stack strace does not shows in our original implementation, but when switching to the connection just as shows in the pool readme it happens.
Stack trace (when creating the connection just as specified on driver README)
Table schema
Input Code
-- your SQL here;
Steps to reproduce
Seems like the timing of the request's cancel is what causes the bug, that makes it hard to reproduce. So we are not managed to find a consistent way to reproduce it.
Input Code
Expected behavior/code
I would expect that the connection will always get released no matter which moment the requests gets canceled
Possible Solution
There was an interesting discussion running here
r2dbc/r2dbc-spi#35
Maybe it could help in this problem.
Additional context
The text was updated successfully, but these errors were encountered: