-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent async calls timing out #216
Comments
Hi @andoband. Sorry for the slow response, but I've been on vacation and had limited Internet access. The ConfigureAwait fix is likely not the correct solution. It was investigated while looking into issue #130, but, a more sophisticated solution that properly used the .NET synchronization contexts rather than discarding them was eventually developed. Can you please provide a the test project demonstrating the problem that you mention using? That would help looking into this issue. |
No worries on the "slow" response...I appreciate the reply. I've put up the test project here. It's a page with two buttons...one for executing sync code and one for async code. If you run the Sync tests first, then the connection pool fills up with healthy connections. So to reproduce what I'm seeing (hopefully) just push the async button. When pushed it will send 20 requests to the server and report back the results. I saw the changes you made in response to issue #130 using |
Unfortunately, I started tunneling my connection to the rethink server over ssh and I saw the timeouts start happening again...even with the I looked a little harder at Connection.cs, and I didn't find any explicit Having found that out, I moved the Another solution I tried and which worked was to queue the call to ReadLoop onto the ThreadPool using Something else I tried was bypassing ReadLoop altogether and leaving all that thread management to Asp.Net and the .Net Framework. In this case, I commented out the call to ReadLoop and instead just await the response normally after sending the request. In other words, in InternalRunQuery() I got rid of the TaskCompletionSource code, the use of tokenResponse dictionary, the spin lock, and replaced the I suspected that monitoring data changes might need that ReadLoop, so I ran the unit tests and sure enough, I had problems with the RealtimePushTests. But then I tried running the example console app and that worked! So I dunno. I'm connecting over a tunnelled SSH connection to an Azure server, so that might be complicating things. I hope this info helps. |
Hm... interesting analysis. Great investigation work. :-) I don't believe it's 100% accurate to say that "the call to ReadLoop is synchronous". As it is an "async" method, invoking ReadLoop() returns a Task object immediately while the function is executed on the current SynchronizationContext. This behavior is actually questionable, though; depending on the SynchronizationContext, this might not execute the way the code is written to expect. We know exactly how we want ReadLoop to run (in an unaffiliated thread, without any synchronization), so the implicit use of the SynchronizationContext here is probably the source of the problem. Your experiment to remove ReadLoop entirely unfortunately violates the current design of the Connection class; it is currently thread-safe and can execute multiple queries simultaneously, routing the results to the correct task objects when they are available. It certainly would simplify the Connection class to alter that design, but, I don't think it's quite what I'd want to do. I'm thinking the best approach here would be to take control over how the ReadLoop is executed. My first thought is to create one thread per Connection and, once connected, run ReadLoop in that thread synchronously; that should definitely "fix" the problem. But, that will create one thread per connection which certainly isn't required. The second thought is to take an approach similar to RethinkDb.TaskUtilities; free ourselves from the current SynchronizationContext and then spin up ReadLoop, and then restore the synchronization context. This would let ReadLoop be free-threaded regardless of the synchronization context used for creating the connection, which should address the issue. But it worries me a bit because the library is now clearly designed to work around the SynchronizationContext, and doesn't honour it like I would expect a library to. I still haven't tried to reproduce your test case, but I'll try to do that, and then see if the second approach fixes it. If so, I'm marginally comfortable with that solution. |
This is an attempt to address issue #216, where the fact that ReadLoop is executed on the current SynchronizationContext can cause thread locking issues.
So, I've had trouble reproducing your failure case. I got your code to run fine, but the async button just always worked. I tried to increase the number of requests the webpage made from 20 to 200 and then 2000, but, never ran into the issue. I've gone ahead and theorized the fix I'd like to make in a branch called |
I tried out the issue_216 branch and it prevents the errors that I'm seeing. Awesome!! Have you tried reproducing the issue by accessing a remote (across internet) db? That's when I experienced it, and since I don't have any issues when I'm running against a local machine, I'm guessing it's a latency issue. I would still recommend the .Net built-in ConfigureAwait(false) mechanism for solving this, but I'll be happy with any fix. :) Regarding running parallel queries on a connection...I haven't reported this yet, but I can't do that via ASP.Net either. For example:
That code will cause the process to take up all the processor time and never complete nor timeout. I have to kill the process. I haven't investigated why that happens, but at least in the ASP.Net scenario, the connection can only be used for one userland operation at a time. ReadLoop() perhaps? Regardless of these issues Mathieu, thank you sooo much for the work you've put into creating this driver! I wouldn't have given RethinkDB a chance without it. |
I am running up against connection errors (timeouts mostly) when using rethinkdb-net in an ASP.Net MVC app asynchronously.
If I call this from a webpage one-at-a-time, everything works fine, but if I throw a few concurrent calls at it then rethinkdb requests start timing out. When the timeout happens the rethink connection that timed out will fail each time it's used from that point on (sync or async). This only happens when I have the rethinkdb-net connection pool turned on, so it must be something to do with reusing a connection (or the underlying networking).
I set up a test project pinging the MVC action above via 10 concurrent ajax requests using jquery (although jquery or the browser seemed to be sending only around 6 calls simultaneously).
I found a simple workaround was to request the connection synchronously instead of asynchronously, i.e.
ConnectionFactory.Get()
instead ofConnectionFactory.GetAsync()
. I could then successfully callconn.RunAsync(query)
on that connection from then on out.I dug into the rethink-db code and found that putting a ConfigureAwait(false) on Line 200 of Connection.cs solved the problem.
I see in a previous commit (d22f3c4), that you added ConfigureAwaits for all awaits in Connection.cs but then removed them.
Maybe this is one that could be put back in?
Note: I was testing against a RethinkDB instance I set up on Azure (not a local install). Same behavior happened when the website was deployed to Azure as well.
The text was updated successfully, but these errors were encountered: