-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blocked queries #52
Comments
Hi How can you be sure this is a problem with the concurrency? A simple way to check this could be with random chains of RPCs between nodes in a network, to see if there are more delays than what is expected from serving the currently processed events/ messages. Or do you mean we should have a multi-threaded support for splay nodes? This would make the complexity of development for each node much larger, and I am not sure this is so much of a benefit. best
|
I'm confident it's a concurrency issue because the same scenario with less concurrent queries (thus, less overlapping in-flight queries traversing the network) does not produce the behaviours shown in the attached gantt chart (that is, queries getting slower and slower over time). |
This is true but this would require a re-design that goes beyond what we can do. In particular, we would probably have to move away from Lua. Note sure this is worth the pain. Etienne
|
There are scenarios where the concurrency level allowed by the Splay runtime is not sufficient.
One of them is the execution of the following protocol. It consists of the T-Kad protocol, gossip-based construction of the KAD DHT.
The problematic scenario occurs when deployed over a cluster of 600 splayds and using exactly 600 nodes. Each of the nodes issue 500 queries more or less concurrently.
The attached plot gantt.pdf shows that queries (on the y-axis) get slower and slower (longer blue bars on the x-axis).
We might need a simpler test case to identify and possibly optimise the runtime.
The text was updated successfully, but these errors were encountered: