Long-lasting queries in cluster may result in Clickhouse server bloat #11

michaelkl · 2017-11-18T21:05:06Z

When performing heavy Clickhouse request it may take too much time to finish. Sometimes it may even exceed HTTPClient connection timeout. Here is what happens in details:

We run some long-lasting query on Clickhouse cluster connecting in a way like this: Clickhouse.establish_connection(urls: ['host1.lvh.me:8123', 'host2.lvh.me:8123'])
Clickhouse server starts working hard on query we issued, but it takes too long.
Clickhouse::Connection::Client#request receives Faraday::TimeoutError exception.
Clickhouse::Cluster#method_missing retries the same request on another Clickhouse host from pond's pool with the same result. If servers are under really heavy load and request lasts long enough, we will run out of servers in the pool and... come to the first one again which is still working over the heavy query!
Go to 1.

As a result each Clickhouse server in cluster runs the same query over and over again increasing the load.

This bloat ends by reaching maximum number of simultaneous queries, when Clickhouse refuses to take another query.

The text was updated successfully, but these errors were encountered:

michaelkl · 2017-11-18T21:06:43Z

@archan937 I've made PR #10 to fix this. Not a perfect solution, but do the job. Please take a look.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Long-lasting queries in cluster may result in Clickhouse server bloat #11

Long-lasting queries in cluster may result in Clickhouse server bloat #11

michaelkl commented Nov 18, 2017

michaelkl commented Nov 18, 2017

Long-lasting queries in cluster may result in Clickhouse server bloat #11

Long-lasting queries in cluster may result in Clickhouse server bloat #11

Comments

michaelkl commented Nov 18, 2017

michaelkl commented Nov 18, 2017