-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clickhouse: new database engine (experimental) #1094
Conversation
Some installation instructions are needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is still marked as "draft". Is it really ready for installation?
There are other changes in the code that seems to be unrelated to Clickhouse.
- I find no installation instructions. Zonemaster-Backend installation document needs to be updated.
- Also the Zonemaster-Backend configuration document needs to be updated.
lib/Zonemaster/Backend/DB.pm
Outdated
(SELECT count(*) FROM result_entries JOIN test_results ON result_entries.hash_id = test_results.hash_id AND level = ?) AS nb_critical, | ||
(SELECT count(*) FROM result_entries JOIN test_results ON result_entries.hash_id = test_results.hash_id AND level = ?) AS nb_error, | ||
(SELECT count(*) FROM result_entries JOIN test_results ON result_entries.hash_id = test_results.hash_id AND level = ?) AS nb_warning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these changes related to the addition of support of Clickhouse engine?
There are other changes in this module where it is not obvious that they are related to the Clickhouse change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the changes to DB/Clickhouse.pm
to avoid updating any logic related to the other engines. So this PR can stand by itself an be an experiment that can be reverted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. That makes it easier to review. I think it looks fine, but Travis does not. Some kind of documentation is needed (installation and configuration).
Yes. I marked it as "ready for review" and updated the description. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks fine.
I tried to test, and I successfully installed using the instructions in zonemaster/zonemaster#1228.
2023-12-07T16:27:29Z [622] [NOTICE] [Zonemaster::Backend::Config] Loading config: /home/tgreen/zonemaster/zonemaster-backend/backend_config.ini
2023-12-07T16:27:29Z [626] [NOTICE] [main] Daemon spawned
2023-12-07T16:27:30Z [626] [NOTICE] [Zonemaster::Backend::DB] Connecting to database 'DBI:mysql:database=zonemaster;host=127.0.0.1;port=9004' as user 'zonemaster'
However, as long as #1132 is not merged I can't go through with further testing:
$ ./script/zmtest zonemaster.net
error: method start_domain_test: 500 Internal Server Error at ./script/zmb line 678.
Release testing report – Success, no issues Rocky Linux 9.3/Clickhouse Merged For the unit tests, the procedure was incomplete. Unit tests needed a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I trust the testing from @marc-vanderwal, otherwise I have reviewed the rest and it looks good to me.
@pnax Before this is merged it should be rebased on latest develop (#1132 has now been merged), to be sure that CI passes |
Connections to Clickhouse are made through its MySQL interface. Clickhouse mostly supports standard SQL, however sometimes there are some deviations (for instance to UPDATE a row, or no native support for incremental indexes).
* Make UPDATE synchronous by setting `mutations_sync = 1` within the query, see <https://clickhouse.com/docs/en/operations/settings/settings#mutations_sync> <https://clickhouse.com/docs/en/sql-reference/statements/alter#synchronicity-of-alter-queries> * Manually compute the number of affected rows, because this number is incorrect in Clickhouse and this is a feature, see <ClickHouse/ClickHouse#50970 (comment)>
Clickhouse "batch_id" column is a non-nullable UInt32 to avoid performance issue with nullable column, see <https://clickhouse.com/docs/en/sql-reference/data-types/nullable>. In comparaison the other database engines store an empty or NULL value when no "batch_id" is defined and an integer otherwise. +-------------+-------------+ | no batch_id | batch_id | +------------------+-------------+-------------+ | Clickhouse | 0 | UInt32 >= 1 | | Other DB engines | empty/NULL | integer | +------------------+-------------+-------------+
Clickhouse does not have a UNIQUE constraint per column. Therefore the check for user uniqueness is performed manually.
There is no COMMIT with Clickhouse. However there is no restriction either on the number of bind parameters. Therefore we can pass all domains in one pass via bind parameters. This has been successfully tested with a batch made of 2 millions domains (with Clickhouse only).
To avoid potential memory exhaustion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming it passes CI.
I’m currently testing #1121 with the experimental Clickhouse database engine still enabled and it seems that I’m hitting issues related to the way Clickhouse implements ALTER TABLE … UPDATE that don’t make me feel comfortable at all with using a Clickhouse database backend in production. For example, I get warnings like the following:
That line 362 is the equality comparison of sub test_state {
my ( $self, $test_id ) = @_;
my ( $progress ) = $self->dbh->selectrow_array(
q[
SELECT progress
FROM test_results
WHERE hash_id = ?
],
undef,
$test_id,
);
if ( !defined $progress ) {
die Zonemaster::Backend::Error::Internal->new( reason => 'job not found' );
}
if ( $progress == 0 ) {
return $TEST_WAITING;
}
elsif ( 0 < $progress && $progress < 100 ) {
return $TEST_RUNNING;
}
elsif ( $progress == 100 ) {
return $TEST_COMPLETED;
}
else {
die Zonemaster::Backend::Error::Internal->new( reason => 'state could not be determined' );
}
} But these “argument isn’t numeric” errors should in theory never happen, because the code fetches exactly one row and column and the datum is, according to the database schema, an integer. How do we end up with a string? Grepping
Due to this, many tests either crash with “state could not be determined” or “illegal transition to COMPLETED”. It seems that Clickhouse doesn’t implement proper transaction isolation when updating a table. Yet, currently, the code does many updates to |
Not ready for inclusion yet, not even experimental.
More changes are required, in particular the way the queue is currently done in Backend. It should be separated and take into account the specifies of such a DBMS.
Purpose
Clickhouse is a vertical database engine that appears to be quite efficient when it comes to run Zonemaster with millions of domains
Context
Changes
How to test this PR
Setup a Clickhouse server
zonemaster
database and azonemaster
user:Configure and use Zonemaster-Backend
Adapt the
share/backend_config.ini
file and usezmb
orzmtest
.You could check any results by running some SQL commands with
clickhouse-client
.Run the unit tests
Unit test should work when using Clickhouse.