-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use result entries table #856
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks like a good idea. I have some questions though:
- How will this affect the performance of the Test Agent when writing the results?
- How will this affect the performance of the RPCAPI reading the results and creating JSON for the response?
- Does this have any affect on storage?
- The
args
column in the new table is still a blob. Why not a separate table for that where each argument is one record? - This is a breaking change unless there is a migration script. Is that planned?
I will also do some tests.
I need make some test but I currently have some issue when running batches on develop.
I should not affect the performance, the RPCAPI was already decoded JSON blobs, but I will test too. As for the get_test_history method we should see the performance improving as we don't need to parse / grep up to 200 json blobs.
Again, I haven't done extensive testing yet.
I thought of totally splitting the args in a separate table
Yes, I wanted to have some early feedback before. |
I changed the implementation to do only one query to the database instead of one for each log, this way the performance is mostly unaffected (Actually a fair share of the time taken to insert the results in the database is taken by the grep to filter the log entries to a given minimum log level. This overhead can be reduced by avoiding Moose in the Logger::Entry packages. I have a working POC here that I am planning to integrate to the engine.) |
I will review again when the conflicts are resolved. |
2106c78
to
3177c4c
Compare
Module name, testcase ID and message tag will never reach the length of 255 chacters. We could easily specify that they should be a maximum of 32, 32 and 64 characters, respectively. How much would we gain? If I understand "character varying" it can handle Unicode characters. Today these three codes have names in ASCII. Could we gain some by using a column type for string of 8-bit length characters? |
varchar stores the data as length + data (without padding), there won't be any performance gain from restricting the length of string except for the length difference in the strings. What could be done is store them as enum / table + foreign key, but that could be done later as I want to keep the number of changes brought by this PR to a minimum.
The database is encoded in utf8 so there is no overhead on ascii characters, they are still stored on 8-bit. |
I think this looks fine. We should improve by having database engine independent documentation of the database structure to ensure that we get consistency between the engines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good to me. Though Travis is unhappy. I only have a couple of questions/suggestions.
$dbh->do( | ||
"CREATE TABLE result_entries ( | ||
id integer AUTO_INCREMENT PRIMARY KEY, | ||
hash_id VARCHAR(16) not null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you consider changing this into CHAR(16)? (Here as well as for the other database adapters.)
@@ -170,9 +173,10 @@ sub run { | |||
} | |||
} | |||
|
|||
$self->{_db}->test_results( $test_id, Zonemaster::Engine->logger->json( 'INFO' ) ); | |||
my @entries = grep { $_->numeric_level >= $numeric{INFO} } @{ Zonemaster::Engine->logger->entries }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you consider filtering the entries before we even buffer them, as opposed to before we store them? I'm thinking we could save some memory that way.
@blacksponge, I think this is interesting. If you e.g. test all domain names under a TLD you might want to extract all domain names getting a specific message tag or all message tags with a certain level for each domain name. Today you have to open the JSON blob for each domain name. What are your plans? Comment 2023-07-13: Today I am better informed and it appears that both Mysql and Postgresql have good support to extract data from the JSON blod as if they were fields in a database table. |
Replaced by #1092. The logic is kept, this is mainly a rebase to fix the conflicts. |
Purpose
Move the
test_results.results
json array into a dedicated table.Context
Was briefly mentioned at last group meeting (2021-09-01)
Changes
Adds a new table
result_entries
:get_test_result
andget_test_history
methods are modified to use the new tableDB::test_result
is now only a getter, all write operations use the new database methodsadd_result_entry
andadd_result_entries
How to test this PR
It should work the same way as it was