Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix type coercion between the sides of an UNION #15340

Merged
merged 6 commits into from
Mar 8, 2024

Conversation

systay
Copy link
Collaborator

@systay systay commented Feb 23, 2024

Description

For the following SQL query, Vitess doesn't return the same results as MySQL:

select email, max(col) as max_col from user group by email union select email, avg(col) as avg_col from user group by email order by email desc
Vitess Results:
[VARCHAR("test") DECIMAL(2)]
[VARCHAR("test") DECIMAL(1.5000)]
[VARCHAR("d") DECIMAL(2.0000)]
[VARCHAR("d") DECIMAL(2)]
[VARCHAR("c3") DECIMAL(4.0000)]
[VARCHAR("c3") DECIMAL(4)]
[VARCHAR("c") DECIMAL(4)]
[VARCHAR("c") DECIMAL(4.0000)]
[VARCHAR("b") DECIMAL(3.0000)]
[VARCHAR("b") DECIMAL(3)]
[VARCHAR("Abc") DECIMAL(2.0000)]
[VARCHAR("Abc") DECIMAL(2)]
[VARCHAR("a") DECIMAL(1.5000)]
[VARCHAR("a") DECIMAL(2)]


MySQL Results:
[VARCHAR("test") DECIMAL(2.0000)]
[VARCHAR("test") DECIMAL(1.5000)]
[VARCHAR("d") DECIMAL(2.0000)]
[VARCHAR("c3") DECIMAL(4.0000)]
[VARCHAR("c") DECIMAL(4.0000)]
[VARCHAR("b") DECIMAL(3.0000)]
[VARCHAR("Abc") DECIMAL(2.0000)]
[VARCHAR("a") DECIMAL(2.0000)]
[VARCHAR("a") DECIMAL(1.5000)]

Fixing this bug was way more involved than we initially expected! Basically, we were not properly performing type aggregation nor coercion between the two sides of UNION queries. The results for queries like that were only correct when the two sides of the union have the same types (something which is quite common, which is why we hadn't found out yet), but with different types the results are off.

To fix the issue, this PR:

  • Implements proper type coercion in the evalengine, including changing the precision of DECIMAL when casting from a decimal with a smaller precision.
    • as part of this implementation, the old type coercion API has been removed, and with it several arithmetic APIs that were the only remaining user of the API and that have been superseded by the arithmetic aggregation operators we shipped in Vitess 19. 🔥
  • Implements a new API for incremental type aggregation, instead of the previous API that required collecting all the types to be aggregated in a slice upfront.
  • Implements a new API to convert between evalengine.Type and querypb.Field without losing information
  • Wires up these two new APIs to the UNION/DISTINCT engine primitives, so that we're properly aggregating the types between the two sides of the union.

Related Issue(s)

Checklist

  • "Backport to:" labels have been added if this change should be back-ported to release branches
  • If this change is to be back-ported to previous releases, a justification is included in the PR description
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on CI?
  • Documentation was added or is not required

Deployment Notes

Copy link
Contributor

vitess-bot bot commented Feb 23, 2024

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • Ensure there is a link to an issue (except for internal cleanup and flaky test fixes), new features should have an RFC that documents use cases and test cases.

Tests

  • Bug fixes should have at least one unit or end-to-end test, enhancement and new features should have a sufficient number of tests.

Documentation

  • Apply the release notes (needs details) label if users need to know about this change.
  • New features should be documented.
  • There should be some code comments as to why things are implemented the way they are.
  • There should be a comment at the top of each new or modified test to explain what the test does.

New flags

  • Is this flag really necessary?
  • Flag names must be clear and intuitive, use dashes (-), and have a clear help text.

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow needs to be marked as required, the maintainer team must be notified.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from vitess-operator and arewefastyet, if used there.
  • vtctl command output order should be stable and awk-able.

@vitess-bot vitess-bot bot added NeedsBackportReason If backport labels have been applied to a PR, a justification is required NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsIssue A linked issue is missing for this Pull Request NeedsWebsiteDocsUpdate What it says labels Feb 23, 2024
@github-actions github-actions bot added this to the v20.0.0 milestone Feb 23, 2024
@GrahamCampbell
Copy link
Contributor

Needs backporting to 18.0 and 19.0, presumably? Caused by #15192?

@GrahamCampbell
Copy link
Contributor

Are there similar issues with group by, or is this just an issue with distinct?

@vmg
Copy link
Collaborator

vmg commented Feb 28, 2024

@systay OK here's a more-or-less working fix for this (which includes several related cleanups, particularly removing evalengine APIs which we don't need anymore). You'll see that some plans have changed: we're not asking for weight_string to MySQL in several union cases. I think that's a good thing but I'd like to review it with you next week.

@vmg vmg force-pushed the evalengine-cmp-issue branch from 4936a31 to 21e4558 Compare March 7, 2024 16:47
@systay systay requested a review from deepthi as a code owner March 7, 2024 16:47
@vmg vmg added Type: Bug Component: Query Serving Component: Evalengine changes to the evaluation engine and removed NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsWebsiteDocsUpdate What it says NeedsIssue A linked issue is missing for this Pull Request NeedsBackportReason If backport labels have been applied to a PR, a justification is required labels Mar 7, 2024
Signed-off-by: Vicent Marti <[email protected]>
Copy link

codecov bot commented Mar 7, 2024

Codecov Report

Attention: Patch coverage is 90.50633% with 15 lines in your changes are missing coverage. Please review.

Project coverage is 65.72%. Comparing base (6c73053) to head (ee0f9b4).
Report is 2 commits behind head on main.

Files Patch % Lines
go/vt/vtgate/engine/concatenate.go 93.75% 3 Missing ⚠️
go/vt/vtgate/engine/distinct.go 86.36% 3 Missing ⚠️
go/vt/vtgate/evalengine/eval_numeric.go 66.66% 2 Missing ⚠️
...o/vt/vtgate/planbuilder/operators/union_merging.go 71.42% 2 Missing ⚠️
go/vt/vtgate/semantics/table_collector.go 77.77% 2 Missing ⚠️
go/vt/vtgate/evalengine/api_type_aggregation.go 95.23% 1 Missing ⚠️
go/vt/vtgate/evalengine/compiler.go 95.00% 1 Missing ⚠️
go/vt/vtgate/evalengine/eval.go 95.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #15340      +/-   ##
==========================================
- Coverage   65.72%   65.72%   -0.01%     
==========================================
  Files        1563     1562       -1     
  Lines      194027   193952      -75     
==========================================
- Hits       127529   127468      -61     
+ Misses      66498    66484      -14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Vicent Marti <[email protected]>
@vmg
Copy link
Collaborator

vmg commented Mar 8, 2024

Fixing this bug was way more involved than we initially expected! The PR is green now and the planner changes have been reviewed with @systay. Basically, we were not properly performing type aggregation nor coercion between the two sides of UNION queries. The results for queries like that were only correct when the two sides of the union have the same types (something which is quite common, which is why we hadn't found out yet), but with different types the results are off.

Copy link
Member

@GuptaManan100 GuptaManan100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The title and the description need an update

@vmg vmg changed the title DISTINCT bug Fix type coercion between the sides of an UNION Mar 8, 2024
@vmg
Copy link
Collaborator

vmg commented Mar 8, 2024

Updated title and description with an actual list of changes for this PR. ❇️

@vmg vmg merged commit 983a3c8 into vitessio:main Mar 8, 2024
102 checks passed
@vmg vmg deleted the evalengine-cmp-issue branch March 8, 2024 09:57
@systay
Copy link
Collaborator Author

systay commented Mar 8, 2024

Are there similar issues with group by, or is this just an issue with distinct?

We didn't find any such issues, but you are correct in suspecting DISTINCT. It also needs to compare values. ORDER BY that is performed on the vtgate side would also be exposed to the same underlying problem.

This fix solved the issue in the underlying evalengine component, so it should apply to DISTINCT, GROUP BY and ORDER BY.

@vmg
Copy link
Collaborator

vmg commented Mar 8, 2024

@GrahamCampbell ☝️ what @systay said -- sorry I missed your question! The GitHub UI collapsed it. 😅

@dbussink
Copy link
Contributor

dbussink commented Mar 8, 2024

Needs backporting to 18.0 and 19.0, presumably? Caused by #15192?

This is not related to those changes and I don't think we need to backport here.

@vmg
Copy link
Collaborator

vmg commented Mar 8, 2024

Aye this is unrelated to any previous issue, and it's not blocking any users or customers afaict. The report was filed by @systay himself and he found the problem by fuzzing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants