You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 20, 2023. It is now read-only.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add support for aliases in GROUP BY. Add Postgres-style date_part.
Description
Add support for aliases in GROUP BY. Add Postgres-style date_part.
Note that the Postgres parser rewrites EXTRACT(YEAR FROM some_timestamp)
as date_part("year", some_timestamp).
Note that the optimizer gives us a bad OutputSchema when the GROUP BY expression
is on an aggregated column of the table, so this does not fix chbenchmark. It is currently
unclear if this is primarily a problem with the binder, optimizer, or translator.
Reasons:
Translator: This is where the output schema kaboom happens.
Optimizer: This produced the output schema. Maybe it never accounted for select abs(a) as x ... group by x.
Binder: But maybe the optimizer would be fine if the binder wasn't swapping out expressions (quite literally, the group by x child becomes the expression abs(a) in the above example. This seems to work for ORDER BY fwiw.)
This is an example of the OutputSchema produced by the optimizer for the following query:
SELECT n_name,
extract(YEAR
FROM o_entry_d) AS l_year,
sum(ol_amount) AS sum_profit
FROM item,
stock,
supplier,
order_line,
oorder,
nation
WHERE ol_i_id = s_i_id
AND ol_supply_w_id = s_w_id
AND MOD ((s_w_id * s_i_id), 10000) = su_suppkey
AND ol_w_id = o_w_id
AND ol_d_id = o_d_id
AND ol_o_id = o_id
AND ol_i_id = i_id
AND su_nationkey = n_nationkey
AND i_data LIKE '%bb'
GROUP BY n_name,
l_year
ORDER BY n_name,
l_year DESC;
Note that the Postgres parser rewrites EXTRACT(YEAR FROM some_timestamp)
as date_part("year", some_timestamp).
Note that the optimizer gives us a bad OutputSchema when the GROUP BY expression
is on a column of the table, so this does not fix chbenchmark. It is currently
unclear if this is primarily a problem with the binder, optimizer, or translator.
lmwnshn
added
in-progress
This PR is being actively worked on and not ready to be reviewed or merged. Mark PRs with this.
ready-for-ci
Indicate that this build should be run through CI.
labels
Jun 2, 2021
lmwnshn
added
ready-for-review
This PR passes all checks and is ready to be reviewed. Mark PRs with this.
and removed
in-progress
This PR is being actively worked on and not ready to be reviewed or merged. Mark PRs with this.
labels
Jun 4, 2021
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
ready-for-ciIndicate that this build should be run through CI.ready-for-reviewThis PR passes all checks and is ready to be reviewed. Mark PRs with this.
2 participants
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Heading
Add support for aliases in GROUP BY. Add Postgres-style date_part.
Description
Add support for aliases in GROUP BY. Add Postgres-style date_part.
Note that the Postgres parser rewrites EXTRACT(YEAR FROM some_timestamp)
as date_part("year", some_timestamp).
Note that the optimizer gives us a bad OutputSchema when the GROUP BY expression
is on an aggregated column of the table, so this does not fix chbenchmark. It is currently
unclear if this is primarily a problem with the binder, optimizer, or translator.
Reasons:
select abs(a) as x ... group by x
.x
child becomes the expressionabs(a)
in the above example. This seems to work for ORDER BY fwiw.)This is an example of the OutputSchema produced by the optimizer for the following query:
Further work
This is enough to merge on its own, seeing if it gets through CI.