-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Coral-Hive] [Coral-Trino] Make named_struct a Coral IR operator and Migrate GenericProject Function #431
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -21,7 +21,7 @@ | |||
* If a column, colA, has a RelDataType, relDataTypeA, with a Trino type string, trinoTypeStringA = buildStructDataTypeString(relDataTypeA), | |||
* then the following operation is syntactically and semantically correct in Trino: CAST(colA as trinoTypeStringA) | |||
*/ | |||
class RelDataTypeToTrinoTypeStringConverter { | |||
public class RelDataTypeToTrinoTypeStringConverter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I know why these three classes including TrinoMapTransformValuesFunction
and TrinoStructCastRowFunction
are converted to public
? I don't see any usage of this class in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These classes are used in GenericProjectTransformer
for example here. Previously GenericProjectTransformer
was in the same package as RelDataTypeToTrinoTypeStringConverter
but now it's moved to another package.
What changes are proposed in this pull request, and why are they necessary?
This PR covers two migrations - (1)
named_struct()
(2)generic_project()
[1]
This PR uses code changes from open PR #412 and adds minor modifications on top of it to be compatible with the new API.
Summary from the PR#412:
This patch removes the transformation from
HiveConvertletTable
that converts named_struct to CAST (ROW() AS ROW()). Instead, it makesnamed_struct
a Coral IR operator. Engine translations on the RHS are also adapted to accommodate this change. This also eliminates the need to rewrite from CAST (ROW() AS ROW()) to named_struct on the Spark side, because named_struct is now maintained all along. CastToNamedStructTransformer on the Spark side will be removed in a future PR.This PR also introduces a Trino transformer,
NamedStructToCastTransformer
, which converts the Coral IR operator:named_struct
to its equivalent Trino compatible operator.This PR should address #357 and also unblocks migration of CONCAT operator here #378
[2]
This PR also migrates the Rel transformer:
GenericProjectToTrinoConverter
to a SqlCall transformer:GenericProjectTransformer
.How was this patch tested?
./gradlew build
updated & added UTs
tested with production views for spark, avro, trino