-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[native] Fix bug when parsing SqlFunctionHandle #24059
base: master
Are you sure you want to change the base?
Conversation
CC : @czentgr @aditi-pandit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pdabre12
@@ -271,4 +271,8 @@ class VeloxBatchQueryPlanConverter : public VeloxQueryPlanConverterBase { | |||
}; | |||
|
|||
void registerPrestoPlanNodeSerDe(); | |||
void parseSqlFunctionHandle( | |||
std::shared_ptr<protocol::SqlFunctionHandle> sqlFunction, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use const&
break; | ||
} | ||
auto argumentType = functionId.substr(start + 1, pos - start - 1); | ||
rawInputTypes.push_back(stringToType(argumentType, typeParser)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add VELOX_CHECK that !argumentType.empty()
facebook::presto::parseSqlFunctionHandle( | ||
sqlFunctionHandle, actualRawInputTypes, typeParser); | ||
ASSERT_TRUE(actualRawInputTypes.empty()); | ||
EXPECT_EQ(expectedRawInputTypes.size(), actualRawInputTypes.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need the next 3 lines, since the expectedRawInputTypes is empty ?
std::vector<TypePtr> actualRawInputTypes; | ||
std::vector<TypePtr> expectedRawInputTypes; | ||
TypeParser typeParser; | ||
auto sqlFunctionHandle = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused about what is the significance of the event_based_revenue argument in the json if the SQLFunction handle doesn't have any arguments ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I blindly copied the test cases from https://github.com/prestodb/presto/blob/master/presto-native-execution/presto_cpp/presto_protocol/tests/CallExpressionTest.cpp.
I just needed to have my hands on an instance of a SqlFunctionHandle
for the unit test, because the parseSqlFunctionHandle
function had a bug which failed in the parsing of the functionId
string.
Do you think this is a good way to test it? Please let me know if you have any better ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pdabre12 : Where do you anticipate this json to be generated ? You can run a query with an aggregate function and pick up the json from the web-console.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will only be generated for functions in the new function namespace manager : #23358.
Below is the json for the array_constructor
function :
"@type" : "call",
"sourceLocation" : {
"line" : 1,
"column" : 74
},
"displayName" : "array_constructor",
"functionHandle" : {
"@type" : "native",
"signature" : {
"name" : "native.default.array_constructor",
"kind" : "SCALAR",
"typeVariableConstraints" : [ ],
"longVariableConstraints" : [ ],
"returnType" : "array(double)",
"argumentTypes" : [ "double" ],
"variableArity" : false
},
"functionId" : "native.default.array_constructor;double",
"version" : "1"
},
The problem is without #23358 landing, native execution does not understand the native
functionHandle type, I suppose I could pick the json but keep the functionHandle type as json_file
as we really only are testing the parsing of the functionId string because that's where the bug manifests, let me know if that sounds okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pdabre12 : Yes, that is fine. Thanks.
facebook::presto::parseSqlFunctionHandle( | ||
sqlFunctionHandle, actualRawInputTypes, typeParser); | ||
ASSERT_FALSE(actualRawInputTypes.empty()); | ||
EXPECT_EQ(expectedRawInputTypes.size(), actualRawInputTypes.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make a small inline function to take expectedRawInputTypes and sqlFunctionHandle as inputs and do the parse and compare actualRowInputTypes and expectedRawInputTypes.
This snippet is repeated in all the functions.
"@type": "call", | ||
"arguments": [ | ||
{ | ||
"@type": "variable", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I'm confused since the type of this variable is real, but the SQLFunctionHandle has an integer parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please look at the above comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add to the above comment,
This is the string that was present in the CallExpression unit tests.
std::string str = R"(
{
"@type": "call",
"arguments": [
{
"@type": "variable",
"name": "event_based_revenue",
"type": "real"
}
],
"displayName": "sum",
"functionHandle": {
"@type": "json_file",
"functionId": "json.x4.sum;INTEGER;INTEGER",
"version": "1"
},
"returnType": "double"
}
)"
ASSERT_NE(p.functionHandle, nullptr); | ||
|
||
std::vector<TypePtr> actualRawInputTypes; | ||
std::vector<TypePtr> expectedRawInputTypes{BIGINT(), BIGINT()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a function with all the different input types (like INT, BIGINT, REAL, DOUBLE, VARCHAR etc) to ensure they are all parsed correctly ?
How are complex types handled in this function ? Do you have an example ?
class PrestoToVeloxQueryPlanTest : public ::testing::Test {}; | ||
|
||
TEST_F(PrestoToVeloxQueryPlanTest, parseSqlFunctionHandleWithZeroParam) { | ||
std::string str = R"( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to add a test corresponding to the error that you saw previously that led you to debugging this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the error was very generic.
Whenever we tried to parse any SqlFunctionHandle with zero / multiple params, because of the wrong exit condition it lead to a never-ending loop, for eg: "functionId": "json.x4.sum;"
failed because of zero params, and "functionId": "json.x4.sum;bigint;bigint;"
will fail too.
Resolves: #24077
If release note is NOT required, use: