-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quote local parquet CTAS queries #315
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -545,9 +545,7 @@ def test_ctas_empty_query_creation(expected, schema, table, cols, types): | |
"expected,db_type,schema,table,cols,remote_types", | ||
[ | ||
( | ||
"""CREATE EXTERNAL TABLE IF NOT EXISTS `test_athena`.`remote_table` ( | ||
a String, | ||
b Int | ||
"""CREATE EXTERNAL TABLE IF NOT EXISTS `test_athena`.`remote_table` ( a String, b Int | ||
) | ||
STORED AS PARQUET | ||
LOCATION 's3://bucket/data/' | ||
|
@@ -559,9 +557,7 @@ def test_ctas_empty_query_creation(expected, schema, table, cols, types): | |
["String", "Int"], | ||
), | ||
( | ||
"""CREATE TABLE IF NOT EXISTS local_table AS SELECT | ||
a, | ||
b | ||
"""CREATE TABLE IF NOT EXISTS "local_table" AS SELECT "a", "b" | ||
Comment on lines
-562
to
+560
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So what is the risk you are guarding against in this PR? Like, that a table name wouldn't have a prefix and also be a reserved word? Cause I assume no table with a study prefix could be a problem. Should we instead/additionally enforce that there is a prefix (look for a dunder)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Specifically, the LOINC groups dataset has a file just named 'Group', and columns named 'group'. I think you're right that, if a dunder was present, this would largely not be an issue. But since this is one of these static dataset situations that you want to use across multiple studies/datasources, I'm inclined to just live with the quoting here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah yeah and the "dundering" there is done via a different namespace. Sure - quoting is fine, was just curious if we should go even further. |
||
FROM read_parquet('./tests/test_data/*.parquet')""", | ||
"duckdb", | ||
"test_duckdb", | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only quote for duckdb and not all the time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Athena didn't like that, irritatingly - the serde style queries follow a slightly different set of syntax rules, it seems?