-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[plsql] Fix for #2589 -- performance improvements #3653
Conversation
general_element accounts for the majority of the time for long-running/aggregate01.sql. Not sure why yet. Working on getting more information from DiagnosticErrorListener. |
What I really need is to parse some input from |
|
@kaby76 thanks! |
//| CHAR_STRING_PERL | ||
| NATIONAL_CHAR_STRING_LIT | ||
| DELIMITED_ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question. The string uses single quotes. Why can it match double quotes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for example, SELECT "C1" FROM "T1"
, "C1"
will be matched into a quoted_string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DELIMITED_ID was added to quoted_string in order to get examples/create_table.sql to pass. But, the problem is lob_item.
Please open an Issue for tracking. I will adjust the grammar asap.
The fix will move the DELIMITED_ID alt to the lob_item rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a fix for #2589.
In this PR, I made a few changes to remove some ambiguity in the grammar. The changes improve the performance modestly, so there is more work to be done.
atom
rule. This change involved removing thetable_element outer_join_sign
andquoted_string
alts, which is covered withgeneral_element outer_join_sign?
andconstant
alts, respectively. This change improved some of the examples inlong-running/
to be ~3 times faster for the TypeScript/JavaScript ports.quoted_string
rule. I removed the alt forvariable_name
fromquoted_string
and added an alt forDELIMITED_ID
--which in itself is very weirdly named, as a double-quoted string should must be calledDOUBLE_QUOTED_STRING
. It makes no sense thatqouted_string
would derive a variable name.general_element_part
. I removed the('.' id_expression)*
syntax because the syntax is used withgeneral_element_part
throughout the grammar, creating ambiguity.In addition, I changed the readme.md file to be standardized, containing links to all the relevant Oracle docs, with the latest version. It's not strictly "performance-related" but I found myself constantly referring back to the Oracle docs for this PR. Note, the docs are just horrible for finding anything but necessary reference material. There are no good searching tools for html or pdf, e.g. "column_alias" or "column alias". The HTML doc would be okay with a text editor, but the damn doc is spread over many HTML files. The webpages for the Oracle docs have UI for searching, but it knows nothing about searching for strings that contain spaces.
Performance
Using Ryzen 7 2700, 16GB, SSD, Windows 11, NET SDK 7. Grouped parsing of the 11 files hw-examples/*.sql, repeated 10 times. Calculations performed using ChatGPT4, Mean ± Margin of Error at 95% confidence level.
raw-data.txt