-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Very slow detection of ambiguities. #472
Comments
Can you provide a minimal parser that reproduces the infinite loop? While it does seem to make sense to have an ambiguity between range operator in an array |
Hi @bd82, After preparing the test case, I've discovered: With the invalid rule:
With the working rule:
The more rule I add the more time it takes. |
This makes more sense. But combined with ambiguities and certain grammars it's run time can start growing exponentially. A User can workaround this if they reduce the maximum lookahead number (part of the parser config options). But it is best if this is solved/mitigated by the framework. I will still need a grammar that reproduces the slowness.
|
I've published the test case at https://github.com/daiyam/issues/tree/master/chevrotain/472 I have an old pc so on a today pc, the slow |
Ok great. I can reproduce it using your example Repo. There seems to be a great many ambiguity errors, perhaps another approach |
Just by commenting the lines 123-146 and uncommenting 147-185, there is no ambiguity. I hadn't looked at the number of errors. 10-20 errors should be good enough. |
Right, I've debugged this a bit. It is more complicated than I initially assumed.
I would have to refactor the whole ambiguity checking logic (if possible) to enable halting Another thing I noticed is that when i set the "maxLookahead" option on the parser to 4 (by default it is 5). instead of waiting 40 seconds for the parser initialization error it only takes 800ms. |
If |
Yes. I may even change it to 3 by default. But I still want to free some time for a deeper look into the ambiguity detection algorithm. An alternative approach would be to choose an alternative That way it would be very fast in the common use case of no ambiguities. |
I've played around with this idea. My plan is:
|
To re-enable the old behavior of five tokens lookahead pass maxLookahead optional argument class MyParser extends Parser {
constructor(input) {
// explicit passing of the maxLookahead argument with the old default value.
super(input, ALL_TOKENS, {maxLookahead: 5})
// grammar rules...
Parser.performSelfAnalysis(this)
}
} |
Decided to change the maxLookahead to 4 instead of 3. There is a common pattern of lambda function versus parenthesis expression that requires 4 tokens // lambda expression, need to read ahead to find the fat arrow ("=>")
(x) => x
// parenthesis expression, identical prefix for first 3 characters of lambda expression.
(x) |
Hi @bd82,
I want to parse a script with array range and array comprehension.
The best syntax would be:
By doing so, I get an infinity loop.
It must be due to the needed backtracking between
Expression
andOperand
.Anyway, I was able to have a working syntax with:
The text was updated successfully, but these errors were encountered: