-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed parsing #44
Comments
Thanks for reporting this @GordonSmith This makes wonder again if I could transition Chevrotain to user another regExp parser (one which I do not have to maintain) as over time more and more regExp features are added and I've built this project mainly for use in Chevrotain. |
I don't know off hand, but do know I have been using it for several years now. |
Out of curiosity why does Chevrotain need to parse the RegEx? I would have thought using them "black box" would have been sufficient? |
I didn't mention why this is import! I want to common up my VS Code "syntaxes" regex with the ones I use in the Chevrotain lexer (with plans to automate the syncing). In VS Code the declaration looks like this (from json file):
While in Chevrotain:
My current plan was to standardize the two to look like this:
At which point I would have some hope of auto syncing... |
It is not mandatory, just for optimization purposes, by understand which characters can match each token pattern Chevrotain can save quite a-lot of time during the lexing phase. See: https://sap.github.io/chevrotain/docs/guide/performance.html#ensuring-lexer-optimizations So I am uncertain this issue should be a blocker for you |
Interesting - as a potential side project I could see a "chevrotain grammar -> VSCode Language Extension" utility being able to get a huge % of the grunt work automated. My gut says there is a disconnect between the Grammar and the CST tree (the loss of some of the semantic logic from the parser definition) that if it was preserved in the CstNode as information would simplify the Visitor pattern somewhat. At the moment it feels like I have to write everything twice (but slightly differently), but if I knew that certain children where "OR" and what the sequential order of the children was, then I could simply walk the CST Tree with a simpler visit pattern. (sorry for nattering off topic). |
I think you be describing something like Xtext |
I have created some editor logic utils specifically for the XML language. I find it hard to imagine how such logic would be generalized to a library EDIT: you may want to look here: Chevrotain/chevrotain#921 |
Regarding the CST structure. it is intentionally very simple to allow fast construction and traversal. You may be able to override methods from the tree builder trait to change the CST structure being built. Feel free to share your results. |
Re XText - yes, but 100% within JS (I actually had some experience with XText about 5 years ago, while writing a language extension for Eclipse - the same language I am partially implementing in Chevrotain now...). |
Perhaps a prettier plugin is relevant for you, here are a couple of examples using Chevrotain: Although as you mentioned you main issue is the parsing enough of a not well defined language to properly parse it. The Java Parser (in prettier-java) above uses a-lot of back tracking as lookahead to stay close to the Java Spec which is very well defined just not LL(K)... perhaps such backtracking would be of use to you when handling your difficult grammar. |
Specifically the case insensitive part
?i:
The text was updated successfully, but these errors were encountered: