Use a separate lexer #836
bjorn3
started this conversation in
Large Projects & Plans
Replies: 1 comment 1 reply
-
Hi @bjorn3! I think this is a lovely idea! But, it fits a discussion more than an issue. In our repo, issues are meant for things we are going to fix in a PR, discussions are a place to chat about stuff we do not know if we want to do yet, or we are not sure how to do. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This will make it easier to recover from lexer errors like a newline in the middle of a quoted string. I think it will also make it easier to do program repair, by allowing access to a list of tokens even if the program has a syntax error instead of presenting program repair a raw string and requiring it to do lexing itself as necessary. It could also allow the grammar to be agnostic to whitespace by handling all whitespace separation in the lexer.
There is the slight complication though that lexing for Hedy will need to be context sensitive due to the existence of unquoted strings. AFAIK they can contain words that are keywords outside of said unquoted strings.
Lark allows custom lexers through the
lexer
argument when constructing the parser. You need to pass a class implementing Lexer whoselex
method accepts the parser input and yieldsToken
s.https://lark-parser.readthedocs.io/en/latest/examples/advanced/custom_lexer.html
Beta Was this translation helpful? Give feedback.
All reactions