You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I can't seem to find any documentation on this, so I thought I'd try here.
In many grammar specifications, rules like NAME or NUMBER are used. I can see these defined in the file Tokens, but how do I define these? Is it safe to do:
identifier: characters_for_an_identifier
Or are there better ways of doing this? I'm curious because different languages define what an "identifier" is, so I was curious how this is handled, and where these rules/tokens are (actually) defined.
The text was updated successfully, but these errors were encountered:
Tokens like NAME or NUMBER come from the tokenizer and the parser has no control over them. In order to change what constitutes an identifier, the tokenizer would have to be changed to handle NAME tokens differently.
pegen uses the python tokenizer by default, which has a strict definition of what an identifier is, but you could pass a different tokenizer when instantiating a parser object, if you really want to change that.
@lysnikolaou I might need a different tokenizer owing to the language I'm trying to parse having some unique lexical rules in regards to strings and such. The language is fully Unicode aware, so I have that to deal with. Are there any examples of overriding/replacing the tokenizer or should I just look at the default implementation?
I can't seem to find any documentation on this, so I thought I'd try here.
In many grammar specifications, rules like
NAME
orNUMBER
are used. I can see these defined in the fileTokens
, but how do I define these? Is it safe to do:Or are there better ways of doing this? I'm curious because different languages define what an "identifier" is, so I was curious how this is handled, and where these rules/tokens are (actually) defined.
The text was updated successfully, but these errors were encountered: