-
Notifications
You must be signed in to change notification settings - Fork 35
RegexLexer
Olivier Duhart edited this page Feb 23, 2018
·
3 revisions
This lexer is a poor man regex based lexer inspired by this post So it's not a very efficient lexer. Indeed, when used, this lexer is the bottleneck of the whole lexer/parser. But it is really flexible and easy to use.
The idea of a regex lexer is to associate to every lexeme a matching regex. So a lexeme needs 3 parameters :
-
string regex
: a regular expression that captures the lexeme -
boolean isSkippable
(optional, default isfalse
): a boolean , true if the lexeme must be ignored ( whitespace for example) -
boolean isLineending
(optionanl, default isfalse
) : true if the lexeme matches a line end (to allow line counting while lexing).
public enum ExpressionToken
{
// float number
[Lexeme("[0-9]+\\.[0-9]+")]
DOUBLE = 1,
// integer
[Lexeme("[0-9]+")]
INT = 3,
// the + operator
[Lexeme("\\+")]
PLUS = 4,
// the - operator
[Lexeme("\\-")]
MINUS = 5,
// the * operator
[Lexeme("\\*")]
TIMES = 6,
// the / operator
[Lexeme("\\/")]
DIVIDE = 7,
// a left paranthesis (
[Lexeme("\\(")]
LPAREN = 8,
// a right paranthesis )
[Lexeme("\\)")]
RPAREN = 9,
// a whitespace
[Lexeme("[ \\t]+",true)]
WS = 12,
[Lexeme("[\\n\\r]+", true, true)]
EOL = 14
}