Skip to content

GenericLexerCallbacks

Olivier Duhart edited this page Oct 10, 2024 · 10 revisions

Generic Lexer Callbacks

Sometimes it could be usefull to discriminate tokens after the lexer have scaned them. For example Prolog discriminates variable and atom identifiers base on the first letter (upper case for variables, and lower for atoms) GenericLexer does not allow many pattern for identifier, so we can hook the lexer result before it get send to the parser and subtype tokens as wanted.

Defining callbacks

All callbacks are static method of a single class. Callbacks are methods taking a Token<IN> as parameter and returning a Token<IN>. Every callback method is tagged with the TokenCallback attribute. this attributes has 1 parameter which is the int value of the tokens enum (sadly C# does not allow generics on attributes).

public enum PrologTokens
{

	[Lexeme(GenericToken.Identifier)] 
	IDENTIFIER = 1,

	VARIABLE = 2,

	ATOM = 3

}

public class PrologTokensCallbacks
{

	[TokenCallback((int)PrologTokens.IDENTIFIER)]
	public static Token<PrologTokens> TranslateIdentifier(Token<CallbackTokens> token)
	{
		if (char.IsUpper(token.Value[0]))
		{
			token.TokenID = PrologTokens.VARIABLE;
		}
		else {
			token.TokenID = PrologTokens.ATOM;
		}
		
		return token;
	} 
	
}
linking lexer to callbacks

Once callbacks defined we can link the lexer enum to its callback class. THis is done with the CallBacks attribute on the lexer enum. It takes the type of the callbacks class as parameter

[CallBacks(typeof(TestCallbacks))]
public enum CallbackTokens
{
		...
}
Full example for a prolog identifiers management
[CallBacks(typeof(PrologTokensCallbacks))]
public enum PrologTokens
{
	[Lexeme(GenericToken.Identifier)] 
	IDENTIFIER = 1,

	VARIABLE = 2,

	ATOM = 3
}

public class PrologTokensCallbacks
{

	[TokenCallback((int)CallbackTokens.IDENTIFIER)]
	public static Token<CallbackTokens> TranslateIdentifier(Token<CallbackTokens> token)
	{
		if (char.IsUpper(token.Value[0]))
		{
			token.TokenID = CallbackTokens.VARIABLE;
		}
		else {
			token.TokenID = CallbackTokens.ATOM;
		}
		
		return token;
	} 
	
}
	```