Return multiple tokens #47

MiSawa · 2022-02-19T08:35:42Z

Sometimes I want a lexer rule to be able to return multiple tokens, e.g. to emit a dummy token so parser can use it as an end-marker for some syntax. Maybe I should just use Lexer -> Vec<MyToken> and flatten it later, though it'd be great if this is supported by the library side.

The text was updated successfully, but these errors were encountered:

osa1 · 2022-02-20T07:51:19Z

I needed this once, but I don't remember for what and how I worked around not having it.

We probably don't want to return a Vec in all semantic actions as it will incur runtime costs to lexers that don't need this. We could use SmallVec<[Token; 1]> to avoid allocation in majority of the cases, but even then the lexer main loop (the Iterator implementation) will have to store the returned (by semantic actions) vectors, and return the vector elements when there are tokens in the vector, and continue with lexing if it's empty. This means next() will be slower whether you need to return multiple tokens or not.

Alternatively, we could provide a compile-time switch for this feature and only do this in lexer that need it.

osa1 added the feature New feature or request label Feb 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return multiple tokens #47

Return multiple tokens #47

MiSawa commented Feb 19, 2022

osa1 commented Feb 20, 2022

Return multiple tokens #47

Return multiple tokens #47

Comments

MiSawa commented Feb 19, 2022

osa1 commented Feb 20, 2022