Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix scanner and keyword #229

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions src/scanner.c
Original file line number Diff line number Diff line change
Expand Up @@ -239,8 +239,11 @@ bool tree_sitter_rescript_external_scanner_scan(
if (lexer->lookahead == 'n') {
advance(lexer);
if (lexer->lookahead == 'd') {
// Ignore new lines before `and` keyword (recursive definition)
in_multiline_statement = true;
advance(lexer);
if (is_whitespace(lexer->lookahead)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should rather check against valid identifier characters (a-z, A-Z, apostrophe, $) rather than whitespace, because and (trailing space) is more or less the same as and( (trailing lparen) or and/*comment*/ (a comment right after). What do you think?

Also, you consume an extra character I’m not sure it won’t break anything. For example, the case with a comment and/*comment*/

Would you add some tests?

Also, you consume

Copy link
Collaborator Author

@aspeddro aspeddro Jun 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should rather check against valid identifier characters (a-z, A-Z, apostrophe, $) rather than whitespace, because and (trailing space) is more or less the same as and( (trailing lparen) or and/comment/ (a comment right after). What do you think?

I think we should scan for white space as the scanner is after a line break \n, so if an and followed by a white space is found then in_multiline_statement is true.

Note that and is a keyword, so and followed by any character is a syntax error.

// Ignore new lines before `and` keyword (recursive definition)
in_multiline_statement = true;
}
}
}
}
Expand Down
13 changes: 12 additions & 1 deletion test/corpus/expressions.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1388,6 +1388,8 @@ Subscript expressions

myArray[42]
myObj["foo"]
andd[0]
andd[1]

--------------------------------------------------------------------------------

Expand All @@ -1400,7 +1402,16 @@ myObj["foo"]
(subscript_expression
(value_identifier)
(string
(string_fragment)))))
(string_fragment))))

(expression_statement
(subscript_expression
(value_identifier)
(number)))
(expression_statement
(subscript_expression
(value_identifier)
(number))))

================================================================================
Variants
Expand Down
Loading