-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for ES6 Unicode code point escapes #11
Conversation
return createEscaped('unicode', res[1], 1); | ||
} else if (res = matchReg(/^u\{([0-9a-fA-F]{1,6})\}/)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering: Should there be a option-flag to enable ES6 features for the parser? I am happy if this is a global flag, that can/needs to be set on the parser object directly and not for every call to the parse function, as I assume most people will use the same mode (es5/es6) over all calls to parse
.
Does that sounds reasonable to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally it should just work. If you’re authoring in ES5, there won’t be any \u{xxxxxx}
-style escape sequences anyway. If you’re authoring in ES6 there might be, but I don’t see why we’d need a flag to parse them. For opt-in Unicode regex features, we already have the u
flag in the input anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM.
Thanks a lot Mathias. I need to take a closer look at this tomorrow and read up on the ES6 unicode points/gramma changes but I have the feeling you know what you're doing ;) |
Hi folks! First of all, thx for the great work!
parse('[\uD83D\uDCA9-\uD83D\uDCAB]', 'u').terms[0].classRanges[0].min
Object {type: "escape", name: "codePoint", value: "1F4A9", from: 1, to: 3, raw: "uD83DuDCA9"} As you can see I am combined first unicode code point with second surrogate value. But only for RegExp with 'u' flag. |
@termi, thanks for your comment! Would you mind to make a new PR with the changes 1-4 you have mentioned such that I can pull them in? Would really be happy to have them in the core library :) |
@jviereck I'll do it in one of these days |
return createEscaped('unicode', res[1], 1); | ||
} else if (res = matchReg(/^u\{([0-9a-fA-F]{1,6})\}/)) { | ||
// RegExpUnicodeEscapeSequence (ES6 Unicode code point escape) | ||
return createEscaped('codePoint', res[1], 3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it should be
return createEscaped('codePoint', res[1], 4);
to calculate the correct values of 'from' and 'raw' like in #12
Unicode surrogate pair support | ClassRange.from fix
Add support for ES6 Unicode code point escape sequences.
Diff without whitespace changes: https://github.com/jviereck/regjsparser/pull/11/files?w=1 (Some indentation was incorrect)