Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for ES6 Unicode code point escapes #11

Merged
merged 1 commit into from
Apr 6, 2014

Conversation

mathiasbynens
Copy link
Collaborator

Add support for ES6 Unicode code point escape sequences.

Diff without whitespace changes: https://github.com/jviereck/regjsparser/pull/11/files?w=1 (Some indentation was incorrect)

return createEscaped('unicode', res[1], 1);
} else if (res = matchReg(/^u\{([0-9a-fA-F]{1,6})\}/)) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering: Should there be a option-flag to enable ES6 features for the parser? I am happy if this is a global flag, that can/needs to be set on the parser object directly and not for every call to the parse function, as I assume most people will use the same mode (es5/es6) over all calls to parse.

Does that sounds reasonable to you?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally it should just work. If you’re authoring in ES5, there won’t be any \u{xxxxxx}-style escape sequences anyway. If you’re authoring in ES6 there might be, but I don’t see why we’d need a flag to parse them. For opt-in Unicode regex features, we already have the u flag in the input anyway.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM.

@jviereck
Copy link
Owner

Thanks a lot Mathias. I need to take a closer look at this tomorrow and read up on the ES6 unicode points/gramma changes but I have the feeling you know what you're doing ;)

@termi
Copy link
Contributor

termi commented Mar 21, 2014

Hi folks! First of all, thx for the great work!
But I have to ask you: What the status of this branch?
Recently I needed to the es6 RegExp parser and the only one a have found is the one that present in this branch. But, unfortunately it's incomplete. So I had to do my own version of this branch. The reason for this:

  1. Lack of Unicode surrogate support. change1 and change2. I am not sure that I am did it in a right way, but it works for me. Example:
parse('[\uD83D\uDCA9-\uD83D\uDCAB]', 'u').terms[0].classRanges[0].min
Object {type: "escape", name: "codePoint", value: "1F4A9", from: 1, to: 3, raw: "uD83DuDCA9"}

As you can see I am combined first unicode code point with second surrogate value. But only for RegExp with 'u' flag.
2. Incorrect from value as well as 'raw' value for 'characterClassRange' change3. I am sure this is a bug.
3. Due #3 still not closed I had to add my own option change4

@jviereck
Copy link
Owner

@termi, thanks for your comment! Would you mind to make a new PR with the changes 1-4 you have mentioned such that I can pull them in? Would really be happy to have them in the core library :)

@termi
Copy link
Contributor

termi commented Mar 25, 2014

@jviereck I'll do it in one of these days

@termi
Copy link
Contributor

termi commented Mar 27, 2014

@jviereck #13

return createEscaped('unicode', res[1], 1);
} else if (res = matchReg(/^u\{([0-9a-fA-F]{1,6})\}/)) {
// RegExpUnicodeEscapeSequence (ES6 Unicode code point escape)
return createEscaped('codePoint', res[1], 3);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it should be

return createEscaped('codePoint', res[1], 4);

to calculate the correct values of 'from' and 'raw' like in #12

Unicode surrogate pair support | ClassRange.from fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants