Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jrjackson fails to parse valid JSON in UTF-16 and UTF-32 #72

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

yaauie
Copy link
Contributor

@yaauie yaauie commented Mar 6, 2019

8.1. Character Encoding

JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32. The default
encoding is UTF-8, and JSON texts that are encoded in UTF-8 are
interoperable in the sense that they will be read successfully by the
maximum number of implementations; there are many implementations
that cannot successfully read texts in other encodings (such as
UTF-16 and UTF-32).

-- RFC7159 §8.1

> 8.1.  Character Encoding
>
>    JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32.  The default
>    encoding is UTF-8, and JSON texts that are encoded in UTF-8 are
>    interoperable in the sense that they will be read successfully by the
>    maximum number of implementations; there are many implementations
>    that cannot successfully read texts in other encodings (such as
>    UTF-16 and UTF-32).
>
> -- [RFC7159 §8.1](https://tools.ietf.org/html/rfc7159)
@yaauie
Copy link
Contributor Author

yaauie commented Mar 6, 2019

There are at least two layers of failure here, which is why I'm submitting this as a failing-tests PR instead of coming to a solution.

  1. JrJackson::Json::is_time_string?(Str) encounters an exception in the Regexp library if the given string's encoding isn't ASCII-compatible, since the regexp is ASCII.
  2. if we attempt to fall through to UTF-16 or UTF-32, and have appropriate regexp built for these cases, parsing fails in super weird ways, re-emitting previously-captured yet unrelated tokens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant