Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusing codepoints for Unicode strings #29

Open
jishnub opened this issue Aug 28, 2024 · 1 comment
Open

Confusing codepoints for Unicode strings #29

jishnub opened this issue Aug 28, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@jishnub
Copy link

jishnub commented Aug 28, 2024

julia> about("abα")
4-codeunit String (mutable) (<: AbstractString <: Any), occupies 4B directly (referencing 12B in total)
 • Character set: Unicode

 ┌╴'a'╶─┐┌╴'b'╶─┐┌╴'Î'╶─┐┌╴'±'╶─┐ 4 codepoints
 01100001011000101100111010110001

I found the breakdown into Î and ± confusing, as it's unclear how that relates to α

@tecosaur tecosaur added the enhancement New feature or request label Oct 18, 2024
@tecosaur
Copy link
Owner

So, I'd like to show that α spans two codeunits, this is a bit tricky though the way that the display code is implemented.

This string display actually re-uses the dense vector display, and that's built around constant-size elements.

PRs welcome, or I'll get to this eventually 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants