- Where: zoom.us
- When: October 22, 16:00-17:00 UTC
- Location: link on calendar invite
- Contact:
- Name: Dan Gohman
- Email: [email protected]
Note: This meeting will be hosted by the subgroup co-chair Sam Clegg (@sbc100)
None required if you've attended before. Email Dan Gohman to sign up if it's your first time. The meeting is open to CG members only.
The meeting will be on a zoom.us video conference. Installation is required, see the calendar invite.
- Opening, welcome and roll call
- Please help add your name to the meeting notes.
- Please help take notes.
- Thanks!
- Review of action items from previous meeting(s).
- Vote to move wasi-nn to stage 2; see witx here.
- Character encodings
- A presentation on: Summarizing several current discussions around character encodings
- For details, see the individual issues:
- WebAssembly/wasi-filesystem#5 "case-sensitivity of filesystem apis"
- https://github.com/WebAssembly/WASI/issues/8 "Specify UTF-8 as the character set"
- WebAssembly/gc#145 "GC story for strings?"
- Discuss about how to move forward with a WASI Logo. Proposals:
- We would never like to change the current logo.
- We would like to keep the current logo for now, and change it in the future when WASI is more mature. TODO: Ask the risks of changing the logo short-term and see if there is any way to address them. TODO: define what "mature" means (objectively) and how the process will be then
- We would like to start researching on a new logo now. TODO: define how the process will be
- Other ideas?
- Sam Clegg
- Thomas Lively
- Martin Duke
- Syrus Akbary
- Pat Hickey
- Tanya Crenshaw
- Alon Zakai
- Andrew Brown
- Mingqiu Sun
- Alex Crichton
- Dan Gohman
- Steven Dabell
- David Piepgrass
- Johnnie Birch
- Martin Duke
- Yong
- Sergey Rubanov
- Arun Purushan
TL volunteers
PH seconds
DG: Next time ??? is in the meeting we will check up on ???
AB: We proposed wasi-nn a while back. Have a witx spec. I’ve been creating a POC to see if this works. When looking at docs, realized we are essentially in stage 2. Moving to phase 2 would reflect reality. Asking for feedback on API.
SC: English spec text requirement doesn’t apply?
DG: Right, we only have witx.
AB: Also generates md, which might count.
SC: Implementation and test writing starts in stage 2. Probably don’t need a full vote. Does anyone have any comments or objections?
PH: I followed it and I’m happy with how it looks.
SC: Unanimous consent achieved to move to stage 2.
AB: I will submit a PR updating which table it is in.
DG: Several threads about encodings going on. Want feedback about direction.
TODO: Add link to slides
SC: Won’t the C program truncate on strcpy?
DG: Yes, but then you’ll get a file not found error.
SC: Where does this ARF encoding come from?
DG: I invented this after a lot of tries.
SA: Is this for filenames only or also for contents?
DG: File names, env vars, commandline args, but not file contents.
SC: Anywhere a known string crosses the WASI boundary?
DG: There might be other places. Where invalid strings should be roundtrippable without trapping. Some APIs trap, others wouldn’t. Some tools will want different things as well.
TL: wasi seems like it may be the wrong layer to solve encoding problems since it is not related to capability-based security. Can we leave this to an optional userspace virtualization layer?
DG: That seems reasonable since virtualization is an option.
PH: When IT describe interfaces, it would be great to use strings for these types, and IT guarantees unicode. ARF lets us use the IT string type.
AC: In windows, filenames are lists of 16-bit values. How do we go from 16-bit to 8-bit strings? We need an extra layer.
DG: We can extend ARF to handle Windows character space as well. ARF would be a little different on Windows and Linux.
SC: But there are filesystem out there with different filename encodings. e.g. Shift-JIS in asia.
DG: wasi engines can translate host character encodings into unicode.
AC: What would the users look like? Not raw API users, but e.g. rust stdlib users. Would the entire ARF string be stored in Rust’s Path object?
DG: Two cases: in do nothing case, use all the standard APIs. In ARF encoding case, get string as ARF encoded OSString or similar. WASI will either detect ARF string and not validate or would get file not found. There is a Rust library that can take an OSString and be ARF-aware to do things.
SC: But most programs won’t need to do this because these files don’t exist in practice.
AC: Not sure about this. Rust stdlib would want to do the proper decoding to show better error messages. It seems only extremely low-level C developers would be in the do-nothing case.
SC: Do-nothing only prints first half, right?
AC: Only in C. When there is a pointer and length, the second half gets printed too.
DG: Good point. Don’t have an answer for that.
SC: Curious about prior art.
DG: Perl 6 has UTF8-C8. Uses highest private use code point as an escape char.
AC: Using nul character seems reasonable given that native filesystems already disallow it.
DG: Python uses lone surrogates. Not great because you end up with invalid unicode.
SC: To be clear, status quo is passing through raw bytes from OS with no validation?
DG: Some implementations already have some validation and different behavior.
Slides: https://speakerdeck.com/syrusakbary/wasi-logo-proposal
WASI issue: WebAssembly#3 (comment)
SA: I’m the founder and CEO of Wasmer, want to see what the risks are about moving forward with a different logo and what are the issues with the current logo.
SA: Show we replace the logo at some point?
TL: Agree with identified issues.
SC: Another options would be to fix the problems in the current logo.
TS: It’s clear that the current logo is a draft. I actually like this property because WASI itself is also in a draft stage. For example, the proposed logo uses an office CLI metaphor, which is not what WASI has shaped up to be about.
SA: One idea would be to replace the current logo with a short term logo that fixes some of the issues and is more accessible. Then separately it would be good to organize a contest for a new WASI logo. Back to slides…
TS: It would be good to mention reasons on the issue so others can weigh in.
SA: Will follow up on the issue.
TL: What if we just fix all issues except for wasted space in the current logo?
SA: That would be an improvement, but it would be good to fix the wasted space as well.
SC: Any objections to incremental fix?
PH: Don’t want to spend any more time on this.
SA: I can handle this and release the results with a copyleft license to resolve any other issues.
TS: I think we still need to check what the situation is with IP rights just in case.
SA: I understand those concerns and want to make this as easy as possible.