-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft] WASM support #272
base: master
Are you sure you want to change the base?
[Draft] WASM support #272
Conversation
I'll consider this one after completing the current TODOs. To anyone interested in this PR, please react to it with 👍. |
2af7b1f
to
2f7889d
Compare
3c6de2a
to
5ac3dd3
Compare
Add wasm support
I've updated the PR to target wasmtime 25, so it is theoretically mergable now. Just for some process documentation, here's how I get the included tests to pass:
This isn't great. An even bigger problem that I'm seeing is that wasmtime is a fast-moving target. In the 2 versions released since I initially put up this PR, the API changed enough that the trampoline had to be updated. This means that there will be a potentially high maintenance overhead for this optional dependency if a fully automated solution can't be developed. |
Yeah this seems like more trouble than it's worth. Do we really need all the trampoline functions? |
So, one of the goals I had was to use wasmtime as an optional dependency, rather than a mandatory one. If we link the native code of py-tree-sitter against wasmtime the normal way, then it isn't going to be an optional dependency because loading the py-tree-sitter native library would fail with missing symbols, even if the user never intends to use the wasm features. So, we cannot link against wasmtime when building the native library, and instead have to use So, in order to properly resolve this, I think we need a fully automatic way of specifying these trampolines. This is easy to do if the library supports it, which wasmtime does not (currently). For example, sqlite extensions work exactly like this, where the sqlite library will populate the trampolines for the extension, and sqlite guarantees ABI compatibility. A second problem: it's clear that wasmtime doesn't care about ABI compatibility between major versions (they changed the width of a pointer in an existing method). I don't know their stance on ABI compatibility within a major version. What I'm realizing now: the only way that dynamic linking against wasmtime can work is for a version with ABI compatibility. This means that even if py-tree-sitter uses wasmtime as an optional dependency, the version of that dependency is going to be tied to the specific version of wasmtime that py-tree-sitter was built against. This is annoying but manageable. However, a potentially simpler solution is to include a complete copy of wasmtime statically linked into the py-tree-sitter library. This would change my proposed API a bit, but more over, means that it cannot be an optional dependency. It would need to be a separate native library and would weigh ~25 MiB because it completely includes wasmtime. It would not have a runtime dependency on any other wasmtime, though. Given all of this, would it be interesting to include a second package in this repository, say, py-tree-sitter-wasm, which could be installed separately and included the wasm features? It could be implemented using Otherwise, it would be a fork of this repository, basically. Or am I missing a better solution? |
Hello, I've been experimenting with what it would take to add wasm support to py-tree-sitter and wanted to get some feedback. Is there interest in merging this feature with py-tree-sitter?
I've made the top-level API similar to what a user of wasmtime would expect it to look like, basically you just pass the wasm module bytes and the engine into a separate constructor for Language (
Language.from_wasm
). After that, the usage is identical to normal Language objects. I've modified Parser to automatically manage the wasm store object, since it doesn't have any configurable parameters.My goal with this change is to keep wasmtime as an optional dependency. Wasmtime's binary wheels are about 5 MiB, whereas tree-sitter's are only 0.5 MiB, so for that reason alone it seems like this niche feature should be optional rather than mandatory. Supporting optional dependencies at compile-time is not supported, so in order to add support I have to create trampoline functions which get populated at runtime by pulling them out of wasmtime's ffi module (
wasmtime.c
handles this).There isn't a good way to land this change until wasmtime v24 lands, because it isn't possible to clone a reference to an Engine before I added a method to do so. Without that ability, there can only be a single language per engine, and only a single parser per language.
So, if there is interest in getting this into the main distribution, I can help close out the remaining issues with this PR. If there isn't, I can fork the project and make a separate variant which supports the functionality. Thanks for your consideration!
In order to land this PR:
Language.from_wasm
,Language.is_wasm
)