Skip to content

Commit

Permalink
Create to-wasm.mdx (#118)
Browse files Browse the repository at this point in the history
Co-authored-by: Luke Curley <[email protected]>
  • Loading branch information
kixelated and kixcord authored Oct 24, 2024
1 parent 1bd58dc commit 6a81886
Show file tree
Hide file tree
Showing 5 changed files with 174 additions and 0 deletions.
Binary file added web/public/blog/to-wasm/duck.jpeg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added web/public/blog/to-wasm/slide.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added web/public/blog/to-wasm/spooky.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added web/public/blog/to-wasm/thonk.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
174 changes: 174 additions & 0 deletions web/src/pages/blog/to-wasm.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
---
layout: "@/layouts/global.astro"
title: To WASM, or not to WASM
author: kixelated
description: Have our benevolent W3C overlords allowed us to use Rust in the browser yet?
cover: "/blog/to-wasm/duck.jpeg"
date: 2024-10-24
---

# To WASM, or not to WASM

I'm losing sleep over whether the web client should be written in Rust or TypeScript.
I need your opinion, my beautiful little rubber duckies.

<figure>
![Rubber Ducky](/blog/to-wasm/duck.jpeg)
<figcaption>you irl</figcaption>
</figure>

## Frontend

But first, I'm going to spew my own opinion.
The UI layer will absolutely be written in TypeScript using a flavor-of-the-month web framework.
I'm not going to use [Yew](https://yew.rs/) or any other React clone written in Rust.

Why?
It's pretty simple, I need frontend contributors.
I don't think it's fair to ask a frontend savant to learn Rust and deal with the restrictions imposed by the language.
Unpopular opinion: a UI should be flashy and not _safe_.

Additionally, JavaScript web frameworks are more mature and used in production.
I'm not suggesting that you can't use a Rust frontend library, just that they won't be nearly as polished or feature complete.

"@kixelated ur dumb" if you disagree.

My plan is to create a `<moq-karp>` [custom element](https://developer.mozilla.org/en-US/docs/Web/API/Web_components/Using_custom_elements) that abstracts away the underlying implementation; similar to the `<video>` tag.
Then you can use whatever frontend framework you want to interface with the UI-less element.

What is up for debate is the language used to power the media experience behind this custom element.
And yes, there's going to be a LOT of links to MDN in this post.

## Background

Before we get started, let's talk about the code that has already been written.

- [moq-rs](https://github.com/kixelated/moq-rs) is a Rust library that handles networking and media containers.
- [moq-js](https://github.com/kixelated/moq-js) is a TypeScript library that does networking, media containers, encoding/decoding, and capture/rendering. It also contains this blog post because splitting the library from the website was too difficult for me.
- [moq-wasm](https://github.com/kixelated/moq-wasm) is a Rust proof-of-concept that uses WebAssembly to decode and render media. It barely works.

The decision that needs to be made is whether to continue with `moq-js` or switch to `moq-wasm`.
Can you help me decide?

<figure>
![Spooky](/blog/to-wasm/slide.png)
<figcaption>
[One of my
slides](https://docs.google.com/presentation/d/1GNJhmuUIzT_8vhR2QZJG6FBUMjuY-HW1DIqa5qedW00/edit?usp=sharing)
from [Demuxed 2024](https://2024.demuxed.com/). Shame you couldn't make it.
</figcaption>
</figure>

## Threads

I know "thread" is a spooky word to Javascript developers, but the library will _need_ to run in multiple threads:

1. [Main](https://developer.mozilla.org/en-US/docs/Glossary/Main_thread) Thread
2. [WebWorker](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API) Thread
3. [AudioWorklet](https://developer.mozilla.org/en-US/docs/Web/API/AudioWorklet) Thread

The **Main** Thread needs to initialize the **WebWorker** and **AudioWorklet** of course.
Unfortunately, [AudioContext](https://developer.mozilla.org/en-US/docs/Web/API/AudioContext) also requires the main thread for some reason.
This is a bit frustrating because we will occasionally need to reinitialize it too, like when changing the audio frequency (or device?).

The **WebWorker** Thread will perform any networking, decoding/encoding, and rendering.
This is done via [WebTransport](https://developer.mozilla.org/en-US/docs/Web/API/WebTransport_API), [WebCodecs](https://developer.mozilla.org/en-US/docs/Web/API/WebCodecs_API), and [OffscreenCanvas](https://developer.mozilla.org/en-US/docs/Web/API/OffscreenCanvas), respectively.
Our application code doesn't need to be fast because the heavy processing (ex. encoding) is handled by browser APIs.

Finally, the **AudioWorklet** Thread is a nightmare.
In order to render audio with minimal latency, you use an **AudioWorklet** that runs on the audio thread.
This is completely sandboxed and triggers a callback every ~2ms to request the next 128 audio samples.

## Main Thread

Let's cover the easiest decision first.

The small amount of code backing the custom element should be written in TypeScript.
It will initialize the **WebWorker**, **AudioContext**, and **AudioWorklet**.
The custom element API will be a simple shim that performs a [postMessage](https://developer.mozilla.org/en-US/docs/Web/API/Worker/postMessage) to the worker.

The benefit of TypeScript here is that it's simple.
It also means we can offload the slow WASM initialization to the background.
In theory, we could even establish the connection to the MoQ server while the WASM module is loading.

## WebWorker Thread

Now this decision is trickier.

WASM in a WebWorker makes a lot of sense because the worker is in a sandbox already.
An API that you use to communicate with a WebWorker will be very similar to an API that you use to communicate with a WASM module.

My biggest concern is the performance.
WASM will undoubtedly be _slower_ even when written in the 🦀 language.

Why?
Well one thing you can do with WebWorkers but can't do with WASM yet is a [zero-copy](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Transferable_objects).
That means every byte received over the network (WebTransport) would need to be copied to WASM and then immediately copied back to decode (WebCodecs).
We might even need to perform _another_ copy to and from in order to render each frame, although fortunately I think this can be avoided thanks to WebCodecs' hardware offload.

But, am I being paranoid and over-optimizing?
Almost certainly.
There are other reasons to keep the worker in TypeScript, again for more contributions and faster startup times.
Not to mention that interfacing with JavaScript from Rust is still pretty dreadful as there are some things, like callbacks, that don't translate well.

But if written in Rust, then much of the code would be shareable with native apps.
We would still use native APIs to perform networking, encoding, and rendering, but the logic would be the same.
Surely there's justification for a more complicated but reusable web library?

Another "@kixelated ur dumb" in chat please.

## WebWorklet Thread

Finally, my least favorite component: _audio_.
I barely understand how beeps and boops are encoded into bytes and samples.
But I do know how to output audio samples on the web, and it's painful.

<figure>
![Spooky](/blog/to-wasm/spooky.jpg)
<figcaption>
You have been spooked by the **Pumpkin of AudioWorklet**. Recite the words "I dare not touch it" three times or
you will be 👻🎃 **CURSED** 🎃👻 with accelerated toenail growth.
</figcaption>
</figure>

To achieve minimum latency, you need to create an AudioWorklet that runs on a dedicated audio thread.
You can think of it like an WebWorker but even more sandboxed.
This worklet runs a [callback](https://developer.mozilla.org/en-US/docs/Web/API/AudioWorkletProcessor/process) every ~2ms to gather the next 128 audio samples.

So we first need to get decoded audio samples to this AudioWorklet.
There is a [postMessage](https://developer.mozilla.org/en-US/docs/Web/API/AudioWorkletProcessor/port) API but apparently the lowest latency approach is to code a ring buffer... in JavaScript.
I'm not kidding, you use [SharedArrayBuffer](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer) and [Atomics](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Atomics) in a language that doesn't even have a dedicated integer type.
Fun fact: `number` is secretly a `float64`, so math gets wonky above 2^53 unless you use a [BigInt](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt).

Right now, we render the samples unmodified.
However, in the future, we'll need to perform some audio processing for features like echo cancellation and AI voice changers (gotta raise funding somehow).
This part seems miserable to do in JavaScript as it involves [UInt8Array](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Uint8Array) and [DataView](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/DataView), which are suprisingly terrible APIs.
I'm sure there are audio processing libraries written in Rust or C/C++ that we could leverage instead.

From my limited research, it does seem possible to [run WASM inside of an AudioWorklet](https://developer.chrome.com/blog/audio-worklet-design-pattern/).
However, **SharedArrayBuffer** and **Atomics** are not yet supported in WASM so some additional latency is unavoidable.
But I imagine `postMessage` only incurs a few milliseconds most and it likley doesn't matter when the network jitter buffer is already closer to 100 milliseconds.

In fact, this might be a great problem turned solution, because SharedArrayBuffer requires quite annoying [CORS settings](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer#security_requirements) that would be nice to remove.
This might be lifted in the future but I wouldn't hold my breath.
It's also just a huge headache to interface with a fixed size SharedArrayBuffer given the uneven arrival over the network.

So I'm definitely leaning towards Rust here.
"@kixelated ur dumb"

## Decision Time

I'm leaning towards Rust for the WebWorker and AudioWorklet threads, and using TypeScript for the UI and Main thread.
What do you think?

<figure>
![Thonk](/blog/to-wasm/thonk.png)
<figcaption>thonking</figcaption>
</figure>

Good luck leaving a comment on this blog post.
I haven't bothered to code that feature yet.
So instead [join the Discord](https://discord.gg/FCYF3p99mr).
You already know what to type.

<img src="/blog/kixelCat.png" class="inline w-16" />

0 comments on commit 6a81886

Please sign in to comment.