Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add worker api #36

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from
Draft

Add worker api #36

wants to merge 8 commits into from

Conversation

jebeck
Copy link
Contributor

@jebeck jebeck commented Mar 5, 2020

Here's a very minimal proof-of-concept for an additional build that exposes an async API on the model of the normal UMAP API but putting all the compute in a Web Worker. So usage is like this:

import { UMAP } from "umap-js/lib/umap-worker-js";

const umap = new UMAP();
// even .fit() becomes async bc of the Web Worker message-passing behind the scenes
const embedding = await umap.fit(data); 

On this approach, I think it will be possible to mirror all of the normal UMAP API, though with some functionality exceptions. For example, while it should be possible to get iteration feedback with some additional messages from the Worker for fitAsync(), we can't actually pass a callback through to a Worker (functions are not structured clone-able), so terminating the compute won't be possible with a return false; from a callback, though we could make a slightly different API for it based on a threshold epochNumber.

The obvious alternative here is to just expose a UMAPWorker with its own separate API defined as types of messages you can pass to the Worker and receive back, but I thought it'd be interesting to try the approach here that's easier on the user since it doesn't require users to implement message handlers and posting, though again it comes with the downside of a nearly but not quite identical API...

Would love to hear your thoughts! And no sweat if you're not that interested in adding this to published umap-js. I've been working for a while on a React wrapper around UMAP (and t-SNE, originally) to make it easier to drop a UMAP viz into a React app, and if you're not a fan of this approach, I'll probably do it anyway there either as the implementation details behind the custom hook that'll be the main export or even as an additional export API as well.

Some comments inline because the code deserves a few to clarify the currently-hackiest bits 🙃

case 'fit': {
const umap = new UMAP();
const embedding = await umap.fit(args[0]);
// @ts-ignore
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TypeScript is incorrectly seeing the postMessages here as window.postMessages instead of Worker.postMessages and thus wanting a targetOrigin 2nd argument. This is almost certainly related to the @ts-ignore required to used the worker-loader on the module (see comment below). Getting the Worker to bundle via Webpack with TypeScript was the hardest part of this. I've bundled Workers into libs many times with the worker-loader and not had any problems in plain JS but Googling around confirms that this is indeed a common issue. (I'm also not sure how cross-browser friendly this bundling approach will be, either, since I'm not sure it works without { fallback: false, inline: true } and that may exclude some browsers...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had to deal with this once before and I wound up opting just to redefine a type for postMessage on the window object.... kind of equally hacky but at least you get type hints on that method. As far as the browser compatibility goes I think we can cross that bridge when we get there.

@@ -0,0 +1,59 @@
import { v4 as uuid } from 'uuid';
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at all wedded to pulling in this dep to keep track of promises, just part of the proof-of-concept nature of what's currently in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally, I'd prefer to keep this lib as dependency-free as possible (mainly for google-internal compatibility reasons) - I usually just use a start-at-one-and-increment id generating system for stuff like this.

}
}

const worker = new UMAPWorker();
Copy link
Contributor Author

@jebeck jebeck Mar 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a worker instance should be in the constructor of the UMAPWorkerWrapper but I was failing to type it correctly (probably related to the @ts-ignore to get the import working) and moved it out to move forward with the proof-of-concept 🙃

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -0,0 +1,59 @@
import { v4 as uuid } from 'uuid';

// @ts-ignore
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried many variations on worker-loader's advice for using it with TypeScript, but it never bundled correctly (didn't fail out during build, but the bundle would error in my example in the browser). Only ignoring all advice and using the config I've used for other projects with this @ts-ignore got it working...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is this issue open on the worker-loader, though that problem isn't exactly what I was seeing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.... sounds awful and I'm not at all opposed to just circumventing TS if it winds up causing too much trouble.

Copy link
Member

@cannoneyed cannoneyed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fantastic. As far as the other methods / callbacks go, it's just a matter of setting up the worker wrapper to accept a callback and then setting up the communication channel between it and the worker... I did this once before in this (languishing) PR, in case you want to check out any strategies (ignore the wild-hacky script stringifcation stuff, it's all to satisfy some truly archaic tensorboard build restrictions):

tensorflow/tensorboard#2173

This all looks very good and I think you're very much on the right track using worker-loader. Can't wait to see the rest of it working!

case 'fit': {
const umap = new UMAP();
const embedding = await umap.fit(args[0]);
// @ts-ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had to deal with this once before and I wound up opting just to redefine a type for postMessage on the window object.... kind of equally hacky but at least you get type hints on that method. As far as the browser compatibility goes I think we can cross that bridge when we get there.

@@ -0,0 +1,59 @@
import { v4 as uuid } from 'uuid';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally, I'd prefer to keep this lib as dependency-free as possible (mainly for google-internal compatibility reasons) - I usually just use a start-at-one-and-increment id generating system for stuff like this.

@@ -0,0 +1,59 @@
import { v4 as uuid } from 'uuid';

// @ts-ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.... sounds awful and I'm not at all opposed to just circumventing TS if it winds up causing too much trouble.

}
}

const worker = new UMAPWorker();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jebeck
Copy link
Contributor Author

jebeck commented Mar 12, 2020

Ack, sorry for the slow reply! Glad we're basically on the same page; will keep chipping away at this and let you know when I've got substantially more progress to look at!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants