Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wukong meet Cubed #7

Open
TomNicholas opened this issue Nov 14, 2023 · 4 comments
Open

Wukong meet Cubed #7

TomNicholas opened this issue Nov 14, 2023 · 4 comments

Comments

@TomNicholas
Copy link

Hi @Scusemua - I'm raising this because I couldn't find your email anywhere!

@tomwhite and I have been working on Cubed - which is extremely similar to Wukong in it's goals.

If I understand correctly, Wukong aims to create a general-purpose serverless DAG execution framework for data science workloads by building on top of dask.distributed.

Cubed also aims to be a serverless DAG execution framework for data science workloads inspired by Dask, but restricts the problem domain to numpy-like array computations, and does not directly use dask (only borrows some of its API/abstractions). Cubed also uses the cloud-native array storage format Zarr to store state (intermediate arrays) between operations.

Both projects cite PyWren as an inspiration explicitly.

Some of the problems you mention with PyWren are solved by Cubed's approach - in particular on this slide of your talk on Wukong the rapid scaling is handled by serverless frameworks like Lithops, the excessive data movement is handled by writing to Zarr, and the per-function resource limitations are not an issue because each function only needs to process a single chunk.

Of possible interest to you:

@TomNicholas
Copy link
Author

Probably also worth mentioning the existence of Coiled Functions, which is also "serverless Dask"

https://docs.coiled.io/user_guide/usage/functions/index.html?utm_source=medium&utm_medium=parallel-coiled-functions

@Scusemua
Copy link
Contributor

Hi Tom!

Thank you for reaching out! 🙂 I will take a look at Cubed as well as Coiled Functions, as I'm not familiar with these. I will also share them with my Wukong colleagues. I'll also attend your talk if I've got time to do so.

If I understand correctly, Wukong aims to create a general-purpose serverless DAG execution framework for data science workloads by building on top of dask.distributed.

Yes, that's correct!

@TomNicholas
Copy link
Author

I will take a look

Great!

Talk slides here, and the recording should be posted here soon.

@Scusemua
Copy link
Contributor

Great, sounds good. Thank you. :) I'll check out the slides + recording!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants