You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I know it's a fair bit after you asied your question- in terms of including in the readme, are you thinking an example of dbfs:// reads or metrics on speed comparison?
I'm curious about both, but I'm pretty sure this project only extends as far as providing dask cluster management, so I wouldn't (at least currently) expect it to perform differently from regular dask.
Definitely a good idea to include an example, I might find some time to double check it all works and put in a PR.
Side note: I use fsspec basically every day, and dbfs:// a fair bit, and somehow never realised until now that dbfs:// was an fsspec protocol 😅
Fun project!
I remember nerd sniping @martindurant to work on https://github.com/fsspec/filesystem_spec/blob/master/fsspec/implementations/dbfs.py (https://github.com/fsspec/filesystem_spec/blob/master/fsspec/registry.py#L152) when I was using databricks a few years ago.
May be a good test for parallel read/writes of parquet files to the databricks file system. Curious if it gets speed up compared to s3 for example.
Given this repo is slim it could be added to the README once tested
The text was updated successfully, but these errors were encountered: