-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for spatial data frames - {sf} package format #112
Comments
Hello, I have the same issue with the It would be nice if there was some way to inherit the class of the object that is being passed to the cluster. This is a great package by the way. The Thanks!! |
Definitely second the {sf} package inheritance request. I may be wrong, but multidplyr is an incredible opportunity to make massive computations more efficient. |
I also would like to see support for sf (or in general other "specialised" tibble classes). As an aside, to work with sf in a parallel pipe: grid_sf3 <- grid_sf2 %>%
multidplyr::partition(cluster) %>%
dplyr::mutate(
dist = as.numeric(sf::st_distance(geometry, coast))
) %>%
dplyr::collect() %>%
sf::st_sf() |
Hello! does anyone have a better approach? Unfortunately the only thing I can think of is to process with |
The
{multidplyr}
package changes class of object distributed to workers tomultidplyr_party_df
. This causes a loss of the "special sauce" that is provided by the{sf}
package for spatial datasets (special interpretation of the geometry column, and information about the coordinate reference system).It would be advantageous for spatial data processing to allow parallelization of some tasks, such as point-in-polygon operation demonstrated in the reprex bellow.
To do so would likely require keeping the class of the distributed object unchanged (or perhaps re-implementing the
sf
methods, in which case the issue would likely fall outside of scope of the{multidplyr}
package).The text was updated successfully, but these errors were encountered: