arrange all data #366
arnaud-feldmann
started this conversation in
Ideas
Replies: 2 comments 1 reply
-
Do you mean library(disk.frame)
setup_disk.frame() # not needed for small datasets but just for show
a = as.disk.frame(iris)
# this will arrange (i.e. sort) the disk.frame on disk by Species
# WARNING: extremely slow for large datasets
a = disk.frame::hard_arrange(a, Species)
a_collected = collect(a, parallel = FALSE)
is.unsorted(a_collected$Species) # FALSE; so it's sorted |
Beta Was this translation helpful? Give feedback.
0 replies
-
Thanks. I had seen the function but misunderstood it as a grouping function more than a sorting one.
Thanks |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
Is an implementation of global arrange still intended for your package disk.frame ?
I understand that’s one of the complicated things to think about, due to the preference for not loading the data too much times.
Though, that's one of the things I'd need to have if I put my data that way
Thanks for your work.
Beta Was this translation helpful? Give feedback.
All reactions