Skip to content
susabi edited this page Feb 24, 2019 · 7 revisions

Welcome to the disk.frame wiki!

How does disk.frame works?

Many of disk.frame's functions, such as map.disk.frame and delayed are just convenience functions to let you perform the same operation to each chunk. The convenience comes from the fact that it loads every chunk into a data.table/data.frame and does the saving to disk automatically into .fst files.

Key Priorities

  1. Tests covering all user-facing functions (2019 03 03)
  2. Implement #50
  3. Submit to CRAN

Working time

I only work on Sunday morning on disk.frame to avoid this eating into my other (paid) work. So progress can be slow. If you would like to speed things up feel free to contact me for consulting services.

TODO: convenience disk.frame syntax proposal

disk.frame_code({
  libname(a, path1)
  libname(b, path2)

  a~disk.frame2 = delayed(b~disk.frame1, some_fn)
})

Scaling out as a cluster

future has backends for clusters. So with a little bit (a lot) of work we can scale out to computer. But we need to identify a simple way to set up these servers on AWS or on a local network. But this is not going to happen for a while.

Clone this wiki locally