Add report ordering #13

thibautjombart · 2019-05-25T10:51:43Z

To handle dependencies between reports, it would be useful to implement an optional ordering of reports. This could be stored in a file .order or .reports_order at the root of a factory. The order would relate to the file names, without the dates, and default to alphanumeric. I would imagine:

get_order(): returns (undated) Rmd files in their order of compilation, defaulting to alphanumeric
set_order(x): sets the order of compilation of the reports; x could be a vector of names, which then needs validation against the names of existing reports, or a vector of integers, in which case this is applied to the output of get_order(); the output will be saved in .order
reset_order(): resets the order of compilation of the reports to default, i.e. removes the file .order

Comments and ideas welcome. I may be able to get a head start on this if @zkamvar is really not keen on it, unless we can get help from others?

The text was updated successfully, but these errors were encountered:

sgetalbo · 2019-11-04T12:00:47Z

Hey @thibautjombart I believe this falls under Locke Data's remit now. Could I get a quick rundown of a use case for this? I'm not sure I understand the process - e.g.

why do you want to implement it/what scenario might it be used in?
When you refer to handling dependencies, is that meaning the data that was used in the report?

From what I understand, get_order() is simply a list of .Rmd files in a factory, ordered by their compilation date, Y/N?

Would it be useful to have any more information provided, such as the version of the data that was used etc?

TIA.

thibautjombart · 2019-11-06T13:45:10Z

Hey @sgetalbo

This boils down to outputs of some reports being used as inputs of others. The simplest use case I have encountered is:

clean_data_[date].Rmd processes raw data and output some clean data in a specific folder, e.g. data/clean/my_data.rds
analyse_data_[date].Rmd makes some analyses on the clean dataset, reading it from data/clean/my_data.rds

However, when calling update_reports() the order is by default alphanumeric, so that the analyses would be done before the cleaning; in this case we'd like to specify the order of these files. Note that it is not predicated on the [date], only on the base name of the report, e.g.:

set_order(c("clean_data", "analyse_data"))

Currently the workaround is to rename clean_data... to aaa_clean_data....

thibautjombart · 2021-01-14T10:54:39Z

This might be something worth looking into in the future. To adapt it to the current implementation, we could think of having priorities defined in the config file, e.g.:

compile_first:
  get_data
  clean_data

So that list_reports() would return reports with files matching the regexp get_data first, then clean_data, then the rest.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add report ordering #13

Add report ordering #13

thibautjombart commented May 25, 2019 •

edited by sgetalbo

Loading

sgetalbo commented Nov 4, 2019

thibautjombart commented Nov 6, 2019

thibautjombart commented Jan 14, 2021

Add report ordering #13

Add report ordering #13

Comments

thibautjombart commented May 25, 2019 • edited by sgetalbo Loading

sgetalbo commented Nov 4, 2019

thibautjombart commented Nov 6, 2019

thibautjombart commented Jan 14, 2021

thibautjombart commented May 25, 2019 •

edited by sgetalbo

Loading