Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

117 ensure fresh install works even if ifadv missing #121

Merged

Conversation

PabRod
Copy link
Contributor

@PabRod PabRod commented Aug 29, 2024

ifadv is currently hosted only in GitHub. Due to license incompatibility, we cannot just add this dataset to our package. Moreover, CRAN doesn't allow dependencies that live outside of CRAN, so we have no choice but to access it in a less elegant way.

@PabRod PabRod linked an issue Aug 29, 2024 that may be closed by this pull request
But still causing odd problems: tests go fine on testing, but fail on checking
For some reason, native R data files (.rda) are less comfortable for R than the good-old csv. I pushed a csv copy of the rda file to https://github.com/elpaco-escience/ifadv/tree/csv, and that's the one we are using now.
@PabRod
Copy link
Contributor Author

PabRod commented Aug 30, 2024

Context

Dear @mdingemanse, after wrestling with the ifadv problem for a while, I found the following solution:

  1. Loading data from a .csv is, oddly enough, significantly easier than from R's native format .rda. This is particularly true when your data is located on a URL (if you feel curiosity, compare the (buggy) rda version with the csv one). Long story short, loading a .rda was forcing me to pre-download it, and that shouldn't be necessary.
  2. So I pushed a .csv copy of the dataset to elpaco-escience/ifadv. It is redundant, I know, so I kind of hid it in its own branch.

Request

The tests and vignettes keep running. The snapshots didn't change. But I consider this a potentially critical change in the dataset at its very origin, so it will be great if you can check the results still make sense.

@PabRod PabRod self-assigned this Aug 30, 2024
@PabRod PabRod marked this pull request as ready for review August 30, 2024 13:15
@PabRod
Copy link
Contributor Author

PabRod commented Aug 30, 2024

Last and perhaps least, a reflection:

The original IFADV dataset was published under a permissive license. Without getting into the details, this indicates a spirit of open science and data sharing. It is ironic that this triggered an incompatibility problem with our also permissive license and ultimately forced us to apply a suboptimal solution.

And a coda:
Probably all our code, plus all our public conversations, are already used to train IA models anyway.

Copy link
Contributor

@mdingemanse mdingemanse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and tests are stable

@mdingemanse
Copy link
Contributor

Yes I quite agree that it would be lovely to redistribute; I remember us spending too much time early 2023 discussing the licensing constraints, and I think the IFADV makers would be scratching their heads about it. I think the current solution works around it in the most CRAN-ready way, which is our proximate goal here so approving

@mdingemanse mdingemanse merged commit 5aef8bf into main Aug 31, 2024
3 checks passed
@mdingemanse
Copy link
Contributor

I went ahead and merged, let me know if you prefer me to leave this to you as PR author next time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ensure fresh install works even if ifadv missing
2 participants