-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rethinkdb required for seasonal flu builds to work #3
Comments
I looked into what we'd actually need to include rethinkdb in this environment. Some important factors are:
To include the specific rethinkdb version we need for fauna in this base environment, it looks like we need to create our own conda package for this version. We could host it in the Nextstrain channel. Bioconda is not an appropriate place for it and I don't think we want to support a conda-forge package for rethinkdb (or give the impression we are responsible for rethinkdb). @tsibley Does this summary make sense? I'd love to learn how to use our Nextstrain channel through Anaconda, so I could try setting up the rethinkdb package. |
Thanks for the great description here, @huddlej! I was wondering how fast Line 39 in c0c6f67
was going to come back to haunt me. Pretty fast it turns out! 🙃 Agreed that the thing to do here given the constraints you outlined is to produce our own Conda package for the Python RethinkDB bindings and host it in our channel. I think this should be relatively straightforward (if involving some minor tedium), and I'd be happy to help guide you through it. Some things to consider:
|
Cool, let's do it. Maybe end of next week? This feels like a nice Friday afternoon kind of task... I like |
It's a plan!
Yeah, the Docker runtime intentionally locates Fauna's source at |
@tsibley Based on our other in person discussion yesterday, I wonder if we should focus instead on migrating remaining fauna-based workflows to our S3-hosted data approach. This is always the issue of deciding how long to keep supporting a legacy system that everyone relies on, but if I had to choose between a) running the seasonal flu workflow as it is with managed conda environment and b) running the seasonal flu workflow with S3-hosted data, I would pick the latter. |
@huddlej Ah, indeed, my preference would be to advocate for (b) too, but I guess I don't see this as having to be either-or. I don't think it'll take very long to make the |
I'd rather push for the S3-hosted data, instead, unless anyone else on the team has a strong desire to run fauna-based builds with the managed Conda environment right now. This might just be @joverlee521 and @j23414 right now? |
I only run the fauna-based builds with Docker so no desire to include fauna/rethinkdb here. I would be happy to push for (b). |
Closing this as won't fix. We can re-open if (b) doesn't come to pass in a reasonable time and we decide to just make |
Came here from nextstrain/docker-base#222 A few thoughts:
I don't see why one couldn't create a conda-forge package and include this particular version, as well as more recent versions. Creating a conda-forge package in no way implies responsibility for underlying software. With rethinkdb only releasing new versions very sporadically, it wouldn't be hard to keep it up to date - there's also no obligation to do so. So in principle I don't see why one couldn't add fauna to bioconda and the dependencies to conda-forge/bioconda as required. |
Current Behavior
Using the Nextstrain CLI with the managed conda environment, I tried to download data from fauna (the first step of the seasonal flu builds) like so:
nextstrain build --conda . --forceall --configfile profiles/nextflu-private.yaml -p data/h1n1pdm/who_cell_hi_titers.tsv
This command failed with a module not found error because fauna needs the
rethinkdb
module and it is not installed in the managed conda environment. In contrast, the following command works as expected:nextstrain build --docker . --forceall --configfile profiles/nextflu-private.yaml -p data/h1n1pdm/who_cell_hi_titers.tsv
Expected behavior
Managed conda environment should behave like Docker environment.
How to reproduce
Steps to reproduce the current behavior:
trs/conda/nextstrain-base-package
branch of the GitHub repoNEXTSTRAIN_CONDA_CHANNEL=nextstrain/label/pull-1 nextstrain setup conda
,NEXTSTRAIN_CONDA_CHANNEL=nextstrain/label/branch-initial nextstrain update conda
, andnextstrain setup --set-default conda
)git checkout refactor-workflow
nextstrain build --conda . --forceall --configfile profiles/nextflu-private.yaml -p data/h1n1pdm/who_cell_hi_titers.tsv
Possible solution
Since I have fauna installed in the parent directory of my workflow (where the workflow expects to find it), installing rethinkdb should be enough to fix this issue. We should not need to install fauna.
The text was updated successfully, but these errors were encountered: