Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using conda for dependencies? #204

Closed
cyrilcros opened this issue Jan 18, 2021 · 5 comments
Closed

Using conda for dependencies? #204

cyrilcros opened this issue Jan 18, 2021 · 5 comments

Comments

@cyrilcros
Copy link

Hi, I want to know if it is possible to install and use conda to install tools. I realize the underlying Docker images for this chart use virtualenv which has compatibility issues according to https://docs.galaxyproject.org/en/master/admin/framework_dependencies.html .
What I have tried to do is add an extraInitCommand to download miniconda, remove /galaxy/server/.venv and replace it with a virtualenv created in a _galaxy_ conda environment. This follows the logic of the script from scripts/common_startup.sh, and I end up with a working install.
The dependency resolvers aren’t working, even after adding a config file. Am I missing something?

@almahmoud
Copy link
Member

almahmoud commented Jan 19, 2021

By default the job conf installed by the chart uses the Kubernetes runner: Each job is dispatched to a different k8s job in a separate container. Tools that have corresponding BioContainers (built from conda dependencies) will use those, and if no biocontainer is found, the default Galaxy container is used. If you are using the Kubernetes runner, you do not need to install any dependencies (you can uncheck all of the boxes when installing a tool from admin panel) and toolshed tools should work with the mulled containers without any dependencies.
Using extraInitCommand to switch out the .venv for a conda one sounds like a good path if your goal is to change the job conf to not use the kubernetes runner anymore. We've previously tried doing so and having a galaxy running on k8s dispatch jobs to a slurm cluster, but it required a shared filesystem to hold the conda dependencies so that the resolver makes packages available to the cluster as well. That requires a bit more set-up but is fairly easily doable. Alternatively, for an easier setup, you could have the job handler just become a local runner (and potentially scale up the job handlers), but I think even in that scenario you need to have a shared filesystem on which the resolvers would install conda packages, since the dependencies should be made available across handlers... I have not tried the latter though (and for full disclosure I have very little experience with non-k8s runners, so it is also entirely possible i'm missing a much easier solution that would make this possible)
If you wanted to accomplish the latter, you could use extraVolumes and extraVolumeMounts and request a new NFS volume that would be mounted at /galaxy/server/.venv (or change path, potentially just under the original shared mount for the persistence, and explicitly specify it in the conf) across containers, and then install the env with extraInitCommand as you've done, and you should be good to go with a local runner using conda, unless I'm missing something.

@nuwang @natefoo fact check me please if/when you can :)

@cyrilcros
Copy link
Author

Ok, I was using the persistence option with RWX PVC. I used the /galaxy/server/database/deps directory to store my conda config (in a _conda folder, like galaxy does) and I set XDG_CACHE_HOME to a folder in that same directory so pip would use it as a shared cache. I also had to edit the chart to activate conda before running the python command in the job and workflow deployment.

In my case, I want to convert a novel Python utility into a conda package and install it in Galaxy. How could I do that with mulled and involucro? I understand how I could build a container from conda and publish it to quay.io (via https://galaxy-lib.readthedocs.io/en/latest/topics/mulled.html#build-test-and-push-containers-to-your-own-quay-io-repository), but I don’t know how to configure mulled in my galaxy-helm install to search for my image there.
Thanks for your help!

@afgane
Copy link
Contributor

afgane commented Jan 19, 2021

You can specify which container to use for the tool in the container mapper config map, lie so:

- tool_ids:

@cyrilcros
Copy link
Author

Ok, that makes a lot of sense. I can just write my tool definition file and link it to my own container.
Thanks!

@almahmoud
Copy link
Member

I think the mapping is the ideal solution for this case but just for the record in case it's helpful in the future, re:

How could I do that with mulled and involucro?
If you were looking to enable auto-mulling in Galaxy when an image is not found, you can do so by adding the docker-in-docker container and enabling the option in the conf following the changes and comments in #59 (although it's been over a year since I tried that last, and can help debug issues if they come up when you try)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants