-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Facilitated storage compute access #29
base: main
Are you sure you want to change the base?
Facilitated storage compute access #29
Conversation
Hi @rokroskar @Panaetius ! As agreed, here is the proposal. I did not fill some fields since they would be clarified later, after a meeting. Does it make a little bit of sense? Should we also discuss it on our monthly meeting this Wednesday? |
Thanks @volodymyrss for contributing this RFC! From my PoV it's still a bit too vague - can you add some details about specific services or point to existing implementations that you are thinking of? In principle something like the JH services you mention could be possible, i.e. we already run proxies in this sort of mode. |
I listed services in the first sentence of the first paragraph in brackets very briefly. I now add them in a list in this section. I avoid to put actual endpoints to reduce exposure.
Ok, good to hear. How would they be selected, during session start? There would be catalog of these services? Core and contributed? What about another possible solution, when renku is just setting some variables specifying external service endpoint and credentials? This could be easier. edit: it is also the case that since the services might be both restricted and domain-specific, they should probably be only visible to some limited community. I think you mentioned you were thinking about making some domain/project specific resource allocation? Would specialized services visible only for some projects/domains work in the same way? These jupyterhub services differ from some processes running in the session in that they have higher privileges, so they are managed by hub administrator. We currently use one service like that for finding and downloading some data. An example is ARC cluster. It uses some specialized client software which can be installed in the service container but may be tricky to keep in all user sessions. A "side-kick" service would receive simple HTTP request from the user in the session and transform it into ARC job. For cases like WebDAV, extra "side-kick" service is useful only to transform credentials somehow, but the credentials could be also provided in environment variables to the session. |
I think both solutions/use-cases, credentials storage and custom sidecar services have merit and are feasible, and I could see us implementing both to support different use-cases. We are already exploring using Vault to store credentials and we should be able to inject them into sessions as e.g. environment variables. But I think what would be nice is to have an dynamic egress proxy that allows for credentials injection. So you could set up rules per project/user like "Requests to example.com should get the token from, secret my-token injected in the Authorization: Bearer .. header, with the secret coming from vault". I think this could be done in a way that works for most uses, with users (or admins?) being able to define the rules per project. This would also be nice in that it could allow anonymous sessions access to restricted data, if set up by the project owner, without exposing secrets. This would differ a bit from what is proposed here in that we'd have a single, dedicated sidecar container that handles proxying for all kinds of requests, instead of just injecting secrets or having a sidecar per request type. For more generic sidecar containers that actually perform actions (the thing similar to jupyterhub services), we probably don't want users to be able to roll their own, but having something platform wide like we have with project templates would also just end up being noise for users, so having it defined by admins/superusers makes sense to me. The resource access control service we're currently working on could be a nice fit, it has some very similar behavior (admins define what resources a user has access to and the user can pick from those when launching a session). So we could extend that with custom sidecars or have a separate service with essentially the same functionality for custom sidecars. Then an admin could say "User X has access to custom sidecars Y and Z" and the user can pick on session launch (or just by default?) whether to start those. We would need to define some API that custom sidecars need to follow, at the most basic, having a healthcheck endpoint for amalthea to watch and whatever is needed by the persistent sessions changes currently being worked on (so the sidecar can be shutdown/started alongside the session as is appropriate). But it'd be up to communities to write these custom sidecars. I would limit these to just being able to specify a Dockerfile and maybe some fixed settings for a sidecar. There is probably a third class of use-cases that can't be solved by the above, like having to install some plugin in the cluster to mount some specific not officially supported storage in a session. We don't want administrators/users to be able to make these kinds of customizations, for platform stability reasons. So these we'd have to check on a case-by-case basis. But I think the generic proxy and admin-defined custom sidecars are both feasible and both useful. I'd probably go for implementing the proxy first since we have a lot of the parts already. |
Thank you @Panaetius for the analysis, it makes sense to me. I wonder about this concept of renku superusers, does it exist already? How do we proceed to assess the effort and possible timeline? We'll discuss formal aspects on Thursday, so it's very good that we have this technical basis progressing. Just a comment on mounting: this is an option some people like to see since it is familiar. But in practice it is possible to get similar experience by exploring storage through an API. Even in shell we use sometimes some pseudo-ls. Sometimes, it can be even advantageous, since it requires more purposeful data transfers. |
I wonder if this here should be adapted given that #31 will provide additional features which can be relied upon. Or should this remain as it is, since it explains first of all use case, which remains the same? |
@volodymyrss I read through the RFC but I still have a lot of questions I want to clarify. Here is a list of user stories I extracted from the RFC and our meetings. I hope you can review and answer the questions, and also let me know if the limitations I posted here are acceptable. As a SmartSky user I want to:
Questions and limitations:
|
We want to support several, with plugin interface. Several kinds of clusters, and also multiple actual clusters. This is pointed out in here. Please feel free also to make comments on the text with a PR or as you like, if you find that something is missing or unclear!
Most of the cases will be, out of those quoted in the text, only rucio does not fit
I think this is understandable, if there is no other choice.
Some compute backends (e.g. ARC) will fetch the data from any compatible remote storage by URL.
When user authorizes renku to access a comprehensive compute backend, like ARC, the backend can also access storage on user's behalf.
Responsibility where? Are you referring to previous two items in this list?
Environment would be defined in a container. By default, the same container as used in renku session which is built on renku already, but with modified entrypoint (we do something similar already, and I know from UG meeting other users do too, maybe we can reach out to them).
Would be nice if feasible.
The results will be stored in a storage at the end of the execution. |
Hi @olevski , did you have a chance to consider my responses? Should I incorporate them as further changes to the RFC? |
No description provided.