-
Notifications
You must be signed in to change notification settings - Fork 42
gardenctl v2 - SSH controller proposal - old #508
Comments
Thanks, a few comments:
|
thanks @rfranzke for your comments. I updated the description accordingly
done
right, I have removed it from the example
sounds good, done
Very good point! Restricting this to the current namespace makes the validation obsolete
Yes, with your recommendation from (4) it makes things much easier and it would not be required. I have
Not needed anymore because of (4). We needed to do this with the
I'm also not sure how to call it. However it still has some minimalistic heartbeat/lifetime controller, that takes care to delete the
Yes that's also the task to evaluate what can be re-used. I hope the @gardener/mcm-maintainers (and others) can give feedback on what can be re-used as I lack experience in this area
I have added a
yes, makes sense. Or it's just always set/overwritten by the mutating webhook
Currently, every admin in the project can read the secrets. So every admin can read the ssh credentials for another user/admin in the project.
Yes good point, makes sense |
@holgerkoser proposed to rename the resource to (👍 / 👎) If there are no objections, I will rename it accordingly. |
Hm, not sure, I'd rather vote for
Yes, OK, right.
👍🏻
OK, yes, true, but is this really a problem? I would think it's good enough to have one dedicated SSH keypair per
Ah, ok, makes sense. However, I would rename this to make it more clear what this is about. Probably it could be helpful to align with the |
@petersutter Thank you for picking this topic up. Here a few comments:
In miss the component that actually sees the public key in the seed and reconfigures
Why is the SSH request/resource not node-specific? I guess, it was included, but then @rfranzke wrote (and you removed it):
Now its generally open to all nodes? Do we really want/need that? Here my thoughts: Pro:
Con:
In principle, we must avoid changes that involve all nodes in parallel, because we can never know whether the sudden change, executed in parallel on all nodes has impact (a bug that breaks them, even only temporarily causes a hick-up as we have seen in the recent past with the
No, I would not forbid that. CPM trumps everything (we don't do it for fun, but because an operator has invoked the highest emergency measure possible: DR protocol to save a cluster). Besides, where is the problem? Everything that happens on the nodes continues as before (nodes, their workload, here the bastion and the
No, that’s not his concern and nothing good can come out of it. Why asking? It always has to be the right seed holding the control plane and that's our job to manage.
I think so. Why not reuse @mvladev's idea here: gardener/gardener#1433 (comment) to implement something like this: When
Where is that coming from? Isn’t that hard to obtain (on the client, |
Why do we have to generate new node credentials at all? I would give the client responsibility for that. By this, we would lock down the bastion host to a single user (the one who requested it) and we wouldn't need to generate/store/transport any private credentials.
This doesn't sound like cloud-native security to me. Using client IP address restrictions is just bandaid and rather indicates a lack of proper authentication mechanisms. |
That's what we actually don't want @timebertt , because then we cannot control how well they were picked, whether they are rotated and stored safely (the user hasn't leaked her private key), etc. At no point we can/should trust the end user/operator, I believe.
I believe that is not quite true. As with security in general, it is always about multiple layers of security. There is no single measure that, if done properly, makes everything suddenly safe. It is always the combination of many things. That said, I am not and never was a friend of IP whitelisting, but in some cases it is a valid means, e.g. our infrastructure clusters can be (additionally) protected this way (and just because its additional it doesn't mean it's not also increasing our security posture) and I would not reject the thought (hiding end points often helps not becoming prey in the first place).
Yes, I have actually technical doubts here (speaking against). |
Thanks @petersutter for looking into this topic.
To help with spinning up a Bastian VM by MCM the following would be the flow,
Lastly, we can help you out with this part. Once we have the proposal cemented, I and @AxiomSamarth can help you with the implementation details.
Regarding this, we will have to also create the SSH key pair prior to trying to create the Bastian VM. Probably something you have already thought of, but just thought to mention. |
@prashanth26 In regards to the "SSH key pair" on infrastructure level: it is the purpose of this BLI to completely eliminate them/have none/drop that code. It is inherently unsafe to have them in the first place as they are neither personal nor can they be rotated (at least not in AWS, I believe to remember; there they are immutable once the EC2 instance is created). So, we want to get rid of them for good and take full control of |
Hmm, ok. That wasn't exactly clear to me. But how it be better to hand out credentials of the same kind to the user/operator on the fly? They could also immediately leak it and we have to trust them not to do so. The only thing that would make it less risky is, that the credentials are fresh, right?
Ah ok, that one's new to me. I didn't see it in this proposal. If we want to do this, it would of course totally change the game.
Hmm, ok. Maybe it can be done as an additional security measure. But I can already picture myself cursing this restriction, when I switch from my unstable ISP connection to LTE, then later realizing that I need to connect to the VPN and thereby switching client IPs multiple times a day. Still my point was, that in the above proposal, the client IP restriction is the only thing making the bastion host (or say |
Why would they leak "immediately"? Doesn't the generation on-the-fly have two very evident key properties:
I don't understand your point. They are fresh, again and again (and old ones are dropped automatically), because they are generated on-the-fly (see description above). Where is the problem in this proposal?
Ah, but yes. It's the purpose of this BLI to improve the SSH handling and the IaaS key pairs are a primitive means at best that has certain not acceptable limitations. We cannot continue to use them and still meet the high security requirements.
Yes, that's why I also believe it's not worth it. It isn't practical and I see technical challenges (where is the IP coming from was my question: neither the client can send it as it doesn't know it, nor can we be sure that the client when it reaches the backend still has it).
No, they aren't (shared) and that's what is written above (always personal). Why do you think anything is shared? Only the public key is used to configure bastion and node. Only the user of
I do not understand. Why would we not trust them and why are they not personal? I am at a loss here. Maybe we continue out-of-band @timebertt ... |
* Hmm, ok. That wasn't exactly clear to me.
See https://github.com/gardener-security/security-backlog#goals: What we cannot get rid of or replace with dynamic tokens, has to be rotated and credentials in the hand of operators are a problem in that regard.
From: Tim Ebert ***@***.***>
Date: Monday, 22. March 2021 at 08:15
To: gardener/gardenctl ***@***.***>
Cc: Lerenc, Vedran ***@***.***>, Comment ***@***.***>
Subject: Re: [gardener/gardenctl] gardenctl v2 - SSH controller proposal (#508)
That's what we actually don't want @timebertt<https://github.com/timebertt> , because then we cannot control how well they were picked, whether they are rotated and stored safely (the user hasn't leaked her private key), etc. At no point we can/should trust the end user/operator, I believe.
Hmm, ok. That wasn't exactly clear to me. But how it be better to hand out credentials of the same kind to the user/operator on the fly? They could also immediately leak it and we have to trust them not to do so. The only thing that would make it less risky is, that the credentials are fresh, right?
If the controller creates them on the fly, there is similarly no way to rotate them. I guess, the only way to invalidate them would be to shutdown the bastion host (which would also work for the flow I described).
So, we want to get rid of them for good and take full control of sshd ourselves, have no means of access unless a controller on the machine detects the authorised request for SSH access and adds the public key(s) (and removesit/them later). No key pairs in the infrastructure anymore.
Ah ok, that one's new to me. I didn't see it in this proposal. If we want to do this, it would of course totally change the game.
It would give us means to invalidate credentials without shutting down the host.
That said, I am not and never was a friend of IP whitelisting, but in some cases it is a valid means, e.g. our infrastructure clusters can be (additionally) protected this way (and just because its additional it doesn't mean it's not also increasing our security posture) and I would not reject the thought (hiding end points often helps not becoming prey in the first place).
Yes, I have actually technical doubts here (speaking against).
Hmm, ok. Maybe it can be done as an additional security measure.
But I can already picture myself cursing this restriction, when I switch from my unstable ISP connection to LTE, then later realizing that I need to connect to the VPN and thereby switching client IPs multiple times a day.
Still my point was, that in the above proposal, the client IP restriction is the only thing making the bastion host (or say SSHSession) personal, because SSH credentials generated by the controller are basically shared and are stored in the infrastructure (i.e. garden cluster).
If we can't trust a public key given by the user, then let's at least somehow make the credentials actually personal by encrypting them on the fly or doing some ElGamal/DH-like key exchange per user and per ssh session or something similar.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#508 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABF6M664JLZSYT5IUTLCU33TE3U7FANCNFSM4ZOOVYDA>.
|
Yep, please. There are some misunderstandings at play here. |
Okay. Got it. Will keep that in mind. |
Ok, there were indeed some misunderstandings happening here. @vlerenc and I had a chat to clarify. Originally (based on the initial proposal from @petersutter), I assumed that the SSH controller would store the generated credentials in some secret in the garden. And that's why I thought of the credentials as "shared", as it's accessible by all every project member / gardener operator. The proposal from @mvladev to make credentials available via some temporary subresource or a similar mechanism, will require an extension API server and won't work with CRDs. It seems undesirable to us to reuse gardener-apiserver for it (we rather want to decouple gardenctl from g/g's core components) or to add another extension API server, so we would need to take another approach. Another option, which I tried to describe above, for making the credentials only accessible for a specific user (requestor of the "SSH session"), would be to include the user's public GPG key in the Other questions/topics that are unclear to us / that we are missing in the initial proposal:
@vlerenc Please add if I forgot something. |
I didn't understood the remark with the network setup. The heartbeat is coming from whatever client, e.g.
If the landscape admin configures the TTL high enough (like 8h), the client would not necessarily send a heartbeat, however I would plan that
My initial thinking was that it's not reused. If it is reused, there MUST be means to extend its lifetime. If the max lifetime is about to be reached (e.g. it expires in less than 30 minutes) a new Bastion should be created.
Good point, I was actually citing @vlerenc on this one from the gardenctlv2 issue, assuming that there are already existing concepts on gardener that can be reused or that there are some ideas on that already. Seems not to be the case
@timebertt Are you suggesting to have different (temporary) credentials for the bastion and the node? Otherwise I did not get the question |
@petersutter I was coming from the practical side and how to use this feature in conjunction with custom SSH clients, etc. If the heart-beat is not short-lived, but rather 4h or 8h (and can be extended once or twice), great. :-)
Personally, I would rather handle it this way:
Bastions are heavy-weight. I would definitely reuse them/not recreate them again and again if multiple operators need access. This is helpful not only in the case of multiple operators accessing one shoot, but also if we want fine-grained access to individual nodes. You wouldn't want to create a bastion to access node A and another one to access node B. |
I would once again come back to the proposal from @timebertt of using the user's public key for the authentication, however with some slight adjustments. @vlerenc you said that
However the validating webhook could take a look a the Algorithm used, the validity bounds ( |
Is that information in a plain (SSH RSA) private/public key? Aren't these fields only available in certs? Also, if we require strong/quick pseudo-rotation/validity, how would that help? |
Oh right, I was not thinking on using an ssh public key. This does not work then |
How the exit/clean-up mechanism for the bastion in v2, maybe already discussed somewhere I missed it.?
That will bring a dilemma, The cleanup process from User A will destroy Bastion host, firewall rules (Public IP A), and then the process break when removing Security group due to Public IP B still exist. And User B lost session due to Bastion host missing. and Security Group (Public IP B) remaining in hyperscale |
@tedteng That's what is meant with reference counter for the bastion above. The controller in the seed "keeps book" of all "intents" for this bastion and deletes (ideally delayed, but that's a bit more effort) the bastion after the last "intent" (=usage) is gone. |
will open a new issue with a new proposal with a different approach as a result from "offline" discussions -> #510 |
Motivation
gardenctl
(v1) has the functionality to setup ssh sessions to the targeted shoot cluster. For this, infrastructure resources like vms, firewall rules etc have to be created.gardenctl
will clean up the resources after the SSH session. However there were issues in the past where that infrastructure resources did not get cleaned up properly, e.g. due to some error and was not retried. Hence the proposal, to have a dedicated controller (for each infrastructure) that manages the infrastructure resources.gardenctl
also re-used the ssh node credentials which were used by each user. Instead, ssh credentials should be created for each user / ssh session.Assumption
The ssh controller on the seed has access to the garden cluster and is allowed to read/watch
SSH
resources created by the userNew Components involved in SSH Flow (naming to be defined)
gardenctlv2
SSHConfig
: contains provider-specific configuration that is embedded intoSSH
resourceSSH
resource)SSH Flow
gardenctlv2
createsSSH
Resource in garden cluster (see resource example below)SSH
resourcespec.seedName
spec.providerType
Validating Webhook (Central SSH Controller) ensuresthat there is a validating webhook configuration for theSSH
resourcethat the user has certain role in the project where the referenced cluster is (or has the permission to read the secret)? What about garden namespace not having a corresponding project? See question (6) at the bottomtheSSH
resource was created AFTER the validating webhook configuration was createdSSH
resources for own seedcloudprovider
credentials from seed-shoot namespaceSSH
resourceSSH
resource gets deleted, all created resources should be cleaned up by the provider specific ssh controllerExample
Open questions
SSH
resource?spec.seedName
in theSSH
resource?a) "static" kubeconfig is set in controller registions'
providerConfig.values
https://github.com/gardener/gardener/blob/master/docs/extensions/controllerregistration.md#scenario-1-deployed-by-gardener.b) Sync with @mvladev regarding the concept of authentication between clusters without static credentials
spec.clientIP
in theSSH
resources exposes the external ip of the user to an attacker that has access to the garden cluster with read permission for theSSH
resourcesHow to ensure that the user is allowed to create an ssh session for the respective shoot? With theterminal-controller-manager
, the approach is to check, if the user has the permission to read the referenced kubeconfig secret. If allowed, theTerminal
resource is created and the controller acts upon it. The proposal above was to check if the user has a certain role in the project, however operators currently have no role in a user's project. So there must be other means to check this. Ideas?The text was updated successfully, but these errors were encountered: