-
Notifications
You must be signed in to change notification settings - Fork 42
gardenctl v2 (overhaul it, make it safer, and more extensible) #499
Comments
Thanks for summarizing, @vlerenc! Here are some additional thoughts from my side. They don't have to be discussed here, just wanted to put them on the record somewhere for future discussions.
I think we have multiple options for realizing this on the executable/plugin level:
Probably the decision here is about "simple with overhead vs. complex with clean abstraction". Also I'm still struggling how the
Probably the same questions apply to something like
I thought about this a bit again, and I personally would actually like Also, I would like to see the
One thing, I would also like to see as a targeting mechanism is jumping between clusters/targets in the hierarchy. |
Hi, |
@timebertt Yes, from experience with /diag in the bot, I can only agree that we should have only one command that executes a multitude of checks, some of them are IaaS-specific. They should all adhere to the same contract, so that the findings can be uniformly aggregated, assessed and visualised (json, yaml, table, markdown).
Also my favorite, see examples.
That I do not understand. I would still target the shoot and use --control-plane or whatever as an argument to specify that I want to get to the control plane of that shoot cluster, but it's still the shoot I am interested in and that its control plane is on a seed is an implementation detail.
Hmm… “seed” mixes things for me. There are two reasons to visit a seed:
But I get your point. How about: But I am not too happy about this either. Some "switch" functionality would be nice. Usually though, I am not switching. In most cases I open two panes, one on the control plane and one on the cluster (I have the "coordinates", e.g. the dashboard URL then still in my clipboard). |
Yes, something like this would work for me. And is probably even cleaner semantically...
I would add one more reason, which also partly motivates the request for switching between Control Plane and shoot.
Yes, that's also often the case for me, which brings me to another question, which we haven't covered until now: Use case a: I'm targeting some control plane and now want to analyze control plane and the shoot cluster side-by-side in multiple panes. Continuing the session by default is dangerous, so we would need some command to continue the last session on demand, meaning targeting the last cluster again. Then operators can directly jump where ever they want from thereon. Use case b: I'm targeting some control plane and now want to analyze another control plane on the same seed side-by-side in multiple panes. This can probably be supported either by some session mechanism in gardenctl directly or something like the kubeconfig cloning. |
Yes, I know the problem and also hope, the history feature might help. Something similar to In the end, all of us will anyhow create aliases for the most useful cases, but if the CLI supports this generally in a clean way, it would be good. |
Use case B is (maybe) harder and generally another problem. It's also bugging me and my workflows. Generally, I would very much like to have a solution that provides me the targetting, but side-effect free. If I target a cluster in one pane and the same in another and then |
Well, I can imagine different flows for this:
Or even all of them, so everyone can choose the workflow they like. Personally, I would really like to see support for this, as I heavily use this cloning approach and I guess, other folks will find this useful, too. |
@timebertt Hmm... isn't a general The second option, Therefore, I am clearly for option #3. Every time you target something or retarget it, it's another "instance" of the The trouble is, this is only because of the namespace-gets-smeared-into-the-kubeconfig problem. :-( |
Yeah, I would never clone it to the current working directory. That's apparently quite dangerous. Yes, I would employ some similar mechanism like I do currently with the terminal session specific temp dir. We will anyways need some local cache managment for the smart targeting, credentials encryption and so on, so this shouldn't be too much overhead. |
@timebertt Hmm... not sure. I like to have a robust solution with only minimal assumptions and integration into "my personal shell environment". The
What is the "local cache management for the smart targeting"? I am torn between my wish to not have side effects and having a shared/global history, so that I can refer to targetted clusters in new/other shells. Shells have the same issue with their history. What's our take here? E.g. looking here at shell features such as:
Hmm... I thought there is no urgent need anymore for a/the tmp folder approach anymore, because of our other measures (OIDC, transient access, local encryption). Anyways, I am not totally against it. If it can be done nicely, OK. I am just saying that a slim solution would be much appreciated where people can work with the plugins without much ceremony to get it set up first. |
I was just talking about that we will anyways need some local directory structure for caching the topology detection results and also to store the kubeconfigs with the encrypted credentials somewhere. And if that structure and mechanisms are already in place, we can also add a temp directory to that, where we can store cloned kubeconfigs. |
Sure, thanks @timebertt.
👍 |
@mvladev You were mentioning that we have another/better option to access the shoots than temporary service accounts (which, incidentally, is also the way the web terminals get access)? Could you please share the link here? |
I was thinking how about implement operation CRD somewhere and expose functions to support API call (etc. HTTP/REST/GraphQL). The credentials of shoot cluster/Seed cluster process communcation internally inside of cluster. API expose Seed/Shoot functions which used by It generated output for Github comments when call from
I have some thoughts regarding the plugins, (eg
That means the plugins not generated KUBECONFIG file directly. The core of targetting logic still remain in To handle with Plugins target eg Which lead to me a final thought, similar idea as above use API call, What about make Then the authentication and authorization set up in API, integrate with SSO or other third-party tools as default use API to fetch garden kubeconfig, seed kubeconfig, shoot kubeconfig when target. The current |
In regards to the targetting topic, @danielfoehrKn suggested to use https://github.com/danielfoehrKn/kubeswitch, i.e. to split off the |
|
That would not be required. If you like, we can setup a short sync |
sounds good |
@petersutter and I had a quick sync regarding reusing the targeting functionality and have a very rough idea how it could look like. |
Hi @danielfoehrKn , could you please forward the mtg request to me? SRE team would like to be involved in this too, if proper :) |
@danielfoehrKn Please add me, I am also interested in this topic and |
Sure, will do. |
After thinking about it again, we decided, for now, to not invest in integrating the gardenctl v2 with kubeswitch to reuse functionality. Reason
As a result, Gardenctl needs to implement its own basic kubecontext switching functionality. This should not be so much effort and can also be copied / inspired by kubeswitch (this is not much code actually).
The contract between Gardenctl and another tool such as kubeswitch would be the current kubeconfig pointing to a Shoot/Garden/Seed cluster.
Determining the Garden cluster for a Shoot/Seed relies on the cluster identity config map present in Shoot clusters, Shooted seeds, and the Garden cluster. To sum it up: We think that Gardenctl and kubeswitch can be used in a complementary fashion but do not need to be integrated necessarily. We propose to narrow the scope of Gardenctl to exclude fuzzy targeting. |
However @tedteng @neo-liang-sap you are more than welcome to approach me if you are interested in a quick introduction to kubeswitch (for fuzzy search over all landscapes, history targeting, aliasing, etc.). |
Motivation
gardenctl
v1 was written well before Gardener extensibility andkubectl
plugins. It also has a lax handling ofkubeconfig
s as it uses adminkubeconfig
s and doesn't rewrite them with OIDCkubeconfig
s if possible. Therefore, we'd like to suggest a v2 overhaul ofgardenctl
. Furthermore, there was usability feedback in regards to the targetting that we want to address with v2 as well.Proposal
kubectl
plugins as this speaks to the community and features already an extension concept that we need for IaaS-specific subcommands.target
) or multiple (garden
,seed
,project
,shoot
) plugins that could/would deal with managing thekubeconfig
s, i.e. what is called targetting ingardenctl
v1.gardenctl
commands, e.g.logs
,shell
, etc.ssh
,resources
,orphans
, the various CLIs, etc.gardenctl
plugin configuration from GitHub, Vault or wherever they hold the garden configurations.gardenctl
config should be as minimal as possible:gardenctl
should cache information such as domains and identities locally in agardenctl
local config folder. That will be useful for smart and context-aware targetting (see below).brew
/krew
.$PROMPT_COMMAND
) to inject the newkubeconfig
into the parent process (the shell), so that one can directly work with the cluster with standard tools (such as againkubectl
and others).garden
, thenseed
orproject
, thenshoot
"steps" (classicalgardenctl
approach)garden
/project
/seed
, e.g.gardenctl shoot -g prod -p core funny-cluster
that people can then put behind their own shell aliases, so that switching over to v2 gets simplekubeconfig
, the N'th lastkubeconfig
or selecting from a list of last targetted clusterskubeconfig
s for garden and seed clusters.kubeconfig
s encrypted and decrypted on-the-fly with personal credentials, e.g. the user's GPG key. This can be done by providing a binary/command that gets injected into thekubeconfig
(like it is the case for theoidc-plugin
, see https://kubernetes.io/docs/reference/access-authn-authz/authentication/#client-go-credential-plugins) that gets invoked and can either even directly access our infrastructure (slow) or decrypt a previouslygardenctl
-encrypted token on-the-fly.kubeconfig
s on disks (even if encrypted and placed into a RAM disk that gets cleanup up), let's implement a controller in the seed that generates personalisedkubeconfig
s with admin service accounts in the shoot that get removed again automatically after 8h or whatever. This way, operators would have auditable and personalisedkubeconfig
s that last only for a given time.ssh
andshell
commands shall not happen within the CLI (in case it panics, loses connectivity, etc.), but be executed by another controller in the seed that always takes care (safely) of the cleanup.ssh
are created by the above seed controller on-the-fly as well and be then fetched by the node, then injected into thesshd
configuration, and get removed again automatically after 8h or whatever. This way, operators would have auditable and personalisedssh
credentials that last only for a given time.gardenctl
up to end users as well, e.g.:shell
to schedule a regular or priviliged pod in a cluster (on any node or a particular node), possibly not necessary if solutions like https://github.com/kvaps/kubectl-node-shell could replace it (however, it would require to supply a configurable image like ourops-toolbelt
image)aliyun
|aws
|az
|gcloud
|openstack
to invoke the CLIs, so we should continue to support thesessh
into a node, but it should be reimplemented using the SDKs directly instead of invoking the CLIsdiag
(merge withorphan
) to run a self-diagnostic on a shoot (much like the robot does on/diag
, see below), but it should work with operators (full diagnosis as they have access to the seed clusters, which the command should silently obtain if possible) and end users (which have only access to the shoot cluster)logs
fetching the logs for a given time window and component (not individual pod) from loki, which helps with rescheduled or deleted pod logs tremendously and which we therefore should continue to supportinfo
to get landscape information, though this information should also be available in the garden monitoringls
to list gardens, seeds, projects, shoots, and issuesdownload
/terraform
to download/execute everything that's necessary for the infrastructure bring-up if the extension is using the default TF support in Gardenerrobot
shoot probes and instead have the probes ingardenctl
implemented in Go using the Kubernetes client and possibly native SDKs; the latter should be extensible asrobot
shoot probes are. Likerobot
shoot probes, the probes should not only list resources (as is the case today ingardenctl
with thediag
command, which is not helpful), but check for actual issues and assess the situation with "severity", "description", "consequence", and "recommendation" (seerobot
shoot probes, e.g. web hooks or PDBs).Note: Ideally, this description is moved to a docs PR, somtheing like a GEP, to faciliate collaboration and detailing it out.
Taregetting
Hierarchical, direct, domain, fuzzy, and history targetting should be possible. Here some examples how this could look like. Some expressions are lengthy, but if they offer all options, these can then be put behind personal shell aliases (of functions) that operators already use today, which should help with broad adoption, which is in our interest to remove he security hazards of v1 with v2. The targetting should be smart and context-aware, e.g. if the currently active
kubeconfig
is for a seed or shoot in the prod landscape, that's your targetted/context garden. If you then target a project or another seed or shoot, it should automatically happen within this garden cluster. It is not yet clear how fluent this should be, e.g. if a cluster is not found on one garden, are then really all gardens included into a fuzzy search?Hierarchical Targetting
Note: Targetting a project targets the backing namespace in the corresponding garden cluster.
Direct Targetting
Note: Targetting a project targets the backing namespace in the corresponding garden cluster.
Domain Targetting
Domain targetting extracts the domain from an API server or Gardener Dashboard URL and matches it against the domain secrets in the
garden
namespaces of all configured gardens (pre-fetched, of course) or accesses thecluster-identity
configmap in thekube-system
namespace of the shoot if the shoot uses a custom, i.e. unknown domain (slower, therefore only second option).Fuzzy Targetting
If a cluster is targetted via one of the above means and the selection is unambiguous, access is fastest. If however, fuzzy search is directly invoked or a selection is ambiguous, a cluster metadata cache is accessed and in the background the cache is refreshed (possibly even while typing). If the cache is empty, it always needs to be built up/refreshed. That usually takes a few seconds, though. However, only the list of seeds and shoots is retrieved and nothing else, especially not sensitive data like
kubeconfig
s (only metadata).History Targetting
Requires something like a
gardenctl
cache of last targetted clusters by type (seed or shoot).The text was updated successfully, but these errors were encountered: