[Azure] After some days etcd-main, etcd-events & kops-controller pods of Azure KOPS clusters filled with 401 errors while trying to access kops storage account #16839
Labels
kind/bug
Categorizes issue or PR as related to a bug.
/kind bug
After some days etcd-main, etcd-events & kops-controller pods of Azure KOPS clusters filled with 401 errors while trying to access kops storage account.
Have seen it in multiple clusters.
After some more days, it starts complaining
AuthenticationErrorDetail: Lifetime validation failed. The token is expired
Temp Fix: KOPS-controller pod can be fixed by deleting the pod, new pod comeup fine, but for etcd pods we have to restart the control-plane machine.
Expected: Token refresh should happen automatically for system identity.
1. What
kops
version are you running? The commandkops version
, will displaythis information.
v1.28.5
2. What Kubernetes version are you running?
kubectl version
will print theversion if a cluster is running or provide the Kubernetes version specified as
a
kops
flag.v1.28.11
3. What cloud provider are you using?
azure
4. What commands did you run? What is the simplest way to reproduce this issue?
kubectl -n kube-system logs -f etcd-manager-main-control-plane-eastus2-3000005
5. What happened after the commands executed?
The stack trace mentioned can be seen
401 while accessing kops storage account
6. What did you expect to happen?
No error
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml
to display your cluster manifest.You may want to remove your cluster name and other sensitive information.
8. Please run the commands with most verbose logging by adding the
-v 10
flag.Paste the logs into this report, or in a gist and provide the gist link here.
9. Anything else do we need to know?
KOPS-controller pod can be fixed by deleting the pod, new pod comeup fine, but for etcd pods we have to restart the control-plane machine.
After some more days, it is filled with
cc: @hakman
The text was updated successfully, but these errors were encountered: