You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After installed install the latest changes of Katib control plane
Run kubectl get pod -n kubeflow and the result is
root@k8master:~# kubectl get pod -n kubeflow
NAME READY STATUS RESTARTS AGE
katib-controller-86fbb67df-5mgpx 0/1 CrashLoopBackOff 52 (4m39s ago) 5h49m
katib-db-manager-7c8745f44b-4tzm5 0/1 CrashLoopBackOff 56 (54s ago) 5h49m
katib-mysql-77b9495867-fqb5l 0/1 Pending 0 5h49m
katib-ui-5d9c77cfc4-4bfzl 1/1 Running 0 5h49m
and run kubectl describe pod katib-controller-86fbb67df-5mgpx -n kubeflow , the result is
Name: katib-controller-86fbb67df-5mgpx
Namespace: kubeflow
Priority: 0
Service Account: katib-controller
Node: k8node02/192.168.100.12
Start Time: Thu, 10 Oct 2024 02:20:03 +0000
Labels: katib.kubeflow.org/component=controller
katib.kubeflow.org/metrics-collector-injection=disabled
pod-template-hash=86fbb67df
Annotations: prometheus.io/port: 8080
prometheus.io/scrape: true
sidecar.istio.io/inject: false
Status: Running
IP: 10.244.0.3
IPs:
IP: 10.244.0.3
Controlled By: ReplicaSet/katib-controller-86fbb67df
Containers:
katib-controller:
Container ID: docker://ec8cfc87a2c33a75ae61fd2d7ac906ccf52800fb49159e6e6253f129c0fd86bf
Image: docker.io/kubeflowkatib/katib-controller:latest
Image ID: docker-pullable://kubeflowkatib/katib-controller@sha256:103962f0810467fc5f6edcb46b8343387a289dd113dce38933ab15d3b0713261
Ports: 8443/TCP, 8080/TCP, 18080/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
Command:
./katib-controller
Args:
--katib-config=/katib-config.yaml
State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 10 Oct 2024 08:10:54 +0000
Finished: Thu, 10 Oct 2024 08:11:24 +0000
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 10 Oct 2024 08:04:52 +0000
Finished: Thu, 10 Oct 2024 08:05:22 +0000
Ready: False
Restart Count: 53
Liveness: http-get http://:healthz/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:healthz/readyz delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
KATIB_CORE_NAMESPACE: kubeflow (v1:metadata.namespace)
Mounts:
/katib-config.yaml from katib-config (ro,path="katib-config.yaml")
/tmp/cert from cert (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-s4x2k (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
cert:
Type: Secret (a volume populated by a Secret)
SecretName: katib-webhook-cert
Optional: false
katib-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: katib-config
Optional: false
kube-api-access-s4x2k:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 36m (x39 over 4h20m) kubelet (combined from similar events): Successfully pulled image "docker.io/kubeflowkatib/katib-controller:latest" in 20.234160626s (20.234172377s including waiting)
Warning Unhealthy 6m18s (x261 over 4h49m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500
Warning BackOff 85s (x1164 over 4h48m) kubelet Back-off restarting failed container katib-controller in pod katib-controller-86fbb67df-5mgpx_kubeflow(c1cd3096-6bcc-4db2-969b-8f0ac265ae05)
Thanks!
What did you expect to happen?
Run kubectl get pod -n kubeflow and the result is
root@k8master:~# kubectl get pod -n kubeflow
NAME READY STATUS RESTARTS AGE
katib-controller-86fbb67df-5mgpx 1/1 Running 52 (4m39s ago) 5h49m
katib-db-manager-7c8745f44b-4tzm5 1/1 Running 56 (54s ago) 5h49m
katib-mysql-77b9495867-fqb5l 1/1 Running 0 5h49m
katib-ui-5d9c77cfc4-4bfzl 1/1 Running 0 5h49m
Environment
Kubernetes version:
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.0", GitCommit:"1b4df30b3cdfeaba6024e81e559a6cd09a089d65", GitTreeState:"clean", BuildDate:"2023-04-11T17:10:18Z", GoVersion:"go1.20.3", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v5.0.1
Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.16", GitCommit:"cbb86e0d7f4a049666fac0551e8b02ef3d6c3d9a", GitTreeState:"clean", BuildDate:"2024-07-17T01:44:26Z", GoVersion:"go1.22.5", Compiler:"gc", Platform:"linux/amd64"}
What happened?
After installed install the latest changes of Katib control plane
Run
kubectl get pod -n kubeflow
and the result isand run
kubectl describe pod katib-controller-86fbb67df-5mgpx -n kubeflow
, the result isThanks!
What did you expect to happen?
Run
kubectl get pod -n kubeflow
and the result isEnvironment
Kubernetes version:
Katib controller version:
``
docker.io/kubeflowkatib/katib-controller:latest
Name: kubeflow-katib
Version: 0.17.0
Summary: Katib Python SDK for APIVersion v1beta1
Home-page: https://github.com/kubeflow/katib/tree/master/sdk/python/v1beta1
Author: Kubeflow Authors
Author-email: [email protected]
License: Apache License Version 2.0
Location: /root/miniconda3/lib/python3.10/site-packages
Requires: certifi, grpcio, kubernetes, protobuf, setuptools, six, urllib3
Required-by:
The text was updated successfully, but these errors were encountered: