Skip to content
This repository has been archived by the owner on Jan 24, 2023. It is now read-only.

Error while deploying openness release 20.12 for edge node. #77

Open
pushpraj527 opened this issue Dec 21, 2020 · 11 comments
Open

Error while deploying openness release 20.12 for edge node. #77

pushpraj527 opened this issue Dec 21, 2020 · 11 comments

Comments

@pushpraj527
Copy link

Hi,
We tried deploying openness-experience-kit(v20.12). We were able to deploy controller successfully, but while deploying node faced the below mention issue. I am also attaching deployment log here(
2020-12-21_14-31-23_ansible.log
). Please let me if any thing else is required.

2020-12-21 14:53:50,037 p=17432 u=root n=ansible | TASK [openness/node : wait for Kafka CA and User secrets] **********************
2020-12-21 14:53:50,037 p=17432 u=root n=ansible | task path: /home/centos/aman/openness-experience-kits/roles/openness/node/tasks/prebuild/kafka_certs.yml:23
2020-12-21 14:53:50,892 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (60 retries left).
2020-12-21 14:54:52,117 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (59 retries left).
2020-12-21 14:55:53,300 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (58 retries left).
2020-12-21 14:56:54,469 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (57 retries left).
2020-12-21 14:57:58,617 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (56 retries left).
2020-12-21 14:59:00,383 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (55 retries left).
2020-12-21 15:00:01,794 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (54 retries left).
2020-12-21 15:01:03,138 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (53 retries left).
2020-12-21 15:02:04,407 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (52 retries left).
2020-12-21 15:03:05,760 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (51 retries left).
2020-12-21 15:04:07,640 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (50 retries left).
2020-12-21 15:05:09,128 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (49 retries left).
2020-12-21 15:06:10,436 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (48 retries left).
2020-12-21 15:07:11,940 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (47 retries left).
2020-12-21 15:08:15,510 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (46 retries left).
2020-12-21 15:09:16,786 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (45 retries left).
2020-12-21 15:10:18,205 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (44 retries left).
2020-12-21 15:11:19,646 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (43 retries left).
2020-12-21 15:12:20,947 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (42 retries left).
2020-12-21 15:13:22,296 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (41 retries left).
2020-12-21 15:14:23,920 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (40 retries left).
2020-12-21 15:15:25,327 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (39 retries left).
2020-12-21 15:16:26,751 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (38 retries left).
2020-12-21 15:17:28,155 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (37 retries left).
2020-12-21 15:18:32,401 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (36 retries left).
2020-12-21 15:19:33,872 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (35 retries left).
2020-12-21 15:20:36,335 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (34 retries left).
2020-12-21 15:21:37,702 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (33 retries left).
2020-12-21 15:22:39,036 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (32 retries left).
2020-12-21 15:23:40,375 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (31 retries left).
2020-12-21 15:24:41,715 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (30 retries left).
2020-12-21 15:25:43,159 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (29 retries left).
2020-12-21 15:26:45,648 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (28 retries left).
2020-12-21 15:27:47,039 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (27 retries left).
2020-12-21 15:28:50,458 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (26 retries left).
2020-12-21 15:29:52,203 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (25 retries left).
2020-12-21 15:30:53,609 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (24 retries left).
2020-12-21 15:31:54,969 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (23 retries left).
2020-12-21 15:32:56,339 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (22 retries left).
2020-12-21 15:33:57,995 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (21 retries left).
2020-12-21 15:34:59,529 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (20 retries left).
2020-12-21 15:36:00,889 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (19 retries left).
2020-12-21 15:37:02,278 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (18 retries left).
2020-12-21 15:38:03,674 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (17 retries left).
2020-12-21 15:39:10,494 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (16 retries left).
2020-12-21 15:40:11,852 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (15 retries left).
2020-12-21 15:41:14,020 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (14 retries left).
2020-12-21 15:42:15,470 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (13 retries left).
2020-12-21 15:43:16,845 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (12 retries left).
2020-12-21 15:44:18,169 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (11 retries left).
2020-12-21 15:45:19,598 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (10 retries left).
2020-12-21 15:46:20,973 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (9 retries left).
2020-12-21 15:47:22,383 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (8 retries left).
2020-12-21 15:48:23,708 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (7 retries left).
2020-12-21 15:49:28,033 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (6 retries left).
2020-12-21 15:50:29,519 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (5 retries left).
2020-12-21 15:51:31,098 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (4 retries left).
2020-12-21 15:52:32,647 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (3 retries left).
2020-12-21 15:53:34,163 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (2 retries left).
2020-12-21 15:54:35,627 p=17432 u=root n=ansible | FAILED - RETRYING: wait for Kafka CA and User secrets (1 retries left).
2020-12-21 15:55:37,504 p=17432 u=root n=ansible | fatal: [node01 -> 192.168.0.16]: FAILED! => {
"attempts": 60,
"changed": false,
"cmd": "restartCounts=kubectl get pods -n kafka -o json | jq -r '.items[] | [.status.containerStatuses[].restartCount] | @sh'\nfor restartCount in $restartCounts; do\n if [ $((restartCount + 0)) -gt 10 ]; then\n exit -1\n fi\ndone\nkubectl get secret cluster-cluster-ca-cert -n kafka && kubectl get secret eaa-kafka -n kafka\n",
"delta": "0:00:00.713457",
"end": "2020-12-21 15:55:37.251058",
"rc": 1,
"start": "2020-12-21 15:55:36.537601"
}

STDOUT:

NAME TYPE DATA AGE
cluster-cluster-ca-cert Opaque 3 56m

STDERR:

jq: error (at :1756): Cannot iterate over null (null)
Error from server (NotFound): secrets "eaa-kafka" not found

MSG:

non-zero return code

@amr-mokhtar
Copy link
Contributor

Hi @pushpraj527! This might be happening due to slow download speed. Can you give it another try and see if the issue still persists?
If persisting, please include the status of all the pods.

@pushpraj527
Copy link
Author

Hi @amr-mokhtar I tried two times and always getting this issue. Pod status and secret status are mentioned below.

podStatus.txt

NAMESPACE NAME READY STATUS RESTARTS AGE
cdi cdi-apiserver-85ff78c47c-mkwfx 1/1 Running 0 15h
cdi cdi-deployment-5c947c965f-pqb26 1/1 Running 0 15h
cdi cdi-operator-7466c8c6b-vgkhw 1/1 Running 2 18h
cdi cdi-uploadproxy-6887998f8d-kll8g 1/1 Running 0 15h
harbor harbor-app-harbor-chartmuseum-6cbd5c5bbb-vqxsr 1/1 Running 0 18h
harbor harbor-app-harbor-clair-779df4555b-jx4tz 2/2 Running 0 18h
harbor harbor-app-harbor-core-7cd94df459-p6xk2 1/1 Running 0 18h
harbor harbor-app-harbor-database-0 1/1 Running 0 18h
harbor harbor-app-harbor-jobservice-864f675bfc-zcqdx 1/1 Running 0 18h
harbor harbor-app-harbor-nginx-7dcd9fbc86-8frtx 1/1 Running 0 18h
harbor harbor-app-harbor-notary-server-7945945b9d-tq65h 1/1 Running 0 18h
harbor harbor-app-harbor-notary-signer-7556c8b697-zkm8l 1/1 Running 0 18h
harbor harbor-app-harbor-portal-fd5ff4bc9-8zzf7 1/1 Running 0 18h
harbor harbor-app-harbor-redis-0 1/1 Running 0 18h
harbor harbor-app-harbor-registry-6c66d95768-pt26k 2/2 Running 0 18h
harbor harbor-app-harbor-trivy-0 1/1 Running 0 18h
kafka cluster-kafka-0 0/2 ContainerCreating 0 6m44s
kafka cluster-zookeeper-0 1/1 Running 0 15h
kafka strimzi-cluster-operator-68b6d59f74-229tw 0/1 Running 19 18h
kube-system coredns-f9fd979d6-c6mrm 1/1 Running 0 18h
kube-system coredns-f9fd979d6-x2dc7 1/1 Running 0 18h
kube-system descheduler-cronjob-1608563400-zccqz 0/1 Completed 0 14h
kube-system descheduler-cronjob-1608563520-5pm6p 0/1 RunContainerError 0 14h
kube-system descheduler-cronjob-1608563520-8qj94 0/1 RunContainerError 0 14h
kube-system descheduler-cronjob-1608563520-crlv2 0/1 Completed 0 14h
kube-system descheduler-cronjob-1608568320-j84f8 0/1 Completed 0 13h
kube-system descheduler-cronjob-1608576720-nmbwr 0/1 RunContainerError 0 11h
kube-system descheduler-cronjob-1608576720-nqzrt 0/1 ContainerCreating 0 7h9m
kube-system etcd-edgecontroller2 1/1 Running 0 18h
kube-system kube-apiserver-edgecontroller2 1/1 Running 0 18h
kube-system kube-controller-manager-edgecontroller2 1/1 Running 2 18h
kube-system kube-ovn-cni-fq65t 1/1 Running 0 18h
kube-system kube-ovn-cni-rt85w 1/1 Running 2 15h
kube-system kube-ovn-controller-76d6bd7c8d-tzgk6 1/1 Running 0 18h
kube-system kube-ovn-pinger-6f897 1/1 Running 0 18h
kube-system kube-ovn-pinger-gdcqc 1/1 Running 0 15h
kube-system kube-proxy-8gz8g 1/1 Running 0 18h
kube-system kube-proxy-b2l79 1/1 Running 0 15h
kube-system kube-scheduler-edgecontroller2 1/1 Running 0 18h
kube-system ovn-central-5845ddffb5-rcnv7 1/1 Running 0 18h
kube-system ovs-ovn-7v8xg 1/1 Running 1 18h
kube-system ovs-ovn-zkp7j 1/1 Running 3 15h
kubevirt virt-api-666f7455c8-4nvln 1/1 Running 0 15h
kubevirt virt-api-666f7455c8-9vc24 1/1 Running 0 15h
kubevirt virt-controller-9c85d9794-n2x9l 1/1 Running 19 15h
kubevirt virt-controller-9c85d9794-qb45t 1/1 Running 18 15h
kubevirt virt-handler-hpx4z 1/1 Running 0 15h
kubevirt virt-operator-6699ff65f4-45zmc 1/1 Running 13 18h
kubevirt virt-operator-6699ff65f4-m6fnb 1/1 Running 11 18h
openness certsigner-6cb79468b5-t5qpm 0/1 ErrImageNeverPull 0 18h
openness eaa-69c7bb7b5d-h2mtf 0/1 Init:0/2 0 18h
openness edgedns-mrhz7 0/1 Init:ErrImageNeverPull 0 15h
openness interfaceservice-f5vhg 0/1 Init:ErrImageNeverPull 0 15h
openness nfd-release-node-feature-discovery-master-6d978d7668-4flfg 1/1 Running 0 18h
openness nfd-release-node-feature-discovery-worker-djn4q 1/1 Running 3 15h
telemetry cadvisor-lhhln 2/2 Running 0 15h
telemetry collectd-sxv46 2/2 Running 0 15h
telemetry custom-metrics-apiserver-55bdf684ff-xhchz 1/1 Running 0 18h
telemetry grafana-76867c586-h7hlm 2/2 Running 0 17h
telemetry otel-collector-f9b9d494-d5s8q 2/2 Running 25 18h
telemetry prometheus-node-exporter-4sg9m 1/1 Running 0 15h
telemetry prometheus-server-8656f6bf98-p84f9 3/3 Running 0 18h
telemetry telemetry-aware-scheduling-554db589c4-2xqdv 2/2 Running 0 18h
telemetry telemetry-collector-certs-gzggr 0/1 Completed 0 18h
telemetry telemetry-node-certs-8frdk 1/1 Running 0 15h

secrets.txt
NAMESPACE NAME TYPE DATA AGE
cdi cdi-api-signing-key Opaque 2 15h
cdi cdi-apiserver-server-cert Opaque 2 15h
cdi cdi-apiserver-signer Opaque 2 15h
cdi cdi-apiserver-token-lj95d kubernetes.io/service-account-token 3 15h
cdi cdi-operator-token-n8t27 kubernetes.io/service-account-token 3 18h
cdi cdi-sa-token-rnrrm kubernetes.io/service-account-token 3 15h
cdi cdi-uploadproxy-server-cert Opaque 2 15h
cdi cdi-uploadproxy-signer Opaque 2 15h
cdi cdi-uploadproxy-token-txlwj kubernetes.io/service-account-token 3 15h
cdi cdi-uploadserver-client-cert Opaque 2 15h
cdi cdi-uploadserver-client-signer Opaque 2 15h
cdi cdi-uploadserver-signer Opaque 2 15h
cdi default-token-jfzz9 kubernetes.io/service-account-token 3 18h
default ca-certrequester Opaque 1 18h
default default-token-mwslf kubernetes.io/service-account-token 3 18h
default sh.helm.release.v1.prometheus-adapter.v1 helm.sh/release.v1 1 18h
harbor default-token-pwlj6 kubernetes.io/service-account-token 3 18h
harbor harbor-app-harbor-chartmuseum Opaque 1 18h
harbor harbor-app-harbor-clair Opaque 3 18h
harbor harbor-app-harbor-core Opaque 8 18h
harbor harbor-app-harbor-database Opaque 1 18h
harbor harbor-app-harbor-jobservice Opaque 2 18h
harbor harbor-app-harbor-nginx Opaque 3 18h
harbor harbor-app-harbor-notary-server Opaque 5 18h
harbor harbor-app-harbor-registry Opaque 3 18h
harbor harbor-app-harbor-trivy Opaque 2 18h
harbor sh.helm.release.v1.harbor-app.v1 helm.sh/release.v1 1 18h
kafka cluster-clients-ca Opaque 1 15h
kafka cluster-clients-ca-cert Opaque 3 15h
kafka cluster-cluster-ca Opaque 1 15h
kafka cluster-cluster-ca-cert Opaque 3 15h
kafka cluster-cluster-operator-certs Opaque 4 15h
kafka cluster-kafka-brokers Opaque 4 14h
kafka cluster-kafka-token-8g8zq kubernetes.io/service-account-token 3 14h
kafka cluster-zookeeper-nodes Opaque 4 15h
kafka cluster-zookeeper-token-rv9lv kubernetes.io/service-account-token 3 15h
kafka default-token-8pg7g kubernetes.io/service-account-token 3 18h
kafka sh.helm.release.v1.strimzi.v1 helm.sh/release.v1 1 18h
kafka strimzi-cluster-operator-token-2j82w kubernetes.io/service-account-token 3 18h
kube-node-lease default-token-c47zh kubernetes.io/service-account-token 3 18h
kube-public default-token-k44np kubernetes.io/service-account-token 3 18h
kube-system attachdetach-controller-token-rrc8z kubernetes.io/service-account-token 3 18h
kube-system bootstrap-signer-token-jnvmq kubernetes.io/service-account-token 3 18h
kube-system bootstrap-token-bcg7hx bootstrap.kubernetes.io/token 6 18h
kube-system certificate-controller-token-65h2t kubernetes.io/service-account-token 3 18h
kube-system clusterrole-aggregation-controller-token-rcmfg kubernetes.io/service-account-token 3 18h
kube-system coredns-token-4dkp7 kubernetes.io/service-account-token 3 18h
kube-system cronjob-controller-token-tn82f kubernetes.io/service-account-token 3 18h
kube-system daemon-set-controller-token-wd6jw kubernetes.io/service-account-token 3 18h
kube-system default-token-f79hc kubernetes.io/service-account-token 3 18h
kube-system deployment-controller-token-brvf6 kubernetes.io/service-account-token 3 18h
kube-system descheduler-sa-token-glrdj kubernetes.io/service-account-token 3 18h
kube-system disruption-controller-token-n59hd kubernetes.io/service-account-token 3 18h
kube-system endpoint-controller-token-tdkfs kubernetes.io/service-account-token 3 18h
kube-system endpointslice-controller-token-5kpjj kubernetes.io/service-account-token 3 18h
kube-system endpointslicemirroring-controller-token-wrp9c kubernetes.io/service-account-token 3 18h
kube-system expand-controller-token-wq44z kubernetes.io/service-account-token 3 18h
kube-system generic-garbage-collector-token-gxt5p kubernetes.io/service-account-token 3 18h
kube-system horizontal-pod-autoscaler-token-xqpx9 kubernetes.io/service-account-token 3 18h
kube-system job-controller-token-qh9p2 kubernetes.io/service-account-token 3 18h
kube-system kube-proxy-token-jm4dv kubernetes.io/service-account-token 3 18h
kube-system namespace-controller-token-r6bxp kubernetes.io/service-account-token 3 18h
kube-system node-controller-token-b5fwt kubernetes.io/service-account-token 3 18h
kube-system ovn-token-f9drj kubernetes.io/service-account-token 3 18h
kube-system persistent-volume-binder-token-8ztw4 kubernetes.io/service-account-token 3 18h
kube-system pod-garbage-collector-token-bph5c kubernetes.io/service-account-token 3 18h
kube-system pv-protection-controller-token-qv755 kubernetes.io/service-account-token 3 18h
kube-system pvc-protection-controller-token-6wm4j kubernetes.io/service-account-token 3 18h
kube-system replicaset-controller-token-xkf89 kubernetes.io/service-account-token 3 18h
kube-system replication-controller-token-8vwkz kubernetes.io/service-account-token 3 18h
kube-system resourcequota-controller-token-989jw kubernetes.io/service-account-token 3 18h
kube-system service-account-controller-token-8cbgj kubernetes.io/service-account-token 3 18h
kube-system service-controller-token-xj9bt kubernetes.io/service-account-token 3 18h
kube-system statefulset-controller-token-v9q8r kubernetes.io/service-account-token 3 18h
kube-system token-cleaner-token-bgbq4 kubernetes.io/service-account-token 3 18h
kube-system ttl-controller-token-wz6bw kubernetes.io/service-account-token 3 18h
kubevirt default-token-zwcml kubernetes.io/service-account-token 3 18h
kubevirt kubevirt-apiserver-token-k5rzv kubernetes.io/service-account-token 3 15h
kubevirt kubevirt-controller-token-9qvc4 kubernetes.io/service-account-token 3 15h
kubevirt kubevirt-handler-token-42khh kubernetes.io/service-account-token 3 15h
kubevirt kubevirt-operator-certs Opaque 3 15h
kubevirt kubevirt-operator-token-sfrb7 kubernetes.io/service-account-token 3 18h
kubevirt kubevirt-virt-api-certs Opaque 3 15h
kubevirt kubevirt-virt-handler-certs Opaque 3 15h
openness ca-certrequester Opaque 1 18h
openness certgen Opaque 2 18h
openness csr-signer-token-kt9z4 kubernetes.io/service-account-token 3 18h
openness default-token-wr2pv kubernetes.io/service-account-token 3 18h
openness eaa-token-h7fq2 kubernetes.io/service-account-token 3 18h
openness edgedns-token-kqpqh kubernetes.io/service-account-token 3 18h
openness interfaceservice-token-4c4rl kubernetes.io/service-account-token 3 18h
openness nfd-master-token-c2t2z kubernetes.io/service-account-token 3 18h
openness nfd-release-node-feature-discovery-master-cert Opaque 2 18h
openness nfd-release-node-feature-discovery-worker-cert Opaque 2 18h
openness root-ca Opaque 2 18h
openness sh.helm.release.v1.nfd-release.v1 helm.sh/release.v1 1 18h
telemetry certgen Opaque 2 18h
telemetry cm-adapter-serving-certs kubernetes.io/tls 2 18h
telemetry custom-metrics-apiserver-token-kqh9b kubernetes.io/service-account-token 3 18h
telemetry default-token-zzkff kubernetes.io/service-account-token 3 18h
telemetry extender-secret kubernetes.io/tls 2 18h
telemetry grafana Opaque 3 18h
telemetry grafana-test-token-l7977 kubernetes.io/service-account-token 3 18h
telemetry grafana-token-h6zrl kubernetes.io/service-account-token 3 18h
telemetry prometheus-node-exporter-token-zgpqx kubernetes.io/service-account-token 3 18h
telemetry prometheus-server-token-bnk6v kubernetes.io/service-account-token 3 18h
telemetry root-ca Opaque 2 18h
telemetry sh.helm.release.v1.cadvisor.v1 helm.sh/release.v1 1 18h
telemetry sh.helm.release.v1.collectd.v1 helm.sh/release.v1 1 18h
telemetry sh.helm.release.v1.grafana.v1 helm.sh/release.v1 1 18h
telemetry sh.helm.release.v1.otel-collector.v1 helm.sh/release.v1 1 18h
telemetry sh.helm.release.v1.prometheus.v1 helm.sh/release.v1 1 18h
telemetry telemetry-aware-scheduling-service-account-token-v26m7 kubernetes.io/service-account-token 3 18h

@tomaszwesolowski
Copy link
Contributor

Hi @pushpraj527
It looks like you might have some issues with docker daemon. Can you check versions of docker and double check if everything works correctly? Also always make sure that machines you work on are clean and were not used before

@pushpraj527
Copy link
Author

Hi @tomaszwesolowski I tried it on the fresh machine only. Nothing manually done there.

@tomaszwesolowski
Copy link
Contributor

Hi, could you share with us file Openness_experience_kit_archive.tar.gz that was created after deployment?

@Jorge-Sasiain
Copy link

Jorge-Sasiain commented Jan 13, 2021

Hello @tomaszwesolowski

I have a the same problem as the issue starter when deploying the edge node (currently in the same step at 34 retries left and counting).

Please, find my Openness_experience_kit_archive.tar.gz here: 2021_01_13_11_44_59_Openness_experience_kit_archive.tar.gz

In case it's relevant, here's the "describe pod" output of the only pod I found in the kafka namespace: https://pastebin.com/raw/gPZsTYp4
It says that 0/2 nodes are available because of taints the pod didn't tolerate.

My knowledge about Kubernetes is limited but according to the output below my edge node seems to be ready and has no taints:

[root@openness-controller centos]# kubectl describe node openness-edgenode
(...)
Events:
  Type    Reason                   Age   From        Message
  ----    ------                   ----  ----        -------
  Normal  Starting                 31m   kubelet     Starting kubelet.
  Normal  NodeHasSufficientMemory  31m   kubelet     Node openness-edgenode status is now: NodeHasSufficientMemory
  Normal  NodeHasNoDiskPressure    31m   kubelet     Node openness-edgenode status is now: NodeHasNoDiskPressure
  Normal  NodeHasSufficientPID     31m   kubelet     Node openness-edgenode status is now: NodeHasSufficientPID
  Normal  NodeAllocatableEnforced  31m   kubelet     Updated Node Allocatable limit across pods
  Normal  Starting                 31m   kube-proxy  Starting kube-proxy.
  Normal  NodeReady                29m   kubelet     Node openness-edgenode status is now: NodeReady
[root@openness-controller centos]# kubectl describe node openness-edgenode | grep Taints
Taints:             <none>

In the edge node, the docker logs of the only container matching "docker container ls | grep kafka" are empty.

I would appreciate any pointers towards finding a solution for this issue. Thanks in advance.

Edit: It seems to be an underlaying networking issue from my part unrelated to taints and tolerations where nodes just can't reach pods from a different node. I'll edit again confirming it if I can fix it.

@iamtiresome
Copy link

My solution is run “kubectl taint nodes - -all node-role.kubernetes.io/master-“ and then run deploy.py file again

@NishankK
Copy link

My solution is run “kubectl taint nodes - -all node-role.kubernetes.io/master-“ and then run deploy.py file again

I am facing exact same issue...Can you please elaborate on what you did?

  1. Where did you run it?
  2. What did you do after running (since it is "click to deploy" kind of a thing, right. So how do we:
    a. Run deploy. py again and where is it located.
    b. What are the steps after that.

Details would be very much appreciated.

Thanks,
Nishank

@Jorge-Sasiain
Copy link

@NishankK In my case the error was just caused because the pods in my controller node had no connectivity with pods in the edge node. I would check whether that's your case too. (I don't remember exactly right now, but in my case I had the controller in a OpenStack VM and it was some issue with port security rules I think).

If I'm understanding correctly, removing the master taint from all nodes as suggested above would enable the EAA pod (or whatever it is) to be deployed on the controller instead of the edge, which would place it in the same node as the Kafka pod but I don't think it's correct (but I could be missing something, sorry i nadvance if that's the case).

@NishankK
Copy link

@NishankK In my case the error was just caused because the pods in my controller node had no connectivity with pods in the edge node. I would check whether that's your case too. (I don't remember exactly right now, but in my case I had the controller in a OpenStack VM and it was some issue with port security rules I think).

If I'm understanding correctly, removing the master taint from all nodes as suggested above would enable the EAA pod (or whatever it is) to be deployed on the controller instead of the edge, which would place it in the same node as the Kafka pod but I don't think it's correct (but I could be missing something, sorry i nadvance if that's the case).

Thanks for replying Jorge..Actually I am getting this exact error while deploying Openness on Azure..I got confused. In Azure deployment case, with the click of a button the Edge as well as controller node is supposed to be deployed on Azure.

Okay I have 1 other query, if you can help, please. Is it possible to deploy Openness controller and Edge on VM rather than server? If yes, what is the recommended VM configuration like how may GBs, CPU etc..Can't find a clear answer to this anywhere.

Many thanks

@Jorge-Sasiain
Copy link

I'm no expert by any means, so I'll just link you to this thread in OpenNESS-dev (Question number 2 covers that): https://mail.openness.org/archives/developer/2021-January/000225.html

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants