-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: unhealthy cluster - https://localhost:2379 #3
Comments
DNS setup: ; <<>> DiG 9.11.4-P2-RedHat-9.11.4-16.P2.el7_8.6 <<>> axfr @192.168.168.10 snc.test ; (1 server found) ;; global options: +cmd snc.test. 10800 IN SOA dc1.sa.out.ba. root.sa.out.ba. 18 28800 7200 604800 3600 snc.test. 900 IN NS ns.snc.test. api.okd4-snc.snc.test. 900 IN A 192.168.168.164 api.okd4-snc.snc.test. 900 IN A 192.168.168.165 _etcd-server-ssl._tcp.okd4-snc.snc.test. 900 IN SRV 0 10 2380 etcd-0.okd4-snc.snc.test. okd4-snc-host.snc.test. 900 IN A 192.168.168.160 api-int.okd4-snc.snc.test. 900 IN A 192.168.168.164 api-int.okd4-snc.snc.test. 900 IN A 192.168.168.165 *.apps.okd4-snc.snc.test. 900 IN CNAME okd4-snc-master.snc.test. okd4-snc-bootstrap.snc.test. 900 IN A 192.168.168.164 ns.snc.test. 900 IN A 192.168.168.10 etcd-0.okd4-snc.snc.test. 900 IN A 192.168.168.165 okd4-snc-master.snc.test. 900 IN A 192.168.168.165 snc.test. 10800 IN SOA dc1.sa.out.ba. root.sa.out.ba. 18 28800 7200 604800 3600 |
[root@hp-144 okd4-snc]# cat ~/bin/setSncEnv.sh export SNC_DOMAIN=snc.test export SNC_HOST=192.168.168.160 #export SNC_NAMESERVER=${SNC_HOST} export SNC_NAMESERVER=192.168.168.10 export SNC_NETMASK=255.255.255.0 export SNC_GATEWAY=192.168.168.254 export INSTALL_HOST_IP=${SNC_HOST} export INSTALL_ROOT=/usr/share/nginx/html/install export INSTALL_URL=http://${SNC_HOST}/install export OKD4_SNC_PATH=/root/okd4-snc export OKD_REGISTRY=quay.io/openshift/okd export OKD_RELEASE=4.4.0-0.okd-2020-05-23-055148-beta5 |
[root@hp-144 okd4-snc]# cat install-config-snc.yaml
|
FCOS defined in ~/bin/DeployOkdSnc.sh CPU="4" MEMORY="16384" DISK="200" FCOS_VER=31.20200505.2.0 FCOS_STREAM=testing |
nginx on host is ok: curl http://okd4-snc-host.snc.test/install/fcos/ignition/bootstrap.ign
|
The FCOS version is old because recent versions of FCOS broke my install. I'm working on an alternative that works with the live ISO. The replacement of the isolinux.cfg that I'm doing in the deployment script no longer works with more recent versions of FCOS... I don't know why yet. The error that you reported above is normal while the bootstrap is starting up. It can take a few minutes before it's up and listening on port 2379. How long did you wait? |
Just to add, I have also tried instalation with newer FCOS images, and okd 4.5.x, with no success. Exactly, with these FCOS:
and this OKD #export OKD_RELEASE=4.5.0-0.okd-2020-06-29-110348-beta6 |
It's possible that during the install of the bootstrap node it is upgrading to FCOS 32 which we are having some issues with. See: okd-project/okd#229 and okd-project/okd#238 |
I've got a long weekend coming up with the holiday here in the states. I hope to get some work done on this. recent versions of FCOS seem to have broken it. |
Thanks for your feeedback @cgruver.
As long as I am writing this :). About 30 minutes. Still the same... |
This is master side virsh console: [root@hp-144 okd4-snc]# virsh console okd4-snc-master
|
Master is obviously stuck at ignition phase. |
You are right. ssh login to bootstrap node |
Yes, what you are seeing is similar to the problem that I am having now building a full cluster. The master nodes cannot pull the ignition from the bootstrap node. I think this is related to the issues I listed above. try:
See if you get a |
Track progress here: okd-project/okd#239 |
Digging deeper, I'm not sure you are seeing the issue that we have with FCOS 32 and OKD 4.5... What is the output of: Run it several times to make sure that DNS round-robin is working. It should hit your bootstrap node. |
curl -v --insecure https://api-int.okd4-snc.snc.test:22623/config/master
There is no service on port 22623 ?! |
active services on bootstrap: [root@okd4-snc-bootstrap ~]# netstat -tlnp
|
I have noticed that. |
Try tearing it down, and running everything again.
Double check your DNS config against the files that I provided. This entry may be incorrect:
I believe that there should not be a Also note that after the bootstrap process completes, you will have to remove the |
I just pushed an update that works with FCOS 32 and OKD 4 Beta 6 It also tested with Beta 5 |
@cgruver, great work ! Last day I have finally achieved a working cluster using this configuration: It is based on your work mostly. The difference is loading ingition file via qemu firmware option.
I will try this after current investigation of my first working cluster :) Again, thank for your work and support. |
For your information, dot at the end is OK. It is standard to put in NS configuration to say "this is full qualified name - STOP". I have seen similar examples in OKD documentation where FQDN is finished with dot. |
Excellent! I will take a look at your config. Eliminating the Nginx server will simplify the deployment for folks. |
Hi, my bootstrap node reports this error:
[root@okd4-snc-bootstrap ~]# journalctl -b -f -u bootkube.service
The text was updated successfully, but these errors were encountered: