Review installation log Check for any glaring issues, this could be image pull back off errors, storage issues, network failers, or permissions issues. The cluster admin credentials can be pulled from here as well. Install logs can be found on the install node, inside of the install directory
cat <install_dir>/.openshift_install.log
Verify that image pull source is what is expected Check that the source of your openshift images is as expected. As we are not in a disconnected environment quay.io and the internal cluster registry.
oc adm node-logs <node_name> -u crio
Verify that cluster version matches what is expected Check the
file in this repo to verifying the intended version for this cluster.oc get clusterversion
Verify that the cluster is on the correct update channel Currently we're looking at the stable update channel for 4.10
oc get clusterversion -o jsonpath='{.items[0].spec}{"\n"}'
Check for any available cluster updates In the event that a new release has been made available during install check we are on the latest version.
oc adm upgrade
Verify that expected Operators are available We need to generate an initial list of operators that each cluster requires.
oc get clusteroperators
Check that all csrs required by operators are approved. Check all nodes are in ready status and CSRs are approved.
oc get csr
Approve any pending CSRs Verify that all pending CSRs are legitimate. If they are not approve only the legitimate CSRs one at a time.
oc adm certificate approve <csr_name>
If all CSRs are legitimate you can approve all pending all CSRs with the below command.
oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
List nodes Ensure that all nodes are in a Ready state and all nodes you expect to be available are
oc get nodes
Review CPU and memory resources Ensure that all hardware expected is listed and that the load is in a the expected range
oc adm top nodes
Ensure kubelet is running on each node Start a debug container on the node
oc debug node <node_name>
Set /host as your root directory
# chroot /host
Check the status of kubelet
# systemctl status kubelet