-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Update cloud-init customization #11
Conversation
e024eb1
to
80126df
Compare
Changes relative to upstream: * Add explanatory comments * Do not use stderr output of preKubeadmCommands indicate an error with bootstrapping Changes relative to our fork: * Do not enable IPv6 * Do not remove cloud-init logs and seed * Do not disable VMware customization * Do not disable network configuration * Do not truncate cloud-init-output.log * Do not report status of HTTP proxy configuration * Do not configure cloud-init to remove SSH keys on first boot * Remove commands that are already executed as a result of being defined in `preKubeadmCommands`
80126df
to
66ecf82
Compare
{{ if .ControlPlane }} | ||
- '[ ! -f /run/kubeadm/konvoy-set-kube-proxy-configuration.sh] && sudo reboot' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
were you able to boot the VM and create cluster after removing this file? I remember that pre kubeadm commands were failing If I removed them. I will test it out later to confirm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question!
You're right that any preKubeadmCommand that requires an ordinary file in /run
will fail after a reboot, because ordinary files in /run
do not persist across reboots. We recently (in https://github.com/mesosphere/konvoy2/pull/2337) moved all patch scripts from /run
to /etc
for this reason.
Your question made me wonder about the two reboot
calls left in this template:
cluster-api-provider-cloud-director/controllers/cluster_scripts/cloud_init.tmpl
Lines 72 to 80 in 66ecf82
{{ if .ControlPlane }} | |
- '[ ! -f /root/control_plane.sh ] && sudo reboot' | |
- '[ ! -f /run/kubeadm/kubeadm.yaml ] && sudo reboot' | |
- bash /root/control_plane.sh | |
{{ else }} | |
- '[ ! -f /root/node.sh ] && sudo reboot' | |
- '[ ! -f /run/kubeadm/kubeadm-join-config.yaml ] && sudo reboot' | |
- bash /root/node.sh | |
{{ end }} |
It seems harmless to reboot if the kubeadm config (/run/kubeadm/kubeadm.yaml
or /run/kubeadm/kubeadm-join-config.yaml
) is not not there (yet?).
But if we reboot because the bootstrap script (/root/control_plane.sh
or /root/node.sh
) is missing, and the kubeadm config happens to already be present, we will lose the kubeadm config after the reboot, leading to further reboots, without end.
At this time (66ecf82), I can successfully reboot either a control plane, or worker machine.
I think it may be better to remove the remaining reboot
calls. I will experiment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment that explains why the reboot call is necessary. I've also moved these checks out to their own script, and use a separate log file to keep track.
This is what the log looks like:
# cat /var/log/capvcd/replace-userdata-files.log
2023-10-17 22:07:37 Checking for kubeadm configuration file
2023-10-17 22:07:37 kubeadm configuration file not found, cleaning cloud-init cache and rebooting
2023-10-17 22:08:12 Checking for kubeadm configuration file
2023-10-17 22:08:12 kubeadm configuration file found, exiting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty clever. just iterating the logic: until the user-data is available, the vm will reboot and cloud-init will run it as if its first boot. once it is user-data are available, the bootstrap.sh will run kubeadm init/join.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
until the user-data is available, the vm will reboot and cloud-init will run it as if its first boot. once it is user-data are available, the bootstrap.sh will run kubeadm init/join.
Correct. The upstream cloud-init also did this, but used in-line commands, instead of a script, and the reason behind everything wasn't given.
* Use shell script to clean cloud-init cache and reboot. * Fix error handling of bootstrap script. Do not interpret stderr output as an indicator of failure. Do not rely on trap and errexit, because it does not work for command lists. * Include last lines of output for error context. * Ensure we have an IPv4 address for localhost. * Remove unnecessary cloud-init configuration to preserve SSH host keys.
6c7753d
to
d8316d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for testing this out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all the comments and really digging deep into this! Great changes and so much simpler to understand.
* feat: Update cloud-init customization Changes relative to upstream: * Use shell script to clean cloud-init cache and reboot. * Fix error handling of bootstrap script. Do not interpret stderr output as an indicator of failure. Do not rely on trap and errexit, because it does not work for command lists. * Include last lines of output for error context. * Ensure we have an IPv4 address for localhost. * Remove unnecessary cloud-init configuration to preserve SSH host keys. Changes relative to our fork: * Do not remove cloud-init logs and seed on reboot * Do not truncate cloud-init-output.log on reboot * Do not report status of HTTP proxy configuration * Remove redundant commands (already executed as a result of being defined in `preKubeadmCommands`) * Do not disable VMware customization * Do not disable network configuration Signed-off-by: Daniel Lipovetsky <[email protected]>
Description
Changes relative to upstream:
Changes relative to our fork:
preKubeadmCommands