-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Managing boot images via the MCO #1496
Changes from 2 commits
f3defdf
8662b40
c7ebf0d
e560240
8fde15a
201a414
1ecab79
3205857
92704fb
74966a5
5a965fe
92a17e6
e4264f3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,206 @@ | ||
--- | ||
title: manage-boot-images | ||
authors: | ||
- "@djoshy" | ||
reviewers: | ||
- "@yuqi-zhang" | ||
- "@mrunal" | ||
- "@cgwalters, for rhcos context" | ||
- "@joelspeed, for machine-api context" | ||
- "@sdodson, for installer context" | ||
approvers: | ||
- "@yuqi-zhang" | ||
api-approvers: | ||
- None | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
creation-date: 2023-10-16 | ||
last-updated: 2022-10-17 | ||
tracking-link: | ||
- https://issues.redhat.com/browse/MCO-589 | ||
see-also: | ||
replaces: | ||
- https://github.com/openshift/enhancements/pull/368 | ||
superseded-by: | ||
- https://github.com/openshift/enhancements/pull/201 | ||
--- | ||
|
||
# Managing boot images via the MCO | ||
|
||
## Summary | ||
|
||
This is a proposal to manage bootimages via the `Machine Config Operator`(MCO), leveraging some of the [pre-work](https://github.com/openshift/installer/pull/4760) done as a result of the discussion in [#201](https://github.com/openshift/enhancements/pull/201). This feature will only target standalone OCP installs. It will also be user opt-in and is planned to be released behind a feature gate. | ||
|
||
For Installer Provisioned Infrastructure(IPI) clusters, the end goal is to create a mechanism that can: | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- update the boot images references in `MachineSets` to the latest in the payload image | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- ensure stub ignition referenced in each `Machinesets` is in spec 3 format | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
For User Provisioned Infrastructure(UPI) clusters, this end goal is to create a document(KB or otherwise) that a cluster admin would follow to update their boot images. | ||
|
||
|
||
## Motivation | ||
|
||
Currently, bootimage references are [stored](https://github.com/openshift/installer/blob/1ca0848f0f8b2ca9758493afa26bf43ebcd70410/pkg/asset/machines/gcp/machines.go#L204C1-L204C1) in a `MachineSet` by the openshift installer during cluster bringup and is thereafter unmanaged. These boot image references are not updated on an upgrade, so any node scaled up using it will boot up with the original “install” bootimage. This has caused a myriad of issues during scale-up due to this version skew, when the nodes attempt the final pivot to the release payload image. Issues linked below: | ||
- Afterburn [[1](https://issues.redhat.com/browse/OCPBUGS-7559)],[[2](https://issues.redhat.com/browse/OCPBUGS-4769)] | ||
- podman [[1](https://issues.redhat.com/browse/OCPBUGS-9969)] | ||
- skopeo [[1](https://issues.redhat.com/browse/OCPBUGS-3621)] | ||
|
||
Additionally, the stub secret [referenced](https://github.com/openshift/installer/blob/1ca0848f0f8b2ca9758493afa26bf43ebcd70410/pkg/asset/machines/gcp/machines.go#L197) in the `MachineSet` is also unmanaged. This stub is used by the ignition binary in firstboot to auth and consume content from the `machine-config-server`(MCS). The content served includes the actual ignition configuration and the final pivot OS image. The ignition binary now does first boot provisioning based on this, then hands off to the `machine-config-daemon`(MCD) first boot service to do the final pivot. As 4.6 and up clusters only understood spec 3 ignition, and as the unmanaged ignition stub is only spec 2, this was now an incompatibility. This would prevent new nodes from joining a cluster that had been upgraded past 4.5, but was originally a 4.5 or lower at install time. Issue linked below: | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- SAN [[1](https://issues.redhat.com/browse/OCPBUGS-1817)] | ||
|
||
|
||
### User Stories | ||
|
||
* As an Openshift engineer, having nodes boot up on an unsupported OCP version is a security liability. By having nodes directly boot on the release payload image, it helps me avoid tracking incompatibilities across OCP release versions and shore up technical debt(see issues linked above). | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
* As a cluster administrator, having to keep track of a "boot" vs "live" image for a given cluster is not intuitive or user friendly. In the worst case scenario, I will have to reset a cluster(or do a lot of manual steps with rh-support in recovering the node) simply to be able to scale up nodes after an upgrade. If I'm managing an IPI cluster, once opted in, this feature will be a "switch on and forget" mechanism for me. If I'm managing a UPI cluster, this would provide me with documentation that I could follow after an upgrade to ensure my cluster has the latest bootimages. | ||
|
||
### Goals | ||
|
||
The MCO will take over management of the boot image references and the stub ignition. The installer is still responsible for creating the `MachineSet` at cluster bring-up of course, but once cluster installation is complete the MCO will ensure that boot images are in sync with the latest payload. From the user standpoint, this should cause less compatibility issues as nodes will no longer need to pivot to a different version of rhcos during node scaleup. | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Non-Goals | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- The new subcontroller does not provide a solution for UPI as it does not use `MachineSets`. We plan to support a UPI solution via documentation that is based on this workflow. | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- This is meant to be a user opt-in feature, and if the user wishes to keep their boot images static it will let them do so. | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- This does not intend to solve [booting into custom pools](https://issues.redhat.com/browse/MCO-773). | ||
- This does not target Hypershift, as [it does not use machinesets](https://github.com/openshift/hypershift/blob/32309b12ae6c5d4952357f4ad17519cf2424805a/hypershift-operator/controllers/nodepool/nodepool_controller.go#L2168). | ||
|
||
## Proposal | ||
|
||
__Overview__ | ||
|
||
- The `machine-config-controller`(MCC) pod will gain a new sub-controller `machine_set_controller`(MSC) that monitors `MachineSet` changes and the `coreos-bootimages` [ConfigMap](https://github.com/openshift/installer/pull/4760). | ||
- Before processing a MachineSet, the MSC will check for the existence of `io.openshift.mco-managed=true` annotation. If it is not present, the MSC will exit the reconciliation loop. This is how `MachineSets` are opted-in to this mechanism. | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Based on platform and arch type, the MSC will check if the boot images referenced in the `providerSpec` field of the `MachineSet` is the same as the one in the ConfigMap. Each platform(gcp, aws...and so on) does this differently, so this is a good opportunity to split the work up between platforms and see if the implementation is effective. The ConfigMap is considered to be the golden set of bootimage values, i.e. they will never go out of date. | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Next, it will check if the stub secret referenced is spec 3. If it is spec 2, the MSC will try create a new version of this secret by trying to translate it to spec 3. This step is platform/arch agnostic. Failure to up translate will cause a degrade and the sub-controller will exit without patching the `MachineSet`. | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Finally, if the MSC will attempt to patch the `MachineSet` if required. Failure to do so will cause a degrade. | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Any other failures in the above steps will report an error; but degrades will only be in the specific cases mentioned above. Certain failures may also be as a result of an unsupported architecture or an unsupported platform. This is necessary because support for platforms will be phased in(and some platforms may not even desire this support) | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
__Rolling back__ | ||
|
||
The very first time a `MachineSet` is patched, the MSC will also backup the following via annotation to the `MachineSet`: | ||
- `io.openshift.mco-pre-managed-image=` storing the original provider image reference | ||
- `io.openshift.mco-pre-managed-secret=` storing the original stub secret | ||
|
||
A roll back can be done by opting out the `MachineSet`, this will trigger the MSC to restore the MachineSet to "factory" values by using the annotations mentioned above. | ||
This is an important mitigation in case things go wrong(invalid bootimage references, incorrect patching... etc). | ||
|
||
__UPI__ | ||
|
||
For UPI, the proposal is to create platform specific documentation based on our implementation of the the above work. If this feature is | ||
opted in on a UPI install, it is necessary to warn(degrade or some other way) the cluster admin to indicate that this functionally is essentially a no-op in the absence of machinesets. | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Workflow Description | ||
|
||
- To enroll a `MachineSet` for boot image updates, the cluster admin should add an annotation `io.openshift.mco-managed=true` to the `MachineSet`. | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- To un-enroll(and effectively rollback) the `MachineSet` from boot image updates, the cluster admin should remove the `io.openshift.mco-managed=true` annotation from the `MachineSet`. | ||
|
||
#### Variation and form factor considerations [optional] | ||
|
||
Any form factor using the MCO and `MachineSets` will be impacted by this proposal. So case by case: | ||
- Standalone OpenShift: Yes, this is the main target form factor. | ||
- microshift: No, as it does [not](https://github.com/openshift/microshift/blob/main/docs/contributor/enabled_apis.md) use `MachineSets`. | ||
- Hypershift: No, Hypershift does not have this issue. | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### API Extensions | ||
|
||
We may have to make some changes to MCO CRDs for the opt-in feature. | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Implementation Details/Notes/Constraints [optional] | ||
|
||
![Sub Controller Flow](manage_boot_images_flow.jpg) | ||
|
||
![MachineSet Reconciliation Flow](manage_boot_images_reconcile_loop.jpg) | ||
|
||
The implementation has a GCP specific POC here: | ||
- https://github.com/openshift/machine-config-operator/pull/3980 | ||
|
||
Possible constraints: | ||
- Ignition spec 2 to spec 3 is not deterministic. Some translations are unsupported and as a result not all stub secrets can be managed. In these cases, failure will be reported, and it will cause a cluster degrade. | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- See Open questions below for some more possible constraints. | ||
|
||
### Risks and Mitigations | ||
|
||
The biggest risk in this enhancement would be delivering a bad boot image. To mitigate this, we have outlined a rollback option. | ||
|
||
How will security be reviewed and by whom? TBD | ||
This is a solution aimed at reducing usage of outdated artifacts and should not introduce any security concerns that do not currently exist. | ||
|
||
How will UX be reviewed and by whom? TBD | ||
The UX element involved include the user opt-in and opt-out, which is currently up for debate. | ||
|
||
### Drawbacks | ||
|
||
TBD, based on the open questions below. | ||
|
||
## Design Details | ||
|
||
### Open Questions | ||
|
||
- Should we have a like a global switch that opt-in all `MachineSets` for this mechanism? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm wondering if there's a way that this can be achieved without even modifying the MachineSets, what if it were done at admission time based on the labels on the Machine (and future CAPI InfrastructureMachine) being created. That would mean you don't have to worry about modifying resources in place and would allow the cluster admin to enforce policy at some level. Need to have a think about this 🤔 |
||
- Somewhat related to above, would we also want to allow opting out without rolling back? This is for a situation for the customer would not want to update the boot images any longer, but would like to keep the current image instead of the "factory" after rolling back. Not sure if anyone would use this, but though it was worth considering. | ||
- This proposal relies on the golden configmap having a target value for every platform/arch combination that we use today. I've [noticed](https://issues.redhat.com/browse/MCO-793) some cases like vsphere don't have a reference as it stands today. Why is that? Are there scenarios not requiring boot image updates? | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Heterogenous platform(nodes span across infra providers) concerns. Do such clusters exist? If they do, do they use `MachineSets`? The current proposal assumes the same platform across all nodes and uses the infra object to determine the cluster platform. It reports anror if there is a platform mismatch and will exit non-fatally. | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Hetergenous architecture concerns. I think these exist, but do they use `MachineSets`? The current proposal maps a `MachineSet` to an architecture, so this should not be a concern, but curious overall | ||
djoshy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- The user could have possibly modified the stub ignition used in first boot with sensitive information. While this sub controller could uptranslate them, this is manipulating user data in a certain way which the customer may not be comfortable with. Are we ok with this? | ||
- What platforms do we want to support in GA? GCP was used in the PoC so I've added that, but is there an interest for certain platforms over others for the first release? | ||
|
||
### Test Plan | ||
|
||
In addition to unit tests, the enhancement will also ship with e2e tests, outlined [here](https://issues.redhat.com/browse/MCO-774). | ||
|
||
### Graduation Criteria | ||
|
||
#### Dev Preview -> Tech Preview | ||
|
||
- Support for GCP | ||
- Unit & E2E tests | ||
- Feedback from openshift teams | ||
- [Good CI signal from autoscaling nodes](https://github.com/cgwalters/enhancements/blob/5505d7db7d69ffa1ee838be972c70b572d882891/enhancements/bootimages.md#test-plan) | ||
|
||
|
||
#### Tech Preview -> GA | ||
|
||
- Feedback from interested customers | ||
- UPI documentation based on IPI workflow for select platforms(vpshere + any others TBD) | ||
- User facing documentation created in [openshift-docs](https://github.com/openshift/openshift-docs/) | ||
|
||
In future releases, we can phase in support for remaining platforms as we gain confidence in the functionality. Priorty list for this is still TBD. | ||
|
||
#### Removing a deprecated feature | ||
|
||
This does not remove an existing feature. | ||
|
||
### Upgrade / Downgrade Strategy | ||
|
||
__Upgrade__ | ||
|
||
This mechanism is only active shortly after an upgrade, which is when the ConfigMap containing the bootimages are updated by the CVO manifest. It will also run during machineset edits but patching will only occur if there is a mismatch in bootimages. | ||
|
||
__Downgrade__ | ||
|
||
- If the cluster is downgrading to a version that supports this feature, the boot images will track the downgraded version. | ||
- If the cluster is downgrading to a version that does not support this feature, the boot images will not track to the downgraded version. So, it may be wise to opt-out of the feature prior to the downgrade if "normal(i.e. older) OCP behavior" is expected. | ||
|
||
### Version Skew Strategy | ||
|
||
N/A | ||
|
||
### Operational Aspects of API Extensions | ||
|
||
TBD, based on how the opt-in feature would work. | ||
|
||
#### Failure Modes | ||
|
||
TBD | ||
|
||
#### Support Procedures | ||
|
||
TBD | ||
|
||
## Implementation History | ||
|
||
TBD | ||
|
||
## Alternatives | ||
|
||
TBD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@2uasimojo I think someone from Hive should be aware of and review this EP