Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MCO-694: revert from layered pool to non-layered pool #4284

Conversation

cheesesashimi
Copy link
Member

@cheesesashimi cheesesashimi commented Mar 26, 2024

- What I did

This adds code that reverts from a layered MachineConfigPool to a non-layered MachineConfigPool.

Why this was so troublesome is:

  • When a MachineConfig is written to the node, it is placed in the portions of the filesystem that are mutable according to ostree.
  • When a container image containing those MachineConfigs is written onto the node using rpm-ostree, it technically overwrites those preexisting MachineConfigs. In doing so, the container is now claiming (for lack of a better term) ownership of those files.
  • The "factory" OS image does not contain these MachineConfigs.
  • So when we roll back from the customized image to the "factory" image, because the MachineConfig files on disk are now owned by the customized container, they are removed when the factory OS image is rebased.

If an ad-hoc file is written to a mutable part of the filesystem after the container has been applied, provided that the container does not claim ownership of a file with the same name, the ad-hoc file will persist after a reboot. To take full advantage of this fact, this PR does the following:

  1. Introduces a new subpackage called pkg/daemon/runtimeassets. The purpose of this package is to house any configs or templates that need to be applied to a node during runtime but should not be part of the clusters MachineConfig. There is the potential for this to be used by the certificate writer path in the future.
  2. Introduces a machine-config-daemon-revert.service systemd service which is only rendered, written to the node , and enabled whenever a revert operation is being done.
  3. After the file is written to the nodes' filesystem, the node reboots.
  4. During the bootup, the new service detects the presence of /etc/mco/machineconfig-revert.json and runs the MCD in bootstrap mode to rewrite all of the configs to disk. This (unfortunately) requires a second node reboot.
  5. Following the second node reboot, the node should be in the reverted configuration.

- How to verify it

  1. Bring up an OpenShift cluster for this PR.
  2. Opt into on-cluster builds. My onclustertesting helper can be used to assist with that; just run $ onclustertesting setup --enable-feature-gate --pool=layered in-cluster-registry.
  3. Wait for the image to finish building.
  4. Add a node to the layered MachineConfigPool: $ oc label node/<nodename> 'node-role.kubernetes.io/layered='
  5. Wait for the node to deploy the built image.
  6. Remove the label from the layered MachineConfigPool: $ oc label node/<nodename> 'node-role.kubernetes.io/layered-'
  7. Wait for the node to revert back to the worker MachineConfigPool.

- Description for the changelog
Allows reverting from layered MachineConfigPool to non-layered MachineConfigPool

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 26, 2024
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 26, 2024

@cheesesashimi: This pull request references MCO-694 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

- What I did

This adds code that reverts from a layered MachineConfigPool to a non-layered MachineConfigPool.

Why this was so troublesome is:

  • When a MachineConfig is written to the node, it is placed in the portions of the filesystem that are mutable according to ostree.
  • When a container image containing those MachineConfigs is written onto the node using rpm-ostree, it technically overwrites those preexisting MachineConfigs. In doing so, the container is now claiming (for lack of a better term) ownership of those files.
  • The "factory" OS image does not contain these MachineConfigs.
  • So when we roll back from the customized image to the "factory" image, because the MachineConfig files on disk are now owned by the customized container, they are removed when the factory OS image is rebased.

If an ad-hoc file is written to a mutable part of the filesystem after the container has been applied, provided that the container does not claim ownership of a file with the same name, the ad-hoc file will persist after a reboot. To take full advantage of this fact, this PR does the following:

  1. Introduces a machine-config-daemon-revert.service systemd service, which is disabled by default. The contents of this are similar to the machine-config-daemon-firstboot.service, with the exception being that it is required by a default system target.
  2. In the event that a revert is detected, this file is cloned to a different service name in the systemd root (/etc/systemd/system).
  3. The systemd service is then enabled, the new MachineConfig is written to disk under /etc/ignition-machine-config-encapsulated.json.
  4. The node reboots.
  5. During the bootup, the new service detects the presence of /etc/ignition-machine-config-encapsulated.json and runs the MCD in bootstrap mode to rewrite all of the configs to disk. This (unfortunately) includes a second node reboot.
  6. Following the second node reboot, the node should be in the reverted configuration.

- How to verify it

  1. Bring up an OpenShift cluster for this PR.
  2. Opt into on-cluster builds. My onclustertesting helper can be used to assist with that; just run $ onclustertesting setup --enable-feature-gate --pool=layered in-cluster-registry.
  3. Wait for the image to finish building.
  4. Add a node to the layered MachineConfigPool: $ oc label node/<nodename> 'node-role.kubernetes.io/layered='
  5. Wait for the node to deploy the built image.
  6. Remove the label from the layered MachineConfigPool: $ oc label node/<nodename> 'node-role.kubernetes.io/layered-'
  7. Wait for the node to revert back to the worker MachineConfigPool.

- Description for the changelog
Allows reverting from layered MachineConfigPool to non-layered MachineConfigPool

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 26, 2024
Copy link
Contributor

openshift-ci bot commented Mar 26, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op
/test e2e-gcp-op-techpreview

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 26, 2024
@cgwalters
Copy link
Member

One thing we could investigate is something a bit like #1190 where we avoid mutating the system's /etc and explicitly make a new bootloader entry.

@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op
/test e2e-gcp-op-techpreview

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 28, 2024
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 28, 2024
@cheesesashimi cheesesashimi force-pushed the zzlotnik/revert-to-non-layered branch from 2f6ba4a to 979f59f Compare July 10, 2024 20:43
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 10, 2024
@cheesesashimi
Copy link
Member Author

/remove-lifecycle stale

@openshift-ci openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 10, 2024
@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op
/test e2e-gcp-op-techpreview

2 similar comments
@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op
/test e2e-gcp-op-techpreview

@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op
/test e2e-gcp-op-techpreview

@cheesesashimi cheesesashimi force-pushed the zzlotnik/revert-to-non-layered branch 2 times, most recently from bf0c69d to aaad79e Compare July 15, 2024 21:07
@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op
/test e2e-gcp-op-techpreview

3 similar comments
@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op
/test e2e-gcp-op-techpreview

@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op
/test e2e-gcp-op-techpreview

@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op
/test e2e-gcp-op-techpreview

@cheesesashimi cheesesashimi marked this pull request as ready for review July 22, 2024 16:14
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 22, 2024
@openshift-ci openshift-ci bot requested review from sinnykumari and yuqi-zhang July 22, 2024 16:15
Copy link
Contributor

@yuqi-zhang yuqi-zhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really liking the new runtimeassests method! Thanks for adapting the PR!

/lgtm
/hold

Holding for:

  1. QE pre-merge approval
  2. removal of test code after MCO-703: Lifecycle Buildah with MCO #4471 merges

// If the new OS image equals the OS image URL value, this means we're in a
// revert-from-layering situation. This also means we can return early after
// taking a different path.
if newImage == newConfig.Spec.OSImageURL {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I originally thought that a user can also set the image back by hand, but that should be fine here as well.

@@ -456,6 +474,9 @@ func prepareForTest(t *testing.T, cs *framework.ClientSet, testOpts onClusterBui
pushSecretName, err := getBuilderPushSecretName(cs)
require.NoError(t, err)

// REMOVE AFTER https://github.com/openshift/machine-config-operator/pull/4471 LANDS!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can land that first since it's mostly ready to go

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 1, 2024
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 1, 2024
@yuqi-zhang
Copy link
Contributor

Going to remove the hold now that #4471 has landed. @cheesesashimi could you rebase when you get a chance?

/hold cancel

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 28, 2024
@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 28, 2024
This adds code that reverts from a layered MachineConfigPool to a non-layered MachineConfigPool.

Why this was so troublesome is:
- When a MachineConfig is written to the node, it is placed in the portions of the filesystem that are mutable according to ostree.
- When a container image containing those MachineConfigs is written onto the node using rpm-ostree, it technically overwrites those preexisting MachineConfigs. In doing so, the container is now claiming (for lack of a better term) ownership of those files.
- The "factory" OS image does not contain these MachineConfigs.
- So when we roll back from the customized image to the "factory" image, because the MachineConfig files on disk are now owned by the customized container, they are removed when the factory OS image is rebased.

If an ad-hoc file is written to a mutable part of the filesystem after the container has been applied, provided that the container does not claim ownership of a file with the same name, the ad-hoc file will persist after a reboot. To take full advantage of this fact, this PR does the following:

1. Introduces a new subpackage called `pkg/daemon/runtimeassets`. The purpose of this package is to house any configs or templates that need to be applied to a node during runtime but should not be part of the clusters MachineConfig. There is the potential for this to be used by the certificate writer path in the future.
2. Introduces a `machine-config-daemon-revert.service` systemd service which is only rendered, written to the node , and enabled whenever a revert operation is being done.
3. After the file is written to the nodes' filesystem, the node reboots.
4. During the bootup, the new service detects the presence of `/etc/mco/machineconfig-revert.json` and runs the MCD in bootstrap mode to rewrite all of the configs to disk. This (unfortunately) requires a second node reboot.
5. Following the second node reboot, the node should be in the reverted configuration.
@cheesesashimi cheesesashimi force-pushed the zzlotnik/revert-to-non-layered branch from ce47545 to 07db49e Compare August 29, 2024 20:14
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Aug 29, 2024
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 29, 2024
@yuqi-zhang
Copy link
Contributor

/lgtm
/hold

I just realized the original hold is for QE verification. I'm going to re-add that in case we missed some edge cases. Feel free to unhold if no longer necessary

@openshift-ci openshift-ci bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. labels Aug 30, 2024
Copy link
Contributor

openshift-ci bot commented Aug 30, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cheesesashimi, yuqi-zhang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [cheesesashimi,yuqi-zhang]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cheesesashimi
Copy link
Member Author

The failure in e2e-gcp-op is most likely unrelated to this. Still, I'd like to get a clean run.

@djoshy
Copy link
Contributor

djoshy commented Sep 10, 2024

I'm not sure if this case is handled, so wanted to check - what happens to node annotations when reverted? related context here

@cheesesashimi
Copy link
Member Author

When reverted, the desiredImage and currentImage annotations should be cleared.

@cheesesashimi
Copy link
Member Author

/retest-required

@djoshy
Copy link
Contributor

djoshy commented Sep 18, 2024

When reverted, the desiredImage and currentImage annotations should be cleared.

Just to clarify, do you mean that annotations are set to blank, or that they are completely removed on the node object?

@cheesesashimi
Copy link
Member Author

Just to clarify, do you mean that annotations are set to blank, or that they are completely removed on the node object?

I mean that they are completely removed.

@cheesesashimi
Copy link
Member Author

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 23, 2024
@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 1929823 and 2 for PR HEAD 07db49e in total

@cheesesashimi
Copy link
Member Author

/retest-required

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 1929823 and 2 for PR HEAD 07db49e in total

Copy link
Contributor

openshift-ci bot commented Sep 24, 2024

@cheesesashimi: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 1ac641c into openshift:master Sep 24, 2024
17 checks passed
@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: ose-machine-config-operator
This PR has been included in build ose-machine-config-operator-container-v4.18.0-202409250208.p0.g1ac641c.assembly.stream.el9.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants