Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MCO-585: MCO-569: MCO-563: MCO-586: Introduces BuildController #3731

Conversation

cheesesashimi
Copy link
Member

@cheesesashimi cheesesashimi commented Jun 6, 2023

- What I did
This introduces the concept of the BuildController for handling on-cluster layering OS builds. This provides a state machine for determining whether a given MachineConfigPool has opted in to layered OS builds by use of the label machineconfiguration.openshift.io/layering-enabled. Included with this PR are two interchangeable image builder backends. These provide a common interface to the BuildController to allow building a layered OS image in one of two ways: 1) Makes use of the OpenShift Image Builder API. 2) Makes use of a custom build pod for building and pushing the containers.

The way the build process works is this:

  1. The on-cluster-build-config ConfigMap is used to identify which push / pull secrets to use and where to push the final OS image to. So this ConfigMap must exist; though there is some experimental work going on in [DO NOT MERGE] - integrates BuildController and machine-os-builder stub #3765 to provide some sensible defaults.
  2. Once a pool has opted-in, the BuildController identifies that a build is required.
  3. A Dockerfile containing the base OS image (from the machine-config-osimageurl ConfigMap) is rendered from a template and stored in a ConfigMap specific to this build.
  4. The rendered MachineConfig is pulled from the API server, compressed and base64-encoded, and stored in a ConfigMAp specific to this build.
  5. Depending on whether the a custom build pod or an OpenShift Image Build is used, both of the aforementioned artifacts are injected into the build process.
  6. The Dockerfile is crafted in such a way that the MachineConfig is decoded and extracted, then fed into Ignition. Ignition reads the config portion of the MachineConfig and performs the necessary operations to have the desired effect.
  7. As the custom build pod or OpenShift Image Build begins the build, the MachineConfigPool status is updated in lockstep with the status of the build pod or OpenShift Image Build.
  8. Upon success, the MachineConfigPool gets an annotation machineconfiguration.openshift.io/newestImageEquivalentConfig which contains the new layered OS image pullspec. The aforementioned Dockerfile and MachineConfig ConfigMaps are deleted upon a successful build.

OpenShift Image Builder:

  • This is fairly self-contained. It creates an OpenShift Build object which knows how to create a build pod, inject the ConfigMaps, and knows where to push the container when the build is completed.

Custom Pod Builder:

  • This creates a build pod which uses quay.io/buildah/stable:latest to perform the build / push process.
  • Since we need the SHA256 digest produced by the build process, we use the --digestfile option on Buildah to write that value to the filesystem someplace.
  • Then, we use the oc CLI contained within the base OS image to create a new ConfigMap with this digest.
  • The PodBuildController reads the ConfigMap and extracts the SHA256 from it.

- How to verify it

This PR includes a unit test suite for the new BuildController which also covers the two interchangeable image builders. Running the unit tests will exercise all of the major code paths therein. In this current form, once merged, this PR will not become part of the MCO's execution code path, meaning that this code will not yet be built nor included in the e2e test suites.

- Description for the changelog
Introduces BuildController

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 6, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 6, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 6, 2023
@cheesesashimi cheesesashimi force-pushed the zzlotnik/buildcontroller-interface branch from 9f79a61 to dc78b4e Compare June 20, 2023 15:15
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 27, 2023
@cheesesashimi cheesesashimi force-pushed the zzlotnik/buildcontroller-interface branch from 8a05491 to 2fd353f Compare June 27, 2023 20:37
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 27, 2023
@cheesesashimi
Copy link
Member Author

/test unit
/test verify

@cheesesashimi
Copy link
Member Author

/test unit
/test verify

1 similar comment
@cheesesashimi
Copy link
Member Author

/test unit
/test verify

@cheesesashimi cheesesashimi force-pushed the zzlotnik/buildcontroller-interface branch from 2fd353f to 7147f45 Compare June 28, 2023 16:39
@cheesesashimi
Copy link
Member Author

/test unit
/test verify

@cheesesashimi cheesesashimi changed the title Introduces BuildController image builder interfaces MCO-585, MCO-565: Introduces BuildController image builder interfaces Jun 28, 2023
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 28, 2023
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jun 28, 2023

@cheesesashimi: This pull request references MCO-565 which is a valid jira issue.

In response to this:

- What I did
This introduces the concept of a common interface across Image Builder backends.

- How to verify it

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@cheesesashimi cheesesashimi changed the title MCO-585, MCO-565: Introduces BuildController image builder interfaces MCO-585, MCO-565: Introduces BuildController Jun 28, 2023
@cheesesashimi cheesesashimi changed the title MCO-585, MCO-565: Introduces BuildController MCO-585: MCO-565: Introduces BuildController Jun 28, 2023
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jun 28, 2023

@cheesesashimi: This pull request references MCO-585 which is a valid jira issue.

In response to this:

- What I did
This introduces the concept of a common interface across Image Builder backends.

- How to verify it

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jun 28, 2023

@cheesesashimi: This pull request references MCO-585 which is a valid jira issue.

In response to this:

- What I did
This introduces the concept of the BuildController for handling on-cluster layering OS builds. This provides a state machine for determining whether a given MachineConfigPool has opted in to layered OS builds by use of the label machineconfiguration.openshift.io/layering-enabled. Included with this PR are two interchangeable image builder backends. These provide a common interface to the BuildController to allow building a layered OS image in one of two ways: 1) Makes use of the OpenShift Image Builder API. 2) Makes use of a custom build pod for building and pushing the containers.

- How to verify it

This PR includes a unit test suite for the new BuildController which also covers the two interchangeable image builders. Running the unit tests will exercise all of the major code paths therein. In this current form, once merged, this PR will not become part of the MCO's execution code path, meaning that this code will not yet be built nor included in the e2e test suites.

- Description for the changelog
Introduces BuildController

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@cheesesashimi cheesesashimi changed the title MCO-585: MCO-565: Introduces BuildController MCO-585: MCO-565: MCO-569: Introduces BuildController Jun 28, 2023
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jun 28, 2023

@cheesesashimi: This pull request references MCO-585 which is a valid jira issue.

In response to this:

- What I did
This introduces the concept of the BuildController for handling on-cluster layering OS builds. This provides a state machine for determining whether a given MachineConfigPool has opted in to layered OS builds by use of the label machineconfiguration.openshift.io/layering-enabled. Included with this PR are two interchangeable image builder backends. These provide a common interface to the BuildController to allow building a layered OS image in one of two ways: 1) Makes use of the OpenShift Image Builder API. 2) Makes use of a custom build pod for building and pushing the containers.

The way the build process works is this:

  1. The on-cluster-build-config ConfigMap is used to identify which push / pull secrets to use and where to push the final OS image to. So this ConfigMap must exist; though there is some experimental work going on in [DO NOT MERGE] - integrates BuildController and machine-os-builder stub #3765 to provide some sensible defaults.
  2. Once a pool has opted-in, the BuildController identifies that a build is required.
  3. A Dockerfile containing the base OS image (from the machine-config-osimageurl ConfigMap) is rendered from a template and stored in a ConfigMap specific to this build.
  4. The rendered MachineConfig is pulled from the API server, compressed and base64-encoded, and stored in a ConfigMAp specific to this build.
  5. Depending on whether the a custom build pod or an OpenShift Image Build is used, both of the aforementioned artifacts are injected into the build process.
  6. The Dockerfile is crafted in such a way that the MachineConfig is decoded and extracted, then fed into Ignition. Ignition reads the config portion of the MachineConfig and performs the necessary operations to have the desired effect.
  7. As the custom build pod or OpenShift Image Build begins the build, the MachineConfigPool status is updated in lockstep with the status of the build pod or OpenShift Image Build.
  8. Upon success, the MachineConfigPool gets an annotation machineconfiguration.openshift.io/newestImageEquivalentConfig which contains the new layered OS image pullspec. The aforementioned Dockerfile and MachineConfig ConfigMaps are deleted upon a successful build.

OpenShift Image Builder:

  • This is fairly self-contained. It creates an OpenShift Build object which knows how to create a build pod, inject the ConfigMaps, and knows where to push the container when the build is completed.

Custom Pod Builder:

  • This creates a build pod which uses quay.io/buildah/stable:latest to perform the build / push process.
  • Upon completion of the build, it uses quay.io/skopeo/stable:latest to get the SHA256 for where the image was pushed to. At the time I wrote this code, I was unaware of the --digestfile option contained within Buildah, so some future work can be done to make this a bit better.

- How to verify it

This PR includes a unit test suite for the new BuildController which also covers the two interchangeable image builders. Running the unit tests will exercise all of the major code paths therein. In this current form, once merged, this PR will not become part of the MCO's execution code path, meaning that this code will not yet be built nor included in the e2e test suites.

- Description for the changelog
Introduces BuildController

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op
/test okd-scos-e2e-aws-ovn
/test e2e-aws-ovn

@cheesesashimi cheesesashimi force-pushed the zzlotnik/buildcontroller-interface branch from bf80665 to 873dc81 Compare July 3, 2023 18:38
@cheesesashimi cheesesashimi changed the title MCO-585: MCO-565: MCO-569: Introduces BuildController MCO-585: MCO-569: MCO-566: MCO-563: MCO-586: Introduces BuildController Jul 3, 2023
@cheesesashimi cheesesashimi changed the title MCO-585: MCO-569: MCO-566: MCO-563: MCO-586: Introduces BuildController MCO-585: MCO-569: MCO-563: MCO-586: Introduces BuildController Jul 3, 2023
@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op
/test okd-scos-e2e-aws-ovn
/test e2e-aws-ovn

@cheesesashimi
Copy link
Member Author

/test unit
/test verify

jkyros and others added 3 commits July 17, 2023 10:53
I don't know that we'll ultimately end up with imagestreams with these
names but for right now, I think these are at least the ones we think
we want that have purposes behind them.
@cheesesashimi cheesesashimi force-pushed the zzlotnik/buildcontroller-interface branch from 873dc81 to c68b40d Compare July 17, 2023 14:57
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 17, 2023
@cheesesashimi
Copy link
Member Author

PR is rebased now.

@cheesesashimi
Copy link
Member Author

/test verify
/test unit
/test bootstrap-unit
/test images

@cheesesashimi
Copy link
Member Author

/retest-required

@cheesesashimi cheesesashimi force-pushed the zzlotnik/buildcontroller-interface branch from c68b40d to 48ac23e Compare July 18, 2023 15:16
@cheesesashimi
Copy link
Member Author

Fixed lint issue.

@jkyros
Copy link
Contributor

jkyros commented Jul 19, 2023

Those test failures look like flakes. We've had green runs until this point and the latest changes wouldn't have affected anything. I say we go for it and let the bot have at it.
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 19, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 19, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cheesesashimi, jkyros

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [cheesesashimi,jkyros]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jkyros
Copy link
Contributor

jkyros commented Jul 19, 2023

/retest-required

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD df96614 and 2 for PR HEAD 48ac23e in total

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 6694a28 and 1 for PR HEAD 48ac23e in total

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD ee08d44 and 0 for PR HEAD 48ac23e in total

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 20, 2023

@cheesesashimi: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn 48ac23e link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link
Contributor

/hold

Revision 48ac23e was retested 3 times: holding

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 20, 2023
@jkyros
Copy link
Contributor

jkyros commented Jul 20, 2023

/test e2e-hypershift

@jkyros
Copy link
Contributor

jkyros commented Jul 20, 2023

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 20, 2023
@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD e91b573 and 2 for PR HEAD 48ac23e in total

@openshift-merge-robot openshift-merge-robot merged commit 9ad6ee9 into openshift:master Jul 20, 2023
@cheesesashimi cheesesashimi deleted the zzlotnik/buildcontroller-interface branch March 21, 2024 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants