diff --git a/enhancements/update/upgrade-without-registry.md b/enhancements/update/upgrade-without-registry.md new file mode 100644 index 00000000000..caf6cb63bcf --- /dev/null +++ b/enhancements/update/upgrade-without-registry.md @@ -0,0 +1,887 @@ +--- +title: upgrade-without-registry +authors: +- "@jhernand" +reviewers: +- "@avishayt" +- "@danielerez" +- "@mrunalp" +- "@nmagnezi" +- "@oourfali" +approvers: +- "@sdodson" +- "@zaneb" +api-approvers: +- "@sdodson" +- "@zaneb" +creation-date: 2023-06-29 +last-updated: 2023-07-26 +tracking-link: +- https://issues.redhat.com/browse/RFE-4482 +see-also: +- https://issues.redhat.com/browse/OCPBUGS-13219 +- https://github.com/openshift/cluster-network-operator/pull/1803 +replaces: [] +superseded-by: [] +--- + +# Upgrade without registry + +## Summary + +Provide an automated mechanism to upgrade a cluster without requiring an image +registry server. + +## Motivation + +All these stories are in the context of disconnected clusters with limited +resources, both in the cluster itself and in the surrounding environment: + +- The cluster is not connected to the Internet, or the bandwidth is very limited. +- It isn't possible to bring up additional machines, even temporarily. +- The resources of the cluster are limited, in particular in terms of CPU and memory. +- The technicians responsible for performing the upgrade have no OpenShift knowledge. + +These clusters are usually installed at the customer site by a partner engineer +collaborating with customer technicians. + +Eventually, the cluster will need to be upgraded, and then the technicians will +need tools that make the process as simple as possible, ideally requiring no +OpenShift knowledge. + +### User Stories + +#### Prepare an upgrade bundle + +As an engineer working in a partner's factory, I want to be able to assemble an +upgrade bundle that contains all the artifacts (container images and metadata) +needed to upgrade the OpenShift cluster and my own applications. I want to hand +over this upgrade bundle and the documentation explaining how to use it to the +technicians that will perform the upgrade in a suitable media, for example a USB +stick. + +#### Include custom images in the bundle + +As an engineer working in a partner's factory I want to be able to include in +the upgrade bundle custom images for workloads specific to the customer. + +#### Explicitly allow vetted upgrade bundle + +As an engineer managing a cluster I want be able to explicitly approve the use +of an upgrade bundle, so that only the bundle that I tested and vetted will be +applied to the cluster. + +#### Upgrade a single-node cluster + +As a technician with little or no OpenShift knowledge I want to be able to +upgrade a single-node cluster using the upgrade bundle and its documentation. I +can't bring up any additional infrastructure at the cluster site, in particular +I can't bring up an image registry server, neither outside of the cluster nor +inside. I want to plug the USB stick provided by the engineer in the node and +have the rest of the process performed automatically. + +#### Upgrade a multi-node cluster + +As a technician with little or no OpenShift knowledge I want to be able to +upgrade a multi-node cluster as well. This is the same story than for +single-node clusters but for multi-node clusters. The difference is that I don't +want to plug the USB stick to all the nodes of the cluster, instead I want to +plug it to only one of the nodes (selected randomly) and have the rest of the +process performed automatically, included the propagation of the bundle contents +to all the nodes. + +#### Pre-load upgrade images + +As an engineer managing a cluster that has a low bandwidth and/or unreliable +connection to an image registry server I want to pre-load all the images +required for the upgrade so that when I decide to actually perform the upgrade +there will be no need to contact that slow and/or unreliable registry server. + +### Goals + +Provide an automated and documented mechanism that partner engineers customer +technicians can use to upgrade a cluster without requiring a registry server. + +### Non-Goals + +It is not a goal to not require a registry server for other operations. For +example, installing a new workload will still require a registry server. + +## Proposal + +### Workflow Description + +1. An engineer working in a partner's factory is asked to prepare a bundle to +upgrade a set of clusters to a specific OpenShift version. + +1. The engineer uses the `oc adm upgrade create bundle ...` tool described in +this enhancement to prepare the upgrade bundle containing all the artifacts +(container images and metadata) that are required to perform the upgrade, and +writes it USB stick (or any other suitable media) that will be handed over to +the technicians responsible for performing the upgrades, together with +documentation explaining how to use it. + +1. The technicians receive copies of the USB stick and the corresponding +documentation. + +1. The technician goes to the cluster site and uses the upgrade bundle inside +the USB stick to perform the upgrade. The documentation included in the bundle +will basically ask the technician to plug the USB stick in one of the nodes of +the cluster and then provide simple instructions to verify that the upgraded has +been applied correctly. This step is potentially repeated multiple times by the +same technician for multiple clusters using the same USB stick or copies of it. + +Note that the upgrade bundle should not be specific for a particular cluster, +only for the OpenShift architecture and version. Technicians should be able to +use that package for any cluster with that architecture and version. + +### API Extensions + +There are no new object kinds introduced by this enhancement, but new fields +will be added to existing `ClusterVersion` and `ContainerRuntimeConfig` objects. +Those fields are described in detail in the implementation details section below, +this is a summary of those new `CluserVersion` fields: + +```yaml +apiVersion: config.openshift.io/v1 +kind: ClusterVersion +metadata: + name: version +spec: + desiredUpdate: + + # Indicates if the upgrade should be performed inmediately or if it should + # be held for a later time. The default is `true` which means that the + # upgrade will be performed inmediately (once the images are pre-loaded, if + # so requested). + hold: true|false + + bundle: + # Indicates if use of a bundle is enabled. The default is `true`. If + # set to `false` then the `desiredUpdate.image` or `desiredUpdate.version` + # field will need to be populated. The cluster version operator will then + # pre-load the release images from the registry servers configured in the + # cluster and will not use or require a bundle. + enabled: true|false + + # Indicates if the cluster version operator should monitor devices + # to automatically detect when an upgrade bundle is available. The + # defaut is `false`. + monitor: true|false + + # Indicates the location of the upgrade bundle when `monitor` is `false`. + file: /root/m-bundle.tar # Can be a file. + file: /dev/sdb # Or a device. + + # This is an optional digest of the bundle. If set then the cluster version + # operator will ensure that the bundle is used only if it matches this + # digest. This is intended to prevent use of a wrong USB stick, or use of + # a perfectly fine USB stick in the wrong cluster. + digest: sha256:.... + +status: + desired: + bundle: + # Location of the upgrade bundle. This is populated by the monitor, if + # enabled, or copied from the `spec.desiredUpdate.bundle.file` field if + # monitoring is disabled. + file: /root/my-bunde.tar # Can be a file. + file: /dev/sdb # Or a device. + + # Contains the current status of the bundle processing in each node + # of the cluster. Keys are the node names. + nodes: + node0: + # Indicates if the upgrade bundle has already been extracted to + # the `/var/lib/upgrade/4.13.7-x86_64` directory. + extracted: true|false + # Indicates if the images from the upgrade bundle have beel already + # copied to the `/etc/containers/storage` directory. + loaded: true|false + # Contains the metadata of the bundle as seen by this node. This is + # taken from the `metadata.json` file that is part of the bundle. + metadata: '{ ... }' + node1: { ... } + ... +``` + +And this is a summary of the new `ContainerRuntimeConfig` fields: + +```yaml +apiVersion: machineconfiguration.openshift.io/v1 +kind: ContainerRuntimeConfig +metadata: + name: pin-upgrade +spec: + containerRuntimeConfig: + # Additional image store directories. Translates into something like this + # inside the `/etc/containers/storage.conf` file: + # + # additionalimagestores = [ + # "/var/lib/my-images/...", + # "/var/lib/your-images/...", + # ... + # ] + additionalImageStores: + - /var/lib/my-images + - /var/lib/my-images + + # List of pinned images. Translates into a new `pin-upgrade.conf` file inside + # the `/etc/crio/crio.conf.d` directory with content similar to this: + # + # pinned_images = [ + # "quay.io/openshift-release-dev/ocp-release@sha256:...", + # "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + # "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + # ... + # ] + # + # The CRI-O service will be reloaded when this is changed. + pinnedImages: + - quay.io/openshift-release-dev/ocp-release@sha256:... + - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:... + - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:... + ... +``` + +### Implementation Details/Notes/Constraints + +The proposed solution is based on copying all the required images to all the +nodes of the cluster before starting the upgrade, and ensuring that no component +requires access to a registry server during the upgrade. For this to work the +following changes are required: + +1. No OpenShift component used during the upgrade should use the `Always` pull +policy, as that forces the kubelet and CRI-O to try to contact the registry +server even if the image is already available. + +1. No OpenShift component should garbage collect the upgrade images before or +during the upgrade. This is typically started by the kubelet instruction CIR-O +to remove images. + +1. No OpenShift component should explicitly try to contact the registry server +without a fallback alternative. + +1. The engineer working in the partner's factory needs a `oc adm upgrade create +bundle` tool to create the upgrade bundle. + +1. The technician needs documentation explaining how to use the upgrade bundle to +perform the upgrade. + +1. The machine config operator should understand and support the changes that +will be required to the CRI-O configuration during the upgrade. + +1. The cluster version operator needs to orchestrate the upgrade process. + +#### Don't use the `Always` pull policy during the upgrade + +Some OCP core components currently use the `Always` image pull policy during the +upgrade. As a result, the kubelet and CRI-O will try to contact the registry +server even if the image is already available in the local storage of the +cluster. This blocks the upgrade. + +The catalog operator uses the `Always` pull policy to pull catalog images. It +does so in order refresh catalog images that are specified with a tag. But it +also does it when it pulls catalog images that are specified with a digest. That +should be changed to use the `IfNotPresent` pull policy for catalog images that +are specified by digest. + +Most OCP core components have been changed in the past to avoid this. Recently +the OVN pre-puller has also been changed (see this +[bug](https://issues.redhat.com/browse/OCPBUGS-13219) for details). To prevent +bugs like this from happening in the future and make the solution less fragile +we should have a test that gates the OpenShift release and that verifies that +the upgrade can be performed without a registry server. One way to ensure this +is to have an admission hook that rejects/warns about any spec with `Always` and +run in CI to catch it. + +It would also be useful to have another test that scans for use of this `Always` +pull policy. + +#### Don't garbage collect images required for the upgrade + +Starting with version 4.14 of OpenShift CRI-O will have the capability to pin +certain images (see [this](https://github.com/cri-o/cri-o/pull/6862) pull +request for details). That capability will be used to temporarily pin all the +images required for the upgrade, so that they aren't garbage collected by +kubelet and CRI-O. + +Note that pinning images means that kubelet and CRI-O will not remove them, even +if they aren't in use. It is very important to make sure that there is enough +available space for these images, as otherwise the performance of the node may +degrade and it may stop functioning correctly if it runs out of space. The space +should be enough to accommodate tho releases (current running + candidate for +install) as well as workload images and buffer. + +#### Don't try to contact the image registry server explicitly + +Some OpenShift components explicitly try to contact the registry server without +a fallback alternative. These need to be changed so that they don't do it or so +that they have a fallback mechanism when the registry server isn't available. + +For example, in OpenShift 4.1.13 the machine config operator runs the equivalent +of `skopeo inspect` in order to decide what kind of upgrade is in progress. That +fails if there is no registry server, even if the image has already been pulled. +That needs to be changed so that contacting the registry server is not required. +A possible way to do that is to use the equivalent of `crictl inspect` instead. + +#### MCO should support the configuration changes required for the upgrade + +In order to copy the images required for the upgrade to the nodes of the cluster +we will create an additional image store in the `/var/lib/upgrade` directory of +each node of the cluster, and we will pin all those images. This requires +changes in the `/etc/containers/storage.conf` file, something like this: + +```toml +additionalimagestores = [ + "/var/lib/upgrade/4.13.7-x86_64" +] +``` + +That `/etc/containers/storage.conf` file is tracked by the machine config +operator, and changing it will trigger a reboot that will interfere with the +upgrade process. We will need to change the machine config operator so that it +is aware of these changes and doesn't reboot the node. Ideally this should be +added to the `ContainerRuntimeConfig` object, for example: + +```yaml +apiVersion: machineconfiguration.openshift.io/v1 +kind: ContainerRuntimeConfig +metadata: + ... +spec: + containerRuntimeConfig: + additionalImageStores: + - /var/lib/upgrade/4.13.7-x86_64 +``` + +When the new `additionalImageStores` field is added or changed the machine +config operator will need to regenerate the `/etc/containers/storage.conf` file, +including the corresponding `additionalimagestores` field, and then reload it +using the equivalent of `systemctl reload crio.service`. + +The changes to pin the images will be done in a +`/etc/crio/crio.conf.d/pin-upgrade.conf` file, something like this: + +```toml +pinned_images=[ + "quay.io/openshift-release-dev/ocp-release@sha256:...", + "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + ... +] +``` + +These files aren't tracked by the machine config operator, but the CRI-O service +needs to be reloaded when they are created or changed. To support that a new +filed will be added to the `ContainerRuntimeConfig` object, for example: + +```yaml +apiVersion: machineconfiguration.openshift.io/v1 +kind: ContainerRuntimeConfig +metadata: + name: pin-upgrade +spec: + containerRuntimeConfig: + pinnedImages: + - quay.io/openshift-release-dev/ocp-release@sha256:... + - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:... + - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:... + ... +``` + +When the new `pinnedImages` field is added or changed the machine config +operator will need to create or update the corresponding +`/etc/crio/crio.conf.d/pin-upgrade.conf` file, and then reload it using the +equivalent of `systemctl reload crio.service`. + +#### Tool to create the upgrade bundle + +The upgrade bundle will be created by an engineer in a partner's factory using a +new `oc adm upgrade create bundle` command. This engineer will first determine +the target version number, for example 4.13.7. Note that doing this will +probably require access to the upgrade service available in api.openshift.com. +Finding that upgrade version number is outside of the scope of this enhancement. + +The engineer will then need internet access and a Linux machine where she can +run `oc adm upgrade create bundle`, for example: + +```bash +$ oc adm upgrade create bundle \ +--arch=x86_64 \ +--version=4.13.7 \ +--pull-secret=/my/pull/secret.txt \ +--output=/my/bundle/dir +``` + +The `oc adm upgrade bundle` command will find the list of image references that +make up the release, doing the equivalent of this: + +```bash +$ oc adm release info \ +quay.io/openshift-release-dev/ocp-release:4.13.7-x86_64 -o json | \ +jq '.references.spec.tags[].from.name' +``` + +In addition to the release images the tool will also support explicitly adding +custom images. For example: + +```bash +$ oc adm upgrade create bundle \ +--arch=x86_64 \ +--version=4.13.7 \ +--pull-secret=/my/pull/secret.txt \ +--extra-image=quay.io/my-company/my-workload1 \ +--extra-image=quay.io/my-company/my-workload2 \ +... +--output=/my/bundle/dir +``` + +This is intended for situations where the user wants to use the same upgrade +mechanism for her own images. + +The command will then bring up a temporary image registry server, embedded into +the tool, listening to a randomly selected local port and using a self signed +certificate. It will then start to copy the images found in the previous step to +the embedded registry server, using the equivalent of this for each image: + +```bash +$ skopeo copy \ +--src-auth-file=/my/pull.secret.txt \ +--dst-cert-dir=/my/certs \ +docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:... \ +docker://localhost:12345/openshift-release-dev/ocp-v4.0-art-dev@... +``` + +When all the images have been copied to the temporary image registry server it +will be shutdown. + +The result will be a directory containing approximately 180 images and requires +16 GiB of space. The command will then create a `upgrade-4.13.7-x86_64.tar` tar +file containing that directory and a `metadata.json` file. + +The `metadata.json` file will contain additional information, in particular the +architecture, the version, the size and the list of images: + +```json +{ + "version": "4.13.7", + "arch": "x86_64", + "size": "16 GiB", + "release": "quay.io/openshift-release-dev/ocp-release@sha256:...", + "images": [ + "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + ... + ] +} +``` + +The `metadata.json` file will always be the first entry of the tar file, to +simplify operations that need the metadata but not the rest of the contents of +the tar file. + +The command will also write a `upgrade-4.13.7-x86_64.sha256` file containing a +digest of the complete tar file. + +The engineer will write the tar file to some kind of media and hand it over to +the technicians, together with the documentation explaining how to use it. + +#### Documentation to use the bundle + +This documentation shouldn't assume previous OpenShift knowledge, and should be +basic instructions to plug the USB stick containing the bundle and then check if +upgrade the upgrade succeeded or failed. + +#### CVO will need to orchestrate the upgrade + +The cluster version operator will be responsible for orchestrating the upgrade. + +The technician will receive the USB stick containing the +`upgrade-4.13.7-x86_64.tar` file and will plug it in one of the nodes of the +cluster. + +The cluster version operator will be monitoring device events in all the nodes +of the cluster in order to detect when such an USB stick (or any other kind of +media containing an upgrade bundle) has been plugged. To control that a new +`spec.desiredUpdate.bundle.monitor` field will be added to the `ClusterVersion` +object. The default value will be `false` so that this monitoring will be +disabled by default. When set to `true` the cluster version operator will create +a new `bundle-monitor` daemon set that will perform the actual monitoring. When +any of the pods in this daemon set detects a device containing a valid upgrade +bundle it will update the status of the `ClusterVersion` object indicating that +the bundle is available, via a new `status.desired.bundle.file` field. + +For situations where it isn't desirable to monitor device events, or it isn't +possible to plug the USB stick or any other kind of media, we will also add a +new `spec.desiredUpdate.bundle.file` field to explicitly indicate the location +of the bundle. When this is used the value will be directly copied to the +`status.desired.bundle.file` field, without creating the `bundle-monitor` daemon +set. + +For example, a user that wants to enable automatic detection of upgrade bundles +will add this to the `ClusterVersion` object: + +```yaml +spec: + desiredUpdate: + bundle: + monitor: true +``` + +A user that wants to disable automatic detection, and wants to manually copy the +bundle file to one of the cluster nodes will instead do something like this: + +```yaml +spec: + desiredUpdate: + bundle: + monitor: false # Not really needed, this is the default. + file: /root/upgrade-4.13.7-x86_64.tar +``` + +A user that wants to disable automatic detection, but wants to use an USB stick anyhow will do +something like this: + +```yaml +spec: + desiredUpdate: + bundle: + monitor: false # Not really needed, this is the default. + file: /dev/sdb +``` + +When the `status.desired.bundle.file` field is populated the cluster version +operator will start to replicate the bundle to the rest of the nodes of the +cluster. To do so it will first disable auto-scaling to ensure that no new nodes +are added to the cluster while this process is progress. Then it will start a +new `bundle-server` daemon set. Each of the pods in this daemon set will check +if the file specified in the `status.desired.bundle.file` field exists and +contains a valid bundle. If it does then it will serve it via HTTP for the +other nodes of the cluster. If it doesn't exist then it will do nothing. + +Simultaneously with the `bundle-server` daemon set the cluster version operator +also will start a new `bundle-extractor` batch job in each node of the cluster. +Each pod in these jobs will try to read the bundle from the location specified +in `status.desired.bundle.file` field. If that file doesn't exist it will try to +download it from the HTTP server of one of the pods of the `bundle-server` +daemon set; the first that responds with `HTTP 200 Ok`. Once it has either the +file or body of the HTTP response it first extract the contents of the +`metadata.json` file (will always be the first entry of the tarball) to check +that there is space in `/var/lib/upgrade` for the size indicated in the `size` +field. If there is not enough space then it will be reported as en error +condition in the `ClusterVersion` object and the upgrade process will be +aborted. + +If there is enough space it will then extract the contents to the +`/var/lib/upgrade/4.13.7-x86_64` directory. + +If the `spec.desiredUpgrade.bundle.digest` field is set then the +`bundle-extractor` will calculate the digest of the bundle, and if it doesn't +match it will report it as error condition in the `ClusterVersion` object and +the upgrade process will be aborted. + +Once the bundle is completely extracted and the digest has been verified it will +update the status of the `ClusterVersion` object to indicate that the bundle is +extracted in that node. This will be done via a new +`status.desired.bundle.nodes.*.extracted` and +`status.seried.bundle.nodes.*.metadata` fields. For example, when `node0` and +`node2` have completed the extraction of the bundle but `node1` hasn't, the +`ClusterVersion` status will look like this: + +```yaml +status: + desired: + bundle: + monitor: true + file: /dev/sdb + nodes: + node0: + extracted: true + metadata: | + { + "version": "4.13.7", + "arch": "x86_64", + "release": "quay.io/openshift-release-dev/ocp-release@sha256:...", + "images": [ + "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + ... + ] + } + node1: + extracted: false + node2: + extracted: true + metadata: | + { + "version": "4.13.7", + "arch": "x86_64", + "size": "16 GiB", + "release": "quay.io/openshift-release-dev/ocp-release@sha256:...", + "images": [ + "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + ... + ] + } +``` + +At that point the `bundle-extractor` job will finish. + +The `extracted` field for each node will indicate if the bundle extraction +process has been completed. + +The `metadata` field for each node will contain the information from the +`metadata.json` file of the bundle. Note that the list of images may be too long +(approximately 180 images) to store it in the status of the `ClusterVersion` +object; it may be convenient to store them in separate configmaps, and have only +the references to those configmaps in the `ClusterVersion` status. + +When all the nodes have the bundle extracted (the `extracted` field for all +nodes is `true`) the cluster version operator will delete the `bundle-server` +daemon set and verify that the metadata in all nodes (the content of the +`metadata` field) is the same. If there are differences they will be reported as +error conditions in the status of the `ClusterVersion` object and the upgrade +process will be aborted. This is intended to prevent accidents like having two +different USB sticks with different bundles plugged in two different nodes. + +Once the metadata has been validated the cluster version operator will configure +the machine config operator to pin the images specified in the metadata, so that +they aren't removed by the garbage collection mechanisms while the upgrade is in +progress. To do so it will use the new `pinnedImages` field of the +`ContainerRuntimeConfig` object: + +```yaml +apiVersion: machineconfiguration.openshift.io/v1 +kind: ContainerRuntimeConfig +metadata: + name: pin-upgrade +spec: + containerRuntimeConfig: + pinnedImages: + - quay.io/openshift-release-dev/ocp-release@sha256:... + - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:... + - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:... +``` + +This configuration will be removed once the upgrade has completed successfully. + +`bundle-loader` job in each node of the cluster. The pods inside these jobs will +first check that there is enough space available in the node to load the images +into the `/var/lib/containers/storage` directory. To do so it will use the +`size` field of the `metadata.json` file. Images in the bundle are compressed, +but in `/var/lib/containers/storage` they are not, so the `bundle-loader` will +check that there is at least twice the space indicated in the `size` field. If +there is not enough space it will be reported as error conditions in the +`CluterVersion` object and the upgrade process will be aborted. + +If there is enough space the `bundle-loader` pod will start a temporary embedded +image registry server, listening in a randomly selected local port, using a self +signed certificate to serve the images from the +`/var/lib/upgrade/4.13.17-x86_64` directory. It will then node to trust the self +signed certificate and configure CRI-O to use that temporary image registry +server as a mirror for the release images, creating a +`/etc/containers/registries.conf.d/upgrade-mirror.conf` file with a content +similar to this: + +```toml +[[registry]] +prefix = "quay.io/openshift-release-dev/ocp-release" +location = "quay.io/openshift-release-dev/ocp-release" + +[[registry.mirror]] +location = "localhost:12345/openshift-release-dev/ocp-release" + +[[registry]] +prefix = "quay.io/openshift-release-dev/ocp-v4.0-art-dev" +location = "quay.io/openshift-release-dev/ocp-v4.0-art-dev" + +[[registry.mirror]] +location = "localhost:12345/openshift-release-dev/ocp-v4.0-art-dev" +``` + +This configuration will be removed by the `bundle-loader` once the images have +been successfully loaded into the `/var/lib/containers/storage` directory. + +Next the `bundle-loader` will start to load the images into the +`/var/lib/containers/storage` directory. To do that it will use the gRPC of +CRI-O to run the equivalent of `crictl pull` for each of the images. When that +is completed the `bundle-laoder` will update the new +`status.desired.bundle.nodes.*.loaded` field. For example, when the images have +been loaded for `node0` the status will look like this: + +```yaml +status: + desired: + bundle: + monitor: true + file: /dev/sdb + nodes: + node0: + extracted: true + loaded: true + metadata: '{ ... }' + node1: + extracted: true + loaded: false + metadata: '{ ... }' + node2: + extracted: true + loaded: false + metadata: '{ ... }' +``` + +At that point the `bundle-loader` job will finish. + +When all the nodes have the bundle loaded (the `loaded` fields of all nodes is +`true`) the cluster version operator will check the new +`spec.desiredUpdate.hold` field of the `ClusterVersion` object. This field will +be used to indicate if the upgrade has to be started immediately, or if should +wait till the user explicitly sets it to `false`. The default value for that +will be `false`. This is intended for situations where the user wishes to +prepare everything for the upgrade in advance, but wants to perform the upgrade +later. + +When the value of the `spec.desiredUpdate.hold` field is `false` the cluster +version operator will trigger the regular upgrade process, adding something +like this to the `ClusterVersion` object: + +```yaml +spec: + desiredUpdate: + image: quay.io/openshift-release-dev/ocp-release@... +``` + +The value of the `image` field will be obtained from the `release` field of the +bundle metadata. + +When the upgrade has completed successfully the cluster version operator will +delete the `ContainerRuntimeConfig` object that it created to pin the images, and +will start a new `bundle-cleaner` job in each node that will clean all the +artifacts potentially left around by other pieces of the upgrade. In particular +it will remove the `/var/lib/upgrade/4.13.7-x86_64` directory created by the +`bundle-extractor`. + +In order to support the situation where the user only wants to load the images +in advance but doesn't want to use a bundle, the `ClusterVersion` spec will have +a `desiredUpdate.bundle.enabled` field. The default will be `true`, but when +explicitly set to false the the cluster version operator will skip the creation +of the `bundle-server` and the `bundle-extractor` and will go directly to create +the `bundle-loader`. The `bundle-loader` will in this case skip the step to +start the embedded registry and configure it as a mirror. Instead of that it will +ask CRI-O to pull the images as usual. + + +### Risks and Mitigations + +The proposed solution will require space to store the release bundle and all the +release images in all the nodes of the cluster, approximately 48 GiB in the +worst case. To mitigate that risks the components that will consume disk space +will check in advance if the required space is available. + +### Drawbacks + +This approach requires non trivial changes to the cluster version operator and, +to a lesser degree, to the machine config operator. + +## Design Details + +### Open Questions + +None. + +### Test Plan + +We should have at least tests that verifies that the upgrade can be performed in +a fully disconnected environment, both for a single node cluster and a cluster +with multiple nodes. These tests should gate the OCP release. + +It is desirable to have another test that scans the OCP components looking for +use of the `Always` pull policy. This should probably run for each pull request +of each OCP component, and prevent merging if it detects that the offending pull +policy is used. We should consider adding admission in CI for this. + +### Graduation Criteria + +The feature will ideally be introduced as `Dev Preview` in OpenShift 4.14.z, +moved to `Tech Preview` in 4.15 and declared `GA` in 4.16. + +#### Dev Preview -> Tech Preview + +- Ability to upgrade a single-node clusters using the `bundle.enable: false` mode +(no bundle, just pre-loading of the images). + +- Availability of the tests that verify the upgrade if single-node clusters. + +- Availability of the tests that verify that no OCP component uses the `Always` +pull policy. + +- Obtain positive feedback from at least one customer. + +#### Tech Preview -> GA + +- Ability to upgrade single-node and multi-node clusters using the +`bundle.enable` set to `true` or `false`: with a bundle or just pre-loading the +images. + +- Ability to create bundles using the `oc adm upgrade create bundle` command. + +- Availability of the tests that verify the upgrade in single-node and +multi-node clusters. + +- User facing documentation created in +[https://github.com/openshift/openshift-docs](openshift-docs). + +#### Removing a deprecated feature + +Not applicable, no feature will be removed. + +### Upgrade / Downgrade Strategy + +There are no additional considerations for upgrade or downgrade. The same +considerations that apply to the cluster version operator in general will also +apply in this case. + +### Version Skew Strategy + +This feature will only be usable once the cluster version operator and the +machine config operator have been upgraded to support it. That upgrade will have +to be done by other means. + +For subsequent upgrades we will ensure that the cluster version operator can +work with both the old and the new version of the machine config operator. + +### Operational Aspects of API Extensions + +Not applicable, there are no API extensions. + +#### Failure Modes + +#### Support Procedures + +## Implementation History + +There is an initial prototype exploring some of the implementation details +described here in this [https://github.com/jhernand/upgrade-tool](repository). + +## Alternatives + +The alternative to this is to make a registry server available, either outside +or inside the cluster. + +An external registry server is a well known solution, even for disconnected +environments, but it is not feasible in most of the target environments. + +An internal registry server, running in the cluster itself, is a feasible +alternative for nodes with multiple clusters. The registry server supported by +Red Hat is Quay. The disadvantages of Quay is that it requires additional +resources that are often not available in the target environments. + +For single node clusters an internal registry server isn't an alternative +because it would need to be continuously available during and after the upgrade, +and that isn't possible if the registry server runs in the cluster itself. + +## Infrastructure Needed + +Infrastructure will be needed to run the tests described in the test plan above.