From 58ef5d9db830653d2e78a5c031fa5ac9cd431e53 Mon Sep 17 00:00:00 2001 From: Juan Hernandez Date: Thu, 29 Jun 2023 09:56:40 +0200 Subject: [PATCH] OTA-1001: Upgrade without registry enhancement This patch adds an enhancement that describes an automated mechanism to perform cluster upgrades without requiring an image registry server. Related: https://issues.redhat.com/browse/RFE-4482 Related: https://issues.redhat.com/browse/OTA-1001 Related: https://issues.redhat.com/browse/OTA-997 Signed-off-by: Juan Hernandez --- .../update/upgrade-without-registry.md | 893 ++++++++++++++++++ 1 file changed, 893 insertions(+) create mode 100644 enhancements/update/upgrade-without-registry.md diff --git a/enhancements/update/upgrade-without-registry.md b/enhancements/update/upgrade-without-registry.md new file mode 100644 index 0000000000..e322765718 --- /dev/null +++ b/enhancements/update/upgrade-without-registry.md @@ -0,0 +1,893 @@ +--- +title: upgrade-without-registry +authors: +- "@jhernand" +reviewers: +- "@avishayt" +- "@danielerez" +- "@mrunalp" +- "@nmagnezi" +- "@oourfali" +approvers: +- "@sdodson" +- "@zaneb" +- "@LalatenduMohanty" +api-approvers: +- "@sdodson" +- "@zaneb" +- "@deads2k" +- "@JoelSpeed" +creation-date: 2023-06-29 +last-updated: 2023-07-26 +tracking-link: +- https://issues.redhat.com/browse/RFE-4482 +see-also: +- https://github.com/openshift/api/pull/1548 +- https://github.com/openshift/machine-config-operator/pull/3839 +- https://issues.redhat.com/browse/OCPBUGS-13219 +- https://github.com/openshift/cluster-network-operator/pull/1803 +replaces: [] +superseded-by: [] +--- + +# Upgrade without registry + +## Summary + +Provide an automated mechanism to upgrade a cluster without requiring an image +registry server, and without requiring OpenShift knowledge for the technicians +performing the upgrades. + +## Motivation + +All these stories are in the context of disconnected clusters with limited +resources, both in the cluster itself and in the surrounding environment: + +- The cluster is not connected to the Internet, or the bandwidth is very limited. +- It isn't possible to bring up additional machines, even temporarily. +- The resources of the cluster are limited, in particular in terms of CPU and memory. +- The technicians responsible for performing the upgrade have little or no OpenShift knowledge. + +These clusters are usually installed at the customer site by a partner engineer +collaborating with customer technicians. + +Eventually, the cluster will need to be upgraded, and then the technicians will +need tools that make the process as simple as possible, ideally requiring no +OpenShift knowledge. + +When OpenShift knowledge is required the technician performing the upgrade will +have the support of the team of engineers that planned and vetted the upgrade. + +### User Stories + +#### Pre-load and pin upgrade images + +As an engineer managing a cluster that has a low bandwidth and/or unreliable +connection to an image registry server I want to pin and pre-load all the +images required for the upgrade so that when I decide to actually perform the +upgrade there will be no need to contact that slow and/or unreliable registry +server. + +#### Pre-load and pin custom images + +As an engineer managing a cluster I want to be able to pin and pre-load custom +images required to upgrade my own applications. + +#### Prepare an upgrade bundle + +As an engineer managing a cluster I want to be able to assemble an upgrade +bundle that contains all the artifacts (container images and metadata) needed +to upgrade the OpenShift cluster and my own applications. I want to hand over +this upgrade bundle and the documentation explaining how to use it to the +technicians that will perform the upgrade in a suitable media, for example a +USB stick. + +#### Include custom images in the upgrade bundle + +As an engineer managing a cluster I want to be able to include in the upgrade +bundle the images required to upgrade my own workloads. + +#### Explicitly allow vetted upgrade bundle + +As an engineer managing a cluster I want be able to explicitly approve the use +of an upgrade bundle, so that only the bundle that I tested and vetted will be +applied to the cluster. + +#### Upgrade a single-node or multi-node clusters using a bundle + +As a technician with little or no OpenShift knowledge I want to be able to +upgrade a single-node or multi-node cluster using the upgrade bundle and its +documentation. I can't bring up any additional infrastructure at the cluster +site, in particular I can't bring up an image registry server, neither outside +of the cluster nor inside. I want to plug the USB stick provided by the +engineer in one of the nodes of the cluster and have the rest of the process +performed automatically. + +### Goals + +Provide an automated and documented mechanism that engineers managing a cluster +can use to pin and pre-load images in order to upgrade a cluster without +requiring a registry server. + +### Non-Goals + +It is not a goal to not require a registry server for other operations. For +example, installing a new workload will still require a registry server. + +## Proposal + +### Workflow Description + +For all kinds of clusters, connected or disconnected, with or without an +available registry server: + +1. The administrator of a cluster uses the OpenShift API to request an upgrade. + +1. The upgrade infrastructure of the cluster ensures that all the images +required for the upgrade are pinned and pre-loaded in all the nodes of the +cluster. These images will be un-pinned during the next upgrade. + +In addition, for the cases were the cluster is completely disconnected or it +isn't possible to use a registry server: + +1. An engineer uses the `oc adm upgrade create bundle ...` tool described in +this enhancement to prepare the upgrade bundle containing all the artifacts +(container images and metadata) that are required to perform the upgrade plus +the images required for the custom workloads, and writes it to a USB stick (or +any other suitable media) that will be handed over to the technicians +responsible for performing the upgrades, together with documentation explaining +how to use it. + +1. The technicians receive copies of the USB stick and the corresponding +documentation. + +1. The technician goes to the cluster site and uses the upgrade bundle inside +the to perform the upgrade. The documentation will ask the technician to +plug the USB stick in one of the nodes of the cluster and provide simple +instructions to verify that the upgraded has been applied correctly. This step +is potentially repeated multiple times by the same technician for multiple +clusters using the same USB stick or copies of it. + +Note that the upgrade bundle should not be specific for a particular cluster, +only for the OpenShift architecture and version. Technicians should be able to +use that package for any cluster with that architecture and version. + +### API Extensions + +There are no new object kinds introduced by this enhancement, but new fields +will be added to existing `ClusterVersion` and `ContainerRuntimeConfig` objects. + +The new fields for the `ClusterVersion` object are defined in detail in the +in https://github.com/openshift/api/pull/1548. + +The new fields for the `ContainerRuntimeConfig` object are defined in detail in +https://github.com/openshift/machine-config-operator/pull/3839. + +### Implementation Details/Notes/Constraints + +The proposed solution is based on pre-loading and pinning all required images +in all the nodes of the cluster before starting the upgrade, and ensuring that +no component requires access to a registry server during the upgrade. For this +to work the following changes are required: + +1. No OpenShift component used during the upgrade should use the `Always` pull +policy, as that forces the kubelet and CRI-O to try to contact the registry +server even if the image is already available. + +1. No OpenShift component should garbage collect the images required for the +upgrade. This is typically started by the kubelet instructing CRI-O to remove +images. + +1. No OpenShift component should explicitly try to contact the registry server +without a fallback alternative. + +1. The machine config operator needs to support image pinning, pre-loading and +reloading of the CRI-O service. + +1. The cluster version operator needs to orchestrate the upgrade process. + +1. The engineer that prepares the upgrade bundle needs a `oc adm upgrade create +bundle` tool to create it. + +1. The technician that performs the upgrade needs documentation explaining how +to use the upgrade bundle. + +#### Don't use the `Always` pull policy during the upgrade + +Some OCP core components currently use the `Always` image pull policy during the +upgrade. As a result, the kubelet and CRI-O will try to contact the registry +server, even if the image is already available in the local storage of the +cluster. This blocks the upgrade. + +The catalog operator uses the `Always` pull policy to pull catalog images. It +does so in order refresh catalog images that are specified with a tag. But it +also does it when it pulls catalog images that are specified with a digest. That +should be changed to use the `IfNotPresent` pull policy for catalog images that +are specified by digest. + +Most OCP core components have been changed in the past to avoid this use of the +`Always` pull policy. Recently the OVN pre-puller has also been changed (see +this [bug](https://issues.redhat.com/browse/OCPBUGS-13219) for details). To +prevent bugs like this happening in the future and make the solution less +fragile we should have a test that gates the OpenShift release and that +verifies that the upgrade can be performed without a registry server. One way +to ensure this is to run in CI an admission hook that rejects/warns about any spec +that uses the `Always` pull policy. + +It would also be useful to have another test that scans for use of this `Always` +pull policy in the source code. + +We can control the image pull policy for the OCP payload, but not for +customer-specific images. The OCP upgrade may succeed, but the overall upgrade +process will still be seen as failed from the customer point of view. For +example, in the OpenShift appliance use case both the cluster and the customer +workloads are installed using a temporary registry server that is shutdown +after the installation is complete. If any of those workloads uses the `Always` +pull policy then the OCP upgrade would succeed, but the customer workloads will +not be able to start after the upgrade. To mitagate that risk when a bundle is +used the upgrade mechanism will check if there are pods using the `Always` pull +policy, and will emit a warning if there are any. + +#### Don't try to contact the image registry server explicitly + +Some OpenShift components explicitly try to contact the registry server without +a fallback alternative. These need to be changed so that they don't do it or so +that they have a fallback mechanism when the registry server isn't available. + +For example, in OpenShift 4.1.13 the machine config operator runs the +equivalent of `skopeo inspect` in order to decide what kind of upgrade is in +progress. That fails if there is no registry server, even if the release image +has already been pulled. That needs to be changed so that contacting the +registry server is not required. A possible way to do that is to use the +equivalent of `crictl inspect` instead. + +#### MCO should learn to pre-load and pin images + +Starting with version 4.14 of OpenShift CRI-O will have the capability to pin +certain images (see [this](https://github.com/cri-o/cri-o/pull/6862) pull +request for details). That capability will be used to pin all the images +required for the upgrade, so that they aren't garbage collected by kubelet and +CRI-O. + +Note that pinning images means that kubelet and CRI-O will not remove them, even +if they aren't in use. It is very important to make sure that there is enough +available space for these images, as otherwise the performance of the node may +degrade and it may stop functioning correctly if it runs out of space. The space +should be enough to accommodate tho releases (current running + candidate for +install) as well as workload images and buffer. + +The changes to pin the images will be done in a +`/etc/crio/crio.conf.d/pin-upgrade.conf` file, something like this: + +```toml +pinned_images=[ + "quay.io/openshift-release-dev/ocp-release@sha256:...", + "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + ... +] +``` + +The images need to be pre-loaded and the CRI-O service needs to be reloaded +when this configuration changes. To support that a new field will be added to +the `ContainerRuntimeConfig` object: + +```yaml +apiVersion: machineconfiguration.openshift.io/v1 +kind: ContainerRuntimeConfig +metadata: + name: pin-upgrade +spec: + containerRuntimeConfig: + pinnedImages: + - quay.io/openshift-release-dev/ocp-release@sha256:... + - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:... + - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:... + ... +``` + +When the new `pinnedImages` field is added or changed the machine config +operator will need to pre-load those images (with the equivalent of `crictl +pull`), create or update the corresponding +`/etc/crio/crio.conf.d/pin-upgrade.conf` file and ask CRI-O reload its +configuration (with the equivalent of `systemctl reload crio.service`). + +If the `spec.containerRuntimeConfig.imagesDirectory` field is used then it +should contain a dump of a +[docker/distribution](https://github.com/distribution/distribution) registry +server. The machine config operator will first check if there is enough space +available in the node to copy the images to the `/var/lib/containers/storage` +directory. Images in the dump of a registry server are compressed, but in +`/var/lib/containers/storage` they are not, so the machine config operator will +check that there is at least twice the space space used by the registry server +dump. If there is not enough space it will be reported as an error conditions in +the `ClusterVersion` object. + +If there is enough disk space then the machine config operator will start a +temporary embedded image registry server in each node of the cluster, listening +in a randomly selected local port, and using a self signed certificate, to +serve the images from the `/var/lib/upgrade/4.13.17-x86_64` directory. It will +then configure the node to trust the self signed certificate and configure +CRI-O to use that temporary image registry server as a mirror for the pinned +images, creating a `/etc/containers/registries.conf.d/pin-upgrade.conf` file +with a content similar to this: + +```toml +[[registry]] +prefix = "quay.io/openshift-release-dev/ocp-release" +location = "quay.io/openshift-release-dev/ocp-release" + +[[registry.mirror]] +location = "localhost:12345/openshift-release-dev/ocp-release" + +[[registry]] +prefix = "quay.io/openshift-release-dev/ocp-v4.0-art-dev" +location = "quay.io/openshift-release-dev/ocp-v4.0-art-dev" + +[[registry.mirror]] +location = "localhost:12345/openshift-release-dev/ocp-v4.0-art-dev" +``` + +The machine config operator will copy the images to the +`/var/lib/containers/storage` directory. To do that it will use the gRPC API of +CRI-O to run the equivalent of `crictl pull` for each of the images. When that +is completed the machine config operator will update the new +`status.pinnedImages` field of the rendered machine config: + +```yaml +status: + pinnedImages: + - quay.io/openshift-release-dev/ocp-release@sha256:... + - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:... + - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:... + ... +``` + +We explored several alternatives for the format of the bundle, and using a dump +of a registry server to be the best one. + +A plain file copy is feasible only if the target directory is empty and nothing +is using it. That is why we are suggesting to add full support to the +additionalimagestores setting in storage.conf. If that was usable (via +`ContainerRuntimeConfig`) then we could create such an additional directory, copy +the files there and reload CRI-O. The size of the (uncompressed) bundle for +this would be approximately 32 GiB, and it would be there for ever, at least +till the next upgrade, because those additional directories are read only. + +It is also possible to copy the images using the equivalent of `skopeo copy +containers-storage:... containers-storage: ...`. That doesn't need need the +permanent additional directory, and doesn't need to shutdown or reload CRI-O, +but it does need those 32 GiB temporarily, while the copy is in progress. + +Another possibility is to use the equivalent of skopeo copy `docker://... +containers-storage:...`. That is where the temporary registry comes in place. +The advantage is that the format used by the registry to store the images is +more efficient: it only needs 16 GiB for a release. That reduces the size of +the bundle and the space required in the node. + +An improvement over that last possibility is to use the CRI-O gRPC API, the +equivalent of `circtl pull ...`. It doesn't improve performance or reduces the +required size, but it means that there is one less component needed (no need +for skopeo) and reduces the risks: CRI-O will be writing the images to the disk +itself, so no risk of format mismatch. + +We aren't ruling out any of the above possibilities, consider them as +implementation details, but we think that the last one is the better overall. + +#### CVO will need to orchestrate the upgrade activities + +To initiate the upgrade the administrator of the cluster changes the +`ClusterVersion` object is changed like this: + +```yaml +kind: ClusterVersion +metadata: + name: version +spec: + desiredUpdate: + version: 4.14.5 +``` + +The cluster operator will first ask the machine config operator to pin and +pre-load the release image, creating a `ContainerRuntimeConfig` object similar +to this: + +```yaml +apiVersion: machineconfiguration.openshift.io/v1 +kind: ContainerRuntimeConfig +metadata: + name: pin-upgrade +spec: + containerRuntimeConfig: + pinnedImages: + - quay.io/openshift-release-dev/ocp-release@sha256:... +``` + +The machine config operator will react to that pinning and pre-loading the +image in all the nodes of the cluster, as described in the previous section. +Once the image is pinned and pre-loaded the cluster version operator will +inspect it and find out the references to the payload images. It will then also +pin and pre-load those images, updating the `ContainerRuntimeConfig` object: + +```yaml +apiVersion: machineconfiguration.openshift.io/v1 +kind: ContainerRuntimeConfig +metadata: + name: pin-upgrade +spec: + containerRuntimeConfig: + pinnedImages: + - quay.io/openshift-release-dev/ocp-release@sha256:... + - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:... + - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:... + ... +``` + +Once those images are pinned and pre-loaded the upgrade will proceed as usual. + +Note that this process will by default work using whatever registry servers are +configured in the cluster. When it isn't possible to use a registry server, the +administrator of the cluster will explicitly configure the cluster version +operator to use an upgrade bundle instead, setting the new +`spec.desiredUpdate.imageSource.sourceType` field to `Bundle` (the default will +be `Registry`): + +```yaml +apiVersion: config.openshift.io/v1 +kind: ClusterVersion +metadata: + name: version +spec: + desiredUpdate: + imageSource: + sourceType: Bundle +``` + +The technician that performs the upgrade will receive the USB stick containing +the upgrade bundle and will plug it in one of the nodes of the cluster. + +The cluster version operator will be monitoring device events in all the nodes +of the cluster in order to detect when such an USB stick (or any other kind of +media containing an upgrade bundle) has been plugged. To control that a new +`spec.desiredUpdate.imageSource.bundle.detectionMechanism` field will be added +to the `ClusterVersion` object. The default value will be `Manual` so that this +monitoring will be disabled by default. When set to `Automatic` the cluster +version operator will create a new `bundle-monitor` daemon set that will +perform the actual monitoring. When any of the pods in this daemon set detects +a device containing a valid upgrade bundle it will update the status of the +`ClusterVersion` object indicating that the bundle is available. + +For example, if the administrator of the cluster wants to enable automatic +detection of upgrade bundles she will add this to the `ClusterVersion` object: + +```yaml +apiVersion: config.openshift.io/v1 +kind: ClusterVersion +metadata: + name: version +spec: + desiredUpdate: + imageSource: + sourceType: Bundle + bundle: + detectionMechanism: + mechanismType: Automatic +``` + +For situations where it isn't desirable to monitor device events, or it isn't +possible to plug the USB stick or any other kind of media, we will also add a +new `spec.desiredUpdate.imageSource.bundle.file` field to explicitly indicate +the location of the bundle. When this is used the value will be directly copied +to `status.desired.bundle.file` without creating the `bundle-monitor` daemon +set. + +If the administrator wants to disable automatic detection of the upgrade bundle, +and wants to manually copy the file to one of the cluster nodes she will do +something like this: + +```yaml +spec: + desiredUpdate: + imageSource: + sourceType: Bundle + bundle: + detectionMechanism: Manual + manual: + file: /root/upgrade-4.13.7-x86_64.tar +``` + +If the administrator wants to disable automatic detection, but wants to use a +USB stick anyhow he will do something like this: + +```yaml +spec: + desiredUpdate: + imageSource: + sourceType: bundle + bundle: + detectionMechanism: Manual + manual: + file: /dev/sdb +``` + +When the `status.desired.bundle.file` field has been populated the cluster +version operator will start to replicate the bundle to the rest of the nodes of +the cluster. To do so it will first disable auto-scaling to ensure that no new +nodes are added to the cluster while this process is in progress. Then it will +start a new `bundle-server` daemon set. Each of the pods in this daemon set +will check if the bundle field exists and contains a valid bundle. If it does +then it will serve it via HTTP for the other nodes of the cluster. If it +doesn't exist then it will do nothing. + +Simultaneously with the `bundle-server` daemon set the cluster version operator +will also start a new `bundle-extractor` batch job in each node of the cluster. +Each pod in these jobs will try to read the bundle file from the location +specified in `status.desired.bundle.file`. If that file doesn't exist it will +try to download it from the HTTP server of one of the pods of the +`bundle-server` daemon set; the first that responds with `HTTP 200 Ok`. Once it +has either the file or body of the HTTP response it will extract the contents +of the `metadata.json` file (will always be the first entry of the tarball) to +check that there is space in `/var/lib/upgrade` for the size indicated in the +`size` field. If there is not enough space then it will be reported as en error +condition in the `ClusterVersion` object and the upgrade process will be +aborted. + +If the `spec.desiredUpdate.imageSource.bundle.digest` field has been set then +the `bundle-extractor` will calculate the digest of the bundle, and if it +doesn't match it will report it as an error condition in the `ClusterVersion` +object and the upgrade process will be aborted. + +If there is enough space the `bundle-extractor` will extract the contents of +the bundle to the `/var/lib/upgrade/4.13.7-x86_64` directory. + +Once the digest has been verified and the bundle has been completely extracted +the `bundle-extractor` will update the status of the `ClusterVersion` object to +indicate that the bundle is extracted in that node. This will be done via a new +set of conditions for each node, with types `Extracted` and `Loaded`. For +example, when `node0` and `node2` have completed the extraction of the bundle +but `node1` hasn't, the `ClusterVersion` status will look like this: + +```yaml +status: + desired: + bundle: + file: /dev/sdb + nodes: + node0: + conditions: + - type: Extracted + status: True + - type: Loaded + status: False + metadata: | + { + "version": "4.13.7", + "arch": "x86_64", + "release": "quay.io/openshift-release-dev/ocp-release@sha256:...", + "images": [ + "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + ... + ] + } + node1: + conditions: + - type: Extracted + status: False + - type: Loaded + status: False + node2: + conditions: + - type: Extracted + status: True + - type: Loaded + status: False + metadata: | + { + "version": "4.13.7", + "arch": "x86_64", + "size": "16 GiB", + "release": "quay.io/openshift-release-dev/ocp-release@sha256:...", + "images": [ + "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + ... + ] + } +``` + +At that point the `bundle-extractor` job will finish. + +The `extracted` field for each node will indicate if the bundle extraction +process has been completed. + +The `metadata` field for each node will contain the information from the +`metadata.json` file of the bundle. Note that the list of images may be too long +(approximately 180 images) to store it in the status of the `ClusterVersion` +object; it may be convenient to store them in separate configmaps, and have only +the references to those configmaps in the `ClusterVersion` status. + +When all the nodes have the bundle extracted (the `Extracted` condition for all +nodes is `True`) the cluster version operator will delete the `bundle-server` +daemon set and verify that the metadata in all nodes (the content of the +`metadata` field) is the same. If there are differences they will be reported +as error conditions in the status of the `ClusterVersion` object and the +upgrade process will be aborted. This is intended to prevent accidents like +having two different USB sticks with different bundles plugged in two different +nodes. + +Once the metadata has been validated the cluster version operator will ask the +machine config operator to pin and pre-load the images specified in the +metadata. To do so it will use the new `pinnedImages` and `imagesDirectory` +fields of the `ContainerRuntimeConfig` object: + +```yaml +apiVersion: machineconfiguration.openshift.io/v1 +kind: ContainerRuntimeConfig +metadata: + name: pin-upgrade +spec: + containerRuntimeConfig: + pinnedImages: + - quay.io/openshift-release-dev/ocp-release@sha256:... + - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:... + - quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:... + imagesDirectory: /var/lib/upgrade/4.13.7-x86_64 +``` + +The machine config operator will react to that pinning and pre-loading the +images in all the nodes of the cluster, as described in the previous section. + +When the all the images are pinned and pre-loaded the cluster version operator +will check the `spec.desiredUpdate.schedule.scheduleType` field of the +`ClusterVersion` object. This field will be used to indicate if the upgrade has +to be started manually later or it if should wait till the user explicitly sets +it to `Immediate`. The default value for that will be `Immediate`. This is +intended for situations where the user wishes to prepare everything for the +upgrade in advance, but wants to perform the upgrade later. + +When the value of the `spec.desiredUpdate.schedule.scheduleType` field is +`Immediate` the cluster version operator will trigger the regular upgrade +process. + +When the upgrade has completed successfully the cluster version operator will +will start a new `bundle-cleaner` job in each node that will clean all the +artifacts potentially left around by other pieces of the upgrade. In +particular it will remove the `/var/lib/upgrade/4.13.7-x86_64` directory +created by the `bundle-extractor`. + +#### Tool to create the upgrade bundle + +The upgrade bundle will be created by an engineer using a new `oc adm upgrade +create bundle` command. This engineer will first determine the target version +number, for example 4.13.7. Note that doing this will probably require access +to the upgrade service available in api.openshift.com. Finding that upgrade +version number is outside of the scope of this enhancement. + +The engineer will then need internet access and a Linux machine where she can +run `oc adm upgrade create bundle`, for example: + +```bash +$ oc adm upgrade create bundle \ +--arch=x86_64 \ +--version=4.13.7 \ +--pull-secret=/my/pull/secret.txt \ +--output=/my/bundle/dir +``` + +The `oc adm upgrade bundle` command will find the list of image references that +make up the release, doing the equivalent of this: + +```bash +$ oc adm release info \ +quay.io/openshift-release-dev/ocp-release:4.13.7-x86_64 -o json | \ +jq '.references.spec.tags[].from.name' +``` + +In addition to the release images the tool will also support explicitly adding +custom images. For example: + +```bash +$ oc adm upgrade create bundle \ +--arch=x86_64 \ +--version=4.13.7 \ +--pull-secret=/my/pull/secret.txt \ +--extra-image=quay.io/my-company/my-workload1 \ +--extra-image=quay.io/my-company/my-workload2 \ +... +--extra-images-file=/my-copany/my-workloads.txt +... +--output=/my/bundle/dir +``` + +This is intended for situations where the user wants to use the same upgrade +mechanism for her own images. + +The `--extra-image` option will be used to add a single image, and it can be +repeated multiple times. + +The `--extra-images-file` option will be used to add a collection of images +specified in a text file, one image per line. + +The command will then bring up a temporary image registry server, embedded into +the tool, listening to a randomly selected local port and using a self signed +certificate. It will then start to copy the images found in the previous step to +the embedded registry server, using the equivalent of this for each image: + +```bash +$ skopeo copy \ +--src-auth-file=/my/pull.secret.txt \ +--dst-cert-dir=/my/certs \ +docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:... \ +docker://localhost:12345/openshift-release-dev/ocp-v4.0-art-dev@... +``` + +When all the images have been copied to the temporary image registry server it +will be shutdown. + +The result will be a directory containing approximately 180 images and requires +16 GiB of space. The command will then create a `upgrade-4.13.7-x86_64.tar` tar +file containing that directory and a `metadata.json` file. + +The `metadata.json` file will contain additional information, in particular the +architecture, the version, the size and the list of images: + +```json +{ + "version": "4.13.7", + "arch": "x86_64", + "size": "16 GiB", + "release": "quay.io/openshift-release-dev/ocp-release@sha256:...", + "images": [ + "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...", + ... + ] +} +``` + +The `metadata.json` file will always be the first entry of the tar file, to +simplify operations that need the metadata but not the rest of the contents of +the tar file. + +The command will also write a `upgrade-4.13.7-x86_64.sha256` file containing a +digest of the complete tar file. This digest is intended to protect the +integrity of the bundle. It can be checked with tools like `sha256sum`. It will +also be optionally used by the administrator of the cluster to ensure that the +right bundle is used in the right cluster. To do so the administrator of the +cluster will write it to the `desiredUpdate.imageSource.bundle.digest` field of +the `ClusterVersion` object. The cluster version operator will reject the +bundle if it doesn't match this digest. + +The engineer will write the tar file to some kind of media and hand it over to +the technicians, together with the documentation explaining how to use it. + +#### Documentation to use the bundle + +This documentation shouldn't assume previous OpenShift knowledge, and should be +basic instructions to plug the USB stick containing the bundle and then check +if upgrade the upgrade succeeded or failed, using either the cluster console or +the `oc` tool. + +### Risks and Mitigations + +The proposed solution will require space to store the release bundle and all the +release images in all the nodes of the cluster, approximately 48 GiB in the +worst case. To mitigate that risks the components that will consume disk space +will check in advance if the required space is available. + +### Drawbacks + +This approach requires non trivial changes to the cluster version operator and, +to a lesser degree, to the machine config operator. + +## Design Details + +### Open Questions + +None. + +### Test Plan + +We should have at least tests that verifies that the upgrade can be performed in +a fully disconnected environment, both for a single node cluster and a cluster +with multiple nodes. These tests should gate the OCP release. + +It is desirable to have another test that scans the OCP components looking for +use of the `Always` pull policy. This should probably run for each pull request +of each OCP component, and prevent merging if it detects that the offending pull +policy is used. We should consider adding admission in CI for this. + +### Graduation Criteria + +The feature will ideally be introduced as `Dev Preview` in OpenShift 4.X, +moved to `Tech Preview` in 4.X+1 and declared `GA` in 4.X+2. + +#### Dev Preview -> Tech Preview + +- Ability to upgrade a single-node clusters using the `imageSource.sourceType: +Registry` mode (no bundle, just pinning and pre-loading of the images). + +- Availability of the tests that verify the upgrade if single-node clusters. + +- Availability of the tests that verify that no OCP component uses the `Always` +pull policy. + +- Obtain positive feedback from at least one customer. + +#### Tech Preview -> GA + +- Ability to manually detect update bundles. + +- Ability to upgrade single-node and multi-node clusters using the +`imageSource.sourceType: Bundle` mode (with a bundle and pinning and +pre-loading the images). + +- Ability to create bundles using the `oc adm upgrade create bundle` command. + +- Availability of the tests that verify the upgrade in single-node and +multi-node clusters. + +- User facing documentation created in +[https://github.com/openshift/openshift-docs](openshift-docs). + +#### After GA + +- Ability to automatically detect update bundles. + +#### Removing a deprecated feature + +Not applicable, no feature will be removed. + +### Upgrade / Downgrade Strategy + +There are no additional considerations for upgrade or downgrade. The same +considerations that apply to the cluster version operator in general will also +apply in this case. + +### Version Skew Strategy + +This feature will only be usable once the cluster version operator and the +machine config operator have been upgraded to support it. That upgrade will have +to be done by other means. + +For subsequent upgrades we will ensure that the cluster version operator can +work with both the old and the new version of the machine config operator. + +### Operational Aspects of API Extensions + +Not applicable, there are no API extensions. + +#### Failure Modes + +#### Support Procedures + +## Implementation History + +There is an initial prototype exploring some of the implementation details +described here in this [https://github.com/jhernand/upgrade-tool](repository). + +## Alternatives + +The alternative to this is to make a registry server available, either outside +or inside the cluster. + +An external +[https://cloud.redhat.com/blog/introducing-mirror-registry-for-red-hat-openshift](registry +server) is the currently supported solution for upgrades in disconnected +environments. It doesn't require any change, but it is not feasible in most of +the target environments due to the resource limitations described in the +[motivation](#motivation) section of this document. + +An internal registry server, running in the cluster itself, is a feasible +alternative for clusters with multiple nodes. The registry server supported by +Red Hat is Quay. The disadvantages of Quay is that it requires additional +resources that are often not available in the target environments. + +For single node clusters an internal registry server isn't an alternative +because it would need to be continuously available during and after the upgrade, +and that isn't possible if the registry server runs in the cluster itself. + +## Infrastructure Needed + +Infrastructure will be needed to run the tests described in the test plan above.