Upgrade without registry enhancement

This patch adds an enhancement that describes supported procedure to perform cluster upgrades without requiring an image registry server. Signed-off-by: Juan Hernandez <[email protected]>
openshift · Jun 29, 2023 · 3416e01 · 3416e01
1 parent d09fbc4
commit 3416e01
Showing 1 changed file with 321 additions and 0 deletions.
diff --git a/enhancements/update/upgrade-without-registry.md b/enhancements/update/upgrade-without-registry.md
@@ -0,0 +1,321 @@
+---
+title: Upgrade without registry
+authors:
+- "@jhernand"
+reviewers:
+- "@avishayt"
+- "@danielerez"
+- "@mrunalp"
+- "@nmagnezi"
+- "@oourfali"
+approvers:
+- "@sdodson"
+- "@zaneb"
+creation-date: 2023-06-29
+last-updated: 2023-06-29
+tracking-link: []
+see-also:
+- https://issues.redhat.com/browse/OCPBUGS-13219
+- https://github.com/openshift/cluster-network-operator/pull/1803
+replaces: []
+superseded-by: []
+---
+
+# Upgrade without registry
+
+## Summary
+
+Provide a documented and supported mechanism to upgrade a cluster without requiring an image
+registry server.
+
+## Motivation
+
+All these stories are in the context of disconnected clusters with limited resources, both in the
+cluster itself and in the surrounding environment:
+
+- The cluster is not connected to the Internet.
+- It isn't possible to bring up additional machines, even temporarily.
+- The resources of the cluster are limited, in particular in terms of CPU and memory.
+
+These clusters are usually installed at the customer site by a partner engineer collaborating with a
+customer technician.
+
+Eventually, the cluster will need to be upgraded, and then the technician will need a supported and
+documented procedure to do it.
+
+### User stories
+
+#### Prepare an upgrade package
+
+As an engineer working in a partner's factory, I want to be able to assemble the upgrade package
+that will be delivered to the technicians that will perform the upgrade on-site.
+
+#### Upgrade a single-node cluster
+
+As a technician, I want to be able to upgrade a single node cluster using the upgrade package and
+without requiring/deploying an image registry server neither in the cluster itself nor externally.
+
+#### Upgrade a multi-node cluster
+
+As a technician, I want to be able to upgrade a multi-node cluster using the upgrade package and
+without requiring/deploying an image registry server neither on the cluster itself nor externally.
+
+### Goals
+
+Provide a documented and supported mechanism that technicians can use to upgrade a cluster without
+requiring a registry server. This upgrade will include the OpenShift core components as well as
+additional applications provided by the partner.
+
+### Non-Goals
+
+It is not a goal to not require a registry server for other operations. For example, installing a
+new workload will still require a registry server.
+
+It is not a goal to provide a tool to automate the procedures to prepare the upgrade package or to
+perform the upgrade.
+
+### Workflow Description
+
+1. An engineer working in a partner's factory is asked to prepare a package to upgrade a cluster to
+a specific OpenShift architecture and version.
+
+1. The engineer uses the documentation and tools to prepare the upgrade package containing all the
+artifacts (mostly release images) that are required to perform the upgrade and writes it to some
+kind of media (USB stick, for example) that will be delivered to the technicians.
+
+1. The technicians receive copies of the upgrade package.
+
+1. The technician goes to the cluster location and uses the upgrade package, documentation and tools
+to perform the upgrade. This step is potentially repeated multiple times for multiple clusters.
+
+Note that the upgrade package should not be specific for a particular cluster, only for the
+OpenShift architecture and version. Technicians should be able to use that package for any cluster
+with that architecture and version.
+
+### API Extensions
+
+None.
+
+### Implementation Details/Notes/Constraints
+
+The proposed solution is based on copying all the required images to all the nodes of the cluster
+before starting the upgrade, and ensuring that no component requires access to a registry server
+during the upgrade. For this to work the following changes are required:
+
+1. No OpenShift component used during the upgrade should use the `Always` pull policy, as that
+forces the kubelet and CRI-O to try to contact the registry server even if the image is already
+available.
+
+1. No OpenShift component should garbage collect the upgrade images before or during the upgrade.
+
+1. The machine config operator should tolerate the changes that will be required to the CRI-O
+configuration during the upgrade.
+
+1. The engineer working in the partner's factory needs a supported and documented procedure to
+create the upgrade package.
+
+1. The technician needs a documented and supported procedure to verify that the upgrade requirements
+are met (right version, enough space available, etc) and then to actually perform the upgrade.
+
+#### Don't use the `Always` pull policy during the upgrade
+
+Some OCP core components currently use the `Always` image pull policy during the upgrade. As a
+result, the kubelet and CRI-O will try to contact the registry server even if the image is already
+available in the local storage of the cluster. This blocks the upgrade.
+
+Most OCP core components have been changed in the past to avoid this. Recently the OVN pre-puller
+has also been changed (see this [bug](https://issues.redhat.com/browse/OCPBUGS-13219) for details).
+To prevent bugs like this from happening in the future and make the solution less fragile we should
+have a test that gates the OpenShift release and that verifies that the upgrade can be performed
+without a registry server.
+
+It would also be useful to have another test that scans for use of this `Always` pull policy.
+
+#### Don't garbage collect images required for the upgrade
+
+Starting with version 4.14 of OpenShift CRI-O will have the capability to pin certain images (see
+[this](https://github.com/cri-o/cri-o/pull/6862) pull request for details). That capability will be
+used to temporarily pin all the images required for the upgrade, so that they aren't garbage
+collected by kubelet and CRI-O.
+
+Note that pinning images means that kubelet and CRI-O will not remove them, even if they aren't in
+use. It is very important to make sure that there is enough available space for these images, as
+otherwise the performance of the node may degrade and it may stop functioning correctly if it runs
+out of space.
+
+#### MCO should tolerate the changes required for the upgrade
+
+In order to copy the images required for the upgrade to the nodes of the cluster we will create an
+additional image store in the `/var/lib/upgrade` directory of each node of the cluster, and we will
+pin all those images. This requires changes in the `/etc/containers/storage.conf` file, something
+like this:
+
+```toml
+additionalimagestores = [
+  /var/lib/upgrade
+]
+```
+
+That `/etc/containers/storage.conf` file is tracked by the machine config operator, and changing it
+will trigger a reboot that will interfere with the upgrade process. We will need to change the
+machine config operator so that it knows to ignore changes to this file during the upgrade.
+
+The changes to pin the images will be done in a `/etc/crio/crio.conf.d/pin-upgrade.conf` file,
+something like this:
+
+```toml
+pinned_images=[
+  quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:*
+]
+```
+
+We will also need to ensure that the machine config operator doesn't track this file during the
+upgrade.
+
+#### Procedure to prepare the upgrade package
+
+The upgrade package will be prepared by an engineer in a partner's factory. This engineer will first
+determine the upgrade version number, for example 4.12.10. Note that doing this will probably
+require access to the upgrade service available in api.openshift.com. Finding that upgrade version
+number is outside of the scope of this enhancement.
+
+The engineer will then need internet access and a Linux machine where she can run `podman` to
+download the release images:
+
+```
+# oc adm release info \
+quay.io/openshift-release-dev/ocp-release:4.12.10-x86_64 \
+-o json > release-info.json
+
+# podman pull --root /images $(
+  cat release-info.json |
+  jq -r '.references.spec.tags[].from.name'
+)
+
+# cd /images
+
+# tar \
+--create \
+--xattrs \
+--xattrs-include="security.*" \
+--gzip \
+--file=upgrade-4.12.10-x86_64.tar.gz
+```
+
+That will pull approximately 180 images and requires 32 GiB of space for the images and another 16
+GiB for the tar file, so approximately 48 GiB in total.
+
+The engineer will write this tar file to some kind of media and hand it over to the technicians.
+
+#### Procedure to perform the upgrade
+
+The technician will receive the media containing the `upgrade-4.12.10-x86_64.tar.gz` file.
+
+The technician will verify that each node of the cluster has at least 32 GiB of space available in
+the `/var` filesystem, or 48 GiB if the tar file needs to be copied to the node in advance (if it
+isn't possible to plug and mount an USB stick, for example). This is crucial because if the node
+runs out of space performance will degrade and the node may stop functioning correctly.
+
+The technician will connect to the nodes of the cluster via SSH, create a `/var/lib/upgrade`
+directory in each node, and extract the upgrade package there:
+
+```
+# ssh core@node
+
+# sudo -i
+
+# mkdir /var/lib/upgrade
+
+# cd /var/lib/upgrade
+
+# tar \
+--extract \
+--xattrs \
+--xattrs-include="security.*" \
+--gzip \
+--file=/media/upgrade-4.12.10-x86_64.tar.gz
+```
+
+Note that it is very important to preserve capabilities and SELinux contexts in these files, so this
+needs to be done as root.
+
+The technician will change the configuration of CRI-O to use `/var/lib/upgrade` as an additional
+image store and to pin the release images. Inside `/etc/containers/storage.conf`:
+
+```toml
+additionalimagestores = [
+  /var/lib/upgrade/4.12.10/images
+]
+```
+
+Inside `/etc/crio/crio.conf.d/pin-upgrade.conf`:
+
+```toml
+pinned_images=[
+  quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:*
+]
+```
+
+The technician will then trigger the upgrade, adding the following to the cluster version object:
+
+```
+# oc edit clusterversion
+```
+
+```yaml
+desiredUpdate:
+  image: quay.io/openshift-release-dev/ocp-release@sha256:0398462b1c54758fe83619022a98bbe65f6deed71663b6665224d3ba36e43f03
+  version: 4.12.10
+```
+
+### Risks and Mitigations
+
+The proposed solution will require space to store all the release images in all the nodes of the
+cluster, approximately 48 GiB in the worst case. It will be necessary to calculate those space
+requirements in advance and ensure (via documentation or tooling) that the upgrade isn't attempted
+if the requirements aren't met.
+
+### Drawbacks
+
+The procedure to perform the upgrade is complicated and needs to be repeated in all the nodes of the
+cluster. Should we provide a tool automating that?
+
+## Design Details
+
+### Open Questions
+
+None.
+
+### Test Plan
+
+We should have at least tests that verifies that the upgrade can be performed in a fully
+disconnected environment, both for a single node cluster and a cluster with multiple nodes. These
+tests should gate the OCP release.
+
+It is desirable to have another test that scans the OCP components looking for use of the `Always`
+pull policy. This should probably run for each pull request of each OCP component, and prevent
+merging if it detects that the offending pull policy is used.
+
+## Implementation History
+
+None.
+
+## Alternatives
+
+The alternative to this is to make a registry server available, either outside or inside the
+cluster.
+
+An external registry server is a well known solution, even for disconnected environments, but it is
+not feasible in most of the target environments.
+
+An internal registry server, running in the cluster itself, is a feasible alternative for nodes with
+multiple clusters. The registry server supported by Red Hat is Quay. The disadvantages of Quay is
+that it requires additional resources that are often not available in the target environments.
+
+For single node clusters an internal registry server isn't an alternative because it would need to
+be continuously available during and after the upgrade, and that isn't possible if the registry
+server runs in the cluster itself.
+
+## Infrastructure Needed
+
+Infrastructure will be needed to run the tests described in the test plan above.