-
Notifications
You must be signed in to change notification settings - Fork 476
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Upgrade without registry enhancement
This patch adds an enhancement that describes supported procedure to perform cluster upgrades without requiring an image registry server. Signed-off-by: Juan Hernandez <[email protected]>
- Loading branch information
Showing
1 changed file
with
321 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,321 @@ | ||
--- | ||
title: Upgrade without registry | ||
authors: | ||
- "@jhernand" | ||
reviewers: | ||
- "@avishayt" | ||
- "@danielerez" | ||
- "@mrunalp" | ||
- "@nmagnezi" | ||
- "@oourfali" | ||
approvers: | ||
- "@sdodson" | ||
- "@zaneb" | ||
creation-date: 2023-06-29 | ||
last-updated: 2023-06-29 | ||
tracking-link: [] | ||
see-also: | ||
- https://issues.redhat.com/browse/OCPBUGS-13219 | ||
- https://github.com/openshift/cluster-network-operator/pull/1803 | ||
replaces: [] | ||
superseded-by: [] | ||
--- | ||
|
||
# Upgrade without registry | ||
|
||
## Summary | ||
|
||
Provide a documented and supported mechanism to upgrade a cluster without requiring an image | ||
registry server. | ||
|
||
## Motivation | ||
|
||
All these stories are in the context of disconnected clusters with limited resources, both in the | ||
cluster itself and in the surrounding environment: | ||
|
||
- The cluster is not connected to the Internet. | ||
- It isn't possible to bring up additional machines, even temporarily. | ||
- The resources of the cluster are limited, in particular in terms of CPU and memory. | ||
|
||
These clusters are usually installed at the customer site by a partner engineer collaborating with a | ||
customer technician. | ||
|
||
Eventually, the cluster will need to be upgraded, and then the technician will need a supported and | ||
documented procedure to do it. | ||
|
||
### User stories | ||
|
||
#### Prepare an upgrade package | ||
|
||
As an engineer working in a partner's factory, I want to be able to assemble the upgrade package | ||
that will be delivered to the technicians that will perform the upgrade on-site. | ||
|
||
#### Upgrade a single-node cluster | ||
|
||
As a technician, I want to be able to upgrade a single node cluster using the upgrade package and | ||
without requiring/deploying an image registry server neither in the cluster itself nor externally. | ||
|
||
#### Upgrade a multi-node cluster | ||
|
||
As a technician, I want to be able to upgrade a multi-node cluster using the upgrade package and | ||
without requiring/deploying an image registry server neither on the cluster itself nor externally. | ||
|
||
### Goals | ||
|
||
Provide a documented and supported mechanism that technicians can use to upgrade a cluster without | ||
requiring a registry server. This upgrade will include the OpenShift core components as well as | ||
additional applications provided by the partner. | ||
|
||
### Non-Goals | ||
|
||
It is not a goal to not require a registry server for other operations. For example, installing a | ||
new workload will still require a registry server. | ||
|
||
It is not a goal to provide a tool to automate the procedures to prepare the upgrade package or to | ||
perform the upgrade. | ||
|
||
### Workflow Description | ||
|
||
1. An engineer working in a partner's factory is asked to prepare a package to upgrade a cluster to | ||
a specific OpenShift architecture and version. | ||
|
||
1. The engineer uses the documentation and tools to prepare the upgrade package containing all the | ||
artifacts (mostly release images) that are required to perform the upgrade and writes it to some | ||
kind of media (USB stick, for example) that will be delivered to the technicians. | ||
|
||
1. The technicians receive copies of the upgrade package. | ||
|
||
1. The technician goes to the cluster location and uses the upgrade package, documentation and tools | ||
to perform the upgrade. This step is potentially repeated multiple times for multiple clusters. | ||
|
||
Note that the upgrade package should not be specific for a particular cluster, only for the | ||
OpenShift architecture and version. Technicians should be able to use that package for any cluster | ||
with that architecture and version. | ||
|
||
### API Extensions | ||
|
||
None. | ||
|
||
### Implementation Details/Notes/Constraints | ||
|
||
The proposed solution is based on copying all the required images to all the nodes of the cluster | ||
before starting the upgrade, and ensuring that no component requires access to a registry server | ||
during the upgrade. For this to work the following changes are required: | ||
|
||
1. No OpenShift component used during the upgrade should use the `Always` pull policy, as that | ||
forces the kubelet and CRI-O to try to contact the registry server even if the image is already | ||
available. | ||
|
||
1. No OpenShift component should garbage collect the upgrade images before or during the upgrade. | ||
|
||
1. The machine config operator should tolerate the changes that will be required to the CRI-O | ||
configuration during the upgrade. | ||
|
||
1. The engineer working in the partner's factory needs a supported and documented procedure to | ||
create the upgrade package. | ||
|
||
1. The technician needs a documented and supported procedure to verify that the upgrade requirements | ||
are met (right version, enough space available, etc) and then to actually perform the upgrade. | ||
|
||
#### Don't use the `Always` pull policy during the upgrade | ||
|
||
Some OCP core components currently use the `Always` image pull policy during the upgrade. As a | ||
result, the kubelet and CRI-O will try to contact the registry server even if the image is already | ||
available in the local storage of the cluster. This blocks the upgrade. | ||
|
||
Most OCP core components have been changed in the past to avoid this. Recently the OVN pre-puller | ||
has also been changed (see this [bug](https://issues.redhat.com/browse/OCPBUGS-13219) for details). | ||
To prevent bugs like this from happening in the future and make the solution less fragile we should | ||
have a test that gates the OpenShift release and that verifies that the upgrade can be performed | ||
without a registry server. | ||
|
||
It would also be useful to have another test that scans for use of this `Always` pull policy. | ||
|
||
#### Don't garbage collect images required for the upgrade | ||
|
||
Starting with version 4.14 of OpenShift CRI-O will have the capability to pin certain images (see | ||
[this](https://github.com/cri-o/cri-o/pull/6862) pull request for details). That capability will be | ||
used to temporarily pin all the images required for the upgrade, so that they aren't garbage | ||
collected by kubelet and CRI-O. | ||
|
||
Note that pinning images means that kubelet and CRI-O will not remove them, even if they aren't in | ||
use. It is very important to make sure that there is enough available space for these images, as | ||
otherwise the performance of the node may degrade and it may stop functioning correctly if it runs | ||
out of space. | ||
|
||
#### MCO should tolerate the changes required for the upgrade | ||
|
||
In order to copy the images required for the upgrade to the nodes of the cluster we will create an | ||
additional image store in the `/var/lib/upgrade` directory of each node of the cluster, and we will | ||
pin all those images. This requires changes in the `/etc/containers/storage.conf` file, something | ||
like this: | ||
|
||
```toml | ||
additionalimagestores = [ | ||
/var/lib/upgrade | ||
] | ||
``` | ||
|
||
That `/etc/containers/storage.conf` file is tracked by the machine config operator, and changing it | ||
will trigger a reboot that will interfere with the upgrade process. We will need to change the | ||
machine config operator so that it knows to ignore changes to this file during the upgrade. | ||
|
||
The changes to pin the images will be done in a `/etc/crio/crio.conf.d/pin-upgrade.conf` file, | ||
something like this: | ||
|
||
```toml | ||
pinned_images=[ | ||
quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:* | ||
] | ||
``` | ||
|
||
We will also need to ensure that the machine config operator doesn't track this file during the | ||
upgrade. | ||
|
||
#### Procedure to prepare the upgrade package | ||
|
||
The upgrade package will be prepared by an engineer in a partner's factory. This engineer will first | ||
determine the upgrade version number, for example 4.12.10. Note that doing this will probably | ||
require access to the upgrade service available in api.openshift.com. Finding that upgrade version | ||
number is outside of the scope of this enhancement. | ||
|
||
The engineer will then need internet access and a Linux machine where she can run `podman` to | ||
download the release images: | ||
|
||
``` | ||
# oc adm release info \ | ||
quay.io/openshift-release-dev/ocp-release:4.12.10-x86_64 \ | ||
-o json > release-info.json | ||
# podman pull --root /images $( | ||
cat release-info.json | | ||
jq -r '.references.spec.tags[].from.name' | ||
) | ||
# cd /images | ||
# tar \ | ||
--create \ | ||
--xattrs \ | ||
--xattrs-include="security.*" \ | ||
--gzip \ | ||
--file=upgrade-4.12.10-x86_64.tar.gz | ||
``` | ||
|
||
That will pull approximately 180 images and requires 32 GiB of space for the images and another 16 | ||
GiB for the tar file, so approximately 48 GiB in total. | ||
|
||
The engineer will write this tar file to some kind of media and hand it over to the technicians. | ||
|
||
#### Procedure to perform the upgrade | ||
|
||
The technician will receive the media containing the `upgrade-4.12.10-x86_64.tar.gz` file. | ||
|
||
The technician will verify that each node of the cluster has at least 32 GiB of space available in | ||
the `/var` filesystem, or 48 GiB if the tar file needs to be copied to the node in advance (if it | ||
isn't possible to plug and mount an USB stick, for example). This is crucial because if the node | ||
runs out of space performance will degrade and the node may stop functioning correctly. | ||
|
||
The technician will connect to the nodes of the cluster via SSH, create a `/var/lib/upgrade` | ||
directory in each node, and extract the upgrade package there: | ||
|
||
``` | ||
# ssh core@node | ||
# sudo -i | ||
# mkdir /var/lib/upgrade | ||
# cd /var/lib/upgrade | ||
# tar \ | ||
--extract \ | ||
--xattrs \ | ||
--xattrs-include="security.*" \ | ||
--gzip \ | ||
--file=/media/upgrade-4.12.10-x86_64.tar.gz | ||
``` | ||
|
||
Note that it is very important to preserve capabilities and SELinux contexts in these files, so this | ||
needs to be done as root. | ||
|
||
The technician will change the configuration of CRI-O to use `/var/lib/upgrade` as an additional | ||
image store and to pin the release images. Inside `/etc/containers/storage.conf`: | ||
|
||
```toml | ||
additionalimagestores = [ | ||
/var/lib/upgrade/4.12.10/images | ||
] | ||
``` | ||
|
||
Inside `/etc/crio/crio.conf.d/pin-upgrade.conf`: | ||
|
||
```toml | ||
pinned_images=[ | ||
quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:* | ||
] | ||
``` | ||
|
||
The technician will then trigger the upgrade, adding the following to the cluster version object: | ||
|
||
``` | ||
# oc edit clusterversion | ||
``` | ||
|
||
```yaml | ||
desiredUpdate: | ||
image: quay.io/openshift-release-dev/ocp-release@sha256:0398462b1c54758fe83619022a98bbe65f6deed71663b6665224d3ba36e43f03 | ||
version: 4.12.10 | ||
``` | ||
### Risks and Mitigations | ||
The proposed solution will require space to store all the release images in all the nodes of the | ||
cluster, approximately 48 GiB in the worst case. It will be necessary to calculate those space | ||
requirements in advance and ensure (via documentation or tooling) that the upgrade isn't attempted | ||
if the requirements aren't met. | ||
### Drawbacks | ||
The procedure to perform the upgrade is complicated and needs to be repeated in all the nodes of the | ||
cluster. Should we provide a tool automating that? | ||
## Design Details | ||
### Open Questions | ||
None. | ||
### Test Plan | ||
We should have at least tests that verifies that the upgrade can be performed in a fully | ||
disconnected environment, both for a single node cluster and a cluster with multiple nodes. These | ||
tests should gate the OCP release. | ||
It is desirable to have another test that scans the OCP components looking for use of the `Always` | ||
pull policy. This should probably run for each pull request of each OCP component, and prevent | ||
merging if it detects that the offending pull policy is used. | ||
|
||
## Implementation History | ||
|
||
None. | ||
|
||
## Alternatives | ||
|
||
The alternative to this is to make a registry server available, either outside or inside the | ||
cluster. | ||
|
||
An external registry server is a well known solution, even for disconnected environments, but it is | ||
not feasible in most of the target environments. | ||
|
||
An internal registry server, running in the cluster itself, is a feasible alternative for nodes with | ||
multiple clusters. The registry server supported by Red Hat is Quay. The disadvantages of Quay is | ||
that it requires additional resources that are often not available in the target environments. | ||
|
||
For single node clusters an internal registry server isn't an alternative because it would need to | ||
be continuously available during and after the upgrade, and that isn't possible if the registry | ||
server runs in the cluster itself. | ||
|
||
## Infrastructure Needed | ||
|
||
Infrastructure will be needed to run the tests described in the test plan above. |