Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootimages: Downloading and updating bootimages via release image #201

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
168 changes: 168 additions & 0 deletions enhancements/bootimages.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
---
title: bootimage-updates
authors:
- "@cgwalters"
reviewers:
- "@coreos-team"
approvers:
- "@coreos-team"
creation-date: 2020-02-04
last-updated: 2020-02-024
status: provisional
---

# Updating bootimages

This proposes a path towards integrating the bootimage into the release image and become managed by the cluster. This aids worker scaleup speed, helps in mirroring OpenShift into disconnected environments, and allows the CoreOS team to avoid supporting in-place updates from OpenShift 4.1 into the forseeable future.

## Release Signoff Checklist

- [ ] Enhancement is `implementable`
- [ ] Design details are appropriately documented from clear requirements
- [ ] Test plan is defined
- [ ] Graduation criteria for dev preview, tech preview, GA
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)

## Summary

OpenShift will support clusters managing their own bootimages (in cloud environments), and come with tooling and documentation to aid operators on e.g. bare metal PXE to keep bootimages up date.

By default in an environment managed by the [machine-api-operator](https://github.com/openshift/machine-api-operator), when a cluster is upgraded (e.g `oc adm upgrade`), the machinesets will be updated to use newer bootimages with the latest `machine-os-content`.

## Motivation

Most discussion of this problem originated in [this issue](https://github.com/openshift/os/issues/381). For more background, see [OSUpgrades](https://github.com/openshift/machine-config-operator/blob/master/docs/OSUpgrades.md) which documents some of the process.

Since the creation of OpenShift 4.1 and continuing until the time of this writing, there is no automated mechanism to update "bootimages" past a cluster installation. We have a mechanism to do [in place updates](https://github.com/openshift/machine-config-operator/blob/09fe53e2e47bc6f8129376dfe389e98fc151ff48/docs/OSUpgrades.md) which has worked quite well, but there is a need to do more.

### Goal 1: Scaling up workers directly into upgraded OS

In an IaaS cloud with [cluster autoscaling](https://docs.openshift.com/container-platform/4.3/machine_management/applying-autoscaling.html) enabled, every worker that comes online will need to pull the latest `machine-os-content` and reboot before user workloads can land on the node. This adds 3-4 minutes (at least) of latency to scaleup, and that time can matter significantly in "burst" scale scenarios, serverless usage, etc.

### Goal 2: Avoid accumulation of technical debt across OS updates

The CoreOS team must today support clusters upgraded in place (e.g. `oc adm upgrade`) from OpenShift 4.1 until the forseeable future. We would like the ability to make potentially breaking changes, relying on the ability for the cluster to re-provision both the control plane and workers in-place from updated bootimages.
cgwalters marked this conversation as resolved.
Show resolved Hide resolved

One example is [bootloader updates](https://github.com/coreos/fedora-coreos-tracker/issues/510).

Another example is all of the things that happen at "firstboot" time around OS upgrades; for example we fixed a bug in rpm-ostree that made kernel argument ordering stable, and that is important in some scenarios.

Another big case here is around the container stack - if we want to take advantage of e.g. a new podman feature for mirroring images that inclues the first OS update.

### Goal 3: Better integrate bootimage management into disconnected install paths

OpenShift 4 comes as a "release image" which is a *container* image - bootimages do not naturally fit into that, and currently the installer has some ad-hoc support for dealing with bootimages.

A goal here is to make bootimage management more of a first class operation, something like:

```
$ oc adm release generate-bootimage quay.io/openshift/release:4.3 vmware
```

Rather than having an administrator [manually download a bootimage](https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.3/latest/) corresponding to the release version, this command would output a bootimage for the chosen platform/media.

Similarly:

```
$ oc adm release generate-bootimage quay.io/openshift/release:4.3 aws
```

would output data similar to the [aws rhcos.json data](https://github.com/openshift/installer/blob/2055609f95b19322ee6cfdd0bea73399297c4a3e/data/data/rhcos.json#L2) so that AWS UPI installations could use it programatically.

Other output proposals

- `metal`, `openstack`, `qemu`: What's available today for 4.3
- `aws-snapshot` to handle [private AWS regions](https://github.com/openshift/os/blob/411d1f5943ea23f2de7e4f7a04ab0bb185fd9586/FAQ.md#q-how-do-i-get-rhcos-in-a-private-ec2-region).
- `baremetal-iso` generates a "Live" ISO as is used in Fedora CoreOS and points people at how to do [Ignition injection](https://github.com/coreos/coreos-installer/blob/3652e6a767bad593b1b898f239e41bc11a83ab8f/src/iso.rs#L28)
- `baremetal-pxe` outputs the split kernel/initramfs suitable for PXE

### Non-Goals

#### Replacing the default in-place update path

In-place updates as [managed by the MCO](https://github.com/openshift/machine-config-operator/blob/master/docs/OSUpgrades.md) today works fairly seamlessly. We can't require that everyone fully reprovision a machine in order to do in-place updates - that makes updates *much* more expensive. It implies re-downloading all container images, etc.

#### Exposing a general purpose "OS build" tool

This proposal includes shipping a subset of https://github.com/coreos/coreos-assembler/ - but that's not the same as supporting custom "CoreOS-style" builds for non-OpenShift use cases.

## Proposal phase 1: bootimage.json in release image

First, the RHCOS build process is changed to inject the current coreos-assembler `meta.json` output for the build into `machine-os-content`. This aims to move the "source of truth" for cluster bootimages into the release image. Nothing would use this to start - however, as soon as we switch to a machine API provisioned control plane, having that process consume this data would be a natural next step.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you have some details about what this means?

I've read through what I can find and I don't understand how machine-os-content is produced, there's no source location in the release image.

As a user, how would I consume this? How would I find out which is the image URL for say the OpenStack image? Does the meta.json become an annotation on the image or something?

        {
          "name": "machine-os-content",
          "annotations": {
            "io.openshift.build.commit.id": "",
            "io.openshift.build.commit.ref": "",
            "io.openshift.build.source-location": "",
            "io.openshift.build.version-display-names": "machine-os=Red Hat Enterprise Linux CoreOS",
            "io.openshift.build.versions": "machine-os=45.81.202003131108-0"
          }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of this exists yet - it's a proposal.

But this is the same question as #201 (comment) right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how machine-os-content is produced

Via https://github.com/coreos/coreos-assembler and https://gitlab.cee.redhat.com/coreos/redhat-coreos/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it thank you - I was looking for details about how this should be implemented to see if we could already do part of it.

The problem for us is that baremetal (and openstack) IPI both need to know where the RHCOS images live. Today we do awful hacks with openstack-install version and GitHub to fetch the data that don't work in CI. There, openshift-install version reports a sha that doesn't exist due to the rebasing strategy CI uses. We ended up with a temporary workaround and just copy rhcos.json into the installer container.

I'd be interested to see if we could get a better approach to make the data accessible to end users (this --info flag @abhinavdahiya mentioned)


In fact, we could aim to switch to having workers use the `bootimage.json` from the release payload immediately after it lands. A downside is this would open up potential for drift between the bootstrap+controlplane and workers.

## Proposal phase 2: oc adm release generate-bootimage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we include details on a option that prints the contents of bootimage.json like mentioned here https://github.com/openshift/enhancements/pull/201/files#diff-9b570624a774881c701bbfbc18b60109R64

a way to get the prebuilt bootimages if the user sees fit as an alternative to generating the whole thing...?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets into a fundamental question of whether we do ship pre-built bootimages going forward in general. Clearly in "phase 1" we continue to do so as we can't break the way openshift-install works today.

The thing is though in any disconnected environment it'd require another step for admins to mirror on an ongoing basis.

On one hand though, we absolutely are going to need to make it ergonomic at some point to "boot a single RHCOS node" for testing. There are many valid reasons for this, among them hardware validation before trying a cluster install, experimenting with configuration "live" that one then compiled into Ignition/MachineConfig, etc. And until such time as we have openshift-install run-one-coreos-instance or something...people are going to want to download the bootimages directly and boot them too.

So...I'm conflicted.

Maybe what happens in the end is we do do both - but the cadence of "golden bootimages" would still be once per release (for the bootstrap). There's no point to us uploading e.g. new AMIs if in practice clusters are going to use in-place update mechanisms anyways.

IOW...I think people doing UPI installs would turn to automating oc adm release generate-bootimage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For phase 1, i think it's necessary for migration.

we could add a --info flag with it defaulting to true to begin with. and when we have actual generation capability we move to --info false so users are generating the images.

i think --info will continue to provide value in future, because it will allow the customers to get information like

what's the rhcos version, maybe what's included, if there are any publicly available images they are try let's use them. and as long as the output is versioned users have a way to differentiate and automate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I'm thinking about this again. I guess one thing here is what you most strongly objected to was having openshift-install be the frontend for this information (clearly we want to do all the other stuff here though too).

But clearly a "MVP" for this is basically what you're saying with --info. The question is how to get there from the today where RHCOS is pinned in the installer. I really don't want two sources of this information (it's already bad enough docs are updated manually). So...what if we still added a hidden/secret mechanism for oc adm release new to download openshift-install and extract this data from it somehow?

The simplest would perhaps be a hidden version of openshift-install hidden-only-for-oc-output-coreos-build or so.

Or, we could stick it in a special ELF section (would require oc adm to vendor some ELF tools).

Or, we could do a variant of openshift/installer#1422 and basically scrape the binary; the lowest tech approach...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should include an object into the release image like we include the machine-os-content.

This object can then be used to provide the information as part of oc adm release info --boot-images

trying to get this data from installer is not correct imo. the installer's embedded data is mostly for bootstrap-host and it currently being used for cluster hosts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we are all agreed on the long term goal of getting the data out of the installer. The problem is in the short term, having two sources of truth for this data is going to be very problematic.

Now...hmm, we could get the data into a container in the release payload and change oc adm release new to patch the openshift-install binary, the same way it does for the release image itself?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside of doing this is that anyone building openshift-install from git...well, it just won't work. I guess we could change the installer to fetch "the latest" from a URL if built from git, basically the same way we do with the release image.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For metal IPI CI, we stuff the data into the baremetal-installer container because we have to have it. See: openshift/installer#3330

The way users get this data to use metal IPI is really painful (openshift-install version, and fetching the rhcos.json from GitHub for that sha!). Basically every on-premise platform needs access to the image locations...

I dislike client-side patching of the installer binary. Storing it somehow next to machine-os-content makes sense to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike client-side patching of the installer binary.

I'm not talking about any client side patching. This is something that would happen in the OpenShift build system (release controller).

Storing it somehow next to machine-os-content makes sense to me.

Yes but the problem with that is that it's the bootstrap node that pulls the release image, so that leaves open the problem of which image to use for the bootstrap.

Now I guess a first transitional step could be changing the installer to use the metadata from the release payload for the cluster, and leave the pinned version only for the bootstrap node.

Copy link
Member

@stbenjam stbenjam May 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike client-side patching of the installer binary.
I'm not talking about any client side patching. This is something that would happen in the OpenShift build system (release controller).

oc adm release extract patches the binary. Sure, the binaries published on mirror.openshift.com already had that run to get them, but users can do that themselves too client side.

It's the default for baremetal since we don't publish ours.

Storing it somehow next to machine-os-content makes sense to me.
Yes but the problem with that is that it's the bootstrap node that pulls the release image, so that leaves open the problem of which image to use for the bootstrap.

A user can just use oc adm release info? That's at least a consistent place for any release-related information.


The implementation of this is basically shipping a subset of [coreos-assembler](https://github.com/coreos/coreos-assembler) as part of the OpenShift release payload, and teaching `oc` how to invoke `podman` to run it.

The `generate-bootimage` implementation would download the `machine-os-bootimage-generator` container image along with the existing `machine-os-content` container image (OSTree repository), and effectively run the [create_disk.sh](https://github.com/coreos/coreos-assembler/blob/30fbac4e176c7936362efbd647c8199d927e593c/src/create_disk.sh) process or [buildextend-installer](https://github.com/coreos/coreos-assembler/blob/30fbac4e176c7936362efbd647c8199d927e593c/src/cmd-buildextend-installer) for live media, etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do this at all? If we're already pulling all the binaries to build the image, why not just publish an image that contains a ISO or whatever format?

Copy link
Member Author

@cgwalters cgwalters Feb 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I think the basic answer is that if we e.g. ship a container image wrapping each pre-built bootimage, that's at least 700MB per image and there are like 7-8 or more images. That's...a lot. So things like oc adm release mirror would have to learn how to filter them. It'd make the OpenShift release process to push them much slower.

It just seems more tenable to me to make this dynamic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So things like oc adm release mirror would have to learn how to filter them

That seems like a better approach.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate a bit more on why? Just in terms of avoiding needing a new nontrivial container image? Needing to ship code that was formerly "behind the firewall" to run on premise? Something else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the bit level, we'll know the image they are booting is the one we built, we don't have to worry about a bug in the builder producing something incorrect in their environment. It's also one less moving piece we need to worry about in-cluster. My preference is to add nothing to the cluster that doesn't need to 100% need to be there.


This should have three possible runtime choices:

- Run via `--privileged` and use loopback mounts (avoids any virtualization requirements)
- Run via `--device /dev/kvm` (as coreos-assembler is optimized for today)
- Run with `--env COSA_NO_KVM=1` to run in environments without KVM

It would also be theoretically possible to support using e.g. an on-demand provisioned EBS volume, but this would impose a burden on the CoreOS team for another build path.

This `oc` enhancement would currently *not* run directly on platforms that don't have `podman` (i.e. OS X/Windows). But we could likely tweak things so that it could use e.g. the Docker API and Docker Desktop for Windows or equivalent.

## Proposal phase 3: machine-api-bootimage-updater

Add a new component to https://github.com/openshift/machine-api-operator which runs `machine-os-bootimage-generator`, uploads the resulting images to the IaaS layer (AMI, OpenStack Glance, etc.) and updates the machinesets for the cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we moving away from public AMIs?

There was some talk about possibly changing machine-controller to an image reference rather than updating each machineset. Either of these cases has the question of mixed image support. Do we plan on doing that in the future, or do we do it now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openshift/installer#2906 suggests moving away from AMIs, yep.

Copy link
Member

@enxebre enxebre Feb 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This proposal seems to me an extension of the current upgrade workflow driven by the Machine Config Operator.

If I understand correctly this is introducing a new upgrading step to run the machine-os-bootimage-generator to use the just brought machine-os-content to build an upload an artifact during an upgrade. So this belong together with all the upgrading business logic to the Machine Config Operator. And only when this last step is complete the MCO should signal a cluster upgrade as completed. So then any component in any environment e.g UPI have an available artifact to consume.

I can't see why we'd want to couple this with the machine API or the machine API operator.
I'd find it more reasonable for the MCO to make the new artifact ID available via e.g configMap as part of this new upgrading step and let any consumer e.g machineSets to reference it as they see fit e.g spec.Ami: "ocp-cluster-latest".

Watching and updating all the machineSets massively would be again effectively an additional step for an upgrade workflow to be considered complete. And I don't think we should split or couple our upgrading process to something orthogonal like the machine API, this belong to the Machine Config Operator to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this belong together with all the upgrading business logic to the Machine Config Operator. And only when this last step is complete the MCO should signal a cluster upgrade as completed.

I don't think it's that clear cut. Changing the bootimage has no effect on any running system; it's not clearly coupled with the main CVO upgrade logic.

I can't see why we'd want to couple this with the machine API or the machine API operator.

Because there's nothing to do if the machineAPI isn't in use.

I'd find it more reasonable for the MCO to make the new artifact ID available via e.g configMap as part of this new upgrading step and let any consumer e.g machineSets to reference it as they see fit e.g spec.Ami: "ocp-cluster-latest".

Again none of this applies unless machineAPI is in use. Sure, the code could technically live in the MCO but...

Another argument for having it live in machineAPI is that it requires IaaS credentials, usage of those APIs - the MCO isn't set up with that today, but machineAPI is.

Copy link
Member Author

@cgwalters cgwalters Feb 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or to expand on this - in a non-machineAPI (i.e. UPI) scenario, it'd be up to the customer to integrate this tooling into whatever they use, such as updating their AWS CloudFormation template, etc. Similar for the "manual bare metal PXE" scenario.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing to note, which is called out in this proposal - this is not replacing the default in-place updates. So even in a machineAPI/IPI scenario, if we edit the machinesets as part of the default CVO flow (whether the MCO edits or machineAPI edits), we would rely on the default machineAPI semantic of not reprovisoning machines when a machineset is changed.

Copy link
Member

@enxebre enxebre Feb 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again none of this applies unless machineAPI is in use. Sure, the code could technically live in the MCO but...

I see only a tangential point here to the machine API -> new machines belonging to a machineSet should come with the latest OS dictated by the MCO. The lifecycle of that OS artifact is transparent for the machine API. The machine API should be a consumer of this, just like any other component/tooling creating a new instance (e.g cloud formation) would be. They'd only need to know about where this is available.

Another argument for having it live in machineAPI is that it requires IaaS credentials, usage of those APIs - the MCO isn't set up with that today, but machineAPI is.

The machine API has actually no particular knowledge about IaaS credentials. It just happen to consume what ever the cloud credentials operator makes available for any component in the cluster.

Or to expand on this - in a non-machineAPI (i.e. UPI) scenario, it'd be up to the customer to integrate this tooling into whatever they use, such as updating their AWS CloudFormation template, etc. Similar for the "manual bare metal PXE" scenario.

I think It's up to the consumer to integrate as they see fit but to consume a consistently formatted "artifact address" given by the component which is the authoritative source of truth for cluster upgrades i.e the MCO. This consumer could be anything: machine API, cloud formation, etc. They are orthogonal building blocks.

Another thing to note, which is called out in this proposal - this is not replacing the default in-place updates. So even in a machineAPI/IPI scenario, if we edit the machinesets as part of the default CVO flow (whether the MCO edits or machineAPI edits), we would rely on the default machineAPI semantic of not reprovisoning machines when a machineset is changed.

I'm not in favour of the MCO editing machineSets to complement a cluster upgrade. That would be effectively scattering our upgrade process into different components and coupling our upgrade process to the machine API.

I think this proposal describes a missing step in our current upgrade workflow driven by the authoritative source of truth for cluster upgrades i.e MCO. Once this would be supported then any component/tooling (e.g machine API, anything) could transparently react as they see fit to consume a consistently formatted source of truth.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the proposal is to

  1. let machine config operator ecosystem generate the artifact that's suitable for upload
    ===
  2. machine-api-operator ecosystem consuming that artifact and creating the cloud specific Image
  3. machine-api-operator ecosystem then ensuring all required machinesets have the latest Image.

I think the separation make sense because machine config + rhcos knows best how to create bootimages.. and machine-api ecosystem knows best how to talk to cloud to create a cloud specific images and interact with Machinesets.


### User Stories

#### Story 1

ACME Corp runs OpenShift 4 in Google Compute Engine (installed via IPI) and are aiming to base a large part of their store frontend on [serverless via KNative](https://docs.openshift.com/container-platform/4.3/serverless/serverless-getting-started.html). They regularly keep on top of the latest OpenShift releases via in-place updates. When access to their widget store increases at peak times, their OpenShift cluster quickly boots latest RHCOS nodes in new instances to handle the work.

At no point did the ACME Corp operations team have to worry about managing or updating the GCP bootimages.

#### Story 2

Jane Smith runs OpenShift 4 on VMware vSphere in an on-premise environment not connected to the public Internet. She has (traditional) RHEL 7 already imported into the environment and already pre-configured and managed by her operations team.

She boots an instance there, logs in via ssh, downloads an `oc` binary. Jane then proceeds to follow the instructions for preparing a [mirror registry](https://docs.openshift.com/container-platform/4.3/installing/install_config/installing-restricted-networks-preparations.html).

Jane then uses `oc adm release generate-bootimage quay.io/openshift/release:4.3 vmware` to generate an OVA that can be imported into the VMWare environment.

From that point, Jane's operations team can use `openshift-install` in UPI mode, referencing that already uploaded bootimage and the internally mirrored OpenShift release image content.

### Risks and Mitigations

In an intermediate state where we have two "sources of truth" for the RHCOS version (pinned in the installer *and* included in the release image), the potential for confusion increases.

Even after the control plane is provisioned via `bootimage.json`, [openshift-install](https://github.com/openshift/installer/) will still need to reference *some* version of RHCOS that will need to be kept up to date.

#### coreos-assembler subset

The CoreOS team has invested a lot in [coreos-assembler](https://github.com/coreos/coreos-assembler) - the proposed `machine-os-bootimage-generator` would be a fairly targeted subset of that, but it still implies suddenly running on-premise what before this had just been run as part of the RHCOS build process. Today, the container uses Fedora as a base image; this proposal raises the question of whether we'd need to target the image at RHEL8 for example.

## Design Details

### Test Plan

If we switch to having workers provisioned via `bootimage.json` from the release payload, then the system will be constantly tested by every CI and periodic run today - plus the existing `machine-os-content` promotion gate.

The new `machine-os-bootimage-generator` container would have its own repository with its own e2e test that runs in at least one IaaS cloud and e.g. a VMWare environment too.

### Graduation Criteria

TBD

### Version Skew Strategy

The whole idea of this is to *reduce* skew overall. However, we do need to ensure that e.g. new bootimages are only replaced in machinesets once a cluster upgrade is fully complete.

## Implementation History


## Drawbacks

Even more moving parts to maintain for the CoreOS/MCO team, and this also requires integration with other components such as machineAPI, the release image, etc.

## Alternatives

We could implement automatic bootimage updates for connected IaaS environments, and perhaps just some documentation for disconnected environments.