MCO-286: Add mode for template controller to write to /usr, spike on building a container #3137

cgwalters · 2022-05-05T19:34:49Z

Today the MCO ships a whole lot of static files. In this spike we introduce something like a Dockerfile that does:

FROM machine-config-operator as builder
RUN machine-config-controller extract-static-templates > /srv/out.json

FROM rhel-coreos
COPY --from=builder /srv/out.json
RUN ignition-liveapply /srv/out.json && rm -f /srv/out.json

The output of this would be a new openshift-node-base image that ends up in the release image too. And this image would become the "golden image" that is rolled out by the MCO by default.

In this, the template controller will need to put e.g. what is know /etc/systemd/system/kubelet.service to /usr/lib/systemd/system/kubelet.service, and /usr/local/bin/nodeip-config.sh (this is really /var/usrlocal/nodeip-config.sh) to /usr/bin/nodeip-config.sh.

The text was updated successfully, but these errors were encountered:

cgwalters · 2022-05-05T20:36:16Z

To build and extend on this a bit - notice a huge benefit of this transition is that suddenly files that the MCO currently owns move underneath the ostree read-only bind mount. Today, an admin can ssh to a node and vi /etc/systemd/system/kubelet.conf and that will work - the config drift monitoring will hopefully kick in.

Instead in this world, when they try to vi /usr/lib/systemd/system/kubelet.conf, they will get a permission denied, same as for all the OS binaries.

Of course, nothing stops them creating /etc/systemd/system/kubelet.conf which will override per systemd rules, or for that matter using systemctl edit kubelet.

But...I do hope actually what we can do ostree side actually is move to an opt-in model where people request a "sealed" system where /etc is really just a symlink to /usr/etc - and then it's all clearly immutable.

Then, following onto this - I think a powerful model will be enabling people to (cryptographically) sign their images with e.g. Linux IMA. Then the protection we have can't be subverted with a simple mount -o remount,rw /usr; this would help provide mitigation in some container breakout/exploit scenarios too.

cgwalters · 2022-06-02T18:03:57Z

Another big thing we can do once this lands is try a spike where we switch from templating things like kubelet.service to dynamic dispatch. For example, switch to a systemd generator which dispatches on ignition.platform.id (this intersects OCP platforms and CoreOS platforms, which has some nontrivial subtleties).

openshift-bot · 2022-09-01T09:00:24Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

cgwalters · 2022-10-05T12:37:07Z

And doing this then leads to the next domino: removing the Machine Config Server entirely. Suddenly a vast swath of issues go away (e.g. #784 etc.)
(It'd be replaced by a registry on the bootstrap node if we have MachineConfig to apply, or in the "golden" case of zero configuration at all we literally just pull the stock openshift-node-base container image that this issue talks about).

Another way to look at this: how much we use/depend on Ignition in OpenShift shrinks a lot.

And then...you know, it seems quite viable to have a mode for openshift-install where it can output a kickstart file for Anaconda too...and we could support installing RHEL CoreOS via Anaconda (xref https://bugzilla.redhat.com/show_bug.cgi?id=2125655 - the role of the kickstart is basically just to pull our node base image). Irrelevant for cloud, and we've got all this covered really well with Assisted installer etc too...but I'm sure there's a not-small percentage of customers on bare metal for whom that would make RHEL CoreOS actually in practice feel much more like RHEL.

EDIT: Actually another big domino after we drop the MCS is that the need for networking in the initramfs for CoreOS also shrinks - in the cloud case we don't need to do DHCP, just link-local. For hypervisors that give us metadata via a non-IP channel, we go back to not needing networking at all. We do need initramfs networking for Tang of course. And we've invested a lot in initramfs networking, since RHEL9 and current Fedora it seems to work well.

cgwalters · 2022-10-07T10:06:16Z

Oooh. I just had another idea related to this...I'm thinking we could support a flow where we use the stock Fedora/CentOS/RHEL cloud images (AMI, GCP etc.) like this:

Stock RHEL AMI comes up, openshift-install has attached cloud-init data to it.

That cloud-init injected code entirely re-paves the system fetch and deploy the target oscontainer (a beauty of ostree is it just drops new files into /ostree), then start executing from ram and the old running root filesystem and reboots into it. Again, everything that existed there before is gone, actually we would wipe and reinitialize the bootloader state (ESP, MBR etc.) too. What we'd be keeping is the provisioned filesystem and that's it.

Done! OK well, almost...

Two important details here. First, assuming we've followed the model where all the secrets are embedded in the user data, we would need to inject the pull secret via cloud-init - and then we have a choice:

preserve that data across the "re-paving"
actually embed an ignition config inside the cloud-init config (in a way that cloud-init will ignore), then on the next boot, we actually run ignition in the same way we do today!

What if we have Ignition that wants to re-partition the disk? Yeah, here's where things would be much better if we had Ignition as an opt-in in these images too, because it actually makes sense to unify the "re-paving" and "re-partitioning". But in the short term, if you have Ignition repartitioning specified, we take another reboot (or, we could try to hack things up so that we run ignition not from the initramfs but from our similar running-from-RAM setup after we've fetched the target OS).

What would be the value in all of this? Well, for one thing, we could stop uploading and managing RHEL CoreOS cloud images...which would be kind of a big deal. It means for customers who want to install OpenShift in e.g. some private cloud and they already have uploaded RHEL guest images, we can just reuse that instead of making them upload, manage and maintain a different one.

openshift-bot · 2023-01-06T01:00:31Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2023-02-05T08:30:22Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2023-03-08T00:00:21Z

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2023-03-08T00:02:43Z

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

cgwalters added the layering label May 5, 2022

cgwalters changed the title ~~Add mode for template controller to write to /usr~~ Add mode for template controller to write to /usr, spike on building a container Jun 2, 2022

cgwalters changed the title ~~Add mode for template controller to write to /usr, spike on building a container~~ MCO-286: Add mode for template controller to write to /usr, spike on building a container Jun 2, 2022

cgwalters mentioned this issue Aug 26, 2022

Add kubens.service, drop-ins, and kubensenter prefix to kubelet.service #3274

Merged

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 1, 2022

cgwalters mentioned this issue Sep 8, 2022

layering: pod-like MachineImage idea #3327

Closed

cgwalters removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 9, 2022

cgwalters mentioned this issue Oct 5, 2022

RFE-2962: configure ovs should use node-ip-hint set by nodeip-configuration #3362

Merged

cgwalters mentioned this issue Oct 12, 2022

openshift-os: promote openshift-os-src image openshift/release#31973

Closed

cgwalters mentioned this issue Nov 15, 2022

consider making config changes truly transactional on RHCOS #1190

Closed

cgwalters mentioned this issue Nov 30, 2022

Add install verb containers/bootc#1

Closed

cgwalters mentioned this issue Jan 5, 2023

realtime kernel enablement relies on delta from past machine state which doesn't exist in hypershift #3468

Closed

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 6, 2023

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 5, 2023

openshift-ci bot closed this as completed Mar 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MCO-286: Add mode for template controller to write to /usr, spike on building a container #3137

MCO-286: Add mode for template controller to write to /usr, spike on building a container #3137

cgwalters commented May 5, 2022 •

edited

Loading

cgwalters commented May 5, 2022

cgwalters commented Jun 2, 2022

openshift-bot commented Sep 1, 2022

cgwalters commented Oct 5, 2022 •

edited

Loading

cgwalters commented Oct 7, 2022 •

edited

Loading

openshift-bot commented Jan 6, 2023

openshift-bot commented Feb 5, 2023

openshift-bot commented Mar 8, 2023

openshift-ci bot commented Mar 8, 2023

MCO-286: Add mode for template controller to write to /usr, spike on building a container #3137

MCO-286: Add mode for template controller to write to /usr, spike on building a container #3137

Comments

cgwalters commented May 5, 2022 • edited Loading

cgwalters commented May 5, 2022

cgwalters commented Jun 2, 2022

openshift-bot commented Sep 1, 2022

cgwalters commented Oct 5, 2022 • edited Loading

cgwalters commented Oct 7, 2022 • edited Loading

openshift-bot commented Jan 6, 2023

openshift-bot commented Feb 5, 2023

openshift-bot commented Mar 8, 2023

openshift-ci bot commented Mar 8, 2023

cgwalters commented May 5, 2022 •

edited

Loading

cgwalters commented Oct 5, 2022 •

edited

Loading

cgwalters commented Oct 7, 2022 •

edited

Loading