Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootupd: call bootupctl with --update-firmware #5508

Merged
merged 2 commits into from
Jun 12, 2024

Conversation

AdamWill
Copy link
Contributor

@AdamWill AdamWill commented Mar 7, 2024

This is required to write an entry to the EFI boot manager, which we ought to do (anaconda does it when installing the bootloader itself). Without this, boot of the installed system will only work if it's configured to try and boot from the hard disk using the fallback path.

@AdamWill
Copy link
Contributor Author

AdamWill commented Mar 7, 2024

I haven't tested this yet, but wanted to file it as a way to flag the issue. It seems like images that use bootupd (Silverblue and IoT) do not boot after install on UEFI in openQA. I believe this is why: they don't actually write an EFI boot manager entry (I modified one openQA test to run efibootmgr after install was complete, and it showed no Fedora entry in the list). The system will still sometimes boot okay, but only by relying on boot from the hard disk using the EFI fallback path, which is really meant to be an emergency backstop, and does not work in all cases.

I'm running a test build of a Silverblue image with anaconda patched with this patch ATM, and will try and get openQA to test a UEFI install and boot with it once it's done.

@AdamWill
Copy link
Contributor Author

AdamWill commented Mar 7, 2024

@pcdubs @travier @jmarrero

@pep8speaks
Copy link

pep8speaks commented Mar 7, 2024

Hello @AdamWill! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2024-06-10 15:07:44 UTC

@travier
Copy link
Contributor

travier commented Mar 7, 2024

CC @cgwalters

Copy link
Contributor

@travier travier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. We indeed need to do this for Atomic Desktops.

Investigation in https://pagure.io/workstation-ostree-config/pull-request/453#comment-199664

@cgwalters
Copy link
Contributor

but only by relying on boot from the hard disk using the EFI fallback path, which is really meant to be an emergency backstop, and does not work in all cases.

Yeah, though this is how all FCOS (and derivatives) installs work by default; coreos/fedora-coreos-tracker#946 has a lot of related bits. This also relates to the general idea of having the operating system work like cloud instances, which always start from a disk image. (This is how booting in AWS/GCP/qemu-with-uefi work etc.)

But yes, it does probably make sense to do this. That said, we could also use Anaconda's code for this. If this works it's probably simplest, but the equivalent bootupd logic here is currently way less battle-tested than Anaconda's, to state the obvious. (We are testing it though in at least AWS instances with UEFI)

@AdamWill
Copy link
Contributor Author

AdamWill commented Mar 7, 2024

Yeah, I think it's an environmental thing; for FCOS it's sensible since the target environment is one where we know fallback path boot works and is commonly relied upon. But this isn't really the case for desktop spins, obviously; they get installed into, er, extremely heterogeneous firmware environments ;) Even without real issues in the firmware, take the "install alongside Windows" case - if we don't write a boot manager entry, it won't be particularly obvious that in order to boot Fedora, you have to tell the firmware to do a fallback path boot from the disk you installed it to. IoT I guess is somewhere in the middle.

Yeah, there's definitely some subtlety to this beyond "does it work". It seems like it does work to fix boot of a clean Silverblue install in openQA, at least. But questions I can think of: what does it call the efibootmgr entry it creates? What does it do if one with the same name already exists? How do attempts to dual boot between an anaconda-bootloader-code install and a bootupd-bootloader-code install work out? I'm downloading the image locally so I can look into at least a couple of those.

BTW, on the FCOS topic - it looks like the openQA FCOS tests don't run into this because they're all run on BIOS...maybe I should flip them to UEFI...

@jkonecny12
Copy link
Member

Looks like a good approach to me. However, I wonder if we want to take into account also the bootloader kickstart command.

@jkonecny12
Copy link
Member

If it is environment dependent than it could be resolved by Anaconda configuration files

@AdamWill
Copy link
Contributor Author

AdamWill commented Mar 7, 2024

I don't think we really need to not do this in some environments (unless Colin disagrees), it shouldn't cause any harm at all in cloud environments. It's just not really necessary.

For the bootloader kickstart command - I guess you're considering the --leavebootorder option, right? That's documented as telling anaconda to not do this. So, yes, I suppose we should respect that, I'll see if I can add it.

@AdamWill
Copy link
Contributor Author

AdamWill commented Mar 7, 2024

So, hum. I don't think bootupd actually has the behaviour that argument is meant to produce.

Looking at the way the EFI bootloader classes work, when that argument is set, they run efibootmgr -C. When it's not set, they run efibootmgr -c. -C creates an entry but does not add it to the boot order. -c creates an entry and adds it to the boot order. So either way, you get an entry - the difference is only whether it's added to the boot order.

In bootupd, it looks like if you pass --update-firmware, it runs efibootmgr with --create (which is the same as -c, it creates an entry and adds it to the boot order). If you don't pass --update-firmware, it does not run efibootmgr at all. There's no way to make it create an entry but not add it to the boot order. So we cannot implement the intended behaviour of the option without patching bootupd, I don't think. I can write a patch for bootupd, I guess...

BTW, while researching this, I think the bootupd path is not implementing quite a few other pykickstart bootloader options. I don't think it respects bootloader --disabled, for instance, or bootloader --password. Unless I'm missing something in how this works.

(Honestly, those bootloader args could use a complete rewrite, but of course, nobody has the time...)

@cgwalters
Copy link
Contributor

So...this is a giant topic, but part of the push for bootupd is to "static" grub configuration, and not dynamic, and ideally also not per-machine.

And there's an overall tension between:

  • Configuring my operating system via kickstart
  • Configuring my operating system by deriving a container image

It should actually work for example to drop in grub configuration into /usr/lib/bootupd/grub2-static/configs.d/ as part of a container build, and change it that way.

@AdamWill
Copy link
Contributor Author

AdamWill commented Mar 7, 2024

I'm not sure you could manage "don't install a bootloader at all" that way, though. (edit: of course...we probably can't manage that at all with an ostree/container-based deployment, which will have some bootloader bits baked into it. I think for old-school installs, bootloader --disabled actually means "do not install any bootloader packages").

Don't get me wrong - when I say "complete rewrite" I mean "we should probably get rid of like 75% of them and tell people to live with it". :D But as long as they all exist, I guess, we should at least consider the interface between those options and the bootupd path; at a minimum we could update the pykickstart docs to flag which options are intentionally not implemented for the bootupd path, and maybe log warnings if they're passed in the kickstart for such an install?

But that seems rather beyond the scope of this PR. Given that we're already not implementing lots of other args, I'm not sure if implementing that arg should block this PR, especially if it needs changes to bootupd first?

@AdamWill
Copy link
Contributor Author

AdamWill commented Mar 7, 2024

Filed https://bugzilla.redhat.com/show_bug.cgi?id=2268505 and proposed as a Beta blocker for the IoT case; it may not really be serious enough to make it a blocker, but I figured we could discuss it at least, and should at least make it an FE for the Atomic Desktops case.

@jkonecny12
Copy link
Member

Good point. I agree for leaving bootloader kickstart out of this and document that.

@AdamWill
Copy link
Contributor Author

AdamWill commented Mar 8, 2024

So looking into how bootupd names the entry, it seems that it calls its own find_efi_vendordir() function and names the entry for whatever that returns. So it is, I think, expecting to find /boot/efi/EFI/fedora and name the entry "fedora". But the find_efi_vendordir() implementation seems a bit naive - it seems to just take the first directory it finds under the EFI directory (/boot/efi/EFI) which isn't called BOOT.

This worries me a bit, because on my system, I have:

[adamw@xps13a bootupd (main)]$ sudo ls /boot/efi/EFI
BOOT  dell  fedora

Note, "dell". Which alphabetically comes before "fedora". so...er, yeah, that seems like it may cause some issues. It kinda looks like that would break creating the EFI boot manager entry (it doesn't just use the vendor dir it finds as the name of the entry, it also looks in that directory for a shim file for the entry to point to), and installing grub-static-efi.cfg in grubconfigs.rs install().

anyhow, aside from that, it seems like at least bootupd's intent is to call the entry "fedora". Which is not quite the same as "Fedora", which is what anaconda calls the entry when it makes it itself (for Fedora, anyway; it actually uses get_product_name().split("-")[0]).

When anaconda is handling the efibootmgr install, it removes existing entries before creating its new one (unless --leavebootorder is set). But it only looks for existing entries with the exact same label, so it will not remove any 'fedora' entries created by bootupd. Not sure how much of a problem we think that is.

Unlike anaconda, bootupd doesn't try to remove existing entries with the same label as the one it's going to create. It finds the BootCurrent entry, which means whatever entry in the EFI boot manager was used to trigger the current boot, and removes that, before creating a new entry. I don't think this will usually result in the removal of either 'fedora' or 'Fedora'-named entries, when bootupd is being run from the installer. I'm not sure we really want this behaviour when booting the installer from, e.g., a USB stick or optical drive; it would delete the boot manager entry for that device, I think. Whether the firmware would then restore it on the next boot is kinda up to the firmware. Well, for USB sticks, you'd typically expect the firmware to treat the entries as entirely transient, as it has no idea whether any given stick will be plugged in on any given boot, so I'd expect they would all just scan for attached sticks and create an entry for any that are found on each boot, so it shouldn't be a problem.

@jkonecny12
Copy link
Member

That looks like an RFE on bootupd project to me.

@travier
Copy link
Contributor

travier commented Mar 8, 2024

Given all of what you wrote, it feels like we should be using a subset of the logic in Anaconda as this is going to be a lot of changes on the bootupd side otherwise. So we might as well share the "efibootmgr" logic here instead of duplicating it in bootupd and having to test it all again.

@AdamWill
Copy link
Contributor Author

AdamWill commented Mar 8, 2024

Yeah, so far as efibootmgr invocations go that kinda sounds sensible to me. But then should we try and move that code out somewhere for anaconda and bootupd to share? Well, we'd have to, since they're not even the same language. It might be a bit difficult to do in practice as I think anaconda's logic is quite...embedded into the whole anaconda architecture...

@cgwalters
Copy link
Contributor

I can't imagine it'd be a problem for the anaconda environment to depend on bootupd? Or dunno, we could split out a distinct helper binary. But anyways, this is in some distant future because as you say there's some not-small risk from touching the Anaconda code here I suspect.

@cgwalters
Copy link
Contributor

So we might as well share the "efibootmgr" logic here instead of duplicating it in bootupd and having to test it all again.

To be clear, the rationale for bootupd having this is around supporting bootc install - where a container can act like a self-extracting zip file and support installing itself to a disk (for basic scenarios) without any external infrastructure at all required. More in https://containers.github.io/bootc/bootc-install.html#using-bootc-install-to-filesystem---replacealongside

(It'd also make sense to support running anaconda as a container for these types of flows, installing the target OS container...or I guess via dnf install anaconda too.)

What I would like to do is get #5285 merged, then we can flesh out using bootc as part of anaconda more, which will help drive some deduplication too hopefully.

@travier
Copy link
Contributor

travier commented Mar 11, 2024

I agree that ideally we would have a shared tool doing that setup for us but in the meantime, in oder to make this ready for F40 (beta ideally), could we re-add some of the logic from Anaconda around EFI boot entry setup (likely things that have been skipped in #5342)? @jkonecny12

@jkonecny12
Copy link
Member

Updated:

  • I renamed this PR (removed the DO NOT MERGE part)
  • Rebased

@jkonecny12
Copy link
Member

/kickstart-test --testtype smoke

@travier
Copy link
Contributor

travier commented Jun 10, 2024

I think that ideally we would have Anaconda support the following for the bootupd case:

  • bootloader --leavebootorder: Calls to bootupd without the --update-firmware option from this PR.

From https://pykickstart.readthedocs.io/en/latest/kickstart-docs.html#bootloader

I don't know how/where kargs are parsed/interpreted in Anaconda.

Oh, we have https://anaconda-installer.readthedocs.io/en/latest/boot-options.html#boot-loader-options and https://anaconda-installer.readthedocs.io/en/latest/boot-options.html#inst-leavebootorder

@travier
Copy link
Contributor

travier commented Jun 10, 2024

The main issue that we have right now is that as soon as we add bootupd to the images, this enables the bootupd code path in Anaconda and there is no way to use the other path.

If we could have a variable that force one or the other that would be helpful as well.

@cgwalters
Copy link
Contributor

bootloader --leavebootorder: Calls to bootupd without the --update-firmware option from this PR.

Agreed!

@jkonecny12
Copy link
Member

I've added the requested code. It needs to be tested though.

@jkonecny12
Copy link
Member

/kickstart-test --testtype bootc

@jkonecny12
Copy link
Member

/kickstart-test --testtype smoke

Copy link
Contributor

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks sane to me!

If leavebootorder was specified through kickstart or kernel boot
arguments we won't pass `--update-firmware` to bootupdctl. That will
avoid creation of the UEFI entry for the bootloader and give people
possibility for additional tweaking or debugging.

This was requested by bootloader developers.

Suggested-by: Timothée Ravier <[email protected]>
@jkonecny12
Copy link
Member

/kickstart-test --testtype smoke

@jkonecny12
Copy link
Member

/kickstart-test --testtype bootc

@jkonecny12
Copy link
Member

@cgwalters @travier I wasn't able to make it into today's build.

Are you fine to wait for next build Tuesday or do you want me to do extra build for this change during the week?

@jkonecny12
Copy link
Member

@jikortus @cgwalters do we want to backport this to RHEL branches too?

@cgwalters
Copy link
Contributor

Are you fine to wait for next build Tuesday or do you want me to do extra build for this change during the week?

If there was anything that couldn't wait a week we'd be sure to flag that I think.

@jikortus @cgwalters do we want to backport this to RHEL branches too?

Should definitely go into c10s relatively quickly, and let's do verification of things there, then target c9s probably too.

@jkonecny12 jkonecny12 added port to RHEL9 The pull request needs to be ported to RHEL 9. port to RHEL10 labels Jun 12, 2024
@jikortus
Copy link

jikortus commented Jun 12, 2024

@jkonecny12 ACK for backporting this feature to RHEL-10, it looks useful for non-cloud use-cases. Just note that, to my best knowledge, we'll be only able to test the installation with a CS10 bootc image at this point.

@jkonecny12
Copy link
Member

For testing upstream I used fedora:eln container which we are using in KS tests.

@jkonecny12 jkonecny12 merged commit 0520cfa into rhinstaller:master Jun 12, 2024
20 checks passed
@travier
Copy link
Contributor

travier commented Jun 19, 2024

Thanks for the code @jkonecny12 ! Could we get this in Rawhide? I'll push an updated bootupd & re-enable it for the Atomic Desktops in Rawhide as well.

@jkonecny12
Copy link
Member

Yes, sorry we missed yesterday release because there is a fix we would like to get out. We will try to get this out today.

evan-goode pushed a commit to evan-goode/workstation-ostree-config that referenced this pull request Jul 24, 2024
With the following issues now fixed:
- coreos/bootupd#630
- coreos/bootupd#658
- coreos/bootupd#551

And corresponding support in Anaconda:
- rhinstaller/anaconda#5508

We can now (re-)enable bootupd for the bootable containers.

After a bit of testing, we will enable it for the classic ostree ones.

See: https://gitlab.com/fedora/ostree/sig/-/issues/1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
f41 port to RHEL9 The pull request needs to be ported to RHEL 9. port to RHEL10
Development

Successfully merging this pull request may close these issues.

7 participants