Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grub-update does not identify pool correctly: root=ZFS=<missing>/ROOT/debian #18

Open
azeemism opened this issue Jan 28, 2015 · 10 comments

Comments

@azeemism
Copy link

Hi,

Regarding the dailies version of zfs: 0.6.3-222d9d57jessie

I am running grub-update under chroot, during native-zfs-root-filesystem installation. The parameter root=ZFS= is missing the pool name. This does not happen with version 0.6.3-766_gfde0d6d

The /boot/grub/grub.cfg file is updated with,
root=/ROOT/debian

Instead of,
root=ZFS=spool/ROOT/debian

This occurs with or without the following parameters set in /etc/default/grub
boot=zfs rpool=spool bootfs=spool/ROOT/debian

This issue can be duplicated as follows:

VirtualBox Installation of Debian Jessie (tested in RAID 10)
sda1 - UEFI
sd[a-d]2 - unformatted for zfs
sd[a-d]3 - swap
sd[a-d]4 - /boot
sd[a-d]5 - /

Install zfs dailies,
create pool
zpool create -m none spool mirror /dev/disk/by-id /dev/disk/by-id mirror /dev/disk/by-id /dev/disk/by-id
create datasets for spool/ROOT/debian
add /etc/udev/rules.d/70-zfs-grub-fix.rules
from: #5
zpool export spool
zpool import -o altroot=/sysroot spool
zfs set mountpoint=/ spool/ROOT/debian
rsync -axv / /sysroot/
rsync -axv /dev/ /sysroot/dev/
chroot /sysroot /bin/bash
mount -t proc proc /proc
mount -t sysfs sysfs /sys
mount /boot
nano /etc/fstab
comment out line for /
can add or not add line for spool/ROOT/debian / zfs default,noatime 0 0
update-initramfs -c -k all
update-grub

Thanks for all that you do,

Azeem

@azeemism
Copy link
Author

I should also point out that if grub-mkconfig is run "instead of" / "prior to" running update-grub, then the whole parameter "root=ZFS=spool/ROOT/debian" is missing from /boot/grub/grub.cfg, leaving the original boot parameter.

This is strange to me because grub-mkconfig is supposed to do everything and more than what update-grub does according to the following source:

http://members.iinet.net/~herman546/p20/GRUB2%20Bash%20Commands.html#grub-mkconfig

Not sure if this should be a separate issue, even though they are two different commands they should operate in basically the same way?

@FransUrbo
Copy link
Contributor

I should also point out that if grub-mkconfig is run "instead of" / "prior to" running update-grub, then the whole parameter "root=ZFS=spool/ROOT/debian" is missing from /boot/grub/grub.cfg

This is strange to me because grub-mkconfig is supposed to do everything and more than what update-grub does according to the following source:

Some grub versions don't understand ZFS. I can't remember if the one in Jessie/Jessie-daily did. I have a new version in wheezy-daily which haven't been rebuilt for jessie yet.
Not sure if this should be a separate issue, even though they are two different commands they should operate in basically the same way?

Not really. Update-grub calls grub-mkconfig, so if grub-mkconfig fails, so will the former...

But wether or not grub-mkconfig works or not is really not the biggest problem - yes, it's a nuisance not getting a good/correct grub.cfg, but the primary failure is that it won't boot even if you specify a correct one!

Once we got that figured out, we can deal with a better/correct grub...

@rugubara
Copy link

I'm still hitting this on one of my systems running 0.8.1. others work just fine. I wonder how I could contribute to troubleshooting of this issue?

@lucidBrot
Copy link

The same thing just happened to me in 0.8.4 on ubuntu 18.04 and I'm not sure why. I'm replying in the hope that it is helpful, but I don't require immediate assistance (yet)

It used to work with the same partition just fine and included the pool in the root=ZFS=aquarium/ds1/u18. I proceeded to create a second pool with the exact same contents apart from the fact that it has a different ashift and has encryption turned on. I booted into that by temporarily editing the menuentry to read root=ZFS=tank/ds1/u18 instead and it booted fine.

I then ran sudo update-grub and it removed the aquarium from those boot parameters. It also failed creating an entry for the new system, but perhaps that is due to the encryption and to be expected (?)

Selecting the broken menuentry in grub boots to the initramfs which tells me to import the pool myself.

I booted again into the original system by manually specifying the pool and run update-grub.
That restored the missing aquarium (but also got rid of the custom entry I created for my encrypted system from my encrypted system's tank/ds1/u18 in /etc/grub.d/40_custom

@lucidBrot
Copy link

lucidBrot commented May 21, 2020

Okay, so this seems to be reproducible. Whenever I run sudo update-grub on my unencrypted ZFS partition with ubuntu 18.04, it finds the other ubuntu installations and (as long as I have not added GRUB_SKIP_OS_PROBER=true in /etc/grub.d/30_os-prober) also the windows installation, but not the encrypted installation.

All the installations are on one single NVMe disk.

When I run sudo update-grub on my encrypted ZFS partition with ubuntu 18.04, it finds the other installations too, but it omits the pool from root=ZFS=pool/dataset/systemrootdataset for the unencrypted ZFS partition. It does also not find itself, but that seems to be a tangential issue and I have asked an unanswered question about that on askubuntu.

When I run sudo update-grub on the unencrypted ZFS partition, it restores the broken menuentry for itself.

I have made sure that I actually have the boot partition and the efi partition mounted at /boot and at /boot/efi before running update-grub.

It seems to me that the reason is that in /etc/grub.d/10_linux the variable $rpool is unset because of the line

rpool=`${grub_probe} --device ${GRUB_DEVICE} --target=fs_label 2>/dev/null || true`

which succeeds even when it fails. Having a look at the redirected error output reveals

/usr/sbin/grub-probe: error: unknown filesystem

which is the result of the command

/usr/sbin/grub-probe --device /dev/nvme0n1p10 --target=fs_label 2>/dev/null || true

It is unclear to me why it would be probing /dev/nvme0n1p10 there. That is the partition where my encrypted ZFS resides. It should at least be also probing /dev/nvme0n1p9 where the unencrypted ZFS 0.8.4 ubuntu 18.04 resides.

$GRUB_DEVICE is set to the partition the update-grub command is run from. I believe this is the responsible line. Doesn't make sense to me why you'd set all the ZFS roots like that

@tterpelle
Copy link

I'm seeing the same issue on Debian Buster: my /boot is on a regular ext4 partition but / and everything else is ZFS.

update-grub only adds root=ZFS=/ROOT/linux without the pool name. I "solved" it by adding the correct line to GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub:

GRUB_CMDLINE_LINUX_DEFAULT="acpi_backlight=vendor root=ZFS=rpool/ROOT/linux"

Now I have root=ZFS twice in GRUB:

BOOT_IMAGE=/vmlinuz-5.5.0-0.bpo.2-amd64 root=ZFS=/ROOT/linux ro acpi_backlight=vendor root=ZFS=rpool/ROOT/linux

but it gets the job done.

On Ubuntu 19.10 with a similar setup I don't have this issue. Ubuntu comes with /etc/grub.d/10_linux_zfs in their grub-common-2.04-1ubuntu12.2 package.

@lucidBrot
Copy link

Oh, nice! That's probably a better solution than using a custom entry - because with the custom entry, I have to update the kernel version manually.

I guess I'll try (sometime next week) looking at that /etc/grub.d/10_linux_zfs as well. Perhaps copying it to ubuntu 18.04 would suffice to make this work. Do you know where I would get the contents without installing Ubuntu 19.10?

@tterpelle
Copy link

Do you know where I would get the contents without installing Ubuntu 19.10?

@lucidBrot : Download grub-common_2.04-1ubuntu12_amd64.deb from https://packages.ubuntu.com/eoan/amd64/grub-common/download and extract it with ar x grub-common_2.04-1ubuntu12_amd64.deb.

@lucidBrot
Copy link

lucidBrot commented Jun 22, 2020

Okay, I will experiment more when I get the time, but this already creates a working entry:

  1. Download grub-common_2.04-1ubuntu12_amd64.deb from https://packages.ubuntu.com/eoan/amd64/grub-common/download and extract it with ar x grub-common_2.04-1ubuntu12_amd64.deb.

  2. get the /etc/grub.d/10_linux_zfs file out of that data.tar.xz.
    I didn't know the correct way so I just did vim data.tar.xz, opened the file, then did :w 10_linux_zfs.

  3. Move it to /etc/grub.d

  4. Make it executable
    sudo chmod +x /etc/grub.d/10_linux_zfs

  5. sudo update-grub


There are several warnings though. update-grub found initrd and kernel images in both my tanks (each has a distinct /boot due to an oversight on my part. That may be why. Or maybe I'm misremembering and it just finds the exact same kernel files twice.) and also in each snapshot of the other pool (In fact, my oversight was that the other pool contains an actual /boot while the current pool uses a boot partition which is not under zfs. That explains the following warnings).
For every snapshot in the old pool's root dataset (i.e. the one with mountpoint / ) I get one warning:

Warning: Ignoring tank/ds1/u18@2005081557
Warning: Failed to find a valid directory 'etc' for dataset 'tank/ds1/u18@2005081926'. Ignoring

Edit:
I think I may have confused something above regarding those warnings. The current behaviour for me is that update-grub finds all the kernel images within snapshots of the older zfs-as-root partition that is not encrypted. But it displays warnings for all the snapshots of the currently mounted zfs-as-root partition that is encrypted. I don't really care, because my boot partition is not on zfs for the newer, encrypted, partition. But the verbosity of those warnings is annoying.

If one really wanted to, one could modify 10_linux_zfs to skip those checks.
/Edit

However, it does find the things it needs to do and the generated grub menuentry boots to my encrypted zfs-root ubuntu 18.04!

Now I shall figure out how to get rid of the wrongly created entries. Perhaps it's as simple as also replacing 10_linux.

These lines that were added in the new 10_linux imply that this might be a good idea:

       # We have a more specialized ZFS handler, with multiple system in 10_linux_zfs.
       if [ -e "`dirname $(readlink -f $0)`/10_linux_zfs" ]; then
         exit 0

@lucidBrot
Copy link

lucidBrot commented Jul 9, 2020

Now I shall figure out how to get rid of the wrongly created entries. Perhaps it's as simple as also replacing 10_linux.

Indeed. Simply replacing /etc/grub.d/10_linux_zfs and /etc/grub.d/10_linux suffices to get correct entries that are autogenerated by update-grub.

CC @tterpelle

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants