Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUKS on 512e disk breaks ignition-ostree-growfs.service #1384

Closed
freedge opened this issue Jan 15, 2023 · 6 comments · Fixed by coreos/fedora-coreos-config#3033
Closed

LUKS on 512e disk breaks ignition-ostree-growfs.service #1384

freedge opened this issue Jan 15, 2023 · 6 comments · Fixed by coreos/fedora-coreos-config#3033
Labels

Comments

@freedge
Copy link

freedge commented Jan 15, 2023

Describe the bug

Cannot deploy a VM with Luks on Azure.

Reproduction steps

  1. have a block device with physical_block_size > logical_block_size:
:/root# cat /sys/block/sda/queue/physical_block_size 
4096
:/root# cat /sys/block/sda/queue/logical_block_size 
512
  1. try to deploy with luks

Expected behavior

luks deployment should work

Actual behavior

Jan 15 09:16:32 localhost.localdomain ignition-ostree-growfs[3511]: CHANGED: partition=4 start=1050624 old: size=3803136 end=4853760 new: size=254801887 end=255852511
Jan 15 09:16:36 localhost.localdomain ignition-ostree-growfs[3555]: Device size is not aligned to requested sector size.
Jan 15 09:16:36 localhost.localdomain systemd[1]: ignition-ostree-growfs.service: Main process exited, code=exited, status=1/FAILURE

System details

  • fedora-coreos-37.20221225.1.1-azure.x86_64.vhd
  • on Azure, deploying with a Microsoft.Compute/galleries/images having
    features: [
      {
        name: 'SecurityType'
        value: 'TrustedLaunch'
      }
    ]

and VM having

    securityProfile: {
      securityType: 'TrustedLaunch'
      uefiSettings: {
        secureBootEnabled: true
        vTpmEnabled: true
      }
    }

Ignition config

failing config:

variant: fcos
version: 1.4.0

boot_device:
  layout: x86_64
  luks:
    tpm2: true

working config:

variant: fcos
version: 1.4.0


storage:

  filesystems:
    - device: /dev/mapper/root
      format: xfs
      label: root
      wipe_filesystem: true

  luks:
    - clevis:
        tpm2: true
      device: /dev/disk/by-partlabel/root
      label: luks-root
      wipe_volume: true
      name: root
      options:
      - "--sector-size" 
      - "512"

Additional information

in failing case, when cryptsetup deduces a 4096B sector size:

:/root# lsblk --bytes -o NAME,PHY-SEC,LOG-SEC,START,SIZE
NAME     PHY-SEC LOG-SEC   START         SIZE
sda         4096     512         130996502528
|-sda1      4096     512    2048      1048576
|-sda2      4096     512    4096    133169152
|-sda3      4096     512  264192    402653184
`-sda4      4096     512 1050624 130458566144
  `-root    4096    4096           1930428416
sdb         4096     512          30064771072
`-sdb1      4096     512    2048  30062673920

in working case, when we enforce 512B sector size

[core@core ~]$ lsblk --bytes -o NAME,PHY-SEC,LOG-SEC,START,SIZE
NAME     PHY-SEC LOG-SEC   START         SIZE
sda         4096     512         130996502528
├─sda1      4096     512    2048      1048576
├─sda2      4096     512    4096    133169152
├─sda3      4096     512  264192    402653184
└─sda4      4096     512 1050624 130458566144
  └─root    4096     512         130441788928
sdb         4096     512          30064771072
└─sdb1      4096     512    2048  30062673920


I think it's a combination of
https://gitlab.com/cryptsetup/cryptsetup/-/issues/585 making this error fatal,
and
https://gitlab.com/cryptsetup/cryptsetup/-/merge_requests/135 trying to improve the default sector size,
and
https://github.com/coreos/fedora-coreos-config/blob/next/overlay.d/05core/usr/lib/dracut/modules.d/40ignition-ostree/ignition-ostree-growfs.sh#L94 resizing the disk and letting cryptsetup choose the size, and Azure (and also, my baremetal servers, but I did not try to reproduce over there) having different value for physical and logical block size.

I think it should work out of the box, but just a documentation update would be great.

@aospan
Copy link

aospan commented Sep 19, 2023

Observed the same issue on my bare-metal device:

:/root# journalctl --no-pager -xu ignition-ostree-growfs.service
Sep 19 11:04:43 localhost systemd[1]: Starting ignition-ostree-growfs.service - Ignition OSTree: Grow Root Filesystem...
Sep 19 11:04:43 localhost ignition-ostree-growfs[4241]: Expected /dev/disk/by-label/root to point to /dev/dm-0, but points to /dev/disk/by-label/root; triggering udev
Sep 19 11:04:46 localhost ignition-ostree-growfs[4300]: CHANGED: partition=4 start=1050624 old: size=6922240 end=7972864 new: size=467811471 end=468862095
Sep 19 11:04:47 localhost ignition-ostree-growfs[4359]: Device size is not aligned to requested sector size.
Sep 19 11:04:47 localhost systemd[1]: ignition-ostree-growfs.service: Main process exited, code=exited, status=1/FAILURE
Sep 19 11:04:47 localhost systemd[1]: ignition-ostree-growfs.service: Failed with result 'exit-code'.
Sep 19 11:04:47 localhost systemd[1]: Failed to start ignition-ostree-growfs.service - Ignition OSTree: Grow Root Filesystem.
:/root# cat /sys/block/sda/queue/physical_block_size
4096
:/root# cat /sys/block/sda/queue/logical_block_size
512

@jlebon jlebon changed the title cryptsetup resize fails when PHY-SEC > LOG-SEC LUKS on 512e disk breaks ignition-ostree-growfs.service Sep 19, 2023
@jlebon
Copy link
Member

jlebon commented Sep 19, 2023

Hmm, this seems like a bug in cryptsetup. It should know to resize to a new size aligned to the LUKS sector size and not just the maximum.

I guess we could also work around this by manually providing the size.

jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Jun 20, 2024
On 512e disks, `sfdisk` (which is used by `growpart`) will end up
growing the rootfs to a size not aligned to a 4K boundary. This is
mostly fine because, well, the drive claims to be 512b-compatible.

Issues arise however if one wants to also put LUKS on top: cryptsetup,
trying to optimize performance, wants to set the sector size of the LUKS
device to that of the physical value, which is 4K. But if the partition
range itself isn't 4K-aligned, it will choke.

Ideally, this should be fixed in sfdisk:
util-linux/util-linux#2140

(Though cryptsetup could also learn to align the mapped area itself).

Anyway, for now work aorund this by manually checking if the size of
the partition is a multiple of 4k. If not, and the physical sector size
is 4k, then trim off the edge of the partition to make it so. Note the
partition start is always going to be aligned (they're 1M-aligned).

Closes: coreos/fedora-coreos-tracker#1384
Closes: https://issues.redhat.com/browse/OCPBUGS-35410

See also: https://gitlab.com/cryptsetup/cryptsetup/-/issues/585
@jlebon
Copy link
Member

jlebon commented Jun 20, 2024

I was debugging a similar RHCOS bug and got to the bottom of this. The report there was that LUKS on 512e disks used to work in OCP 4.15 but in 4.16 it now no longer worked. The debugging that ensued ended up being interesting enough that I feel it's worth sharing all the gory details.

When cryptsetup initially formats the partition, it tries to auto-detect the optimal sector size to use. It does this by querying the physical sector size of the device. This will be 4096 on a 512e disk. Then, it verifies whether the partition boundaries are aligned to the chosen sector size (in this case, aligned to 4096). If it does, it chooses 4096. If it doesn't, it gracefully falls back to 512.

In 4.15 and earlier, the RHCOS metal image was partitioned using sgdisk. In 4.16 and later, we moved to osbuild, which uses sfdisk underneath for partitioning. A peculiar difference between the two is that sgdisk will size the rootfs to just take up the maximum amount of space available, whereas sfdisk will round it down to the closest 1M boundary.

Putting it together, what happens then when configuring LUKS on 4.15 on a 512e disk would be that cryptsetup would still use a sector size of 512 during formatting, because the rootfs partition size wasn't aligned to a 4k boundary. OTOH, on 4.16 it would successfully choose a sector size of 4096 because it is aligned.

Now, ignition-ostree-growfs comes along and wants to grow the partition and the LUKS device on top. First, the partition. It calls out to growpart, which underneath uses sfdisk to do the growing. There is a bug in sfdisk which makes it so that growing a partition will not align the partition end to a 1M boundary like it does when creating a partition. Instead it just goes as far as it can, which in practice means until it hits the secondary GPT header, which is guaranteed to not be 4k-aligned on a 512e disk since it's sized 33*512.

When ignition-ostree-growfs goes to resize the LUKS partition (using cryptsetup resize), cryptsetup will verify that the new partition size is aligned to the sector size used in the original LUKS formatting. And since on 4.15, that'll be 512, it'll pass. ON 4.16, it's 4096, it'll fail.

There's lots of possible ways to fix this, but I think the cleanest would be to fix the inconsistency in sfdisk. Meanwhile, we can work around this in ignition-ostree-growfs by manually rounding down the partition end to the nearest 4k boundary, which is what coreos/fedora-coreos-config#3033 does.

Obviously related here is the discussions in #1646; this wouldn't happen in a partition+filesystem-level install (i.e. not disk image-based install) because the partition would initially be created at its maximum size with the right alignment.

jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Jun 20, 2024
On 512e disks, `sfdisk` (which is used by `growpart`) will end up
growing the rootfs to a size not aligned to a 4K boundary. This is
mostly fine because, well, the drive claims to be 512b-compatible.

Issues arise however if one wants to also put LUKS on top: cryptsetup,
trying to optimize performance, wants to set the sector size of the LUKS
device to that of the physical value, which is 4K. But if the partition
range itself isn't 4K-aligned, it will choke.

Ideally, this should be fixed in sfdisk:
util-linux/util-linux#2140

(Though cryptsetup could also learn to align the mapped area itself).

Anyway, for now work aorund this by manually checking if the size of
the partition is a multiple of 4k. If not, and the physical sector size
is 4k, then trim off the edge of the partition to make it so. Note the
partition start is always going to be aligned (they're 1M-aligned).

Closes: coreos/fedora-coreos-tracker#1384
Closes: https://issues.redhat.com/browse/OCPBUGS-35410

See also: https://gitlab.com/cryptsetup/cryptsetup/-/issues/585
jlebon added a commit to coreos/fedora-coreos-config that referenced this issue Jun 21, 2024
On 512e disks, `sfdisk` (which is used by `growpart`) will end up
growing the rootfs to a size not aligned to a 4K boundary. This is
mostly fine because, well, the drive claims to be 512b-compatible.

Issues arise however if one wants to also put LUKS on top: cryptsetup,
trying to optimize performance, wants to set the sector size of the LUKS
device to that of the physical value, which is 4K. But if the partition
range itself isn't 4K-aligned, it will choke.

Ideally, this should be fixed in sfdisk:
util-linux/util-linux#2140

(Though cryptsetup could also learn to align the mapped area itself).

Anyway, for now work aorund this by manually checking if the size of
the partition is a multiple of 4k. If not, and the physical sector size
is 4k, then trim off the edge of the partition to make it so. Note the
partition start is always going to be aligned (they're 1M-aligned).

Closes: coreos/fedora-coreos-tracker#1384
Closes: https://issues.redhat.com/browse/OCPBUGS-35410

See also: https://gitlab.com/cryptsetup/cryptsetup/-/issues/585
jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Jun 21, 2024
On 512e disks, `sfdisk` (which is used by `growpart`) will end up
growing the rootfs to a size not aligned to a 4K boundary. This is
mostly fine because, well, the drive claims to be 512b-compatible.

Issues arise however if one wants to also put LUKS on top: cryptsetup,
trying to optimize performance, wants to set the sector size of the LUKS
device to that of the physical value, which is 4K. But if the partition
range itself isn't 4K-aligned, it will choke.

Ideally, this should be fixed in sfdisk:
util-linux/util-linux#2140

(Though cryptsetup could also learn to align the mapped area itself).

Anyway, for now work aorund this by manually checking if the size of
the partition is a multiple of 4k. If not, and the physical sector size
is 4k, then trim off the edge of the partition to make it so. Note the
partition start is always going to be aligned (they're 1M-aligned).

Closes: coreos/fedora-coreos-tracker#1384
Closes: https://issues.redhat.com/browse/OCPBUGS-35410

See also: https://gitlab.com/cryptsetup/cryptsetup/-/issues/585

(cherry picked from commit 067e1f7)
@travier travier added the status/pending-testing-release Fixed upstream. Waiting on a testing release. label Jun 24, 2024
jbtrystram pushed a commit to coreos/fedora-coreos-config that referenced this issue Jun 25, 2024
On 512e disks, `sfdisk` (which is used by `growpart`) will end up
growing the rootfs to a size not aligned to a 4K boundary. This is
mostly fine because, well, the drive claims to be 512b-compatible.

Issues arise however if one wants to also put LUKS on top: cryptsetup,
trying to optimize performance, wants to set the sector size of the LUKS
device to that of the physical value, which is 4K. But if the partition
range itself isn't 4K-aligned, it will choke.

Ideally, this should be fixed in sfdisk:
util-linux/util-linux#2140

(Though cryptsetup could also learn to align the mapped area itself).

Anyway, for now work aorund this by manually checking if the size of
the partition is a multiple of 4k. If not, and the physical sector size
is 4k, then trim off the edge of the partition to make it so. Note the
partition start is always going to be aligned (they're 1M-aligned).

Closes: coreos/fedora-coreos-tracker#1384
Closes: https://issues.redhat.com/browse/OCPBUGS-35410

See also: https://gitlab.com/cryptsetup/cryptsetup/-/issues/585

(cherry picked from commit 067e1f7)
@jlebon
Copy link
Member

jlebon commented Jun 25, 2024

The short-term workaround for this until the longer-term workaround in coreos/fedora-coreos-config#3033 lands everywhere is to make Ignition resize the rootfs. E.g.:

variant: fcos
version: 1.5.0
boot_device:
  luks:
    tpm2: true
storage:
  disks:
    - device: /dev/disk/by-id/coreos-boot-disk
      partitions:
        - number: 4
          size_mib: 0
          resize: true

@marmijo
Copy link
Member

marmijo commented Jul 8, 2024

The fix for this went into testing stream release 40.20240701.1.0. Please try out the new release and report issues.

@marmijo marmijo added status/pending-stable-release Fixed upstream and in testing. Waiting on stable release. and removed status/pending-testing-release Fixed upstream. Waiting on a testing release. labels Jul 8, 2024
@marmijo
Copy link
Member

marmijo commented Jul 19, 2024

The fix for this went into stable stream release 40.20240701.3.0.

@marmijo marmijo removed the status/pending-stable-release Fixed upstream and in testing. Waiting on stable release. label Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants