-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different partition layout after re-provisioning between version 39 and 40 #1745
Comments
In the docs section you point to (Data persistence), there is a link to Using persistent state section where it states that you can safely set So your butane configuration could be like this: variant: fcos
version: 1.5.0
storage:
disks:
- # The link to the block device the OS was booted from.
device: /dev/disk/by-id/coreos-boot-disk
# We do not want to wipe the partition table since this is the primary
# device.
wipe_table: false
partitions:
- number: 4
label: root
# Allocate at least 8 GiB to the rootfs. See NOTE above about this.
size_mib: 8192
resize: true
- size_mib: 0
# Check if existing partition matches the specified one,
# delete existing partition and create specified partition
# if it does not match.
wipe_partition_entry: true
# We assign a descriptive label to the partition. This is important
# for referring to it in a device-agnostic way in other parts of the
# configuration.
label: var
filesystems:
- path: /var
device: /dev/disk/by-partlabel/var
# We can select the filesystem we'd like.
format: ext4
# Ask Butane to generate a mount unit for us so that this filesystem
# gets mounted in the real root.
with_mount_unit: true |
I tried a few things, what works is to set It looks like the change is that From the Qemu diff:
while:
So probably to fix my setup I would resize the partition to a MiB boundary, then set |
Not sure what is causing the issue but it is reproducible. Setting |
Nice catch! This is, similarly to #1384, fallout from our switch from sgdisk to sfdisk to format the disk. One difference between the two is how they set the first usable and last usable LBA of the disk (these are fields in the GPT header that determine the valid range of the disk which may contain partitions). sgdisk sets the first usable LBA to be 34, which is the minimum required by the spec, and the last usable LBA to N-34 (i.e. the last sector of the disk-34), which leaves just enough for the backup GPT. sfdisk sets the first usable LBA to 2048, and the last usable LBA to N-2048. Both sgdisk and sfdisk align partitions to 1M and will refuse to create partitions below sector 2048, but it seems like sfdisk tries to reflect this in the first LBA field itself. Even though Ignition uses sgdisk to create new partitions, sgdisk still respects the existing last usable LBA field from the GPT header (initially written by sfdisk during disk creation) and so sets the last sector of the This explains the difference you're seeing. Using your numbers for example, let's compare the sizes: 24115167 − 24113153 = 2014 So we lost 2014 sectors, which corresponds to 2048-34. Now thankfully, this shouldn't result in data loss (assuming Ideally we would've caught this as part of the f40 rebase and included information about it in our communications. I don't think we currently have data persistence tests, but we should. We could still send something to coreos-status. Though AFAIK this is the first report of this. But it may simply be that people using data persistence haven't reprovisioned from newer images yet. |
For anyone hitting this, you can restore the last partition to its original size using the following sfdisk script: dev=/dev/vda # replace with target device
diskinfo=$(sfdisk --json $dev)
partnum=$(jq '.partitiontable.partitions | length' <<< "$diskinfo")
start_lba=$(jq '.partitiontable.partitions[-1].start' <<< "$diskinfo")
eval $(lsblk -Pbdyo SIZE,LOG-SEC $dev)
size_in_lba=$((SIZE/LOG_SEC))
last_lba=$((size_in_lba-34))
echo -e "last-lba:$last_lba\n$start_lba $((last_lba-start_lba+1))" | sfdisk $dev -N $partnum --no-reread
partx --update --nr $partnum $dev That should allow mounting the filesystem and recovering your data. For filesystems that support it, it would be best to then resize it down so that it can fit in the smaller partition size on the next reprovision. For filesystems that don't (e.g. XFS), you would have to reformat (but temporarily, you can run the script above on every reprovision |
Is there a way to somehow reuse the partitions as-is without recreating the partition table when reprovisioning? I have an XFS partition so a bit annoying to reformat it. I suppose the script is an option. |
Someone asked OOB:
i think i could imagine data corruption in some scenarios, but in the "obvious" case, no. where i could see data corruption though is if in one of your reprovisioning, you want to tack on another partition after the one you've been reusing if you e.g. moved to a larger disk and the lost blocks cross over a 1M boundary. then the new partition would take ownership of those last few blocks, and an mkfs in that partition would corrupt things. but that would also require changing your ignition config to go from "take all the space remaining" to "be sized exactly this much", which is already a pretty high-wire act. |
Not exactly for this purpose. The install is image-based and the image includes the GPT table. |
Describe the bug
Re-provisioning a bare metal machine using the example storage from the doc, first installed with
39.20231101.3.0
, then reinstalled with40.20240519.3.0
, leads to different partition layout, and thus the reprovisioning fails.Reproduction steps
I've reproduced this in qemu here: https://github.com/iwanb/coreos-reprovision-baremetal
I haven't tried to narrow down the version where the change was introduced.
Expected behavior
Partitions stay the same when re-provisioning.
From this doc I would expect it to be the case: https://docs.fedoraproject.org/en-US/fedora-coreos/bare-metal/#_data_persistence
I could imagine that it can be hard to provide full backwards and forward compatibility for this, so an alternative would be to document how to specify the partitions more precisely, and warn that re-provisioning could otherwise break.
Actual behavior
Partition difference, the var partition is smaller in the new version:
System details
Bare metal (saw it there first) and QEMU (reproduced).
Butane or Ignition config
From https://docs.fedoraproject.org/en-US/fedora-coreos/storage/#_setting_up_separate_var_mounts
The text was updated successfully, but these errors were encountered: