Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apply: Don't use staged deployments when /boot is automounted #301

Merged
merged 2 commits into from
Mar 10, 2022

Conversation

dbnicholson
Copy link
Member

The ostree staged deployment process works by waiting until shutdown to
swap the /boot symlinks to make the new deployment the default.
However, when /boot is the EFI System Partition and there's no fstab
entry, systemd-gpt-auto-generator sets up an automount so that the
VFAT filesystem is only exposed when needed.

Unfortunately, there are 2 bugs that make this process very fragile:

  • Once a systemd automount unit is scheduled to be stopped, it ignores
    notifications from autofs that the target filesystem should be
    mounted. Therefore, if /boot isn't mounted when shutdown begins,
    ostree admin finalize-staged will fail. See
    ExecStop failures with Automount units systemd/systemd#22528.

  • autofs is not mount namespace aware, so it will begin the expiration
    timer for a mount unit unless a process in the root namespace is
    keeping it active. Since ostree admin finalize-staged is run from a
    mount namespace (either via systemd or its own to ensure /sysroot
    and /boot are mounted read-write), the automount daemon (systemd)
    will try to unmount the filesystem if it expires during this process.
    See https://bugzilla.redhat.com/show_bug.cgi?id=2056090.

Therefore, if /boot is an autofs filesystem, use a full deployment
instead of a staged deployment. Since systems with an automounted
/boot are not common, we want to retain the benefit of staged
deployments for more normal systems. See
ostreedev/ostree#2543 for potential future
fixes in ostree.

https://phabricator.endlessm.com/T33136

@dbnicholson dbnicholson requested a review from jprvita March 3, 2022 23:01
@dbnicholson dbnicholson marked this pull request as draft March 3, 2022 23:02
@dbnicholson
Copy link
Member Author

I haven't tested this yet but wanted to get it out there. I have another version that instead checks if /boot is VFAT, but I think this is better since automounting is the real problem, not VFAT.

Copy link
Contributor

@jprvita jprvita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not familiar with libmount's API, but reading through these changes and the API docs, this PR makes sense and looks correct.

I left a small suggestion to improve the comment in the code, feel free to take it or not.

eos-updater/apply.c Outdated Show resolved Hide resolved
@dbnicholson dbnicholson force-pushed the T33136-no-staged-deploy-autofs branch from b6fc706 to 14c6350 Compare March 9, 2022 22:18
@dbnicholson dbnicholson marked this pull request as ready for review March 9, 2022 22:18
@dbnicholson dbnicholson marked this pull request as draft March 9, 2022 22:19
On some hosts, systemd sets up `/boot` as an autofs automount.
Unfortunately, this configuration has some bugs that make usage of
`/boot` less than robust.

Add a method to see if `/boot` is an automount. This uses libmount to
parse `/proc/self/mountinfo` and check if the first mountpoint for the
sysroot's boot directory has the `autofs` filesystem type. libmount
version 2.24 was chosen as a minimum because that's when `libmnt_table`
and `libmnt_cache` reference counting was added.

https://phabricator.endlessm.com/T33136
The ostree staged deployment process works by waiting until shutdown to
swap the `/boot` symlinks to make the new deployment the default.
However, when `/boot` is the EFI System Partition and there's no `fstab`
entry, `systemd-gpt-auto-generator` sets up an automount so that the
VFAT filesystem is only exposed when needed.

Unfortunately, there are 2 bugs that make this process very fragile:

* Once a systemd automount unit is scheduled to be stopped, it ignores
  notifications from autofs that the target filesystem should be
  mounted. Therefore, if `/boot` isn't mounted when shutdown begins,
  `ostree admin finalize-staged` will fail. See
  systemd/systemd#22528.

* autofs is not mount namespace aware, so it will begin the expiration
  timer for a mount unit unless a process in the root namespace is
  keeping it active. Since `ostree admin finalize-staged` is run from a
  mount namespace (either via systemd or its own to ensure `/sysroot`
  and `/boot` are mounted read-write), the automount daemon (systemd)
  will try to unmount the filesystem if it expires during this process.
  See https://bugzilla.redhat.com/show_bug.cgi?id=2056090.

Therefore, if `/boot` is an autofs filesystem, use a full deployment
instead of a staged deployment. Since systems with an automounted
`/boot` are not common, we want to retain the benefit of staged
deployments for more normal systems. See
ostreedev/ostree#2543 for potential future
fixes in ostree.

https://phabricator.endlessm.com/T33136
@dbnicholson dbnicholson force-pushed the T33136-no-staged-deploy-autofs branch from 14c6350 to a19821a Compare March 9, 2022 22:23
@dbnicholson
Copy link
Member Author

I tested this on a PAYG system and could see that it did a finalized deployment instead of a staged deployment. I ended up splitting out the method to the util library so I could unit test it.

@dbnicholson dbnicholson marked this pull request as ready for review March 9, 2022 22:24
@jprvita
Copy link
Contributor

jprvita commented Mar 9, 2022

I tested this on a PAYG system and could see that it did a finalized deployment instead of a staged deployment. I ended up splitting out the method to the util library so I could unit test it.

The new changes look good to me -- not sure if you want an extra pair of eyes on this, since this is critical code. In any case, master PAYG systems are currently failing to boot, so we won't be able to test it on ostrees until that is fixed.

@dbnicholson
Copy link
Member Author

I think it's good, but if @pwithnall has a moment to review the util side and tests that would be nice. I'll wait a day and merge if he doesn't. It's still useful master to check that normal Endless systems still get staged deployments.

@dbnicholson
Copy link
Member Author

I think @pwithnall is too deep in gnome-software land to take a look, so I'm going to go ahead and merge.

@dbnicholson dbnicholson merged commit 2b745c8 into master Mar 10, 2022
@dbnicholson dbnicholson deleted the T33136-no-staged-deploy-autofs branch March 10, 2022 20:46
Copy link
Contributor

@pwithnall pwithnall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just took a quick look and can’t see anything wrong with this, but I haven’t thought deeply about the consequences of the changes. The tests are very pleasing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants