-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support overlayfs whiteouts on checkout #2717
Conversation
Hi @mangelajo. Thanks for your PR. I'm waiting for a ostreedev member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also add ostree commit --retarget-whiteouts
or so which would automatically do the inverse side. Now AIUI today osbuild uses rpm-ostree compose commit
which could also use a similar API.
But neither of these are blockers.
src/libostree/ostree-repo-checkout.c
Outdated
@@ -35,6 +35,8 @@ | |||
#define WHITEOUT_PREFIX ".wh." | |||
#define OPAQUE_WHITEOUT_NAME ".wh..wh..opq" | |||
|
|||
#define OVERLAYFS_WHITEOUT_PREFIX ".wh-ostree." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
#define OVERLAYFS_WHITEOUT_PREFIX ".wh-ostree." | |
#define OSTREE_TARGET_WHITEOUT_PREFIX ".ostree-wh." |
Here "ostree_" is first because it's a thing conceptually namespaced/scoped to ostree, not to overlayfs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/libostree/ostree-repo-checkout.c
Outdated
AT_SYMLINK_NOFOLLOW) < 0)) | ||
return glnx_throw_errno_prefix (error, "fchownat"); | ||
} | ||
need_copy = FALSE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could probably just return TRUE;
here too - what we're skipping over is the caching stuff which isn't relevant for this.
src/libostree/ostree-repo-checkout.c
Outdated
|
||
g_assert (name[0] != '/'); /* Sanity */ | ||
|
||
if (mknodat(destination_dfd, name, (source_mode & ~S_IFMT) | S_IFCHR, (dev_t)0) < 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to verify here is whether we also need to do a fchmodat
- I think the mode we set is modified by umask
, so in most cases we always unconditionally chmod.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point I will verify this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, looking at mknodat documentation:
The file mode is modified by the process's umask in the usual way: in the absence of a default ACL, the permissions of the created node are (mode & ~umask).
So we also need fchmodat
src/libostree/ostree-repo-checkout.c
Outdated
@@ -643,6 +647,32 @@ checkout_one_file_at (OstreeRepo *repo, | |||
|
|||
need_copy = FALSE; | |||
} | |||
else if (is_overlayfs_whiteout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking we may want to make this behavior potentially opt-in? So far ostree has no "magical filenames" - this would be the first.
A concern I have here is that e.g. flatpak would suddenly gain exposure to this. And as we discussed it's probably fine...the worst I can think of is that someone using flatpak on top of overlayfs wouldn't be able to install an app that included a file with this name.
But still, by requiring an opt-in flag we're at least reducing the blast radius.
I think we can then hange the deploy path to pass this flag by default. This feature would clearly be useful for everyone using ostree-for-bootable-systems.
Though I think I'd also be OK having it opt-out (i.e. we add a flag to disable this) instead; then if we discover other use cases don't want it they could turn it off. But opt-in seems safer and only requires a little bit more work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that opt-in as you said,may require changes in the deployment path when we install or upgrade the system. Let me talk to the RHEL4Edge team and try figure out how easy/hard it's to include the flag, probably it's easy or even under the control of the image creator so not a big deal, but I'm not familiar with that.
By looking at our composer installer image I see:
ostreesetup --nogpg --osname=rhel --remote=edge --url=file:///run/install/repo/ostree/repo --ref=rhel/8/x86_64/edge
Do you know where ostreesetup lives?
Note: It seems like any regular user with no special privileges can create char 0:0 devices , and that device always returns error.
As you said the issue would be if trying to checkout inside an overlayfs system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my proposal here, the ostree admin deploy
code would by default pass this option down when it does the checkout. This means no modifications would be necessary for anything higher level than this project.
Conceptually ostree has two separate parts; there's a git-like core represented by OstreeRepo
, and the sysroot/deploy bits in OstreeSysroot
. The split isn't 100% clean (the magical ostree/1/0/1
type refs are part of that) but that's the idea.
Your modifications here are to OstreeRepo
, which is the core of ostree that's used in flatpak and other places, whereas OstreeSysroot
is scoped only to bootable host systems - and that's where we want to handle this.
Hmm actually looking at this we of course already have
typedef struct {
...
gboolean process_whiteouts;
...
} OstreeRepoCheckoutAtOptions;
And...confusingly that's for a different model of whiteouts where we want ostree to handle them, not the container engine. Anyways so...basically I'm saying we'd add gboolean process_passthrough_whiteouts;
there, and then in checkout_deployment_tree
set that boolean or so. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an opt-out for now, but happy to change into opt-in if we see it's doable in rhel4edge or for any other reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an opt-out for now, but happy to change into opt-in if we see it's doable in rhel4edge or for any other reason.
oh sorry I didn't see you had written comments in the meantime, reading ^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sounds like a good plan to me
Questions to make sure I understood right:
- so, RHEL uses ostree admin underneath ?
- I guess I also must verify that switch/upgrade will use the right flag too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ostree admin
is just a small CLI for the core libostree
C shared library, which is also used by rpm-ostree (e.g. here), which is what's used by most things in Fedora/RHEL today. So modifying the code here will affect everything we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I also must verify that switch/upgrade will use the right flag too
Basically by design, every single ostree operation ends up reconstructing a new filesystem via hardlinks - it never tries to do an "in place" update. checkout_deployment_tree()
is the function that does that, and everything else calls it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, understood
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ostree_sysroot_deploy_tree -> ostree_sysroot_deploy_tree_with_options -> sysroot_initialize_deployment -> checkout_deploytment_tree -> ostree_repo_checkout_at
Have you been able to prove out that this approach seems viable via a post-processing script that renames the whiteouts in this way? I'm OK landing a minimal patch that only does the checkout side. |
I have manually tested the creation of whiteout files and checkout, but I will verify it with the whole lifecycle I also checked how overlayfs works for directory deletion and I guess we are covered there: https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt says: whiteouts and opaque directoriesIn order to support rm and rmdir without changing the lower A whiteout is created as a character device with 0/0 device number. A directory is made opaque by setting the xattr "trusted.overlay.opaque" |
Thanks Colin |
src/ostree/ot-builtin-checkout.c
Outdated
@@ -193,6 +195,7 @@ process_one_checkout (OstreeRepo *repo, | |||
checkout_options.force_copy = opt_force_copy; | |||
checkout_options.force_copy_zerosized = opt_force_copy_zerosized; | |||
checkout_options.bareuseronly_dirs = opt_bareuseronly_dirs; | |||
checkout_options.process_passthrough_whiteouts = opt_process_passthrough_whiteouts; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm checking why this option is not getting down to the callee down the line in my local testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+ ./ostree checkout --process-passthrough-whiteouts -v --repo=./testrepo-bare test/stable/x86_64 ./rootfs-out
.ostree-wh.whiteout is_overlayfs_whiteout: 1, enabled:0
not-a-whiteout-just-empty is_overlayfs_whiteout: 0, enabled:0
.ostree-wh.whiteout is_overlayfs_whiteout: 1, enabled:0
.ostree-wh.whiteout2 is_overlayfs_whiteout: 1, enabled:0
+ ls -la rootfs-out
total 8
drwxr-xr-x. 1 majopela majopela 102 Jan 1 1970 .
drwxr-xr-x. 1 majopela majopela 2058 Sep 20 14:24 ..
drwxr-xr-x. 1 majopela majopela 78 Jan 1 1970 another
-rw-r--r--. 1 majopela majopela 0 Sep 20 14:24 not-a-whiteout-just-empty
-rw-r--r--. 1 majopela majopela 0 Sep 20 14:24 .ostree-wh.whiteout
+ ls -la rootfs-out/another
total 0
drwxr-xr-x. 1 majopela majopela 78 Jan 1 1970 .
drwxr-xr-x. 1 majopela majopela 102 Jan 1 1970 ..
-rw-r--r--. 1 majopela majopela 0 Sep 20 14:24 .ostree-wh.whiteout
-rw-r--r--. 1 majopela majopela 0 Sep 20 14:24 .ostree-wh.whiteout2
+ getfattr -d -m '.*' rootfs-out/whiteout
getfattr: rootfs-out/whiteout: No such file or directory
I'm missing something here... ':D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I separated those for debugging:
const gboolean is_overlayfs_whiteout = (!is_symlink && g_str_has_prefix (destination_name, OVERLAYFS_WHITEOUT_PREFIX));
printf("%s is_overlayfs_whiteout: %d, enabled:%d\n", destination_name, is_overlayfs_whiteout, options->process_passthrough_whiteouts);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh wait... :)
/* This strange code structure is to preserve testing
* coverage of both `ostree_repo_checkout_tree` and
* `ostree_repo_checkout_at` until such time as we have a more
* convenient infrastructure for testing C APIs with data.
*/
if (opt_disable_cache || opt_whiteouts || opt_require_hardlinks ||
opt_union_add || opt_force_copy || opt_force_copy_zerosized ||
opt_bareuseronly_dirs || opt_union_identical ||
opt_skiplist_file || opt_selinux_policy || opt_selinux_prefix ||
opt_process_passthrough_whiteouts)
{
cd9fe0a
to
a14ea93
Compare
I used this as a small functional test:
and it looks good, I´m going to feed it into crio as additional storage now and see if the whiteouts work. |
ok, running the image from an additionalImageStore extracted with ostree checkout and the new flag works correctly. I guess remaining parts for this patch would be:
and as additional PRs: a) on this repository, adding the ability to encode whiteouts as .ostree-wh. files automatically |
Yeah that's fine
I'm 97.3% sure no changes to rpm-ostree are necessary for this. |
🤣 |
Ok, this should be now feature complete, writing tests. Later down I can squash or reorganize the commits as necessary |
df13954
to
e23d202
Compare
1285328
to
3dba744
Compare
GError **error) | ||
{ | ||
guint32 file_mode = g_file_info_get_attribute_uint32 (file_info, "unix::mode"); | ||
if (mknodat(destination_dfd, destination_name, (file_mode & ~S_IFMT) | S_IFCHR, (dev_t)0) < 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we store the file type in the ostree mode too? Curious why you're explicitly forcing on S_IFCHR
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In ostree it's stored as a regular file, otherwise in bare mode we would have to create character devices to get those bits preserved. Then it would look more like my original approach here #2713 which was more complex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or are we extracting the S_IFCHR before this call? checking,..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, no, we just detect the .ostree-wh. prefix for a regular empty file, which removes the prefix and triggers this call
Hm, looks like all the tests are failing with |
3521d03
to
4de29a0
Compare
it seemed to be that not all glibs support the NO_FOLLOW_SYMLINK flag which we don't use in other parts of the code. |
src/libostree/ostree-repo-checkout.c
Outdated
if (_checkout_overlayfs_whiteout_at_no_overwrite(options, destination_dfd, destination_name, | ||
file_info, xattrs, cancellable, error)) | ||
return TRUE; | ||
|
||
if (!error || !g_error_matches(*error, G_IO_ERROR, G_IO_ERROR_EXISTS)) | ||
return FALSE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't allowed by GError conventions. It's allowed to pass NULL
to ignore errors, in the caller, though this isn't used much in the ostree stack.
This topic heavily relates to e.g. https://users.rust-lang.org/t/why-i-use-anyhow-error-even-in-libraries/68592 where I argue that we want to separate out the "expected/likely error" case from the "disk is failing" cases.
Specifically here, we need to pass back the "found an extant file" as a separate boolean. There's precedent for this in APIs like glnx_fstatat_allow_noent()
and ot_openat_ignore_enoent
etc.
Something a bit more like:
gboolean found_extant_file = FALSE;
if (!_checkout_overlayfs_whiteout_at_no_overwrite(options, destination_dfd, destination_name,
file_info, xattrs, &found_extant_file, cancellable, error))
return FALSE;
if (!found_extant_file)
return TRUE;
/* fallthrough */
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback, handling it
src/libostree/ostree-repo-checkout.c
Outdated
case OSTREE_REPO_CHECKOUT_OVERWRITE_NONE: | ||
return FALSE; | ||
case OSTREE_REPO_CHECKOUT_OVERWRITE_UNION_FILES: | ||
g_clear_error(error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if we did the above, then we wouldn't need this.
Introduces an intermediate format for overlayfs storage, where .wh-ostree. prefixed files will be converted into char 0:0 whiteout devices used by overlayfs to mark deletions across layers. The CI scripts now uses a volume for the scratch directories previously in /var/tmp otherwise we cannot create whiteout devices into an overlayfs mounted filesystem. Related-Issue: ostreedev#2712
4de29a0
to
e234b63
Compare
@cgwalters I changed the approach to error handling as recommended, let me know how it looks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Introduces an intermediate format for overlayfs storage, where
.wh-ostree. prefixed files will be converted into char 0:0
whiteout devices used by overlayfs to mark deletions across layers.
Related-Issue: #2712