Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try zstd:chunked in local cache registry #388

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

edsantiago
Copy link
Member

There's a strong push to use zstd:chunked by default, but
very little testing of it. This forces zstd:chunked on
images in the local mirror cache.

Also, add new testimage:20241011 (podman#24238)

Signed-off-by: Ed Santiago [email protected]

Copy link

Cirrus CI build successful. Found built image names and IDs:

Stage Image Name IMAGE_SUFFIX
base debian do-not-use
base fedora do-not-use
base fedora-aws do-not-use
base fedora-aws-arm64 do-not-use
base image-builder do-not-use
base prior-fedora do-not-use
cache build-push c20241015t143000z-f40f39d13
cache debian c20241015t143000z-f40f39d13
cache fedora c20241015t143000z-f40f39d13
cache fedora-aws c20241015t143000z-f40f39d13
cache fedora-netavark c20241015t143000z-f40f39d13
cache fedora-netavark-aws-arm64 c20241015t143000z-f40f39d13
cache fedora-podman-aws-arm64 c20241015t143000z-f40f39d13
cache fedora-podman-py c20241015t143000z-f40f39d13
cache prior-fedora c20241015t143000z-f40f39d13
cache rawhide c20241015t143000z-f40f39d13
cache win-server-wsl c20241015t143000z-f40f39d13

@edsantiago
Copy link
Member Author

debian prior-fedora fedora fedora-aws rawhide
podman 5.2.3+ds1-4 4.9.4-1 5.2.4-1 5.2.3-1 5.2.4-1
5.2.3+ds1-3 ⇑

Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I need to get popcorn ready to see if this passes podman CI?

@@ -82,6 +82,7 @@ declare -a IMAGELIST=(
testimage:00000004
testimage:20221018
testimage:20240123
testimage:20241011
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need some form of process to delete the old ones? Otherwise we just keep adding new ones. I guess for testimage not a big concern as it is small enough for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a find-unused-images script but I don't quite trust it yet. Am planning to fine-tune it then submit it as part of the repo one day. It's just a oneliner:

for i in $(sed -ne '/IMAGELIST=/,/^[^ ]/p' <cache_images/local-cache-registry | sed -ne 's/^  *//p');do grep -q -R $i ../libpod/test ../buildah/tests || echo "unused $i";done

...and its current output is:

unused fedora/python-311:latest
unused podman_python:latest
unused testimage:20240123       <<< yay nice catch
unused testimage:multiimage     <<< wrong and hard to test for and maybe I can find a way to not need it
unused podman/stable:v4.3.1     <<< obviously wrong, but hard to check for
unused podman/stable:v4.8.0     <<< ibid

The python ones I think are used in libpod compose tests, but have never taken the time to investigate why

@edsantiago edsantiago marked this pull request as draft October 15, 2024 17:29
@edsantiago
Copy link
Member Author

@Luap99 low-priority question: should I add nv-rust:1.76 to the cache list? On the assumption that some future day netavark will be able to bump VMs?

@Luap99
Copy link
Member

Luap99 commented Oct 16, 2024

@Luap99 low-priority question: should I add nv-rust:1.76 to the cache list? On the assumption that some future day netavark will be able to bump VMs?

No we do not use this image in our CI images at all, AFAIK all the cirrus container tasks run in their community cluster so it will be pulled there automatically

Copy link

Cirrus CI build successful. Found built image names and IDs:

Stage Image Name IMAGE_SUFFIX
base debian do-not-use
base fedora do-not-use
base fedora-aws do-not-use
base fedora-aws-arm64 do-not-use
base image-builder do-not-use
base prior-fedora do-not-use
cache build-push c20241017t115000z-f40f39d13
cache debian c20241017t115000z-f40f39d13
cache fedora c20241017t115000z-f40f39d13
cache fedora-aws c20241017t115000z-f40f39d13
cache fedora-netavark c20241017t115000z-f40f39d13
cache fedora-netavark-aws-arm64 c20241017t115000z-f40f39d13
cache fedora-podman-aws-arm64 c20241017t115000z-f40f39d13
cache fedora-podman-py c20241017t115000z-f40f39d13
cache prior-fedora c20241017t115000z-f40f39d13
cache rawhide c20241017t115000z-f40f39d13
cache win-server-wsl c20241017t115000z-f40f39d13

@edsantiago
Copy link
Member Author

debian prior-fedora fedora fedora-aws rawhide
containers-common ? 1-99 0.60.4-2 0.60.4-1 0.60.4-3
0.60.4-1 ⇑ 0.60.4-1 ⇑

edsantiago added a commit to edsantiago/libpod that referenced this pull request Oct 17, 2024
The local cache registry now has images pushed with zstd:chunked
compression.

Built in: containers/automation_images#388

Signed-off-by: Ed Santiago <[email protected]>
@@ -219,6 +220,7 @@ function cache_image() {
for retry in 1 2 3 0;do
skopeo --registries-conf /dev/null \
copy --all --dest-tls-verify=false \
--dest-compress-format zstd:chunked \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes image digests, which are hard-coded in tests (e.g. ALPINELISTDIGEST).

That’s fine in principle — and I’ll update the test digests in the integration PR — I’m just noting that it’s expected that some tests will start failing with this.


Outside of the current zstd:chunked testing effort, consider using --preserve-digests in the command to make any accidental digest changes (e.g. if an unusual image weren’t supported by the cache registry and triggered a format conversion) an error, to make sure this creates a bit-for-bit-identical copy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. The new images built just now use --preserve-digests. Would you care to try with c20241022t120000z-f40f39d13 ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m sorry, with --preserve-digest + --dest-compress-format, --dest-compress-format is, AFAICS, ignored (because that requires a digest change). I’ll inspect the images to confirm.

Arguably it’s a bug that this option combination is not rejected.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, thanks, I misunderstood.

So, back to your initial comment: "This changes image digests". Are these changes reliably consistent? When we make new VM images next week/year, different version of skopeo, different kernel/filesystem/tools, different phase of the moon, will the image digests be the same?

If so, then it might be okay to change the digests in the tests, but it's important to remember that e2e tests need to be able to run using local cache and also run using quay. Tests can determine the environment via UsingCacheRegistry() (returns bool). This is ugly but it's a potential way to get tests to pass.

If digests are not consistently reproducible, all bets are off and the tests will need to be completely reimagined.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these changes reliably consistent?

No; there is no guarantee that compression produces consistent results even with one version, and the output can (and, from time to time, does) change when the implementation changes.

For the purpose of the current experiment, I’ll hard-code the digests as they exist in the local cache. But, other than that, we’ll indeed need to specifically maintain zstd:chunked images on quay.io — or, alternatively, not hard-code the expected digests in tests, and obtain them at runtime, or maybe at the cache build time, using something like Skopeo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the digest always depends on the compression, so as long as the compression is reproducible it should produce the same digest?

In any case is there any need to hard code the digests? Skopeo inspect can give us that

skopeo inspect --no-tags  --format {{.Digest}} docker://quay.io/libpod/testimage:20241011

(The command seems quite slow for some reason, so it isn't something I would like to run in every test but only once and cache it)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compression isn’t reproducible.

Using skopeo inspect is definitely an option — noting that it decreases the test coverage a bit: if c/image were broken, we’d see consistent incorrect data from both Skopeo and the Podman binary being tested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using skopeo inspect is definitely an option — noting that it decreases the test coverage a bit: if c/image were broken, we’d see consistent incorrect data from both Skopeo and the Podman binary being tested.

Fair but one thing to consider is we would be using skopeo from the rpm build and podman from main so if there is a breaking change only podman on main should be different until we have a new skopeo + c/image build in our VMs so it seem less likely that we hit a corner case there?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure — and the risk is not that high.


Maybe, first, we need some thought about the final test plan: do we actually want to add a gzip / zstd:chunked dimension to the testing matrix, to make this a variable? Maybe the necessary testing coverage can be achieved by a few dedicated tests/images. That’s very unclear to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. This is one of the questions on my radar, that I don't yet know enough to answer. My hope is that we can do an integration, mixed set of images, but as of this writing my recollection is that some images have worked (with zstd:chunked) and some have not, and that makes me uncomfortable. It's possible that the reason for this is that some tests use different images and that's what triggers the failures, but I haven't spent the time to investigate. Thank you for bringing this up.

There's a strong push to use zstd:chunked by default, but
very little testing of it. This forces zstd:chunked on
images in the local mirror cache.

Also, add testimage:20241010 which was actually pushed
to quay with zstd.

Signed-off-by: Ed Santiago <[email protected]>
Copy link

github-actions bot commented Oct 22, 2024

DO NOT USE THIS BUILD

Cirrus CI build successful. Found built image names and IDs:

Stage Image Name IMAGE_SUFFIX
base debian do-not-use
base fedora do-not-use
base fedora-aws do-not-use
base fedora-aws-arm64 do-not-use
base image-builder do-not-use
base prior-fedora do-not-use
cache build-push c20241022t120000z-f40f39d13
cache debian c20241022t120000z-f40f39d13
cache fedora c20241022t120000z-f40f39d13
cache fedora-aws c20241022t120000z-f40f39d13
cache fedora-netavark c20241022t120000z-f40f39d13
cache fedora-netavark-aws-arm64 c20241022t120000z-f40f39d13
cache fedora-podman-aws-arm64 c20241022t120000z-f40f39d13
cache fedora-podman-py c20241022t120000z-f40f39d13
cache prior-fedora c20241022t120000z-f40f39d13
cache rawhide c20241022t120000z-f40f39d13
cache win-server-wsl c20241022t120000z-f40f39d13

@edsantiago
Copy link
Member Author

edsantiago commented Oct 22, 2024

DO NOT USE THIS BUILD

debian prior-fedora fedora fedora-aws rawhide
kernel 6.11.4-1 6.5.6-300 6.8.5-301 6.8.5-301 6.8.5-301
6.11.2-1 ⇑
grub2-common 2.12-5 2.06-121 2.06-123 2.06-123 2.12-10
2.12-9 ⇑
buildah 1.37.5+ds1-1 1.37.1-1 1.37.3-1 1.37.3-1 1.37.5-1
1.37.4+ds1-1 ⇑ 1.37.4-1 ⇑ 1.37.4-1 ⇑
containers-common ? 1-99 0.60.4-2 0.60.4-2 0.60.4-5
0.60.4-1 ⇑ 0.60.4-1 ⇑ 0.60.4-1 ⇑
criu 3.17.1-3 4.0-1 4.0-1 4.0-1 4.0-2
4.0-1 ⇑
podman 5.2.4+ds1-1+b1 4.9.4-1 5.2.3-1 5.2.3-1 5.2.5-1
5.2.3+ds1-4 ⇑ 5.2.4-1 ⇑ 5.2.4-1 ⇑
runc 1.1.12+ds1-5.1+b1 1.1.12-1 1.1.12-3 1.1.12-3 1.1.12-4
1.1.12+ds1-5+b1 ⇑
skopeo 1.16.1+ds1-1+b1 1.16.1-1 1.16.1-1 1.16.1-1 1.16.1-2
1.16.1+ds1-1 ⇑
systemd 256.7-2 254.18-1 255.13-1 255.13-1 256.7-1
256.7-1 ⇑

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants