Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP HACK: Do not reuse zstd:chunked blobs #2185

Merged
merged 2 commits into from
Nov 15, 2023

Conversation

mtrmac
Copy link
Collaborator

@mtrmac mtrmac commented Nov 13, 2023

Absolutely untested.

For @giuseppe to experiment with.

@giuseppe
Copy link
Member

thanks for the PR, I've tested it but we still get:

[+0501s]   Unexpected warnings seen on stderr: "time=\"2023-11-13T14:15:47-06:00\" level=warning msg=\"Compressor for blob with digest sha256:00860748a1d5b2cd6213163fbadc4b3d565ab35105446c87e213cb04b115a0f1 previously recorded as zstd, now zstd:chunked\" time=\"2023-11-13T14:15:47-06:00\" level=warning msg=\"Compressor for blob with digest sha256:b782752f99f21e03efbed6d88b22d694c1d02b1b759093e2506bd94da7658f34 previously recorded as zstd, now zstd:chunked\" time=\"2023-11-13T14:15:49-06:00\" level=warning msg=\"Compressor for blob with digest sha256:151fd43fc6fdc2a6b9fb64eadef0a046c290f238d2f5830626aba8a7d1f94de4 previously recorded as zstd, now zstd:chunked\""
[+0501s]   In [It] at: /var/tmp/go/src/github.com/containers/podman/test/e2e/pull_test.go:433 @ 11/13/23 14:15:49.935

Should we also skip writing in the cache when dealing with zstd:chunked and expect it is always repushed for now?

@giuseppe
Copy link
Member

@mtrmac Podman tests pass if I also add (4b733c9):

diff --git a/copy/compression.go b/copy/compression.go
index 6ba70f0b..b8a5a361 100644
--- a/copy/compression.go
+++ b/copy/compression.go
@@ -261,6 +261,11 @@ func (d *bpCompressionStepData) updateCompressionEdits(operation *types.LayerCom
 // This must ONLY be called if all data has been validated by OUR code, and is not coming from third parties.
 func (d *bpCompressionStepData) recordValidatedDigestData(c *copier, uploadedInfo types.BlobInfo, srcInfo types.BlobInfo,
 	encryptionStep *bpEncryptionStepData, decryptionStep *bpDecryptionStepData) error {
+
+	if d.uploadedCompressorName == compressiontypes.ZstdChunkedAlgorithmName {
+		// HACK: Never record zstd:chunked in the blob info cache, because we don’t know how to handle it yet.
+		return nil
+	}
 	// Don’t record any associations that involve encrypted data. This is a bit crude,
 	// some blob substitutions (replacing pulls of encrypted data with local reuse of known decryption outcomes)
 	// might be safe, but it’s not trivially obvious, so let’s be conservative for now.

Are you fine to cherry-pick the patch in this PR and moving it forward?

@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 14, 2023

Doing that in updateCompressionEdits makes sense; it’s a bit unclean because in principle there could be other callers of RecordDigestCompressorName… but in practice there aren’t any, and it is internal-only, so, good enough.

I don’t think that should completely skip recording the digest pairs; at least the uncompressedDigest := options.Cache.UncompressedDigest path in storageImageDestination could still benefit, correctly. If we don’t record the algorithm, CandidateLocations2 will never return such blobs, so I don’t think that would cause any unwanted reuses of zstd:chunked blobs.

We still need the BlobMatchesRequiredCompression part, at least for OriginalBlobMatchesRequiredCompression callers (which don’t require any BlobInfoCache data at all).

@mtrmac mtrmac force-pushed the zstd-chunked-never-reuse branch from 741b5b3 to 4692256 Compare November 14, 2023 19:40
@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 14, 2023

Updated per the above. I think the net result is that “zstd:chunked” pushes will look for, and often reuse, gzip-compressed layers (unless things like copy.Options.ForceCompressionFormat or copy.Options.EnsureCompressionVariantsExist enforce a strict match); that is “fine” but it might be surprising.

@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 14, 2023

#2189 updated to note these two workarounds.

Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 15, 2023

Compressor for blob with digest … previously recorded as zstd, now zstd:chunked

Late thought, this is actually a deeper issue: If a zstd:chunked-blob without the required annotations “is a legitimate zstd blob” (and it seems to me that it makes sense to treat it that way (?)), we could really end up trying to record both algorithms in the BIC for “legitimate reasons”. So I guess the BIC will also need to learn about the “is a compatible superset of” relationship.

@mtrmac mtrmac force-pushed the zstd-chunked-never-reuse branch from 4692256 to 3ebd43c Compare November 15, 2023 17:05
@mtrmac mtrmac marked this pull request as ready for review November 15, 2023 17:05
@mtrmac mtrmac merged commit 199d256 into containers:main Nov 15, 2023
9 checks passed
@mtrmac mtrmac deleted the zstd-chunked-never-reuse branch November 15, 2023 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants