-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
copy: do not fail if digest mismatches #1980
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrestricted like this, the change would completely break the security of digest references, and signatures.
It’s reasonable to contemplate that in some cases, we need a non-digest layer identifier of some kind; currently, in some transports we need to compute a digest just to have a digest in a manifest and create a consumable ImageSource
. But the assumption that layers are content-addressable and always have a digest is deeply and widely spread throughout the codebase, and we would need to both find a replacement for the current image integrity assumptions (or to decide that one isn’t necessary), and to carefully migrate code away from the existing content-addressability assumption.
6033259
to
8ea3ae3
Compare
I've changed the implementation to use this behavior only for layers that were retrieved using the partial pull feature. It needs this change in c/storage: containers/storage#1627 in particular the patch "store: store if the layer was created with a partial pull" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most importantly, I’m much more worried about the pull path and layer deduplication. Let’s figure out how to do that first, and that will clarify the digests situation. Making the push path work should only come afterwards.
the pull should not be a problem (just a little bit more expensive) since we will end up requesting the TOC for the layer but then all files are already present locally, so it won't be a problem. If you refer to having the |
9a87d2e
to
851f840
Compare
87f1a5d
to
ee10af7
Compare
@mtrmac what do you think of this approach? |
3cd6552
to
efc51d3
Compare
if there are no big objections can we move forward with this? We can rework it again if necessary, but without this fix a partial layer is pulled every time we do a pull |
@mtrmac @rhatdan @vrothberg PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM but I want @mtrmac to chime in. He's partially unavailable this week.
e503165
to
d3e4c0b
Compare
d3e4c0b
to
c498b8f
Compare
c498b8f
to
d9b3d48
Compare
rebased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Nit: the last commit message still points to the warning that has been dropped
e600ada
to
cbea9be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As partially a note to self, and to be clear there is progress:
Looking only at storage_src.go
: overall approach and security LGTM.
Signed-off-by: Giuseppe Scrivano <[email protected]>
Signed-off-by: Giuseppe Scrivano <[email protected]>
add a new field to TryReusingBlobOptions to store the digest of the TOC for the layer. It will be used by later commits to enable looking up a partial layer. Signed-off-by: Giuseppe Scrivano <[email protected]>
and generalize its naming to cover also the TOC digest, that is added by a later commit. Signed-off-by: Giuseppe Scrivano <[email protected]>
this is a preparation for the next commit to clarify that the map can contain either the uncompressed digest for the layer (DiffID), or the digest for the TOC of the layer. Signed-off-by: Giuseppe Scrivano <[email protected]>
Signed-off-by: Giuseppe Scrivano <[email protected]>
when an image using partial layers is pulled, generate a different imageID that combines the digest of the TOC for the partial layers since the partial pull didn't validate the diffID. Signed-off-by: Giuseppe Scrivano <[email protected]>
0bd2e0c
to
1032469
Compare
when copying a partial image, store the expected diffID so that it can be later used to validate the obtained layer stream. Signed-off-by: Giuseppe Scrivano <[email protected]>
1032469
to
b15a850
Compare
if err != nil { | ||
return nil, fmt.Errorf("parsing expected diffID %q for layer %q: %w", expectedDigest, layerID, err) | ||
} | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocking nit: Returning early on this failure, and leaving the non-failure case less indented, would be a tiny bit more readable.
// if the layer is stored by its TOC, report the expected diffID as the layer Digest | ||
// but store the TOC digest so we can later retrieve it from the storage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is no longer accurate now that we record digestToLayerID
.
// If the digest was overriden by LayerInfosForCopy, then we need to use the TOC digest | ||
// to retrieve it from the storage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment needs updating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation of storage_src.go
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another partial note, going to read storage_dest.go
now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for containers/storage#1719 ; with that, I think there is a clean path for storageImageDestination
to be unambiguous, see below about primarily keeping the state in a map[int]addedLayerInfo
.
Admittedly it’s all a little vague/tentative at this point, do let me know if I’m missing something essential or if the attempt to sketch the idea is not clear enough to be understood.
As I understand it during the meeting last week, we came to the agreement that merging this PR plus fixups from @giuseppe going forward, is the easiest path forward. I will merge, and then we can continue working on small fixups. |
…1114) This PR contains the following updates: Package Change Age Adoption Passing Confidence github.com/containers/image/v5 v5.29.2 -> v5.30.0 WarningSome dependencies could not be looked up. Check the warning logs for more information. Release Notes containers/image (github.com/containers/image/v5) v5.30.0 Compare Source What's Changed A fair number of improvements when working with zstd and zstd:chunked-compressed images. Note that make install now installs policy.json and registries.d/default.yaml. Refuse compression to zstd when using schema1 by @mtrmac in containers/image#2196 Don't expose local account details in oci-archive tar files by @mtrmac in containers/image#2202 Trigger a conversion to OCI when compressing to Zstd by @mtrmac in containers/image#2204 Add buildtags to avoid fulcio and rekor dependencies by @siretart in containers/image#2180 copy: do not fail if digest mismatches by @giuseppe in containers/image#1980 Moving policy.json and default.yaml from containers/skopeo by @rahilarious in containers/image#2215 Embrace codespell: config, workflow (to alert when new typos added) and get typos fixed by @yarikoptic in containers/image#2214 Fix raspberry pi zero cpu variant recognition by @lstolcman in containers/image#2086 storage: validate images converted to zstd:chunked by @giuseppe in containers/image#2243 Make blob reuse choices manifest-format-sensitive, and allow conversions when writing to format-agnostic transports by @mtrmac in containers/image#2213 Edit the manifest when pushing uncompressed data from c/storage by @mtrmac in containers/image#2273 Random storage-related cleanups by @mtrmac in containers/image#2287 Improve storage transport documentation, primarily about locking by @mtrmac in containers/image#2291 Fix c/storage destination with partial pulls by @mtrmac in containers/image#2288 Fix manifest updates when we match a layer by TOC digest by @mtrmac in containers/image#2294 Cleanly fail when trying to obtain a DiffID of a non-OCI image by @mtrmac in containers/image#2295 Beautify TOC-related parts of storageImageSource by @mtrmac in containers/image#2296 storage: use the new ApplyStagedLayer interface by @giuseppe in containers/image#2301 Also annotate image instances using zstd:chunked as using zstd by @mtrmac in containers/image#2302 Support editing ArtifactType, preserve it in lists by @nalind in containers/image#2304 Provide data to correctly report throughput on partial pulls by @mtrmac in containers/image#2308 Add validation error to digesting reader by @saschagrunert in containers/image#2312 Fix handling of errors when fetching layers by URLs by @mtrmac in containers/image#2310 Improve handling of zstd vs. zstd:chunked matching by @mtrmac in containers/image#2317 New Contributors @rahilarious made their first contribution in containers/image#2215 @yarikoptic made their first contribution in containers/image#2214 @lstolcman made their first contribution in containers/image#2086 @bainsy88 made their first contribution in containers/image#2260 Full Changelog: containers/[email protected] Configuration 📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied. ♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 Ignore: Close this PR and you won't be reminded about this update again. If you want to rebase/retry this PR, check this box This PR has been generated by Mend Renovate. View repository job log here.
if the computed digest mismatches the expected one for a partial image. This is expected since partial images are stored by their TOC digest, not the diffID for the layer.