-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ObservedObjectCollection
API type
#217
Conversation
45010af
to
ad70d18
Compare
} | ||
|
||
// Remove objects that either do not exist anymore or are no match | ||
oldRefs := sets.New[v1alpha1.ObservedObjectReference](collection.Status.Objects...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, we have two distinct sources (status.objects
and objects owned by collection on etcd), and they could diverge for some reason.
I believe it would be more reliable if we lean on whatever on the API instead of using references in the status (also relying on status has its downsides). I mean, we could add labels to the created observe-only object resources (e.g. kubernetes.crossplane.io/owned-by-collection: jobs
) and list them with labels to see what a collection owns and clean them if they no longer needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we introduce such label, then we implicitly limit collection names to 63 chars. Let me know if this is an important trade to be made. Also, ObservedObjectCollection
instances are gonna be created mostly from within a composition, where their name is built from composite name and a prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it would be more reliable if we lean on whatever on the API instead of using references in the status
Having refs in the status might be helpful for some, similar to how we have refs in the composite object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, agree having refs in the status makes sense for visibility. But we should also add labels so that functions can avoid having to first get the collection and then requesting objects if they already know the collection, they should be able to request resources by label directly instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if they already know the collection, they should be able to request resources by label directly instead.
Users can do that already by adding an arbitrary number of labels through .spec.template.metadata.labels
on ObserverdObjectCollection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we introduce such label, then we implicitly limit collection names to 63 chars.
Also, ObservedObjectCollection instances are gonna be created mostly from within a composition, where their name is built from composite name and a prefix.
This is a good point. However, the same limitation holds for composites, doesn't it? They put crossplane.io/composite: <name-of-the-composite>
to all resources they compose and they could be created from within other compositions as well.
I believe the reason why we maintain references there is that the composed resources could be of any apiVersion/kind and it is not possible to list all composed objects with a label filter. The situation is different here, we know all owned resources are Objects
and we can list/filter them with labels.
I am totally fine with keeping them in status for visibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have introduced kubernetes.crossplane.io/owned-by-collection
label that contains the name of the collection, if the name length is less than 63 chars. If greater, md5 hash of the name is used instead.
The label and its value is published under status.membershipLabel
so that clients can know what label to use to obtain all items belonging to the given collection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If greater, md5 hash of the name is used instead.
FWIW there's a potential collision here:
- One collection has a long enough name that its objects refer to it as
some-md5-hash
- Another collection is actually named
some-md5-hash
This probably doesn't matter - it would never happen outside of a collection actively trying to cause a collision.
Is it so bad to just limit collection names to 63 characters? I prefer predictable and human-readable labels to labels that are sometimes human readable, and other times an md5 hash. I think this would also allow us to avoid having to write a label to the status of the collection just to know what label to query for (e.g. hash or name).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, the same limitation holds for composites, doesn't it?
The limitation also holds for:
- All packages - we label package revisions with package names (see here)
- All compositions - we label composition revisions with composition names (see here)
- All claims - we label XRs with claim names (see here)
Given that we have this 63 character limitation in so many places already, and that AFAIK it hasn't ever been a problem in practice, I don't think we should do anything different here.
If we did want to address this potential problem in a way that's backward compatible and consistent with the existing places where we label children with the parent's name we could:
- Always write the parent's name to one label, and truncate it if longer than 63 characters
- Write a hash1 of the parent's name that will always be 63 characters or less to another label
This way you get a human-readable label that's easily traced back to the parent resource 99% of the time, supported by a machine-readable label that can be used to list child resources 100% of the time.
Footnotes
-
I think we should use sha256 unless there's a specific reason to prefer md5. It's what we've used to generate unique short names in other places, e.g. package revision and composition revision names. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would agree with @negz and have a similar comment here: #217 (comment)
@turkenh I have addressed the comments, ptal. |
cb5296c
to
9e018a2
Compare
return collection.GetName(), nil | ||
} | ||
// Otherwise, compute md5 hash of it, to reduce it length. | ||
h := md5.New() //#nosec G401 -- used only for unique name generation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checked how Crossplane doing it with crossplane.io/composite
label: https://github.com/crossplane/crossplane/blob/d35ecef8b3d9094261dbda64e72498cda1520797/internal/controller/apiextensions/composite/api.go#L401
Looks like we don't have special handling for long names rather try to add it as label, it fails and we return the error to the user so that they can adjust their naming. I remember I observed this while applying some composites (of platform-refs) and used shorter names with my compositions.
I would agree that this is not the best experience, but still would follow upstream implementation here as well. To keep the logic here simple and not have cryptic owner labels when they are long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine if we do not want to backup solution for longer names. Thus, I have limited the collection names to 63 chars max and added CRD validation rule for that.
FWIW user should not expect any particular semantic behind the value of the label that is used for selecting all collection members. It is just important that it is unique per collection, and discoverable by users (published at the moment under status.memebershipLabel
Objects in the collection are defined by: * GVK * optional namespace * label selector The objects are fetched using the specified provider config and for the matched objects the provider creates counterpart observe-only objects in the local cluster. The created objects are owned by the collection resource and reconciled as usual by the provider. They are also referenced in collection's `.status.objects` field. Signed-off-by: Predrag Knezevic <[email protected]>
Signed-off-by: Predrag Knezevic <[email protected]>
Signed-off-by: Predrag Knezevic <[email protected]>
9e018a2
to
78bf7cb
Compare
* limited collection name to 63 chars max, allowing us to that name as the memebership label, without need to handle names longer than 63 chars * remove `.status.objectRefs` fields * addressed nits Signed-off-by: Predrag Knezevic <[email protected]>
The names are computed as follows: `COLLECTION_NAME+FIRST_56_BITS_SHA256_HASH_OF(OBJECT_GVK,NAMESPACE,NAME)` Signed-off-by: Predrag Knezevic <[email protected]>
25d1fe9
to
0fd806c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pedjak 🙌
// MembershipLabel is the label set on each member of this collection | ||
// and can be used for fetching them. | ||
// +optional | ||
MembershipLabel map[string]string `json:"membershipLabel,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need this field in the status? AFAIU, it is always deterministic and we don't have to store this in status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see #217 (comment)
IMHO, instead of relying on a convention, we can make the label discoverable by users.
Found this discussion.. |
* Add `ObservedObjectCollection` API type Objects in the collection are defined by: * GVK * optional namespace * label selector The objects are fetched using the specified provider config and for the matched objects the provider creates counterpart observe-only objects in the local cluster. The created objects are owned by the collection resource and reconciled as usual by the provider. They are labeled with a common label, so that they can be fetched easily. The label is discoverable by reading `.status.membershipLabel` field of `ObservedObjectCollection`. Signed-off-by: Predrag Knezevic <[email protected]> Signed-off-by: Alexander Brandstedt <[email protected]>
Description of your changes
Objects in the collection are defined by:
The provider reconcile the collection by fetching objects matching the selector and creates counterpart "observe-only" Objects in the local cluster. Additional labels and annotations can be added to them by specifying
objectTemplate.metadata
on the collection resource.kubernetes.crossplane.io/owned-by-collection
label uniquely identifying the collection object, so that clients can use label to obtain all members of the given collectionstatus.membershipLabel
NOTE: The new controller is implemented as a standard controller-runtime controller, which is the depart from managed.Reconciler pattern used across provider ecosystem. Given that the new API does not create any external resources, we were able to produce simpler code. The approach was discussed and approved by @negz and @turkenh.
Fixes #209
I have:
make reviewable test
to ensure this PR is ready for review.How has this code been tested
kubectl -f apply examples/collection/collection.yaml