Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(k8sprocessor): Pod Service cache invalidation #1425

Merged
merged 1 commit into from
Jan 15, 2024

Conversation

swiatekm
Copy link

Fixes #1414 by separately handling update events for EndpointSlices. The assumption behind the fix is that every Pod exists exactly once in one of the EndpointSlices associated with a Service, and we can never get the updates in the wrong order if it's re-added to the Service after having been removed from it.

A more robust solution, where we periodically recompute the whole Pod to Service map based on EndpointSlice data, is possible, but more expensive and more complex.

I've also replaced some homegrown slice manipulation code with the slices module, that's part of the standard library in Go 1.21.

@swiatekm swiatekm marked this pull request as ready for review January 11, 2024 18:14
@swiatekm swiatekm requested a review from a team as a code owner January 11, 2024 18:14
@swiatekm swiatekm force-pushed the fix/k8sprocessor/endpoint-memory-leak branch from 7d736c9 to 8c7502f Compare January 12, 2024 13:26
require.NoError(t, err)
assert.Eventually(t, func() bool {
services := op.GetServices(pod.Name)
if len(services) != 2 {
Copy link
Contributor

@rnishtala-sumo rnishtala-sumo Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we checking if the length is not 2 here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be 2, we continue waiting while it's not 2. And it should be 2 because we updated back to the original state, where the EndpointSlice contains the Pod.

podName := endpoint.TargetRef.Name
if slices.Index(newPodNames, podName) == -1 {
// not a deferred delete, as this is a dynamic property which can change often
op.deleteServiceFromPod(podName, serviceName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a call to deleteServiceFromPod going to ensure that our cache doesn't have any stale entries? So my understanding here is that whenever any EndpointSlice for a service is updated this method gets called with the params as mentioned here - https://pkg.go.dev/k8s.io/[email protected]/tools/cache#ResourceEventHandlerFuncs.OnUpdate

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the idea. An update to an EndpointSlice can mean that some Pods were removed from the Service, and this is what we do here.

@swiatekm swiatekm force-pushed the fix/k8sprocessor/endpoint-memory-leak branch from 8c7502f to bb038b5 Compare January 15, 2024 14:52
@swiatekm swiatekm force-pushed the fix/k8sprocessor/endpoint-memory-leak branch from bb038b5 to 2ef9816 Compare January 15, 2024 14:55
@swiatekm swiatekm merged commit 3891b3c into main Jan 15, 2024
28 checks passed
@swiatekm swiatekm deleted the fix/k8sprocessor/endpoint-memory-leak branch January 15, 2024 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

k8s_tagger: Service metadata is not removed from cache after Pods are deleted
2 participants