-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set some default resource requests on the workspace pod #707
Conversation
|
||
// SecurityProfileBaselineDefaultImage is the default image used when the security profile is 'baseline'. | ||
SecurityProfileBaselineDefaultImage = "pulumi/pulumi:latest" | ||
// SecurityProfileRestrictedDefaultImage is the default image used when the security profile is 'restricted'. | ||
SecurityProfileRestrictedDefaultImage = "pulumi/pulumi:latest-nonroot" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rationale: moving these constants and the associated 'defaulting' logic to webhook/auto/v1alpha1
where, in the future, a true webhook would apply the defaults eagerly. This PR does NOT implement a webhook, it simply applies the defaults during reconciliation for simplicity.
See the latest in webhook scaffolding: kubernetes-sigs/kubebuilder#4150
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me say, the benefit of applying defaults eagerly (with a webhook) rather than lazily (during reconciliation) is stability; one may change the default later without affecting existing workloads. The implicit becomes explicit.
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love that we killed the zombies!
My gut says the SetMemoryLimit
call feels premature -- after all , if we OOM it'll probably be due to npm and not our little agent. But it probably doesn't hurt. We should definitely pull some profiles if we observe a leak.
// // SetupWorkspaceWebhookWithManager registers the webhook for Workspace in the manager. | ||
// func SetupWorkspaceWebhookWithManager(mgr ctrl.Manager) error { | ||
// return ctrl.NewWebhookManagedBy(mgr).For(&autov1alpha1.Workspace{}). | ||
// WithDefaulter(&WorkspaceCustomDefaulter{}). | ||
// Complete() | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused boilerplate or something to enable later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be used later when we implement a webhook, which I felt was overkill for this PR.
@blampe here's the article that convinced me to give the system a hint that we're trying to stay within the 'requests'. |
Right, rephrasing my earlier comment I don't think the agent falls into this high-memory category. It can run under 100MiB and handles one request at a time -- its heap should be pretty quiet :) Child processes will eat most of our memory, hence why it felt premature to me, but again it doesn't really matter. |
Proposed changes
Implements good defaults for the workspace resource, using a "burstable" approach.
Since a workspace pod's utilization is bursty - with low resource usage during idle times and with high resource usage during deployment ops - the pod requests a small amount of resources (64mb, 100m) to be able to idle. A deployment op is able to use much more memory - all available memory on the host.
Users may customize the resources (e.g. to apply different requests and/or limits). For large/complex Pulumi apps, it might make sense to reserve more memory and/or use #694.
The agent takes some pains to stay within the requested amount, using a programmatic form of the GOMEMLIMIT environment variable. The agent detects the requested amount via the Downward API. We don't use
GOMEMLIMIT
to avoid propagating it to sub-processes, and because the format is a Kubernetes 'quantity'.It was observed that zombies weren't being cleaned up, and this was leading to resource exhaustion. Fixed by using tini as the entrypoint process (PID 1).
Related issues (optional)
Closes #698