Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Configurable QoS resources for cluster-manager and klusterlet operators and their managed containers #400

Merged

Conversation

dongbeiqing91
Copy link
Contributor

Summary

Configurable QoS resources for cluster-manager and klusterlet operators and their managed pods

Related issue(s)

Related PRs

open-cluster-management-io/ocm#351
open-cluster-management-io/api#316

Fixes #

@openshift-ci openshift-ci bot requested review from itdove and qiujian16 January 30, 2024 06:04
@@ -54,6 +54,17 @@ func NewCmd(clusteradmFlags *genericclioptionsclusteradm.ClusteradmFlags, stream
"If set, the command will initialize the OCM control plan in foreground.")
cmd.Flags().StringVarP(&o.output, "output", "o", "text", "output foramt, should be json or text")
cmd.Flags().BoolVar(&o.singleton, "singleton", false, "If true, deploy singleton controlplane instead of cluster-manager. This is an alpha stage flag.")
cmd.Flags().StringVar(&o.operatorResourceQosClass, "operator-resource-qos-class", "Default", "the resource QoS class of the cluster manager operator. Can be one of Default or BestEffort")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could ref the command here https://kubernetes.io/docs/reference/kubectl/generated/kubectl_set/kubectl_set_resources/.

something like:

clusteradm init --resource-requests=cpu=xxx,memory=xxx --resource-limits=cpu=xxx,memory=xxx

it is not necessary to specify the type, we can decide type if resource-requests is set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the flag value format as you suggested.
type is still necessary I suppose because we still need to know it's Default or BestEffort if limits and requests are not set.

@@ -54,6 +54,13 @@ func NewCmd(clusteradmFlags *genericclioptionsclusteradm.ClusteradmFlags, stream
"If set, the command will initialize the OCM control plan in foreground.")
cmd.Flags().StringVarP(&o.output, "output", "o", "text", "output foramt, should be json or text")
cmd.Flags().BoolVar(&o.singleton, "singleton", false, "If true, deploy singleton controlplane instead of cluster-manager. This is an alpha stage flag.")
cmd.Flags().StringVar(&o.operatorResourceQosClass, "operator-resource-qos-class", "Default", "the resource QoS class of the cluster manager operator. Can be one of Default, BestEffort or ResourceRequirement.")
cmd.Flags().StringToStringVar(&o.operatorResourceLimits, "operator-resource-limits", nil, "the resource limits of the cluster manager operator, for example: cpu=800m,memory=800Mi")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we also update the flag for operator. Actually, why do we need to set operator separately? Can we just use the same resource-limits flag also for operator?

I am thinking we can start with using the same flat for both operator and all other containers. We could add flat specifically for operator later if there is a real requirement, WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am OK with that.
Updated.

pkg/helpers/resourcerequirement/resourcerequirement.go Outdated Show resolved Hide resolved
pkg/helpers/resourcerequirement/fake_deployment.yaml Outdated Show resolved Hide resolved
@dongbeiqing91 dongbeiqing91 force-pushed the dev/qos branch 3 times, most recently from 59aea94 to e0c9aa2 Compare January 30, 2024 09:48
Copy link
Member

@qiujian16 qiujian16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

cc @zhujian7

Copy link

openshift-ci bot commented Jan 30, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: promid, qiujian16

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

pkg/cmd/init/cmd.go Outdated Show resolved Hide resolved
pkg/cmd/join/cmd.go Outdated Show resolved Hide resolved
return nil, err
}
rr.Requests[corev1.ResourceName(rsc)] = quantity
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if it is necessary to add some checks to ensure that the requirements <= limits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, updated

@zhujian7
Copy link
Member

@promid, thanks for your PR, this is great! generally lgtm, and left some nit comments.

…rs and their managed pods

Signed-off-by: Dong Beiqing <[email protected]>
@dongbeiqing91
Copy link
Contributor Author

@qiujian16 @zhujian7 Can you PTAL the e2e CI?

@qiujian16
Copy link
Member

the e2e faiils, I think you need to run the test locally to check...I guess the API change brings some compatible issue with the released version.

@dongbeiqing91
Copy link
Contributor Author

local e2e passed @qiujian16

clusteradm-local-e2e.log

image

@qiujian16
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Feb 8, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit c384319 into open-cluster-management-io:main Feb 8, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants