-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding felix service metric port #3534
Conversation
There is a PrometheusMetricsPort in FelixConfig that should be used instead of always using a built-in default value. |
There is also in migration/core.go a migration that sets installation.spec.NodeMetricsPort. Should we perhaps:
|
Was felixConfiguration.Spec.PrometheusMetricsPort a new field that has been added? I'm wondering if that field was needed if we already had NodeMetricsPort in the operator. If both NodeMetricsPort and PrometheusMetricsPort are set and are different should the operator report that as a problem? |
PrometheusMetricsPort is not a new field. operator/api/v1/installation_types.go Line 119 in ea0f4fb
|
So I guess one thing I'm unclear of, should the installationl.NodeMetricsPort and felixConfig.PrometheusMetricsPort be configuring the same thing or are they different? |
@tmjd Changes I have done is specific to enabling Felix metric for "Bring your own Prometheus" use case (https://docs.tigera.io/calico-enterprise/latest/operations/monitor/prometheus/byo-prometheus#scrape-metrics-from-specific-components). |
pkg/render/node.go
Outdated
if c.cfg.Installation.NodeMetricsPort != nil { | ||
return *c.cfg.Installation.NodeMetricsPort | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the two metrics ports are unrelated, other than providing some metrics information. I don't think there is any reason we should use NodeMetricsPort here.
If this is needed then I'd suggest this logic should be moved to the core_controller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated to only use PrometheusMetricsPort
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as far as operator API and operator code quality. Was there a review that this is appropriate for Calico/Enterprise, for example should the felix metrics port be exposed on the same service as the node metrics or should there be a separate service?
Changes done to add felix service metric port in calico-node service. This helps in removing the manual step required by the client to enable felix metric for BYO prometheus.
Rene and I had discussion to add this in same service as we are using using "calico-node" in the selector. Also we discussed that ServiceMonitor also needs to be updated, I have updated the code for that. |
@rene-dekker did you want to review this then to make sure it follows what you discussed with @vikastigera ? |
@@ -373,18 +379,24 @@ func (r *ReconcileMonitor) Reconcile(ctx context.Context, request reconcile.Requ | |||
return reconcile.Result{}, err | |||
} | |||
|
|||
felixConfiguration, err := utils.GetFelixConfiguration(ctx, r.client) | |||
if err != nil { | |||
log.Error(err, "Error retrieving Felix configuration") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason that we log and move on as opposed to degrading and returning? Or would that create some sort of deadlock with the core controller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, I wouldn't expect it to be problematic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
I think the vision we have around scraping is that we prefer users to get the metrics by scraping our prometheus and have our prometheus scrape node/felix. Users are struggling with the mTLS configuration and by leveraging the ExternalPrometheus option that could be simplified. With that in mind, I don't see any drawback from this, unless you can think of any reason. |
I wasn't trying to suggest a problem with it, I just wanted to make sure someone had reviewed this change from the aspect of is this the right solution for Enterprise and addressed the problem it was targeting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Description
Changes done to add felix service metric
port in calico-node service. This helps in
removing the manual step required by the client
to enable felix metric for BYO prometheus.
https://tigera.atlassian.net/browse/EV-5305
*** Additional changes required to update felix-metrics-service-monitor.yaml to this :
Testing
For PR author
make gen-files
make gen-versions
For PR reviewers
A note for code reviewers - all pull requests must have the following:
kind/bug
if this is a bugfix.kind/enhancement
if this is a a new feature.enterprise
if this PR applies to Calico Enterprise only.