Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latency metrics are missing in Cassandra 4.1 #500

Open
c3-clement opened this issue Jun 12, 2024 · 6 comments
Open

Latency metrics are missing in Cassandra 4.1 #500

c3-clement opened this issue Jun 12, 2024 · 6 comments

Comments

@c3-clement
Copy link

c3-clement commented Jun 12, 2024

Hello,

Our latency Grafana dashboards are not showing any data with management api 4.1.5-v0.1.79, while they are working fine on 4.0 .

The following prometheus metrics are missing from prometheus:

  • mcac_client_request_latency_bucket
  • mcac_table_range_latency_bucket
  • mcac_table_read_latency_bucket
  • mcac_table_write_latency_bucket
  • mcac_table_coordinator_read_latency_bucket
  • mcac_table_coordinator_scan_latency_bucket

In the system logs I'm seeing this error message that could be related:

INFO  [insights-8-1] 2024-06-12 14:24:03,177 NoSpamLogger.java:105 - Not able to get buckets for org.apache.cassandra.metrics.dropped_message.internal_dropped_latency.finalize_propose_msg 128 type org.apache.cassandra.metrics.DecayingEstimatedHistogramReservoir$EstimatedHistogramReservoirSnapshot

I have tried to request the MCAC metrics endpoint on port 9103. In 4.1.5 there is not single entry starting with collectd_mcac_micros_bucket , while I'm seeing it in 4.0.X

I'm using this telemetry configuration on k8ssandracluster :

      telemetry:
        mcac:
          enabled: true
          metricFilters:
            - allow:org.apache.cassandra.metrics.Table
            - allow:org.apache.cassandra.metrics.table
            - allow:org.apache.cassandra.metrics.client_request
        prometheus:
          enabled: true

┆Issue is synchronized with this Jira Story by Unito
┆Issue Number: MAPI-4

@c3-clement
Copy link
Author

@adejanovski @burmanm I've seen this closed issue #444 .

However, it seems that the issue is still happening

@burmanm
Copy link
Contributor

burmanm commented Jun 19, 2024

The #444 should have fixed the missing metrics and in our testing it did, assuming you use the newer metrics endpoints. The names of the metrics are a bit different, to align with the naming inside Cassandra. Only the older endpoint returns mcac* metrics and that endpoint is deprecated and no changes will be done to it.

@c3-clement
Copy link
Author

c3-clement commented Jun 19, 2024

The #444 should have fixed the missing metrics and in our testing it did, assuming you use the newer metrics endpoints. The names of the metrics are a bit different, to align with the naming inside Cassandra. Only the older endpoint returns mcac* metrics and that endpoint is deprecated and no changes will be done to it.

Thanks for the feedback @burmanm .

assuming you use the newer metrics endpoints. The names of the metrics are a bit different

Is there any documentation about those new metrics endpoints and those new metrics names?

We are using k8ssandra-operator and it's creating a prometheus ServiceMonitor to scrape Cassandra metrics, so I assume it should hit the correct endpoint automatically when Cassandra 4.1 is deployed.

However if metrics names changed we probably have to update our Grafana dashboards

@burmanm
Copy link
Contributor

burmanm commented Jun 19, 2024

That's the old "MCAC" port. The new /metrics endpoint listens in port 9000. The k8ssandra-operator will create ServiceMonitors for the new endpoints if MCAC is no longer enabled:

    telemetry:
      mcac:
        enabled: false

But yes, you would need new dashboards to support the new naming. See here for our example ones for installation of the new ones: https://docs.k8ssandra.io/tasks/monitor/prometheus-grafana/#install-the-grafana-dashboards

@burmanm
Copy link
Contributor

burmanm commented Jun 19, 2024

If you don't wish to disable MCAC yet, you can also simply create new ServiceMonitor for the new endpoint. Endpoints would look like this in the ServiceMonitor spec:

spec:
  endpoints:
  - port: metrics
    interval: 15s
    path: /metrics
    scheme: http
    scrapeTimeout: 15s

Rest can be copied from the old one.

@c3-clement
Copy link
Author

c3-clement commented Jun 19, 2024

Thanks a lot @burmanm ! We will try this shortly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: No status
Development

No branches or pull requests

2 participants