Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ct-metrics): add initial conntrack metrics for Prometheus #1057

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

SRodi
Copy link
Member

@SRodi SRodi commented Nov 22, 2024

Description

Create initial metrics for conntrack:

  • packets_count_per_connection [gauge] see example below
  • bytes_count_per_connection [gauge]
  • connections_count [count]

Related Issue

#806

Checklist

  • I have read the contributing documentation.
  • I signed and signed-off the commits (git commit -S -s ...). See this documentation on signing commits.
  • I have correctly attributed the author(s) of the code.
  • I have tested the changes locally.
  • I have followed the project's style guidelines.
  • I have updated the documentation, if necessary.
  • I have added tests, if applicable.

Screenshots (if applicable) or Testing Completed

This example networkobservability_packets_count_per_connection{src_ip="10.244.0.94"} includes 10.244.0.94=prometheus-kube-state-metrics , 10.244.0.98= prometheus-prometheus-kube-prometheus-prometheus and 10.96.0.1=svc/kubernetes

In this example we see the direction="TRAFFIC_DIRECTION_UNKNOWN" since Prometheus was deployed before retina. Conntrack keeps track of source IP and destination IP, hence each time the connection is close the packets_count_per_connection goes to 0 before increasing again until the next TCP connection is closed.

Screenshot 2024-11-22 085548

This example shows a long-lived TCP connection between client src_ip="10.244.0.35" and server 10.244.0.99 via svc 10.96.188.34

image

This example shows the bytes_count_per_connection where dst_ip=172.18.0.2 which is the pod IP for kube-proxy

image

Additional Notes

I pushed this PR to create a discussion based on the above examples. Points to discuss:

  1. Using gauge for packets_count_per_connection and bytes_count_per_connection
  2. Reduce cardinality by removing labels?
  3. Increase cardinality by adding labels like TCP port?

Known issues

  1. GC for conntrack_metrics map
  2. conntrack_metrics map update logic to be reviewed
  3. direction="TRAFFIC_DIRECTION_UNKNOWN" for new connections to be reviewed

Please refer to the CONTRIBUTING.md file for more information on how to contribute to this project.

@SRodi SRodi self-assigned this Nov 22, 2024
@SRodi SRodi requested a review from a team as a code owner November 22, 2024 10:20
@SRodi SRodi marked this pull request as draft November 22, 2024 10:21
@MikeZappa87
Copy link

Did we think though around scalability, performance and stability when adding these metrics? Do we have any concerns? Do we have numbers to back this up?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants