Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expose waiters queue depth metrics #56

Merged
merged 6 commits into from
Oct 25, 2024

Conversation

yliang412
Copy link
Contributor

@yliang412 yliang412 commented Oct 16, 2024

Problem

We want to make slots waiters queue depth observable, so it is easier to tune the submission queue size (also equal to the number of inflight operation as implemented in the system).

Summary of changes

  • Introduced PerSystemMetrics trait that record slots submission queue depth. In the future, per-system metrics can be added through extending this trait.
  • tokio_epoll_uring::System now takes in the per-system metrics observer Arc<M> where M : PerSystemMetrics

Alternative design

See neondatabase/neon#9482 for details.

@yliang412 yliang412 force-pushed the yuchen/expose-waiters-queue-depth-metrics branch from d5fccc5 to ff31e19 Compare October 16, 2024 17:14
@yliang412 yliang412 marked this pull request as ready for review October 17, 2024 13:23
@yliang412 yliang412 self-assigned this Oct 17, 2024
Copy link

@VladLazar VladLazar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks sensible.


Generally however, I'm not sure this is the right approach for per-system metrics.
To my mind, each system tracks it's own metrics separately. When you wish to recover the metrics, you go through each thread local system, query it and return the aggregated result. This removes the need for locking.

I looked into this a little bit and the std lib doesn't provide nice functionality for this (maybe there's some tokio API you can use to iterate threads). The [thread_local](https://docs.rs/thread_local/latest/thread_local/) crate however lets you grab a nice iterator.

tokio-epoll-uring/src/metrics.rs Outdated Show resolved Hide resolved
tokio-epoll-uring/src/metrics.rs Outdated Show resolved Hide resolved
tokio-epoll-uring/src/metrics.rs Outdated Show resolved Hide resolved
@yliang412
Copy link
Contributor Author

I looked into this a little bit and the std lib doesn't provide nice functionality for this (maybe there's some tokio API you can use to iterate threads). The thread_local crate however lets you grab a nice iterator.

If we could use this in pageserver, then I can make each system in tokio-epoll-uring tracks it's own metrics separately, and use the iterator to collect results when collecting metrics.

@VladLazar
Copy link

I looked into this a little bit and the std lib doesn't provide nice functionality for this (maybe there's some tokio API you can use to iterate threads). The thread_local crate however lets you grab a nice iterator.

If we could use this in pageserver, then I can make each system in tokio-epoll-uring tracks it's own metrics separately, and use the iterator to collect results when collecting metrics.

Right. Might be worth investigating, but don't wish to side-track you too much. I think what you have here is fine in the short term. I'd add a comment to the metrics saying that: "if you wanna add new per system metrics, do it properly and link to an issue".

@yliang412
Copy link
Contributor Author

In 132aea5, I wrote comments on adding new per-system metrics, and linked to neondatabase/neon#9480.

Copy link
Collaborator

@problame problame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, independent of the thread_local discussion, the way we add to the metrics here is that we only add observations if we don't get a slot.
We'll never know how close we are to 100% utilization until we hit 100%.
We want to know before.

How to solve this?
The RING_SIZE - unused_indices.len() should give us the number of in-use slots.

If we get a slot right away, observe that number.
If we don't get a slot and have to enqueue to wait, observe above number PLUS waiters.len().

As to the kind of metric: you put in the equivalent of a gauge here.
Gauges aren't useful at our sampling frequency.
The way I have seen queue depth metrics done in other places is by using histograms over observed queue depth.
E.g., when we get a slot, we get a number as per the rules in the previous paragrah.
We record this into the histogram by finding the bucket for that number.
E.g. we get 12 waiters and we have buckets [0,10)[10,20)[20,30)...
Then we increment bucket [10,20).
(Prometheus histograms are le histograms, so, it would also increment all later buckets, but that's an implementation detail).


As per thread_local, I'm all for using it so we don't have that RwLock. That RwLock is a scalability bottleneck because RwLock::read requires exclusive access to a cache line for recording the new reader. What I'd give for read-mostly locks in useland Rust :(

@yliang412 yliang412 force-pushed the yuchen/expose-waiters-queue-depth-metrics branch from 132aea5 to 54a668c Compare October 24, 2024 02:30
Signed-off-by: Yuchen Liang <[email protected]>
@yliang412 yliang412 requested a review from problame October 25, 2024 15:19
@yliang412 yliang412 merged commit cb2dcea into main Oct 25, 2024
4 of 6 checks passed
@yliang412 yliang412 deleted the yuchen/expose-waiters-queue-depth-metrics branch October 25, 2024 18:58
yliang412 added a commit to neondatabase/neon that referenced this pull request Oct 25, 2024
…9482)

In complement to
neondatabase/tokio-epoll-uring#56.

## Problem

We want to make tokio-epoll-uring slots waiters queue depth observable
via Prometheus.

## Summary of changes

- Add `pageserver_tokio_epoll_uring_slots_submission_queue_depth`
metrics as a `Histogram`.
- Each thread-local tokio-epoll-uring system is given a `LocalHistogram`
to observe the metrics.
- Keep a list of `Arc<ThreadLocalMetrics>` used on-demand to flush data
to the shared histogram.
- Extend `Collector::collect` to report
`pageserver_tokio_epoll_uring_slots_submission_queue_depth`.

Signed-off-by: Yuchen Liang <[email protected]>
Co-authored-by: Christian Schwarz <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants