-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
expose waiters queue depth metrics #56
Conversation
d5fccc5
to
ff31e19
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks sensible.
Generally however, I'm not sure this is the right approach for per-system metrics.
To my mind, each system tracks it's own metrics separately. When you wish to recover the metrics, you go through each thread local system, query it and return the aggregated result. This removes the need for locking.
I looked into this a little bit and the std lib doesn't provide nice functionality for this (maybe there's some tokio API you can use to iterate threads). The [thread_local](https://docs.rs/thread_local/latest/thread_local/)
crate however lets you grab a nice iterator.
If we could use this in pageserver, then I can make each system in |
Right. Might be worth investigating, but don't wish to side-track you too much. I think what you have here is fine in the short term. I'd add a comment to the metrics saying that: "if you wanna add new per system metrics, do it properly and link to an issue". |
In 132aea5, I wrote comments on adding new per-system metrics, and linked to neondatabase/neon#9480. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, independent of the thread_local
discussion, the way we add to the metrics here is that we only add observations if we don't get a slot.
We'll never know how close we are to 100% utilization until we hit 100%.
We want to know before.
How to solve this?
The RING_SIZE - unused_indices.len()
should give us the number of in-use slots.
If we get a slot right away, observe that number.
If we don't get a slot and have to enqueue to wait, observe above number PLUS waiters.len().
As to the kind of metric: you put in the equivalent of a gauge here.
Gauges aren't useful at our sampling frequency.
The way I have seen queue depth metrics done in other places is by using histograms over observed queue depth.
E.g., when we get a slot, we get a number as per the rules in the previous paragrah.
We record this into the histogram by finding the bucket for that number.
E.g. we get 12 waiters and we have buckets [0,10)[10,20)[20,30)...
Then we increment bucket [10,20).
(Prometheus histograms are le
histograms, so, it would also increment all later buckets, but that's an implementation detail).
As per thread_local
, I'm all for using it so we don't have that RwLock
. That RwLock
is a scalability bottleneck because RwLock::read requires exclusive access to a cache line for recording the new reader. What I'd give for read-mostly locks in useland Rust :(
Signed-off-by: Yuchen Liang <[email protected]>
Signed-off-by: Yuchen Liang <[email protected]>
Signed-off-by: Yuchen Liang <[email protected]>
132aea5
to
54a668c
Compare
Signed-off-by: Yuchen Liang <[email protected]>
…9482) In complement to neondatabase/tokio-epoll-uring#56. ## Problem We want to make tokio-epoll-uring slots waiters queue depth observable via Prometheus. ## Summary of changes - Add `pageserver_tokio_epoll_uring_slots_submission_queue_depth` metrics as a `Histogram`. - Each thread-local tokio-epoll-uring system is given a `LocalHistogram` to observe the metrics. - Keep a list of `Arc<ThreadLocalMetrics>` used on-demand to flush data to the shared histogram. - Extend `Collector::collect` to report `pageserver_tokio_epoll_uring_slots_submission_queue_depth`. Signed-off-by: Yuchen Liang <[email protected]> Co-authored-by: Christian Schwarz <[email protected]>
Problem
We want to make slots waiters queue depth observable, so it is easier to tune the submission queue size (also equal to the number of inflight operation as implemented in the system).
Summary of changes
PerSystemMetrics
trait that record slots submission queue depth. In the future, per-system metrics can be added through extending this trait.tokio_epoll_uring::System
now takes in the per-system metrics observerArc<M> where M : PerSystemMetrics
Alternative design
See neondatabase/neon#9482 for details.