-
-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds new statistics for tracking connections created and closed #202
Conversation
We add several fiels for tracking the following: - connections_created: How many connections have been created - connections_broken_closed: How many connections were closed due to a broken state - connections_invalid_closed: How many connections were closed due to be considered invalid. - connections_max_lifetime_closed: How many connections were closed due to reaching the max lifetime. - connections_max_idle_timeout_closed: How many connections were closed due to reaching the idle timeout. We have added also a new statistic called `get_waited_time_micro` which provides the accumulated time in micros that all calls to the `get` method had to wait for a new connection.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #202 +/- ##
==========================================
+ Coverage 76.05% 79.15% +3.10%
==========================================
Files 6 6
Lines 543 595 +52
==========================================
+ Hits 413 471 +58
+ Misses 130 124 -6 ☔ View full report in Codecov by Sentry. |
bb8/src/inner.rs
Outdated
self.inner | ||
.statistics | ||
.connections_invalid_closed | ||
.fetch_add(1, Ordering::SeqCst); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just make this another StatsKind
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jaja, good point. Ill try to do that!
bb8/src/inner.rs
Outdated
self.inner | ||
.statistics | ||
.get_waited_time_micro | ||
.fetch_add(wait_time.as_micros() as u64, Ordering::SeqCst); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why micros vs millis or nanos?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that micros would provide enough detail (nanos might be just noise) while we would keep the also the u64 large enough for storing many many many micros.
In any case since we move this to a Duration we will be storing behind the scenes IIRC micros using u128.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oks now I remembered, everything I wanted to use an Atomic Type. I guess that we can try to use an u128
internally and return a Duration (which IIRC is just a u128). If this works we could just forget about the discussion between nanos and micros.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally I went for using micros with u64 for converting later to Duration. If you want to use nanos, i can do that. But got the feeling that nanosecond resolution for this specific use case would not provide a lot of value (indeed micro most likely will not neither)
Ive just kept the type to u64
just for having all atomics using the same type and helping me to keep the record
interface simple.
If you want to to go for nanos and/or u128
let me know.
bb8/src/internals.rs
Outdated
@@ -139,22 +147,34 @@ where | |||
ApprovalIter { num: num as usize } | |||
} | |||
|
|||
pub(crate) fn reap(&mut self, config: &Builder<M>) -> ApprovalIter { | |||
pub(crate) fn reap(&mut self, config: &Builder<M>) -> (ApprovalIter, u64, u64) { | |||
let mut max_lifetime_closed: u64 = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid type annotations where possible.
pub(crate) connections_created: AtomicU64, | ||
pub(crate) connections_broken_closed: AtomicU64, | ||
pub(crate) connections_invalid_closed: AtomicU64, | ||
pub(crate) connections_max_lifetime_closed: AtomicU64, | ||
pub(crate) connections_max_idle_timeout_closed: AtomicU64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's skip the connections_
prefixes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all of them? at least at first sight the connections_created
, connections_broken_closed
, connections_broken_closed
and connections_invalid_closed
without the connections_
prefix will seem a bit off.
Also it follows the same practice for the get_
Kind, where we prefixed it and IMO helps to the user to understand the scope of them, same would happen for connections_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let's keep connections_
but I'd like closed
moved directly after connections_
and rename max_idle_timeout
to idle_timeout
.
bb8/src/internals.rs
Outdated
} | ||
|
||
impl AtomicStatistics { | ||
pub(crate) fn record(&self, kind: StatsKind) { | ||
pub(crate) fn record_get(&self, kind: StatsKind) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we can use it for incrementing other (non-get
) counters, prefer to revert this.
bb8/src/api.rs
Outdated
@@ -99,6 +99,20 @@ pub struct Statistics { | |||
pub get_waited: u64, | |||
/// Total gets performed that timed out while waiting for a connection. | |||
pub get_timed_out: u64, | |||
/// Total time accumulated waiting for a connection. | |||
pub get_waited_time_micro: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this a std::time::Duration
, and lose the _micro
suffix.
@djc this one is ready for a second review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I want a clean commit history with separate logical changes, so this should end up being in a different commit. I usually handle this by using interactive rebases but feel free to use separate PRs if that's simpler for you. I think we need the following changes:
- Rename
StatsKind
toStatsGet
- Rename
record()
torecord_get()
- Track
wait_time
: addwait_time
variable inget()
and changerecord_get()
to accept theOption<Duration>
, should end up inget_wait_time_micros
inAtomicStatistics
and inget_wait_time
inStatistics
- Track
closed_broken
andclosed_created
: recreateAtomicStatistics::record()
andStatsKind
for these - Track
closed_max_lifetime
andclosed_idle_timeout
: create a separate methodAtomicStatistics::reaped()
for these
@@ -85,7 +85,8 @@ where | |||
} | |||
|
|||
pub(crate) async fn get(&self) -> Result<PooledConnection<'_, M>, RunError<M::Error>> { | |||
let mut kind = StatsKind::Direct; | |||
let mut get_kind = StatsKind::GetDirect; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't rename this get_kind
, please.
self.inner.statistics.record(get_kind, 1); | ||
|
||
if let Some(wait_time_start) = wait_time_start { | ||
let wait_time = Instant::now() - wait_time_start; | ||
self.inner | ||
.statistics | ||
.record(StatsKind::GetWaitedTime, wait_time.as_micros() as u64); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's wrap this up in a method on AtomicStatistics
called record_get()
.
self.statistics | ||
.record(StatsKind::ConnectionsMaxLifeTimeClosed, max_lifetime_closed); | ||
self.statistics.record( | ||
StatsKind::ConnectionsMaxIdleTimeoutClosed, | ||
max_idle_timeout_closed, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's wrap this in a separate method on AtomicStatistics
, too, and keep all of the reaping-related changes in a separate commit or PR.
let mut max_lifetime_closed = 0; | ||
let mut max_idle_timeout_closed = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's call these closed_max_lifetime
and closed_idle_timeout
.
pub(crate) connections_created: AtomicU64, | ||
pub(crate) connections_broken_closed: AtomicU64, | ||
pub(crate) connections_invalid_closed: AtomicU64, | ||
pub(crate) connections_max_lifetime_closed: AtomicU64, | ||
pub(crate) connections_max_idle_timeout_closed: AtomicU64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let's keep connections_
but I'd like closed
moved directly after connections_
and rename max_idle_timeout
to idle_timeout
.
Direct, | ||
Waited, | ||
TimedOut, | ||
GetDirect, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll want to split this into StatsGet { Waited, Direct, TimedOut }
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@djc just for having an agreement on the new kinds and record operations that we would have within AtomicStatistics
, we can go for this
pub(crate) enum StatsGet {
Waited,
Direct,
TimedOut,
}
pub(crate) enum StatsConnectionClosed {
Broken,
Invalid,
}
Note: we should not need more stats kind since the used record_*
methods for stats like the ones coming from the reap function will just call an ad-hoc function like record_connections_reaped
And then we would have the following record_*
methods within the AtomicStatistics
:
record_get
(takes theStatsGet
parameter)record_connection_closed
(takes theStatsConnectionClosed
parameter)record_connections_reaped
record_get_wait_time
record_connection_created
Let me know if this works for you and i can create a new PR with separated commits for each group, starting with the rename of the record
to record_get
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, as suggested in my previous message we should reuse a simple record()
with StatsKind
for both connection_closed
and connection_created
, and record_get()
should also handle get_wait_time
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oks, let me provide a new draft - sorry for the back and forth here :
pub(crate) enum StatsGet {
Waited,
Direct,
TimedOut,
WaitTime
}
pub(crate) enum StatsConnectionClosed {
Broken,
Invalid,
}
And following record
methods
record
which would take aStatsConnectionClosed
kind typerecord_get
for recording any stats related to get including wait timerecord_connections_reaped
for recording the reaped connections either idle timeout or max lifetime
How does it sound? is this aligned with your expectations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Let's do one PR at a time, please implement the first chunk and I'll give feedback. Please closely read my feedback from this PR because you've still missed some of it.
We add several new statistics fields for tracking the following:
connections_created
: How many connections have been createdconnections_broken_closed
: How many connections were closed due to a broken stateconnections_invalid_closed
: How many connections were closed due to be considered invalid.connections_max_lifetime_closed
: How many connections were closed due to reaching the max lifetime.connections_max_idle_timeout_closed
: How many connections were closed due to reaching the idle timeout.We have added also a new statistic called
get_waited_time_micro
which provides the accumulated time in micros that all calls to theget
method had to wait for a new connection.