Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide statistics for get operations #198

Merged
merged 1 commit into from
Jun 5, 2024
Merged

Conversation

pfreixes
Copy link
Contributor

@pfreixes pfreixes commented Apr 18, 2024

Exposing three new attributes as part of a new Statistics type

We will be exposing the three new attributes as part of a new Statistics type:

  • get_direct: Total number of get operations performed for retrieving a new connection that did no wait for getting a connection.
  • get_waited: Total number of get operations that had to wait for having a connection available.
  • get_timed_out: Total number of get operations that did time out while waiting for a connection.

See following section for understanding how these two values can be used

Why adding these three new attributes ?

Tuning the connection pool, specifically the max size and the min idle, its something that needs to be done by gathering data that helps you to understand how well configure are for example these to configuration parameters, which should lead to an ideal situation where the contention of the pool is affordable.

By having visibility of the gets that managed to retrieve a connection without waiting and having too the visibility of the ones that had to wait, the user would have available enough data for understanding if the pool is having contention.

When contention can happen

There are multiple scenarios where contention at connection pool level can happen, which can be described as the scenario where there are no enough connections free in the pool and the calls for getting a connection need to wait

  • When the max size of the pool was underestimated, and the operations for getting a new connection need to constantly wait for having a connection free.
  • When the max idle connection is too low, and the operations for getting a new connection have multiple chances to have to pay the cost of creating a new connection.

By adding the get_direct, get_waited and get_timed_out we provide a way to know deterministically how much contention the connection pool is suffering. By using a simple operation like (get_waited+get_timed_out)/(get_direct + get_waited+get_timed_out) the user can understand the percentage of operations that suffered from connection pooling contention.

A low number - i.e < 1% - might indicate that the two variables are properly configured, while bigger numbers - i.e > 5% - might indicate that there is a none negligible amount of operations that are suffering an extra latency due to some level of contention when getting a connection.

How and when to use it?

There is an expectation that the State will be polled every X times, and the user will take the responsibility of calculating the delta from the previous values, something like:

let state_before = pool.statistics()
let state_after = pool.statistics()

println!("Gets performed within period X: {}", (state_after.get_direct + state_after.get_waited + state_after.get_timed_out) - (state_before.get_direct + state_after.get_waited + state_after.get_timed_out))
println!("Gets performed within period X that wait for a connection: {}", state_after.get_waited - state_before.get_waited)
println!("Gets performed within period X that wait for a connection and timed out: {}", state_after.get_timed_out - state_before.get_timed_out)

Deltas can be then sent to your favourite time series database/service - i.e prometheus, DD, etc.

@djc
Copy link
Owner

djc commented Apr 18, 2024

I'm okay with adding some statistics to bb8, but this PR is pretty far from something I'd like to merge. Please concisely describe your use cases -- which information you need and why -- and then be prepared to invest time to very carefully follow any guidance I provide on how to implement this.

@pfreixes
Copy link
Contributor Author

Please concisely describe your use cases -- which information you need and why --

Tried to update the PR with a much better explanation, please leat me know if you believe that needs still some work

and then be prepared to invest time to very carefully follow any guidance I provide on how to implement this.

That would be awesome 🙇

@pfreixes pfreixes changed the title Provide statistics of gets and contention Provide statistics of gets and gets with contention Apr 18, 2024
@djc
Copy link
Owner

djc commented Apr 23, 2024

Please concisely describe your use cases -- which information you need and why --

Tried to update the PR with a much better explanation, please leat me know if you believe that needs still some work

Can you give some (semi-) formal definitions for conceptually what values you want to expose, and explain why these are the minimal (in terms of complexity) measurements you need to solve for your use case?

bb8/src/inner.rs Outdated Show resolved Hide resolved
bb8/src/inner.rs Outdated Show resolved Hide resolved
bb8/src/internals.rs Outdated Show resolved Hide resolved
bb8/src/inner.rs Outdated Show resolved Hide resolved
@pfreixes
Copy link
Contributor Author

pfreixes commented May 2, 2024

@djc PR should be ready for another review

I would like to work on extending the statistics for adding too the following ones:

  • GetWaitDuration Total time waited for a connection during a get
  • MaxIdleClosedConnections Connections closed due to max idle
  • MaxIdleTimeClosedConnections Connections closed due to max idle time
  • MaxLifetimeClosedConnections Connections closed due to max time
  • BrokenClosedConnections Connections closed due to considered broken by is_broken
  • InvalidClosedConnections Connections closed due to be considered broken by is_valid
  • OpenedConnections Total Connections opened

I can do this on top of this PR or in a new one, any objections? any preference? If we keep the statistics as public interface for gathering all fields, Im happy to go in either way, since we would be able to make internal breaking changes if required.

@pfreixes
Copy link
Contributor Author

pfreixes commented May 2, 2024

Regarding my latest comment, just finished the implementation of other statistics so up to you how you want to handle this second batch of changes.

Just let me know.

@pfreixes
Copy link
Contributor Author

pfreixes commented May 9, 2024

Any new feedback on this one? 🙇

@pfreixes
Copy link
Contributor Author

Hi @djc hope that everything is fine, if you have time I would love to have some feedback with regards the outstanding questions for this PR 🙇

@djc
Copy link
Owner

djc commented May 15, 2024

Yes, will try to follow up soon (was on vacation for two weeks).

bb8/src/inner.rs Outdated Show resolved Hide resolved
bb8/src/inner.rs Outdated Show resolved Hide resolved
bb8/src/inner.rs Outdated Show resolved Hide resolved
bb8/src/inner.rs Outdated Show resolved Hide resolved
bb8/src/inner.rs Outdated Show resolved Hide resolved
bb8/src/inner.rs Outdated Show resolved Hide resolved
@pfreixes pfreixes changed the title Provide statistics of gets and gets with contention Provide statistics for get operations May 29, 2024
@pfreixes
Copy link
Contributor Author

@djc I believe that Ive addressed all of your comments PTAL (thanks for the review, right now the changes seem to be more aligned with the current base code!)

Copy link
Owner

@djc djc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better, getting pretty close!

bb8/src/inner.rs Outdated Show resolved Hide resolved
bb8/src/inner.rs Outdated Show resolved Hide resolved
bb8/src/inner.rs Outdated
#[derive(Debug)]
#[non_exhaustive]
pub struct Statistics {
/// Information about gets.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this line? Doesn't seem worth it.

bb8/src/api.rs Outdated
@@ -45,6 +46,11 @@ impl<M: ManageConnection> Pool<M> {
Builder::new()
}

/// Returns statistics about the historical usage of the pool.
pub fn statistics(&self) -> Statistics {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, actually I think we should make statistics a field in State. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect that for having a full representation of all metrics, the ones coming from the State would end us gauge metrics, the user will need to gather both objects, so Im ok to make this easier to the user by only requiring to make a single api call for gathering both.

Ill do that.

@pfreixes
Copy link
Contributor Author

Hi @djc i did the changes related to moving the verbose logic for recording the gets in this commit bc9fba4 PTAL

I did not go for now on adding Statistics as part of the State object, I wanted to first gather your thoughts, if we do that movement we might be forced to move the State object as part of the inner module and create a new internal like StatePoolnternal that would replace the former State internals, once this is done we would be doing the composition of the object into the PoolInner::state() method. The other option would be to move the AtomicStatistics under the PoolInternals but considering that this is an object guarded by a Mutex and with a very specific semantics, I would not go in that way.

The third option is just keep the current implementation with two segregate interfaces.

Thoughts?

@djc
Copy link
Owner

djc commented May 31, 2024

I did not go for now on adding Statistics as part of the State object, I wanted to first gather your thoughts, if we do that movement we might be forced to move the State object as part of the inner module and create a new internal like StatePoolnternal that would replace the former State internals, once this is done we would be doing the composition of the object into the PoolInner::state() method. The other option would be to move the AtomicStatistics under the PoolInternals but considering that this is an object guarded by a Mutex and with a very specific semantics, I would not go in that way.

I think the changes in #200 would enable doing it the way I suggested, do you agree?

@pfreixes
Copy link
Contributor Author

I think the changes in #200 would enable doing it the way I suggested, do you agree

Yeps Ill piggyback on your changes for making mine changes, thanks for taking care of it!

bb8/src/inner.rs Outdated Show resolved Hide resolved
@djc djc force-pushed the add_get_stats branch from 9f89588 to 8c1450c Compare June 5, 2024 09:22
Copy link
Contributor Author

@pfreixes pfreixes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! let me know if you need anything else from me for this PR.

Once merged - if you do that ofc - Ill make a further PR for adding more statistics.

@djc djc force-pushed the add_get_stats branch from 8c1450c to d1cfe51 Compare June 5, 2024 14:11
Copy link

codecov bot commented Jun 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 76.05%. Comparing base (3190c75) to head (d1cfe51).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #198      +/-   ##
==========================================
+ Coverage   74.95%   76.05%   +1.10%     
==========================================
  Files           6        6              
  Lines         519      543      +24     
==========================================
+ Hits          389      413      +24     
  Misses        130      130              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@djc djc merged commit 276235a into djc:main Jun 5, 2024
9 checks passed
@djc
Copy link
Owner

djc commented Jun 5, 2024

Thanks for your patience, happy to see more statistics in your next PR!

@pfreixes
Copy link
Contributor Author

pfreixes commented Jun 5, 2024

Thanks for the review process, helped me a lot with my journey for learning Rust.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants