-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement timestamp generators #1128
base: main
Are you sure you want to change the base?
Conversation
|
80fa6c4
to
3c0da90
Compare
811f0bf
to
1d70873
Compare
1d70873
to
0f151f1
Compare
6d1943a
to
876d544
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do other drivers (beside cpp-driver) have timestamp generators? If so, how do they implement them?
I'm asking because cpp-compatible implementation does not have to reside in Rust Driver: cpp-rust-driver can implement it in its codebase. We should think about what implementation(s) we want to provide to our users, not what cpp-rust-driver needs here.
Did you maybe test how long a call to MonotonicTimestampGenerator::next_timestamp()
takes? The clock has a microsecond resolution and microsecond is a lot of time. It may be possible that if we call the generator often (like in your unit test), then it may always choose last + 1
branch, because we are still in the same microsecond as previous call. In that case after multiple calls the returned value may be far in the future compared to system clock. If there are multiple clients, it may cause issues.
I've also checked Java and Python drivers, they implement it in the exact same way, taking the microsecond time since epoch. Unfortunately, |
876d544
to
feba1aa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
I don't think connection is the right layer to utilize the generator, imo it is more fitting for the session layer. Look into session.rs file, into functions like
execute
orrun_query
(cc: @wprzytula ) -
Nit: the commit messages have too long first lines, which makes github not render them correctly. Avoid having any lines longer than 70 characters, especially the first line (which should ideally have at most 50 characters).
-
The
MonotonicTimestampGenerator
struct could use more explanation in its doc comment. Please describe what it guarantees, how it behaves (errors, drifting etc). -
On that front, documentation book should also be updated with info about timestamp generators. It should either be a new file in
queries
folder, or a new folder. @wprzytula @muzarski wdyt it the better place?
scylla/src/transport/connection.rs
Outdated
let mut timestamp = None; | ||
if query.get_timestamp().is_none() { | ||
if let Some(x) = self.config.timestamp_generator.clone() { | ||
timestamp = Some(x.next_timestamp().await); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to clone here, a reference should be enough to generate a timestamp.
warning_threshold_us: i64, | ||
warning_interval_ms: i64, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently you use raw integers for those durations, and you treat "0" as a special value.
The less error prone and more Rusty way is to use std::time::Duration
to store thresholds / intervals, and use Option<std::time::Duration>
instead of using a special value.
Ideally, we could have a new directory with a file for each generator. However, if it turns out in multiple very short |
My first thought was to put the timestamp generation in the session layer, however the functions I've changed in |
For normal queries modifying |
feba1aa
to
4ad3cd7
Compare
69b7789
to
a66210e
Compare
async fn next_timestamp(&self) -> i64 { | ||
loop { | ||
let last = self.last.load(Ordering::SeqCst); | ||
let cur = self.compute_next(last).await; | ||
if self | ||
.last | ||
.compare_exchange(last, cur, Ordering::SeqCst, Ordering::SeqCst) | ||
.is_ok() | ||
{ | ||
return cur; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 When the client is under high load, won't this approach be a problem?
We should benchmark this, @muzarski could you help @smoczy123 with that?
The alternative approach would be to just put last
under mutex, avoiding the retries. The added benefit of that is that you can store Instant
under Mutex, which would simplify the code in compute_next
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prepared a branch that uses rust-driver version from this PR: https://github.com/muzarski/cql-stress/tree/timestamp-gen.
@smoczy123, you can set the timestamp generator for c-s
frontend in src/bin/cql-stress-cassandra-stress/main.rs
in prepare_run
function. The Session
object is created in this function.
Commands for some simple workloads:
cql-stress-cassandra-stress write n=1000000 -pop seq=1..1000000 -rate threads=20 -node <ip addresses>
cql-stress-cassandra-stress read n=1000000 -pop seq=1..1000000 -rate threads=20 -node <ip addresses>
First one will insert 1M rows to the databse, while the latter reads the rows and validates them. You can play around with the run parameters and options. You can also try running multiple loads simultaneously to simulate multi-client scenario.
To run scylla locally, you can see https://hub.docker.com/r/scylladb/scylla/. Then you can replace <ip addresses>
in the commands above, with your scylla nodes' ips (comma-delimited list of ips).
If you stumble upon any problems, feel free to ping me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've ran write and read workloads for a different numbers of threads in both single and multi client scenarios and I've seen no measurable difference in speed between using a monotonic timestamp generator and not using one. It seems like this does not cause issues with latency.
a66210e
to
569168c
Compare
095bbfe
to
f0e5493
Compare
Added TimestampGenerator trait and MonotonicTimestampGenerator based on c++ driver's implementation
Also added an ability to set it through Session Builder
The timestamp generator in ConnectionConfig is set in Session::Connect()
Generated timestamp is only set if user did not provide one
f0e5493
to
149844f
Compare
@smoczy123 I see you requested a re-review. If you addressed my comments in the new version of the code, please mark them as resolved, so that I know which one to expect to be fixed. |
All of your comments should be addressed now, I've left two conversations open as I'm not sure about those two |
To achieve parity with cpp-driver we need to implement client-side timestamp generators.
This pull request adds a TimestampGenerator trait and a MonotonicTimestampGenerator that implements it,
together with an extension to SessionBuilder that provides an ability to set a TimestampGenerator in Session
and use it to generate timestamps.
Fixes #1032
Pre-review checklist
./docs/source/
.Fixes:
annotations to PR description.