-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rediscluster support #573
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Thank you! I've left a few small comments in there, but overall it looks like it'll do the trick nicely. A few blockers:
- splitting up the files so we don't cause import issues for existing users
- adding a page to the docs here: https://github.com/reddit/baseplate.py/tree/develop/docs/api/baseplate/clients (you can do
make docs
in dev to see the output) - adding an appropriate "extra" to setup.py to specify supported version ranges: https://github.com/reddit/baseplate.py/blob/develop/setup.py#L14
- some lint failures (you can do
make lint
andmake fmt
in dev https://github.com/reddit/baseplate.py/blob/develop/CONTRIBUTING.md)
baseplate/clients/redis.py
Outdated
@@ -4,6 +4,7 @@ | |||
from typing import Optional | |||
|
|||
import redis | |||
import rediscluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would make anyone using the current redis client suddenly have to install redis-py-cluster
as well just to continue using the other client. Can you move the new cluster stuff into its own module in the clients folder? Like baseplate/clients/redis_cluster.py
perhaps?
baseplate/clients/redis.py
Outdated
|
||
""" | ||
|
||
# TODO: Add all args below |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
docker-compose.yml
Outdated
@@ -25,3 +28,29 @@ services: | |||
image: "redis:4.0.9" | |||
zookeeper: | |||
image: "zookeeper:3.4.10" | |||
redis-cluster-node-0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow, this is a lot of stuff. do we actually need a three node cluster to test this out? or can we mock it more simply?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One idea would be to use
grokzen/redis-cluster
which gives us a Redis cluster with a single Docker dependency. It is not recommended for production use but perfect for testing environments. I will say, Redis cluster has idiosyncrasies that are worth having tests run against a real instance but that depends entirely on the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're not actually implementing a full redis cluster client here, thankfully, but just wrapping an existing one. do you think we need to have behavioral tests here in baseplate beyond ensuring that we generate spans?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were a few cases where I was thinking more testing would be useful but ultimately any test I could think of would just end up testing the underlying redis-py-cluster
library, so didn't add anything beyond what the redis client already has (I did end up using the redis-cluster
container for the integration tests though, it was also pretty handy for manual testing)
tests/integration/redis_tests.py
Outdated
@@ -6,9 +6,13 @@ | |||
except ImportError: | |||
raise unittest.SkipTest("redis-py is not installed") | |||
|
|||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
likewise on splitting up the redis/redis_cluster modules here please
Co-authored-by: Neil Williams <[email protected]>
Co-authored-by: Neil Williams <[email protected]>
I think I've addressed all the comments. I've split redis into two clients and added a page to the docs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great! only blocker is the question about our default causing the infinite loop as well. otherwise, just some stylistic questions. thank you!
baseplate/clients/redis_cluster.py
Outdated
from typing import Any | ||
from typing import Dict | ||
|
||
import rediscluster # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this because rediscluster
isn't typed? if so, you can get rid of the pragmas here in code and pop something like this into setup.cfg: https://github.com/reddit/baseplate.py/blob/develop/setup.cfg#L72-L73
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh that's great , that's much cleaner
baseplate/clients/redis_cluster.py
Outdated
nodes_in_slot = self.nodes.slots[slot] | ||
if read_command: | ||
random_index = random.randrange(1, len(nodes_in_slot)) | ||
return nodes_in_slot[random_index] | ||
|
||
return nodes_in_slot[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I understanding right that the first node is always the primary and the rest are replicas? If so, would something like this make that a little more self-explanatory?
nodes_in_slot = self.nodes.slots[slot] | |
if read_command: | |
random_index = random.randrange(1, len(nodes_in_slot)) | |
return nodes_in_slot[random_index] | |
return nodes_in_slot[0] | |
primary, *replicas = self.nodes.slots[slot] | |
if read_command: | |
return random.choice(replicas) | |
return primary |
Edge case check: is there ever a situation where we'd have zero replicas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually it's definitely possible, yes, let me rework it so it can handle all cases
tests/integration/redis_tests.py
Outdated
@@ -8,7 +8,6 @@ | |||
|
|||
from baseplate.clients.redis import RedisClient | |||
from baseplate import Baseplate | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gasp!
Co-authored-by: Neil Williams <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
baseplate/clients/redis_cluster.py
Outdated
# Either this isn't a read command or there aren't any replicas | ||
return primary | ||
# This isn't a read command, so return the primary | ||
return self.nodes.slots[slot] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this not return all of them rather than just the primary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes, and I'm very annoyed at myself for not having added a test for that method. I'll fix it and look at adding a test.
For rediscluster we can't easily calculate the size of the pool since counting the number of items in the queue doesn't work (the queue is always full, just full of Nones if the queue is "empty"). We just decide to ignore that metric for now and just report the ones that are easily available to us.
This will allow us to take advantage of `read_from_replicas` support on pipelines when Grokzen/redis-py-cluster#450 is merged
Previous versions of redis-py-cluster don't accept the `read_from_replicas` argument to pipelines
Add a feature to the cluster redis client that can enable hot key tracking based on a configurable sample rate. When read and/or write commands access keys in Redis we will sample those commands and store the keys within a sorted set in Redis, which we can use to check relative frequency of accessing some keys compared to others.
… into rediscluster-support
@spladug based on our testing I added a requested feature to the client which is the ability to track hot keys in the cluster. Another change that's pending is that I need to bump the version of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great! just a couple notes on the docs. once that's done, we can merge whenever that new release is cut upstream.
How many connections have been established and are currently checked out and | ||
being used. | ||
|
||
.. versionadded:: 2.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this go up at the top, like right before the Example
heading? I've only put the versionadded stuff down here on some other clients where the runtime metrics were the thing that was new.
@@ -0,0 +1,92 @@ | |||
``baseplate.clients.redis_cluster`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the hot key tracking stuff be mentioned somewhere in this page?
Add a
ClusterRedisClient
next to the existingRedisClient
, allowing us to connect to Redis clusters.We also provide a
MonitoredClusterRedisConnection
andMonitoredClusterRedisPipeline
as analogues forMonitoredRedisConnection
andMonitoredRedisPipeline
.