Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rediscluster support #573

Merged
merged 28 commits into from
May 19, 2021
Merged

Conversation

FranGM
Copy link
Contributor

@FranGM FranGM commented Mar 8, 2021

Add a ClusterRedisClient next to the existing RedisClient, allowing us to connect to Redis clusters.

We also provide a MonitoredClusterRedisConnection and MonitoredClusterRedisPipeline as analogues for MonitoredRedisConnection and MonitoredRedisPipeline.

@FranGM FranGM requested a review from a team as a code owner March 8, 2021 18:05
@spladug spladug self-requested a review March 8, 2021 18:12
Copy link
Contributor

@spladug spladug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Thank you! I've left a few small comments in there, but overall it looks like it'll do the trick nicely. A few blockers:

@@ -4,6 +4,7 @@
from typing import Optional

import redis
import rediscluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would make anyone using the current redis client suddenly have to install redis-py-cluster as well just to continue using the other client. Can you move the new cluster stuff into its own module in the clients folder? Like baseplate/clients/redis_cluster.py perhaps?

baseplate/clients/redis.py Outdated Show resolved Hide resolved
baseplate/clients/redis.py Outdated Show resolved Hide resolved

"""

# TODO: Add all args below
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

@@ -25,3 +28,29 @@ services:
image: "redis:4.0.9"
zookeeper:
image: "zookeeper:3.4.10"
redis-cluster-node-0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, this is a lot of stuff. do we actually need a three node cluster to test this out? or can we mock it more simply?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One idea would be to use

grokzen/redis-cluster

which gives us a Redis cluster with a single Docker dependency. It is not recommended for production use but perfect for testing environments. I will say, Redis cluster has idiosyncrasies that are worth having tests run against a real instance but that depends entirely on the tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're not actually implementing a full redis cluster client here, thankfully, but just wrapping an existing one. do you think we need to have behavioral tests here in baseplate beyond ensuring that we generate spans?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were a few cases where I was thinking more testing would be useful but ultimately any test I could think of would just end up testing the underlying redis-py-cluster library, so didn't add anything beyond what the redis client already has (I did end up using the redis-cluster container for the integration tests though, it was also pretty handy for manual testing)

@@ -6,9 +6,13 @@
except ImportError:
raise unittest.SkipTest("redis-py is not installed")

try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise on splitting up the redis/redis_cluster modules here please

@FranGM
Copy link
Contributor Author

FranGM commented Mar 10, 2021

I think I've addressed all the comments. I've split redis into two clients and added a page to the docs.
After more testing I ended up needing to work around a quirk or two of the upstream library but nothing too major.

@FranGM FranGM requested a review from spladug March 10, 2021 21:08
Copy link
Contributor

@spladug spladug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great! only blocker is the question about our default causing the infinite loop as well. otherwise, just some stylistic questions. thank you!

from typing import Any
from typing import Dict

import rediscluster # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this because rediscluster isn't typed? if so, you can get rid of the pragmas here in code and pop something like this into setup.cfg: https://github.com/reddit/baseplate.py/blob/develop/setup.cfg#L72-L73

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh that's great , that's much cleaner

Comment on lines 25 to 30
nodes_in_slot = self.nodes.slots[slot]
if read_command:
random_index = random.randrange(1, len(nodes_in_slot))
return nodes_in_slot[random_index]

return nodes_in_slot[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I understanding right that the first node is always the primary and the rest are replicas? If so, would something like this make that a little more self-explanatory?

Suggested change
nodes_in_slot = self.nodes.slots[slot]
if read_command:
random_index = random.randrange(1, len(nodes_in_slot))
return nodes_in_slot[random_index]
return nodes_in_slot[0]
primary, *replicas = self.nodes.slots[slot]
if read_command:
return random.choice(replicas)
return primary

Edge case check: is there ever a situation where we'd have zero replicas?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually it's definitely possible, yes, let me rework it so it can handle all cases

baseplate/clients/redis_cluster.py Outdated Show resolved Hide resolved
baseplate/clients/redis_cluster.py Show resolved Hide resolved
@@ -8,7 +8,6 @@

from baseplate.clients.redis import RedisClient
from baseplate import Baseplate

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gasp!

@FranGM FranGM requested a review from spladug March 11, 2021 00:13
Copy link
Contributor

@spladug spladug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

# Either this isn't a read command or there aren't any replicas
return primary
# This isn't a read command, so return the primary
return self.nodes.slots[slot]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this not return all of them rather than just the primary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes, and I'm very annoyed at myself for not having added a test for that method. I'll fix it and look at adding a test.

@FranGM FranGM requested a review from spladug March 30, 2021 23:32
Fran Garcia and others added 2 commits April 6, 2021 18:57
For rediscluster we can't easily calculate the size of the pool since
counting the number of items in the queue doesn't work (the queue is
always full, just full of Nones if the queue is "empty").

We just decide to ignore that metric for now and just report the ones
that are easily available to us.
This will allow us to take advantage of `read_from_replicas` support on pipelines when Grokzen/redis-py-cluster#450 is merged
FranGM and others added 3 commits April 20, 2021 14:36
Previous versions of redis-py-cluster don't accept the `read_from_replicas` argument to pipelines
Add a feature to the cluster redis client that can enable hot key
tracking based on a configurable sample rate.

When read and/or write commands access keys in Redis we will sample
those commands and store the keys within a sorted set in Redis, which we
can use to check relative frequency of accessing some keys compared to
others.
@FranGM
Copy link
Contributor Author

FranGM commented Apr 29, 2021

@spladug based on our testing I added a requested feature to the client which is the ability to track hot keys in the cluster.

Another change that's pending is that I need to bump the version of redis-py-cluster to 2.1.3 but need to wait until that is released first. We'll need that version since it will include Grokzen/redis-py-cluster#450

Copy link
Contributor

@spladug spladug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great! just a couple notes on the docs. once that's done, we can merge whenever that new release is cut upstream.

How many connections have been established and are currently checked out and
being used.

.. versionadded:: 2.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this go up at the top, like right before the Example heading? I've only put the versionadded stuff down here on some other clients where the runtime metrics were the thing that was new.

@@ -0,0 +1,92 @@
``baseplate.clients.redis_cluster``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the hot key tracking stuff be mentioned somewhere in this page?

docs/api/baseplate/clients/redis_cluster.rst Outdated Show resolved Hide resolved
@spladug spladug merged commit fa3487b into reddit:develop May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants