-
Notifications
You must be signed in to change notification settings - Fork 145
Kafka Assigner
Sometimes, working with partition assignments in Kafka clusters is a pain. The standard admin CLI tools are quite simple, and do not make it easy to perform simple tasks such as "remove a broker from the cluster". Over time, LinkedIn SRE developed a number of tools for working with partitions in clusters, and they've now been consolidated into a single script that performs most common functions.
In order to run kafka-assigner, you will need to have the following Python modules installed:
- Paramiko
- Kazoo
In addition, you will need to run it on a host that has the following:
- A copy of the Kafka admin tools (including kafka-reassign-partitions.sh)
- Access to the Zookeeper ensemble for the cluster
- SSH access to the Kafka brokers (with credentials preferably loaded into ssh-agent)
At the high level, kafka-assigner is run as follows:
kafka-assigner.py -z <zkhost:port/path> [OPTIONS] <module name> [MODULE OPTIONS]
The argument to the -z
command line option is the full zookeeper connect string for your Kafka cluster. So if your zookeeper host is zook.example.com, running on port 2181, and the Kafka cluster uses a chroot path of /kafka/clustername, then the argument is zook.example.com:2181/kafka/clustername
.
The following command line options can be used as [OPTIONS]
and are all optional:
Option | Argument | Default | Description |
---|---|---|---|
--leadership | none | Show the cluster leadership balance before and after module processing | |
--generate | none | Generate the partition reassignment file(s) and print them out | |
--execute | none | Execute the partition reassignment (if omitted, dry run only) | |
--moves | integer | 10 | The number of partition moves to execute in a single step |
--ple-size | integer | 900000 | Max size in bytes for a preferred leader election string |
--ple-wait | integer | 300 | Time in seconds to wait between preferred leader elections |
--tools-path | path | none | Path to Kafka admin utilities, overriding the PATH env var |
The following modules are currently available and can be specified as the module name. Click through for specifics on using the module
Module | Description |
---|---|
[[clone | module-clone]] |
[[trim | module-trim]] |
[[remove | module-remove]] |
[[elect | module-elect]] |
[[set-replication-factor | module-set-replication-factor]] |
[[reorder | module-reorder]] |
[[balance | module-balance]] |
- Positional arguments suck. This is the way argparse works for common arguments, however, so we're stuck with it for now. If someone has a better way I'd love to see it.
- There should be helper functions for changing the partition replica lists that maintain the partition and the brokers at the same time.
- The reorder module is very heavy handed with the way it operates, as it doesn't consider the initial state of the cluster. It should be changed to perform a minimal number of moves.
- The batching of partition moves is pretty simple. It could be optimized to move similar sized partitions in a batch, as well as only having 1 move per broker tuple in a batch
- A rack aware balancing module would be awesome