Replies: 3 comments 18 replies
-
I'm not sure what do you mean with The rolling update is always needed when upgrading, because the operator needs to understand what version of the software it is running. |
Beta Was this translation helpful? Give feedback.
-
Hello everyone, I have a Kafka cluster managed by Strimzi with the following configuration: apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
...
spec:
kafka:
version: 3.6.1
replicas: 5
config:
default.replication.factor: 3
min.insync.replicas: 2 When I update the Kafka version and deploy the change, I observe that the older pods are terminated before the new pods are fully up (they might not be ready in the Kafka sense; I understand that Drain Cleaner might help with this, but that’s a different conversation). Is there something configuration-wise that I might be doing wrong for this to happen? Is this behavior expected for updates that require broker restarts? Are there any steps I can take to avoid or mitigate this issue? Does Thank you in advance for your help. |
Beta Was this translation helpful? Give feedback.
-
@scholzj I noticed that while I was doing a rolling upgrade on the Kafka cluster, when the client (in this case - I was using the Go SDK) tried to create a new partition [1] I got the error: [1] - Publishing to a Kafka topic that doesn't exist yet while using cluster config of It seems like it might be related to this thread. Do you have a recommendation on what to do in this case? Perhaps setting tweaking Not sure what the best course of action here is. Tweaking Specifically, the error looks like: Jul 9 19:53:27.481 INF Failed to publish to kafka err="[38] Invalid Replication Factor: the replication-factor is invalid" attempts=1 sleep=105ms
Jul 9 19:53:27.705 INF Failed to publish to kafka err="[38] Invalid Replication Factor: the replication-factor is invalid" attempts=2 sleep=598ms
Jul 9 19:53:28.427 INF Failed to publish to kafka err="[38] Invalid Replication Factor: the replication-factor is invalid" attempts=3 sleep=154ms
Jul 9 19:53:28.690 INF Failed to publish to kafka err="[38] Invalid Replication Factor: the replication-factor is invalid" attempts=4 sleep=990ms
Jul 9 19:53:29.802 INF Failed to publish to kafka err="[38] Invalid Replication Factor: the replication-factor is invalid" attempts=5 sleep=1.41s
Jul 9 19:53:31.328 INF Failed to publish to kafka err="[38] Invalid Replication Factor: the replication-factor is invalid" attempts=6 sleep=1.034s
Jul 9 19:53:32.474 INF Failed to publish to kafka err="[38] Invalid Replication Factor: the replication-factor is invalid" attempts=7 sleep=1.809s
Jul 9 19:53:34.413 INF Failed to publish to kafka err="[38] Invalid Replication Factor: the replication-factor is invalid" attempts=8 sleep=1.385s
Jul 9 19:53:35.921 INF Failed to publish to kafka err="[38] Invalid Replication Factor: the replication-factor is invalid" attempts=9 sleep=3.457s |
Beta Was this translation helpful? Give feedback.
-
Hi,
I have noticed that the Strimzi Operator upgrade process causes Rolling Upgrade which causes DOWNTIME.
Although I have 3 brokers and min.insync.replicas=2, the producer needs to wait for time (from 30s-2m).
I think this might happen because of new leaders election (new election for the partition leaders on the broker which dies).
I'm wondering if there's a way to prevent this downtime, because while I'm upgrading the operator, this downtime occurs for ALL the Kafkas that the operator manages.
I thought about an option to manually change the spec.kafka.version to fake one and then the operator won't do a rolling update, but then it won't be supported (which is problematic with the strimzipodset).
Does anyone has an idea how to avoid this situation? or maybe propose a feature change?
Beta Was this translation helpful? Give feedback.
All reactions