Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix error when delete mgh then recreate it #1288

Merged
merged 1 commit into from
Dec 20, 2024

Conversation

ldpliu
Copy link
Contributor

@ldpliu ldpliu commented Dec 18, 2024

Summary

we should delete kafka pvc when delete mgh, if not, when we recreate mgh, the kafka pod can not start

[root@ip-172-31-15-97 multicluster-global-hub]# oc get po
NAME                                                READY   STATUS             RESTARTS      AGE
kafka-kafka-0                                       0/1     CrashLoopBackOff   2 (9s ago)    41s
kafka-kafka-1                                       0/1     CrashLoopBackOff   2 (11s ago)   41s
kafka-kafka-2                                       0/1     CrashLoopBackOff   2 (9s ago)    40s
multicluster-global-hub-grafana-cc8b57d74-v6fh2     2/2     Running            0             89s
multicluster-global-hub-grafana-cc8b57d74-vvc9c     2/2     Running            0             89s
multicluster-global-hub-manager-698594c9-6tpcp      1/1     Running            2 (90s ago)   92s
multicluster-global-hub-manager-698594c9-jbvjv      1/1     Running            2 (86s ago)   92s
multicluster-global-hub-operator-776678446c-7gksz   1/1     Running            0             6m8s
multicluster-global-hub-postgresql-0                2/2     Running            0             92s
strimzi-cluster-operator-v0.43.0-76f57fb5b7-kgs4d   1/1     Running            0             80s

(org.apache.kafka.server.log.remote.storage.RemoteLogManagerConfig) [main] Exception in thread "main" java.lang.RuntimeException: Invalid cluster.id in: /var/lib/kafka/data-0/kafka-log0/meta.properties. Expected PPAVF8rBTuq25YVTs2LT_Q, but read NCsZMgUBSwuPQDozJaC8zg at org.apache.kafka.metadata.properties.MetaPropertiesEnsemble.verify(MetaPropertiesEnsemble.java:509) at kafka.tools.StorageTool$.formatCommand(StorageTool.scala:531) at kafka.tools.StorageTool$.runFormatCommand(StorageTool.scala:140) at kafka.tools.StorageTool$.execute(StorageTool.scala:80) at kafka.tools.StorageTool$.main(StorageTool.scala:53) at kafka.tools.StorageTool.main(StorageTool.scala)

Related issue(s)

https://issues.redhat.com/browse/ACM-16507
Fixes #

Tests

  • Unit/function tests have been added and incorporated into make unit-tests.
  • Integration tests have been added and incorporated into make integration-test.
  • E2E tests have been added and incorporated into make e2e-test-all.
  • List other manual tests you have done.

@clyang82
Copy link
Contributor

/hold
as we discussed offline, please investigate the real root case instead of deleting the pv

@ldpliu
Copy link
Contributor Author

ldpliu commented Dec 19, 2024

It looks must delete data or cluster id in data when recreate kafka cluster.
Please see:
https://github.com/orgs/strimzi/discussions/10082
https://issues.apache.org/jira/browse/KAFKA-7335

Copy link
Contributor

@clyang82 clyang82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

openshift-ci bot commented Dec 19, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: clyang82, ldpliu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@clyang82
Copy link
Contributor

/unhold
I am OK to delete the pv due to:

  1. we do not backup the kafka pv
  2. there is no solution from strimzi community. the only workaround is set the clusterID into .status. that workaround is not fit for us.

@ldpliu
Copy link
Contributor Author

ldpliu commented Dec 20, 2024

/retest

@openshift-merge-bot openshift-merge-bot bot merged commit 9571226 into stolostron:main Dec 20, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants