From a5873c13b85e68b516ae039e1296702bd1349ca7 Mon Sep 17 00:00:00 2001 From: Enrico Deusebio Date: Mon, 4 Sep 2023 18:41:20 +0200 Subject: [PATCH] [DPE-2446] How-to Upgrade Documentation --- docs/how-to/h-upgrade.md | 115 +++++++++++++++++++++++++++++++++++++++ docs/index.md | 41 +++++++------- 2 files changed, 136 insertions(+), 20 deletions(-) create mode 100644 docs/how-to/h-upgrade.md diff --git a/docs/how-to/h-upgrade.md b/docs/how-to/h-upgrade.md new file mode 100644 index 00000000..6d09b4a5 --- /dev/null +++ b/docs/how-to/h-upgrade.md @@ -0,0 +1,115 @@ +# How to upgrade between minor versions + +> **:warning: WARNING:** this is currently a feature in "edge" or "beta" risks channels. We do NOT recommend to use features in production that are not in the `stable` channel! Also, contact [Canonical Data Platform team](https://chat.charmhub.io/charmhub/channels/data-platform) if you are interested in the topic. + +Charm upgrades allow admin to upgrade both operator code (e.g. the revision used by the charm) and/or the workload version. Note that since the charm code pin a particular version of the workload, a charm upgrade may or may not involve also a workload version upgrade. In general, the following guide only applies for in-place upgrade that involve at most minor version upgrade of Kafka workload, e.g. between Kafka 3.4.x to 3.5.x. Major workload upgrades are generally **NOT SUPPORTED** and they should be carried out using out-of-place migrations. Please refer to the how-to guide about cluster migration [here](/t/charmed-kafka-how-to-cluster-migration/10951) for more information on how this can be achieved. + +We strongly recommend to **NOT** perform any other straordinary operations on the Kafka cluster, while upgrading. As an examples, these may be (but not limited) to the following: +1. Adding or removing units +2. Creating or destroying new relations +3. Changes in workload configuration +4. Upgrading other connected applications (e.g. Zookeeper) + +The concurrency with other operations is not supported and it can lead the cluster into unconsistent states. + +## Minor upgrade process overview + +When performing an in-place upgrade process, the full process is composed by the following high-level steps: + +1. **Collect** all necessary pre-upgrade information, necessary for a rollback (if ever needed) +2. **Prepare** the charm for the in-place upgrade, by running some preparatory tasks +3. **Upgrade** the charm and/or the workload. Once started all units in a cluster will refresh the charm code and undergo a workload restart/update. The upgrade will be aborted if the unit upgrade has failed, requiring the admin user to rollback. +4. **Post-upgrade checks** to make ake sure all units are in the proper state and the cluster is healthy. + +## Step 1: Collect + +The first step is to record the revisions of the running application, as a safety measure for a rollback action if needed. To accomplish this, simply run the `juju status` command and look for the revisions of the deployed Kafka and Zookeeper applications. You can also retrieve this with the following command: + +```shell +KAFKA_CHARM_REVISION=$(juju status --format json | yq .applications..charm-rev) +ZOOKEEPER_CHARM_REVISION=$(juju status --format json | yq .applications..charm-rev) +``` + +Please fill `` and `}` placeholder appropriately, e.g. `kafka` and `zookeeper`. + +## Step 2: Prepare + +Before upgrading, the charm needs to perform some preparatory tasks to define the upgrade plan. +To do so, run the `pre-upgrade-check` action against the leader unit: + +```shell +juju run-action mysql/leader pre-upgrade-check --wait +``` + +Make sure that the output of the action is successful, e.g. the output should read: + +```shell +unit-kafka-0: + ... + results: {} + status: completed + ... +``` + +Note that you won't be able to upgrade successfully unless you complete successfully this action. +The action will also configure the charm to minimize the amount of primary switchover, among other preparations for a safe upgrade process. After successful execution, the charm is ready to be upgraded. + +## Step 3: Upgrade + +Use the [`juju refresh`](https://juju.is/docs/juju/juju-refresh) command to trigger the charm upgrade process. +Note that the upgrade can be performed against: + +* selected channel/track, therefore upgrading to the latest revision published on that track +```shell +juju refresh kafka --channel 3/edge +``` +* selected revision +```shell +juju refresh kafka --revision= +``` +* a local charm file +```shell +juju refresh kafka --path ./kafka_ubuntu-22.04-amd64.charm +``` + +When issuing the commands, all units will refresh (i.e. receive new charm content), and the upgrade charm event will be fired. The charm will take care of executing a update (if required) and a restart of the workload one unit at the time to not lose high-availability. + +The upgrade process can be monitored using `juju status` command, where the message of the units will provide information about which units have been upgraded already, which unit is currently upgrading and which units are waiting for the upgrade to be triggered, as shown below: + +```shell +... + +App Version Status Scale Charm Channel Rev Exposed Message +kafka active 3 kafka 3/edge 135 no + +Unit Workload Agent Machine Public address Ports Message +... +kafka/0 active idle 3 10.193.41.131 Upgrade completed +kafka/1* active idle 4 10.193.41.109 Upgrading... +kafka/2 active idle 5 10.193.41.221 Other units upgrading first... +... + +``` + +### Failing upgrade + +Before upgrading the unit, the charm will check whether the upgrade can be performed, e.g. this may mean: +1. Checking that the upgrade from the previous charm revision and Kafka version is allowed +2. Checking that other external applications that Kafka depends on (e.g. Zookeeper) are running the correct version + +Note that these checks are only possible after a refresh of the charm code, and therefore cannot be done upfront (e.g. during the `pre-upgrade-checks` action). +If some of these checks fail, the upgrade will be aborted. When this happens, the workload may still be operating (as only the operator may have failed) but we recommend to rollback the upgrade as soon as possible. + +In order to rollback the upgrade, re-run steps 2 and 3, using the revision taken in step 1, i.e. + +```shell +juju run-action kafka/leader pre-upgrade-check --wait + +juju refresh kafka --revision=${KAFKA_CHARM_REVISION} +``` + +We strongly recommend to also retrieve the full set of logs with `juju debug-log`, to extract insights on why the upgrade failed. + +## Kafka and Zookeeper combined upgrades + +Although the following guide will focus on upgrading Kafka, the same process can also be applied to Zookeeper, should you need to upgrade this component as well. If Kafka and Zookeeper charms need both to be upgraded, we recommend you to start the upgrade from the Zookeeper cluster. As outlined above, the two upgrades should **NEVER** be done concurrently. diff --git a/docs/index.md b/docs/index.md index 1ef9f6a9..1c775a86 100644 --- a/docs/index.md +++ b/docs/index.md @@ -42,26 +42,27 @@ The Charmed Kafka Operator is free software, distributed under the Apache Softwa # Navigation -| Level | Path | Navlink | -|-------|------------------------|-----------------------------------------------------------------------------------------------------------------| -| 1 | tutorial | [Tutorial]() | -| 2 | t-overview | [1. Introduction](/t/charmed-kafka-tutorial-overview/10571) | -| 2 | t-setup-environment | [2. Set up the environment](/t/charmed-kafka-tutorial-setup-environment/10575) | -| 2 | t-deploy-kafka | [3. Deploy Kafka](/t/charmed-kafka-tutorial-deploy-kafka/10567) | -| 2 | t-manage-passwords | [4. Manage passwords](/t/charmed-kafka-tutorial-manage-passwords/10569) | -| 2 | t-relate-kafka | [5. Relate Kafka to other applications](/t/charmed-kafka-tutorial-relate-kafka/10573) | -| 2 | t-cleanup-environment | [6. Cleanup your environment](/t/charmed-kafka-tutorial-cleanup-environment/10565) | -| 1 | how-to | [How To]() | -| 2 | h-manage-units | [Manage units](/t/charmed-kafka-how-to-manage-units/10287) | -| 2 | h-enable-encryption | [Enable encryption](/t/charmed-kafka-how-to-enable-encryption/10281) | -| 2 | h-manage-app | [Manage applications](/t/charmed-kafka-how-to-manage-app/10285) | -| 2 | h-enable-monitoring | [Enable Monitoring](/t/charmed-kafka-how-to-enable-monitoring/10283) | -| 1 | reference | [Reference]() | -| 2 | r-actions | [Actions](https://charmhub.io/kafka/actions?channel=3/stable) | -| 2 | r-configurations | [Configurations](https://charmhub.io/kafka/configure?channel=3/stable) | -| 2 | r-libraries | [Libraries](https://charmhub.io/kafka/libraries/kafka_libs?channel=3/stable) | -| 2 | r-requirements | [Requirements](/t/charmed-kafka-reference-requirements/10563) | -| 2 | r-performance-tuning | [Performance Tuning](/t/charmed-kafka-reference-performace-tuning/10561) | +| Level | Path | Navlink | +|-------|-----------------------|---------------------------------------------------------------------------------------| +| 1 | tutorial | [Tutorial]() | +| 2 | t-overview | [1. Introduction](/t/charmed-kafka-tutorial-overview/10571) | +| 2 | t-setup-environment | [2. Set up the environment](/t/charmed-kafka-tutorial-setup-environment/10575) | +| 2 | t-deploy-kafka | [3. Deploy Kafka](/t/charmed-kafka-tutorial-deploy-kafka/10567) | +| 2 | t-manage-passwords | [4. Manage passwords](/t/charmed-kafka-tutorial-manage-passwords/10569) | +| 2 | t-relate-kafka | [5. Relate Kafka to other applications](/t/charmed-kafka-tutorial-relate-kafka/10573) | +| 2 | t-cleanup-environment | [6. Cleanup your environment](/t/charmed-kafka-tutorial-cleanup-environment/10565) | +| 1 | how-to | [How To]() | +| 2 | h-manage-units | [Manage units](/t/charmed-kafka-how-to-manage-units/10287) | +| 2 | h-enable-encryption | [Enable encryption](/t/charmed-kafka-how-to-enable-encryption/10281) | +| 2 | h-manage-app | [Manage applications](/t/charmed-kafka-how-to-manage-app/10285) | +| 2 | h-enable-monitoring | [Enable Monitoring](/t/charmed-kafka-how-to-enable-monitoring/10283) | +| 2 | h-upgrade | [Upgrade](/t/TODO) | +| 1 | reference | [Reference]() | +| 2 | r-actions | [Actions](https://charmhub.io/kafka/actions?channel=3/stable) | +| 2 | r-configurations | [Configurations](https://charmhub.io/kafka/configure?channel=3/stable) | +| 2 | r-libraries | [Libraries](https://charmhub.io/kafka/libraries/kafka_libs?channel=3/stable) | +| 2 | r-requirements | [Requirements](/t/charmed-kafka-reference-requirements/10563) | +| 2 | r-performance-tuning | [Performance Tuning](/t/charmed-kafka-reference-performace-tuning/10561) | # Redirects