Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RabbitMQ: explain how to properly avoid an OrderedReady-induced deployment deadlock (take 1) #25873

Closed
wants to merge 4 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 21 additions & 29 deletions bitnami/rabbitmq/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -263,43 +263,35 @@ extraConfiguration: |
log.console.formatter = json
```

### Recover the cluster from complete shutdown
## How to Avoid Deadlocked Deployemnts After a Cluster-Wide Restart

> IMPORTANT: Some of these procedures can lead to data loss. Always make a backup beforehand.
RabbitMQ nodes assume their peers come back online within five minutes (by default). With the `OrderedReady` pod management policy is used
michaelklishin marked this conversation as resolved.
Show resolved Hide resolved
with a readiness probe that implicitly requires a fully booted node, the deployment can deadlock:

The RabbitMQ cluster is able to support multiple node failures but, in a situation in which all the nodes are brought down at the same time, the cluster might not be able to self-recover.
- Kubernetes will expect the first node to pass a readiness probe
- The readiness probe may require a fully booted node
- The node will fully boot after it detects that its peers have come online
- Kubernetes will not start any more pods until the first one boots

This happens if the pod management policy of the statefulset is not `Parallel` and the last pod to be running wasn't the first pod of the statefulset. If that happens, update the pod management policy to recover a healthy state:
Using [RabbitMQ Cluster Operator](https://www.rabbitmq.com/kubernetes/operator/operator-overview) is the easies solution.

```console
$ kubectl delete statefulset STATEFULSET_NAME --cascade=false
helm upgrade RELEASE_NAME oci://REGISTRY_NAME/REPOSITORY_NAME/rabbitmq \
--set podManagementPolicy=Parallel \
--set replicaCount=NUMBER_OF_REPLICAS \
--set auth.password=PASSWORD \
--set auth.erlangCookie=ERLANG_COOKIE
```

> Note: You need to substitute the placeholders `REGISTRY_NAME` and `REPOSITORY_NAME` with a reference to your Helm chart registry and repository. For example, in the case of Bitnami, you need to use `REGISTRY_NAME=registry-1.docker.io` and `REPOSITORY_NAME=bitnamicharts`.
Alternatively, the following combination of deployment settings avoids the problem:

For a faster resyncronization of the nodes, you can temporarily disable the readiness probe by setting `readinessProbe.enabled=false`. Bear in mind that the pods will be exposed before they are actually ready to process requests.
- Use `podManagementPolicy: "Parallel"` to boot multiple cluster nodes in parallel
- Use `rabbitmq-diagnostics ping` for readiness probe

If the steps above don't bring the cluster to a healthy state, it could be possible that none of the RabbitMQ nodes think they were the last node to be up during the shutdown. In those cases, you can force the boot of the nodes by specifying the `clustering.forceBoot=true` parameter (which will execute [`rabbitmqctl force_boot`](https://www.rabbitmq.com/rabbitmqctl.8.html#force_boot) in each pod):
Note that forcing nodes to boot is **not a solution** and doing so **can be dangerous**. Forced booting is a last resort mechanism
in RabbitMQ that helps make remaining clusters nodes to recover and rejoin each other after a permanent loss of some of their former
peers. In other words, forced booting a node is en emergency event recovery procedure.

```console
helm upgrade RELEASE_NAME oci://REGISTRY_NAME/REPOSITORY_NAME/rabbitmq \
--set podManagementPolicy=Parallel \
--set clustering.forceBoot=true \
--set replicaCount=NUMBER_OF_REPLICAS \
--set auth.password=PASSWORD \
--set auth.erlangCookie=ERLANG_COOKIE
```

> Note: You need to substitute the placeholders `REGISTRY_NAME` and `REPOSITORY_NAME` with a reference to your Helm chart registry and repository. For example, in the case of Bitnami, you need to use `REGISTRY_NAME=registry-1.docker.io` and `REPOSITORY_NAME=bitnamicharts`.
To learn more, see

More information: [Clustering Guide: Restarting](https://www.rabbitmq.com/clustering.html#restarting).
- [RabbitMQ Clustering guide: Node Restarts](https://www.rabbitmq.com/docs/clustering#restarting)
- [RabbitMQ Clustering guide: Restarts and Readiness Probes](https://www.rabbitmq.com/docs/clustering#restarting-readiness-probes)
- [Recommendations](https://www.rabbitmq.com/docs/cluster-formation#peer-discovery-k8s) for [Operator](https://www.rabbitmq.com/kubernetes/operator/operator-overview)-less (DIY) deployments to Kubernetes
- [DIY RabbitMQ deployments on Kubernetes](https://www.rabbitmq.com/blog/2020/08/10/deploying-rabbitmq-to-kubernetes-whats-involved): What's Involved?

### Known issues
## Known issues

- Changing the password through RabbitMQ's UI can make the pod fail due to the default liveness probes. If you do so, remember to make the chart aware of the new password. Updating the default secret with the password you set through RabbitMQ's UI will automatically recreate the pods. If you are using your own secret, you may have to manually recreate the pods.

Expand Down Expand Up @@ -876,4 +868,4 @@ Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
limitations under the License.
Loading