Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New version of zdm-proxy helm chart using StatefulSets #89

Open
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

weideng1
Copy link
Collaborator

@weideng1 weideng1 commented Dec 28, 2022

@jimdickinson @bradfordcp when you're back from the holidays, could you please help to take a look at the helm chart in this PR? Compared to the original version, the main improvement in this PR is that I've now started to use STS to manage all proxy pods, while keeping the services to route traffic to individual proxy pods via their names. This enables regular rolling restart using kubectl -n zdmproxy rollout restart sts/zdm-proxy with the appropriate wait time in between. Changes to values.yaml or --set values will trigger all proxy pods to roll-restart, so we can easily scale out and scale in.

I also removed configmap.yaml as all of the values in there can be directly retrieved from helm chart's values.yaml. This way, changes to these values can be directly detected by StatefulSets and automatically trigger rolling restart of all proxy pods without causing any downtime.

@lorantfecske-bud
Copy link

I was going through the PR as we are planing to utilize the helm chart to deploy the zdm-proxy to our clusters.

I have few questions about this PR, it seems to me that you have added a new stateful set and set the services to point to the pods in the stateful set. However the deployment is still there, so if I understand this correctly, alongside the new statefulset the zdm proxy is also going to be deployed as a deployment, but no service will point to the deployment(s) anymore.

I'm also not 100% clear why the CDM deployment has been turned into a simple pod. We are using preemptive instances and they are recycled frequently. Based on kubernetes docs if the node is going away the pod is gone.
The Pod remains on that node until the Pod finishes execution, the Pod object is deleted, the Pod is evicted for lack of resources, or the node fails. Source: https://kubernetes.io/docs/concepts/workloads/pods/

@weideng1
Copy link
Collaborator Author

I have few questions about this PR, it seems to me that you have added a new stateful set and set the services to point to the pods in the stateful set. However the deployment is still there, so if I understand this correctly, alongside the new statefulset the zdm proxy is also going to be deployed as a deployment, but no service will point to the deployment(s) anymore.

@lorantfecske-bud The latest commit in this branch k8s-helm has removed deployment completely, so zdm-proxy will use statefulsets, instead of deployment.

@weideng1
Copy link
Collaborator Author

I'm also not 100% clear why the CDM deployment has been turned into a simple pod.

@lorantfecske-bud We're planning to take CDM (Cassandra Data Migrator) out of the zdm-proxy helm chart, as it is a completely separate open source project https://github.com/datastax/cassandra-data-migrator and needs to have its own deployment and lifecycle management approach. For now, you can use the following command to run CDM in your k8s cluster without the helm chart:

kubectl run cdm --image=datastax/cassandra-data-migrator:latest

@weideng1 weideng1 requested a review from jsanda January 18, 2023 21:58
k8s_helm_charts/zdm/templates/sts.yaml Show resolved Hide resolved
k8s_helm_charts/zdm/templates/sts.yaml Show resolved Hide resolved
@@ -21,6 +21,6 @@ spec:
name: cql
selector:
{{- $zdm_selectorLabels | nindent 4 }}
app: {{ $zdm_fullname }}-{{ $index }}
statefulset.kubernetes.io/pod-name: {{ $zdm_fullname }}-{{ $index }}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you trying to match a per-pod label?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsanda The reason for us to create individual services and match them 1:1 at pod level is because unlike C*, zdm-proxy processes don't have gossip protocol to detect/notify topology changes among themselves; whatever set of IP addresses and their respective ordinal index (see here) specified at the start of the proxy process is going to stuck for the rest of its lifecycle. To allow dynamical IP address change to the pods due to reschedule/cordon/crash, we decided to for N number of proxies, implement N number of services, and them map 1-on-1. Given that pods managed by k8s Deployment object doesn't have a static pod name, we switched to using StatefulSets. In StatefulSets, each pod has a unique/static pod-name that we can map the service to. This will allow orderly rolling restart of the zdm-proxy pods to happen.

@lorantfecske-bud
Copy link

@weideng1 thanks for then answers

Copy link
Member

@bradfordcp bradfordcp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Popping some comments in, but no glaring blockers from me. I expect @jsanda will handle proper approval.

k8s_helm_charts/zdm/templates/sts.yaml Show resolved Hide resolved
selector:
matchLabels:
{{- $zdm_selectorLabels | nindent 6 }}
app: {{ $zdm_fullname }}-{{ $index }}
app: {{ $zdm_fullname }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider these recommended labels from the k8s docs.

https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/

Comment on lines +38 to +40
- sh
- "-c"
- "ZDM_PROXY_TOPOLOGY_INDEX=`echo ${HOSTNAME##*-}` /main"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's surprising that you're specifying a command here vs just passing in arguments to the command specified in the Docker image (or relying on arguments there).

Comment on lines +38 to +40
- sh
- "-c"
- "ZDM_PROXY_TOPOLOGY_INDEX=`echo ${HOSTNAME##*-}` /main"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to populate an environment variable with a fieldRef from metadata.name. See https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it looks like you're extracting the ordinal index which is not available to the pod directly. It's frustrating that this isn't available. (I see the link to kubernetes/kubernetes#40651 as well)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bradfordcp Yes exactly. That was the route I explored and ended up using the workaround as you pointed out. Should we resolve this comment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants