New version of zdm-proxy helm chart using StatefulSets #89

weideng1 · 2022-12-28T06:30:00Z

@jimdickinson @bradfordcp when you're back from the holidays, could you please help to take a look at the helm chart in this PR? Compared to the original version, the main improvement in this PR is that I've now started to use STS to manage all proxy pods, while keeping the services to route traffic to individual proxy pods via their names. This enables regular rolling restart using kubectl -n zdmproxy rollout restart sts/zdm-proxy with the appropriate wait time in between. Changes to values.yaml or --set values will trigger all proxy pods to roll-restart, so we can easily scale out and scale in.

I also removed configmap.yaml as all of the values in there can be directly retrieved from helm chart's values.yaml. This way, changes to these values can be directly detected by StatefulSets and automatically trigger rolling restart of all proxy pods without causing any downtime.

…rearchy of some helm values

…, so helm is no longer managing it

… helm

…s.yaml, including logLevel, readMode, primaryCluster, etal

…o-started, add livenessProbe and readinessProbe to proxy pods

lorantfecske-bud · 2023-01-12T11:47:42Z

I was going through the PR as we are planing to utilize the helm chart to deploy the zdm-proxy to our clusters.

I have few questions about this PR, it seems to me that you have added a new stateful set and set the services to point to the pods in the stateful set. However the deployment is still there, so if I understand this correctly, alongside the new statefulset the zdm proxy is also going to be deployed as a deployment, but no service will point to the deployment(s) anymore.

I'm also not 100% clear why the CDM deployment has been turned into a simple pod. We are using preemptive instances and they are recycled frequently. Based on kubernetes docs if the node is going away the pod is gone.
The Pod remains on that node until the Pod finishes execution, the Pod object is deleted, the Pod is evicted for lack of resources, or the node fails. Source: https://kubernetes.io/docs/concepts/workloads/pods/

weideng1 · 2023-01-17T15:49:23Z

I have few questions about this PR, it seems to me that you have added a new stateful set and set the services to point to the pods in the stateful set. However the deployment is still there, so if I understand this correctly, alongside the new statefulset the zdm proxy is also going to be deployed as a deployment, but no service will point to the deployment(s) anymore.

@lorantfecske-bud The latest commit in this branch k8s-helm has removed deployment completely, so zdm-proxy will use statefulsets, instead of deployment.

weideng1 · 2023-01-18T00:02:49Z

I'm also not 100% clear why the CDM deployment has been turned into a simple pod.

@lorantfecske-bud We're planning to take CDM (Cassandra Data Migrator) out of the zdm-proxy helm chart, as it is a completely separate open source project https://github.com/datastax/cassandra-data-migrator and needs to have its own deployment and lifecycle management approach. For now, you can use the following command to run CDM in your k8s cluster without the helm chart:

kubectl run cdm --image=datastax/cassandra-data-migrator:latest

k8s_helm_charts/zdm/templates/sts.yaml

jsanda · 2023-01-18T22:18:17Z

k8s_helm_charts/zdm/templates/service.yaml

@@ -21,6 +21,6 @@ spec:
      name: cql
  selector:
    {{- $zdm_selectorLabels | nindent 4 }}
-    app: {{ $zdm_fullname }}-{{ $index }}
+    statefulset.kubernetes.io/pod-name: {{ $zdm_fullname }}-{{ $index }}


Why are you trying to match a per-pod label?

@jsanda The reason for us to create individual services and match them 1:1 at pod level is because unlike C*, zdm-proxy processes don't have gossip protocol to detect/notify topology changes among themselves; whatever set of IP addresses and their respective ordinal index (see here) specified at the start of the proxy process is going to stuck for the rest of its lifecycle. To allow dynamical IP address change to the pods due to reschedule/cordon/crash, we decided to for N number of proxies, implement N number of services, and them map 1-on-1. Given that pods managed by k8s Deployment object doesn't have a static pod name, we switched to using StatefulSets. In StatefulSets, each pod has a unique/static pod-name that we can map the service to. This will allow orderly rolling restart of the zdm-proxy pods to happen.

lorantfecske-bud · 2023-01-24T17:03:15Z

@weideng1 thanks for then answers

…ractices

bradfordcp

Popping some comments in, but no glaring blockers from me. I expect @jsanda will handle proper approval.

k8s_helm_charts/zdm/templates/sts.yaml

bradfordcp · 2023-01-26T00:52:28Z

k8s_helm_charts/zdm/templates/sts.yaml

  selector:
    matchLabels:
      {{- $zdm_selectorLabels | nindent 6 }}
-      app: {{ $zdm_fullname }}-{{ $index }}
+      app: {{ $zdm_fullname }}


Consider these recommended labels from the k8s docs.

https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/

bradfordcp · 2023-01-26T00:55:31Z

k8s_helm_charts/zdm/templates/sts.yaml

+        - sh
+        - "-c"
+        - "ZDM_PROXY_TOPOLOGY_INDEX=`echo ${HOSTNAME##*-}` /main"


It's surprising that you're specifying a command here vs just passing in arguments to the command specified in the Docker image (or relying on arguments there).

bradfordcp · 2023-01-26T00:55:38Z

k8s_helm_charts/zdm/templates/sts.yaml

+        - sh
+        - "-c"
+        - "ZDM_PROXY_TOPOLOGY_INDEX=`echo ${HOSTNAME##*-}` /main"


You should be able to populate an environment variable with a fieldRef from metadata.name. See https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/

Hmm, it looks like you're extracting the ordinal index which is not available to the pod directly. It's frustrating that this isn't available. (I see the link to kubernetes/kubernetes#40651 as well)

@bradfordcp Yes exactly. That was the route I explored and ended up using the workaround as you pointed out. Should we resolve this comment?

weideng1 added 19 commits December 14, 2022 13:10

initial collection of helm charts to deploy zdm into k8s

9f32a90

added README file

9896ee1

pull zdm-proxy docker image version from helm chart's AppVersion

e842935

fixed a bug so release name can be customized now

29b0b4d

added a big more verification info in README

3d46813

added Cassandra Data Migrator (CDM) to the helm chart and adjusted hi…

0e765f3

…rearchy of some helm values

use cdm image 2.10.3 to solve python3 dependency

9647c87

modify README instructions for local dev machine

83a7cc6

changed README instructions to move secrets lifecycle outside of helm…

bb8c038

…, so helm is no longer managing it

address another small comment from Jim

461139e

move ZDM_ORIGIN_PORT from configmap to secret, which is preset before…

7235572

… helm

mpromoted a bunch of values that should be user configurable to value…

eb5c49b

…s.yaml, including logLevel, readMode, primaryCluster, etal

fixed a quote bug in configmap

6eaf66f

fixed three issues Phil found in his tests

286f7c9

address Joao's comments to reduce zdm-proxy memory down to 2GB

a42d5e9

Merge branch 'main' into k8s-helm

f58e7b3

change cdm to pod instead of deployment, as it doesn't need to be aut…

5d52bd0

…o-started, add livenessProbe and readinessProbe to proxy pods

fixed a bug caused by merging two commits

f86e2c8

switched k8s deployment to statefulsets for proxy pods #88

6f57a02

weideng1 requested review from joao-r-reis, alicel, grighetto and absurdfarce as code owners December 28, 2022 06:30

weideng1 requested review from bradfordcp and jimdickinson December 28, 2022 06:30

Merge branch 'main' into k8s-helm

e8dfd54

Merge branch 'main' into k8s-helm

9f1affc

remove CDM from helm chart

692a11b

weideng1 requested a review from jsanda January 18, 2023 21:58

jsanda reviewed Jan 18, 2023

View reviewed changes

weideng1 added 3 commits January 24, 2023 14:52

remove deployment.yaml

4ef5754

revised helm chart according to John's comments on values.yaml best p…

98636c4

…ractices

fix values in README.md that are no longer accurate

dd1c433

bradfordcp reviewed Jan 26, 2023

View reviewed changes

jocke-l mentioned this pull request Oct 25, 2024

Add support for service discovery on Kubernetes #129

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New version of zdm-proxy helm chart using StatefulSets #89

New version of zdm-proxy helm chart using StatefulSets #89

weideng1 commented Dec 28, 2022 •

edited

Loading

lorantfecske-bud commented Jan 12, 2023

weideng1 commented Jan 17, 2023

weideng1 commented Jan 18, 2023

jsanda Jan 18, 2023

weideng1 Jan 25, 2023

lorantfecske-bud commented Jan 24, 2023

bradfordcp left a comment

bradfordcp Jan 26, 2023

bradfordcp Jan 26, 2023

bradfordcp Jan 26, 2023

bradfordcp Jan 26, 2023

weideng1 Feb 6, 2023

New version of zdm-proxy helm chart using StatefulSets #89

Are you sure you want to change the base?

New version of zdm-proxy helm chart using StatefulSets #89

Conversation

weideng1 commented Dec 28, 2022 • edited Loading

lorantfecske-bud commented Jan 12, 2023

weideng1 commented Jan 17, 2023

weideng1 commented Jan 18, 2023

jsanda Jan 18, 2023

Choose a reason for hiding this comment

weideng1 Jan 25, 2023

Choose a reason for hiding this comment

lorantfecske-bud commented Jan 24, 2023

bradfordcp left a comment

Choose a reason for hiding this comment

bradfordcp Jan 26, 2023

Choose a reason for hiding this comment

bradfordcp Jan 26, 2023

Choose a reason for hiding this comment

bradfordcp Jan 26, 2023

Choose a reason for hiding this comment

bradfordcp Jan 26, 2023

Choose a reason for hiding this comment

weideng1 Feb 6, 2023

Choose a reason for hiding this comment

weideng1 commented Dec 28, 2022 •

edited

Loading