Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add jobs pod when using high availabily deployment #115

Merged
merged 5 commits into from
Apr 17, 2024
Merged

Conversation

jsirianni
Copy link
Member

@jsirianni jsirianni commented Apr 16, 2024

Description of Changes

Copied bindplane.yaml to bindplane-jobs.yaml. The new jobs deployment will deploy when the deployment type is "Deployment" (High availability).

Currently, jobs has the similar configuration as the main bindplane deployment. Notable changes:

  • BINDPLANE_MODE is all
  • Component label has value jobs
    • This will keep the jobs pod out of ingress
  • Has its own resource request and limit configuration

TODO

I think in the future we can refactor both deployments so they share more configuration. Specifically, environment variables. I would like to get more robust testing in place before making such a large change.

Testing

The jobs pod should only deploy when using postgres. When postgres is configured, the deployment type will be Deployment. When using bbolt, deployment type will be StatefulSet.

Bbolt

When testing w/ bbolt, the jobs pod is not deployed and the statefulset has mode: all.

NAME                                                      READY   STATUS    RESTARTS  
release-name-bindplane-0                                  1/1     Running   0        
release-name-bindplane-prometheus-0                       1/1     Running   0       
release-name-bindplane-transform-agent-59d7657d8f-k4jjv   1/1     Running   0        
$ kubectl exec sts/release-name-bindplane -- env | grep BINDPLANE_MODE

BINDPLANE_MODE=all

Postgres

When using Postgres, BindPlane is deployed as a multi pod Deployment along with a single pod "jobs" Deployment. The main Deployment has mode: node and the single jobs pod has mode: all. Also note that the jobs pod is not present in the main bindplane clusterIP service, meaning agents and UI will not connect to it. Its only purpose is to handle periodic jobs.

I created the license secret with:

kubectl create secret generic bindplane \
  --from-literal=license=$BINDPLANE_LICENSE

I deployed the pubsub and postgres helpers and used the values file at test/cases/pubsub.

NAME                                                      READY   STATUS    RESTARTS      AGE
release-name-bindplane-dd8cd779c-krg95                    1/1     Running   2 (55s ago)  
release-name-bindplane-dd8cd779c-plkw6                    1/1     Running   2 (56s ago)  
release-name-bindplane-dd8cd779c-v8b4j                    1/1     Running   2 (55s ago)  
release-name-bindplane-jobs-56d9b85479-8988g              1/1     Running   0             
release-name-bindplane-prometheus-0                       1/1     Running   0            
release-name-bindplane-transform-agent-75769c65d8-ssl7g   1/1     Running   0            

The bindplane pods restarted a couple times, waiting for the jobs pod to handle configuring the database for initial deployment.

Once deployed, the main deployment has mode: node

$ kubectl exec deploy/release-name-bindplane -- env | grep BINDPLANE_MODE

BINDPLANE_MODE=node

The jobs deployment has mode: all.

$ kubectl exec deploy/release-name-bindplane-jobs -- env | grep BINDPLANE_MODE

BINDPLANE_MODE=all

The clusterIP service does not match the job pod's ip address, meaning it will never route incoming requests to it. This is desired to keep its resource consumption low.

NAME                                                                                    IP                      
release-name-bindplane-dd8cd779c-krg95                   10.244.0.18   
release-name-bindplane-dd8cd779c-plkw6                   10.244.0.15  
release-name-bindplane-dd8cd779c-v8b4j                    10.244.0.16   
release-name-bindplane-jobs-56d9b85479-8988g       10.244.0.17   
release-name-bindplane-prometheus-0                         10.244.0.14  
$ kubectl describe svc release-name-bindplane | grep Endpoints

Endpoints:         10.244.0.15:3001,10.244.0.16:3001,10.244.0.18:3001

The UI can be accessed by port forwarding to the service.

kubectl port-forward service/release-name-bindplane 3011:3001

http://localhost:3011

Please check that the PR fulfills these requirements

  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)
  • CI passes

@@ -0,0 +1,447 @@
{{- if eq (include "bindplane.deployment_type" .) "Deployment" }}
Copy link
Member Author

@jsirianni jsirianni Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deployment is based on bindplane.yaml, with anything related to statefulset removed, such as volumes or Prometheus sidecar.

All logic is identical to bindplane.yaml unless otherwise noted.

Comment on lines +60 to +65
- name: BINDPLANE_MODE
{{- if eq (include "bindplane.deployment_type" .) "StatefulSet" }}
value: all
{{- else }}
value: node
{{- end }}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StatefulSet is deployed as a single pod, and should always have mode all.

Comment on lines +330 to +333
{{- with .Values.jobs.resources }}
resources:
{{- toYaml . | nindent 12 }}
{{- end }}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jobs specific resource configuration. In a highly scaled environment, it is possible the main pods will have low cpu and memory values, so we want the jobs pod to have its own configuration.

app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
spec:
replicas: 1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike bindplane.yaml, we always set replicas to 1 to ensure there is a single jobs pod.

During a rolling update, it is okay if multiple job pods exists during the lifetime of the pod rollout.

labels:
app.kubernetes.io/name: {{ include "bindplane.name" . }}
app.kubernetes.io/stack: bindplane
app.kubernetes.io/component: jobs
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The component label is jobs instead of server. The clusterIP service will not match this pod due to the labels.

@jsirianni jsirianni marked this pull request as ready for review April 17, 2024 14:26
@jsirianni jsirianni requested a review from a team as a code owner April 17, 2024 14:26
@jsirianni jsirianni requested review from Mrod1598, dpaasman00 and antonblock and removed request for Mrod1598 and dpaasman00 April 17, 2024 14:26
antonblock
antonblock previously approved these changes Apr 17, 2024
@jsirianni jsirianni merged commit 59a9da1 into main Apr 17, 2024
16 checks passed
@jsirianni jsirianni deleted the jobs branch April 17, 2024 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants