This is an example project running a Spark app on Kubernetes. It runs a PySpark job as spark driver deployment on Kubernetes. More information here.
This example contains two spark deployment mode:
- local mode(helm-values/k8s-spark-local-example )
- client mode(helm-values/k8s-spark-client-example )
Install local Kubernetes cluster first. Use minikube.
Install task
build tools task.
Run locally:
task run.local
Undeploy locally:
task spark.helm.undeploy
Edit .gitlab-ci.yml
file to adapt it to deploy to your own k8s namespace, please read team plateform documentation for details about gitlab runner and Kubernetes.
Create a branch will automatically deploy it on ew1d2 cluster, data-flux-dev namespace. Merge branch will deploy code on data-flux-stg and then to ew1p3 data-flux namespace.