Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add a guide for running airflow tasks on custom nodes #871

Merged
merged 1 commit into from
Oct 16, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 68 additions & 0 deletions airflow/plural/docs/running-on-custom-nodes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
## Running Airflow Tasks on Custom Node Group

In order to point your tasks at a custom node group, you will need to use the `KubernetesExecutor`

There may be a desire to run your Airflow tasks on a specific node size for large workloads, or maybe even
[spot instances](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances.html) to achieve higher cost
savings.

> Disclaimer: if you run your Airflow workloads on spot instances, it is highly recommended to [set retries](https://docs.astronomer.io/learn/rerunning-dags)
> for your tasks as they may lose their underlying compute at any time

### create custom node group
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a mention that this step is optional only if you want to ensure at least 3 spot nodes are present at any time for latency


In order to run your Airflow Tasks on custom configure nodes, you will need to first follow [these docs](https://docs.plural.sh/operations/cluster-configuration#modifying-node-types)
to create your desired nodes. For example, if you were on AWS and wanted to use spot instances you would add something
like this to your `bootstrap/terraform/main.tf` file:

```yaml
multi_az_node_groups = {
medium_burst_spot = {
name = "medium-burst-spot"
min_capacity = 3
desired_capacity = 3
instance_types = ["t3.xlarge", "t3a.xlarge"]
capacity_type = "SPOT"
k8s_labels = {
"plural.sh/capacityType" = "SPOT"
"plural.sh/performanceType" = "BURST"
"plural.sh/scalingGroup" = "medium-burst-spot"
}
k8s_taints = [{
key = "plural.sh/capacityType"
value = "SPOT"
effect = "NO_SCHEDULE"
}]
}
}
```

Then run `plural deploy --commit "add more spot nodes"` to update your cluster.

> ! If you get an error like `InvalidParameterException: Minimum capacity 3 can't be greater than desired size 0` you
> may have to use your cloud CLI or console to enact the change manually and then try running again.

### update airflow to use node group

After creating your custom node group, you can point configure Airflow to use it by adding the following to your
`./airflow/helm/values.yaml` (this can also be done in the plural application console)

```yaml
airflow:
airflow:
airflow:
config:
kubernetesPodTemplate:
nodeSelector:
plural.sh/capacityType: SPOT
tolerations:
- effect: NoSchedule
key: plural.sh/capacityType
operator: Equal
value: SPOT
```

### redeploy

From there, you should be able to run `plural build --only airflow && plural deploy --commit "run on spot instances"` to
use the custom node group to execute your tasks.