-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configure Airflow tasks using dbt model meta #1339
base: main
Are you sure you want to change the base?
Conversation
It would be great to add documentation for the features introduced in this PR. However, looking at the current project, it seems there aren’t any markdown files for documentation apart from the CONTRIBUTING file. Where do you think would be the best place to add this documentation? |
We use rst. You can find docs at https://github.com/astronomer/astronomer-cosmos/tree/main/docs |
Could you please rebase this PR |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1339 +/- ##
==========================================
- Coverage 96.02% 95.98% -0.04%
==========================================
Files 67 67
Lines 4025 4036 +11
==========================================
+ Hits 3865 3874 +9
- Misses 160 162 +2 ☔ View full report in Codecov by Sentry. |
related issue? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @wornjs,
The per-node configuration has been a long-standing request (example: #881 (comment)), and your PR solves this. This is an exciting feature; thanks for contributing to Cosmos!
A question and two requests:
- What are the advantages/disadvantages of having these properties in meta as opposed to having them in config:
meta:
cosmos:
pool: abcd
versus
config:
cosmos:
pool: abcd
-
Please, could you add tests to cover this feature?
-
We'll also need docs.
If you can address these before 20 December, we can ship Cosmos 1.8. I'm tentatively adding this to that milestone.
i'm gonna add test until this week |
Hi, responding here instead of in #1325 as it seems the discussion has migrated here. First, this PR is related to #881. This issue was for supporting arbitrary kwargs, not just a single kwarg. I was a big fan of where we ended up with the API of that proposal, and there are a few differences. The first difference, as @tatiana brings up, is that it uses The second difference is, as mentioned, it supports arbitrary kwargs, not just The third difference is that the namespace was So my final proposal would be this: version: 2
models:
- name: model_a
config:
alias: model_a
cosmos:
operator_args:
pool: my-pool-here (or alternatively, if using version: 2
models:
- name: model_a
config:
alias: model_a
meta:
cosmos:
operator_args:
pool: my-pool-here ^ So the key difference is there is one extra dict, i.e. But really, I think we should just version: 2
models:
- name: model_a
config:
alias: model_a
cosmos:
operator_args:
pool: my-pool-here
retries: 4
conn_id: special_conn_id |
I've confirmed both
(The only way to support merging this way would be to use top-level attributes, but I think that's tedious and limiting.) There is an issue open relating to this in dbt-core where someone suggests recursive merging for |
Description
The various dbt models have unique characteristics, and some may require the use of custom pools, queues, or other specific configurations. To support such cases, this update introduces the ability to add necessary information in the meta section of the dbt model.yaml. This metadata is then passed as kwargs to the corresponding Airflow tasks, enabling model-specific customization and enhanced task configuration.
here is sample
DbtTaskGroup - default_args for all dbt models
result
general pool
custom pool
Related Issue(s)
#1325
Breaking Change?
Checklist