-
Notifications
You must be signed in to change notification settings - Fork 177
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add support to
TestBehavior.BUILD
(#1377)
By default, Cosmos uses `TestBehavior.AFTER_EACH`, creating an Airflow TaskGroup that contains two tasks: * one to run the model, seed or snapshot * another to run the tests related to that dbt resource While many users desire and expect this behaviour, it can also mean additional overhead, especially in dbt projects with more than 500 models. Each time the `dbt` command is executed, there is an overhead, even when using optimisations such as partial parsing and `dbtRunner`. There is also an overhead on splitting a task into multiple Airflow workers. Illustrating some numbers with data shared by an Astronomer customer regarding the dbt command execution (between the logs "running dbt with arguments" and "Done."): * Running `dbt build` for a particular model + its tests: 46s * Running `dbt run` + `dbt test` individually: 2min15s This PR introduces a new behaviour, `TestBehavior.BUILD`, where Cosmos can run both the model/seed/snapshot and the associated tests using a single command (`dbt build`). For documentation on the dbt build, check https://docs.getdbt.com/reference/commands/build. This is an example of how the DAG will render when using this test behaviour when running: ``` airflow dags test example_cosmos_dbt_build ``` <img width="1624" alt="Screenshot 2024-12-10 at 15 08 45" src="https://github.com/user-attachments/assets/d74d7688-5cbf-4f18-83ad-c9847e34252e"> And this is an example of the output, showing both the model is being run and also the tests, using the build command: ``` [2024-12-10 15:19:23,667] {local.py:405} INFO - Trying to run dbtRunner with: ['build', '--models', 'customers', '--full-refresh', '--project-dir', '/var/folders/td/522y78v91d1f5wgh67mj3p0m0000gn/T/tmpghz8naek', '--profiles-dir', '/tmp/profile/ac4e9cde9bc05d574c157e795dcbcc6b60246a73ca1d92d4fc669e90a1e494e0', '--profile', 'default', '--target', 'dev'] in /var/folders/td/522y78v91d1f5wgh67mj3p0m0000gn/T/tmpghz8naek [2024-12-10T15:19:23.667+0000] {local.py:405} INFO - Trying to run dbtRunner with: ['build', '--models', 'customers', '--full-refresh', '--project-dir', '/var/folders/td/522y78v91d1f5wgh67mj3p0m0000gn/T/tmpghz8naek', '--profiles-dir', '/tmp/profile/ac4e9cde9bc05d574c157e795dcbcc6b60246a73ca1d92d4fc669e90a1e494e0', '--profile', 'default', '--target', 'dev'] in /var/folders/td/522y78v91d1f5wgh67mj3p0m0000gn/T/tmpghz8naek 15:19:23 Running with dbt=1.8.0 15:19:23 Registered adapter: postgres=1.8.0 15:19:23 Found 5 models, 3 seeds, 20 data tests, 528 macros 15:19:23 15:19:23 Concurrency: 1 threads (target='dev') 15:19:23 15:19:23 1 of 4 START sql table model public.customers .................................. [RUN] 15:19:23 1 of 4 OK created sql table model public.customers ............................. [SELECT 100 in 0.04s] 15:19:23 2 of 4 START test not_null_customers_customer_id ............................... [RUN] 15:19:23 2 of 4 PASS not_null_customers_customer_id ..................................... [PASS in 0.02s] 15:19:23 3 of 4 START test relationships_orders_customer_id__customer_id__ref_customers_ [RUN] 15:19:23 3 of 4 PASS relationships_orders_customer_id__customer_id__ref_customers_ ...... [PASS in 0.02s] 15:19:23 4 of 4 START test unique_customers_customer_id ................................. [RUN] 15:19:23 4 of 4 PASS unique_customers_customer_id ....................................... [PASS in 0.02s] 15:19:23 15:19:23 Finished running 1 table model, 3 data tests in 0 hours 0 minutes and 0.19 seconds (0.19s). 15:19:24 15:19:24 Completed successfully 15:19:24 15:19:24 Done. PASS=4 WARN=0 ERROR=0 SKIP=0 TOTAL=4 ``` Closes: #892
- Loading branch information
Showing
8 changed files
with
153 additions
and
13 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
""" | ||
An example Airflow DAG that illustrates using the dbt build to run both models/seeds/sources and their respective tests. | ||
""" | ||
|
||
import os | ||
from datetime import datetime | ||
from pathlib import Path | ||
|
||
from cosmos import DbtDag, ProfileConfig, ProjectConfig, RenderConfig | ||
from cosmos.constants import TestBehavior | ||
from cosmos.profiles import PostgresUserPasswordProfileMapping | ||
|
||
DEFAULT_DBT_ROOT_PATH = Path(__file__).parent / "dbt" | ||
DBT_ROOT_PATH = Path(os.getenv("DBT_ROOT_PATH", DEFAULT_DBT_ROOT_PATH)) | ||
|
||
profile_config = ProfileConfig( | ||
profile_name="default", | ||
target_name="dev", | ||
profile_mapping=PostgresUserPasswordProfileMapping( | ||
conn_id="example_conn", | ||
profile_args={"schema": "public"}, | ||
disable_event_tracking=True, | ||
), | ||
) | ||
|
||
# [START build_example] | ||
example_cosmos_dbt_build = DbtDag( | ||
# dbt/cosmos-specific parameters | ||
project_config=ProjectConfig( | ||
DBT_ROOT_PATH / "jaffle_shop", | ||
), | ||
render_config=RenderConfig( | ||
test_behavior=TestBehavior.BUILD, | ||
), | ||
profile_config=profile_config, | ||
operator_args={ | ||
"install_deps": True, # install any necessary dependencies before running any dbt command | ||
"full_refresh": True, # used only in dbt commands that support this flag | ||
}, | ||
# normal dag parameters | ||
schedule_interval="@daily", | ||
start_date=datetime(2023, 1, 1), | ||
catchup=False, | ||
dag_id="example_cosmos_dbt_build", | ||
default_args={"retries": 2}, | ||
) | ||
# [END build_example] |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters