Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add performance integration tests #827

Merged
merged 6 commits into from
Feb 15, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 53 additions & 6 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: test

on:
push: # Run on pushes to the default branch
branches: [main]
branches: [main, performance-int-tests] # TODO: remove me before merging
jlaneve marked this conversation as resolved.
Show resolved Hide resolved
pull_request_target: # Also run on pull requests originated from forks
branches: [main]

Expand All @@ -11,10 +11,8 @@ concurrency:
cancel-in-progress: true

jobs:

Authorize:
environment:
${{ github.event_name == 'pull_request_target' &&
environment: ${{ github.event_name == 'pull_request_target' &&
github.event.pull_request.head.repo.full_name != github.repository &&
'external' || 'internal' }}
runs-on: ubuntu-latest
Expand All @@ -30,8 +28,8 @@ jobs:

- uses: actions/setup-python@v3
with:
python-version: '3.9'
architecture: 'x64'
python-version: "3.9"
architecture: "x64"

- run: pip3 install hatch
- run: hatch run tests.py3.9-2.7:type-check
Expand Down Expand Up @@ -294,6 +292,55 @@ jobs:
AIRFLOW_CONN_AIRFLOW_DB: postgres://postgres:[email protected]:5432/postgres
PYTHONPATH: /home/runner/work/astronomer-cosmos/astronomer-cosmos/:$PYTHONPATH

Run-Performance-Tests:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.11"]
airflow-version: ["2.7"]
num-models: [1, 10, 50, 100]

steps:
- uses: actions/checkout@v3
with:
ref: ${{ github.event.pull_request.head.sha || github.ref }}
- uses: actions/cache@v3
with:
path: |
~/.cache/pip
.nox
key: perf-test-${{ runner.os }}-${{ matrix.python-version }}-${{ matrix.airflow-version }}-${{ hashFiles('pyproject.toml') }}-${{ hashFiles('cosmos/__init__.py') }}

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}

- name: Install packages and dependencies
run: |
python -m pip install hatch
hatch -e tests.py${{ matrix.python-version }}-${{ matrix.airflow-version }} run pip freeze

- name: Run performance tests against against Airflow ${{ matrix.airflow-version }} and Python ${{ matrix.python-version }}
id: run-performance-tests
run: |
hatch run tests.py${{ matrix.python-version }}-${{ matrix.airflow-version }}:test-performance-setup
hatch run tests.py${{ matrix.python-version }}-${{ matrix.airflow-version }}:test-performance

# read the performance results and set them as an env var for the next step
# format: NUM_MODELS={num_models}\nTIME={end - start}\n
cat /tmp/performance_results.txt > $GITHUB_STEP_SUMMARY
env:
AIRFLOW_HOME: /home/runner/work/astronomer-cosmos/astronomer-cosmos/
AIRFLOW_CONN_AIRFLOW_DB: postgres://postgres:[email protected]:5432/postgres
AIRFLOW__CORE__DAGBAG_IMPORT_TIMEOUT: 90.0
PYTHONPATH: /home/runner/work/astronomer-cosmos/astronomer-cosmos/:$PYTHONPATH
MODEL_COUNT: ${{ matrix.num-models }}

env:
AIRFLOW_HOME: /home/runner/work/astronomer-cosmos/astronomer-cosmos/
AIRFLOW_CONN_AIRFLOW_DB: postgres://postgres:[email protected]:5432/postgres
PYTHONPATH: /home/runner/work/astronomer-cosmos/astronomer-cosmos/:$PYTHONPATH

Code-Coverage:
if: github.event.action != 'labeled'
Expand Down
4 changes: 4 additions & 0 deletions dev/dags/dbt/perf/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@

target/
dbt_packages/
logs/
3 changes: 3 additions & 0 deletions dev/dags/dbt/perf/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
dbt project for running performance tests.

The `models` directory gets populated by an integration test defined in `tests/perf`.
Empty file.
17 changes: 17 additions & 0 deletions dev/dags/dbt/perf/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Name your project! Project names should contain only lowercase characters
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: "perf"
version: "1.0.0"
config-version: 2

model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

clean-targets: # directories to be removed by `dbt clean`
- "target"
- "dbt_packages"
Empty file.
11 changes: 11 additions & 0 deletions dev/dags/dbt/perf/profiles.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
simple:
target: dev
outputs:
dev:
type: sqlite
threads: 1
database: "database"
schema: "main"
schemas_and_paths:
main: "{{ env_var('DBT_SQLITE_PATH') }}/imdb.db"
schema_directory: "{{ env_var('DBT_SQLITE_PATH') }}"
Empty file.
Empty file.
Empty file.
36 changes: 36 additions & 0 deletions dev/dags/performance_dag.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
"""
A DAG that uses Cosmos to render a dbt project for performance testing.
"""

import airflow
from datetime import datetime
import os
from pathlib import Path

from cosmos import DbtDag, ProjectConfig, ProfileConfig, RenderConfig

DEFAULT_DBT_ROOT_PATH = Path(__file__).parent / "dbt"
DBT_ROOT_PATH = Path(os.getenv("DBT_ROOT_PATH", DEFAULT_DBT_ROOT_PATH))
DBT_SQLITE_PATH = str(DEFAULT_DBT_ROOT_PATH / "data")

profile_config = ProfileConfig(
profile_name="simple",
target_name="dev",
profiles_yml_filepath=(DBT_ROOT_PATH / "simple/profiles.yml"),
)

cosmos_perf_dag = DbtDag(
project_config=ProjectConfig(
DBT_ROOT_PATH / "perf",
env_vars={"DBT_SQLITE_PATH": DBT_SQLITE_PATH},
),
profile_config=profile_config,
render_config=RenderConfig(
dbt_deps=False,
),
# normal dag parameters
schedule_interval=None,
start_date=datetime(2024, 1, 1),
catchup=False,
dag_id="performance_dag",
)
122 changes: 28 additions & 94 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,8 @@ description = "Orchestrate your dbt projects in Airflow"
readme = "README.rst"
license = "Apache-2.0"
requires-python = ">=3.8"
authors = [
{ name = "Astronomer", email = "[email protected]" },
]
keywords = [
"airflow",
"apache-airflow",
"astronomer",
"dags",
"dbt",
]
authors = [{ name = "Astronomer", email = "[email protected]" }]
keywords = ["airflow", "apache-airflow", "astronomer", "dags", "dbt"]
classifiers = [
"Development Status :: 3 - Alpha",
"Environment :: Web Environment",
Expand Down Expand Up @@ -56,48 +48,23 @@ dbt-all = [
"dbt-spark",
"dbt-vertica",
]
dbt-athena = [
"dbt-athena-community",
"apache-airflow-providers-amazon>=8.0.0",
]
dbt-bigquery = [
"dbt-bigquery",
]
dbt-databricks = [
"dbt-databricks",
]
dbt-exasol = [
"dbt-exasol",
]
dbt-postgres = [
"dbt-postgres",
]
dbt-redshift = [
"dbt-redshift",
]
dbt-snowflake = [
"dbt-snowflake",
]
dbt-spark = [
"dbt-spark",
]
dbt-vertica = [
"dbt-vertica<=1.5.4",
]
openlineage = [
"openlineage-integration-common",
"openlineage-airflow",
]
all = [
"astronomer-cosmos[dbt-all]",
"astronomer-cosmos[openlineage]"
]
docs =[
dbt-athena = ["dbt-athena-community", "apache-airflow-providers-amazon>=8.0.0"]
dbt-bigquery = ["dbt-bigquery"]
dbt-databricks = ["dbt-databricks"]
dbt-exasol = ["dbt-exasol"]
dbt-postgres = ["dbt-postgres"]
dbt-redshift = ["dbt-redshift"]
dbt-snowflake = ["dbt-snowflake"]
dbt-spark = ["dbt-spark"]
dbt-vertica = ["dbt-vertica<=1.5.4"]
openlineage = ["openlineage-integration-common", "openlineage-airflow"]
all = ["astronomer-cosmos[dbt-all]", "astronomer-cosmos[openlineage]"]
docs = [
"sphinx",
"pydata-sphinx-theme",
"sphinx-autobuild",
"sphinx-autoapi",
"apache-airflow-providers-cncf-kubernetes>=5.1.1"
"apache-airflow-providers-cncf-kubernetes>=5.1.1",
]
tests = [
"packaging",
Expand Down Expand Up @@ -137,9 +104,7 @@ Documentation = "https://astronomer.github.io/astronomer-cosmos"
path = "cosmos/__init__.py"

[tool.hatch.build.targets.sdist]
include = [
"/cosmos",
]
include = ["/cosmos"]

[tool.hatch.build.targets.wheel]
packages = ["cosmos"]
Expand Down Expand Up @@ -175,51 +140,20 @@ matrix.airflow.dependencies = [
[tool.hatch.envs.tests.scripts]
freeze = "pip freeze"
type-check = "mypy cosmos"
test = 'pytest -vv --durations=0 . -m "not integration" --ignore=tests/test_example_dags.py --ignore=tests/test_example_dags_no_connections.py'
test-cov = """pytest -vv --cov=cosmos --cov-report=term-missing --cov-report=xml --durations=0 -m "not integration" --ignore=tests/test_example_dags.py --ignore=tests/test_example_dags_no_connections.py"""
# we install using the following workaround to overcome installation conflicts, such as:
# apache-airflow 2.3.0 and dbt-core [0.13.0 - 1.5.2] and jinja2>=3.0.0 because these package versions have conflicting dependencies
test-integration-setup = """pip uninstall -y dbt-postgres dbt-databricks dbt-vertica; \
rm -rf airflow.*; \
airflow db init; \
pip install 'dbt-core' 'dbt-databricks' 'dbt-postgres' 'dbt-vertica' 'openlineage-airflow'"""
test-integration = """rm -rf dbt/jaffle_shop/dbt_packages;
pytest -vv \
--cov=cosmos \
--cov-report=term-missing \
--cov-report=xml \
--durations=0 \
-m integration \
-k 'not (sqlite or example_cosmos_sources or example_cosmos_python_models or example_virtualenv)'"""
test-integration-expensive = """pytest -vv \
--cov=cosmos \
--cov-report=term-missing \
--cov-report=xml \
--durations=0 \
-m integration \
-k 'example_cosmos_python_models or example_virtualenv'"""
test-integration-sqlite-setup = """pip uninstall -y dbt-core dbt-sqlite openlineage-airflow openlineage-integration-common; \
rm -rf airflow.*; \
airflow db init; \
pip install 'dbt-core==1.4' 'dbt-sqlite<=1.4' 'dbt-databricks<=1.4' 'dbt-postgres<=1.4' """
test-integration-sqlite = """
pytest -vv \
--cov=cosmos \
--cov-report=term-missing \
--cov-report=xml \
--durations=0 \
-m integration \
-k 'example_cosmos_sources or sqlite'"""
test = 'sh scripts/test/unit.sh'
test-cov = 'sh scripts/test/unit-cov.sh'
test-integration-setup = 'sh scripts/test/integration-setup.sh'
test-integration = 'sh scripts/test/integration.sh'
test-integration-expensive = 'sh scripts/test/integration-expensive.sh'
test-integration-sqlite-setup = 'sh scripts/test/integration-sqlite-setup.sh'
test-integration-sqlite = 'sh scripts/test/integration-sqlite.sh'
test-performance-setup = 'sh scripts/test/performance-setup.sh'
test-performance = 'sh scripts/test/performance.sh'
jbandoro marked this conversation as resolved.
Show resolved Hide resolved

[tool.pytest.ini_options]
filterwarnings = [
"ignore::DeprecationWarning",
]
filterwarnings = ["ignore::DeprecationWarning"]
minversion = "6.0"
markers = [
"integration",
"sqlite"
]
markers = ["integration", "sqlite", "perf"]

######################################
# DOCS
Expand All @@ -233,7 +167,7 @@ dependencies = [
"sphinx-autobuild",
"sphinx-autoapi",
"openlineage-airflow",
"apache-airflow-providers-cncf-kubernetes>=5.1.1"
"apache-airflow-providers-cncf-kubernetes>=5.1.1",
]

[tool.hatch.envs.docs.scripts]
Expand Down
8 changes: 8 additions & 0 deletions scripts/test/integration-expensive.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
pytest -vv \
--cov=cosmos \
--cov-report=term-missing \
--cov-report=xml \
--durations=0 \
-m integration \
--ignore=tests/perf \
-k 'example_cosmos_python_models or example_virtualenv'
6 changes: 6 additions & 0 deletions scripts/test/integration-setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# we install using the following workaround to overcome installation conflicts, such as:
# apache-airflow 2.3.0 and dbt-core [0.13.0 - 1.5.2] and jinja2>=3.0.0 because these package versions have conflicting dependencies
pip uninstall -y dbt-postgres dbt-databricks dbt-vertica; \
rm -rf airflow.*; \
airflow db init; \
pip install 'dbt-core' 'dbt-databricks' 'dbt-postgres' 'dbt-vertica' 'openlineage-airflow'
4 changes: 4 additions & 0 deletions scripts/test/integration-sqlite-setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
pip uninstall -y dbt-core dbt-sqlite openlineage-airflow openlineage-integration-common; \
rm -rf airflow.*; \
airflow db init; \
pip install 'dbt-core==1.4' 'dbt-sqlite<=1.4' 'dbt-databricks<=1.4' 'dbt-postgres<=1.4'
8 changes: 8 additions & 0 deletions scripts/test/integration-sqlite.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
pytest -vv \
--cov=cosmos \
--cov-report=term-missing \
--cov-report=xml \
--durations=0 \
-m integration \
--ignore=tests/perf \
-k 'example_cosmos_sources or sqlite'
9 changes: 9 additions & 0 deletions scripts/test/integration.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
rm -rf dbt/jaffle_shop/dbt_packages;
pytest -vv \
--cov=cosmos \
--cov-report=term-missing \
--cov-report=xml \
--durations=0 \
-m integration \
--ignore=tests/perf \
-k 'not (sqlite or example_cosmos_sources or example_cosmos_python_models or example_virtualenv)'
4 changes: 4 additions & 0 deletions scripts/test/performance-setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
pip uninstall -y dbt-core dbt-sqlite openlineage-airflow openlineage-integration-common; \
rm -rf airflow.*; \
airflow db init; \
pip install 'dbt-core==1.4' 'dbt-sqlite<=1.4' 'dbt-databricks<=1.4' 'dbt-postgres<=1.4'
5 changes: 5 additions & 0 deletions scripts/test/performance.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pytest -vv \
-s \
-m 'perf' \
--ignore=tests/test_example_dags.py \
--ignore=tests/test_example_dags_no_connections.py
10 changes: 10 additions & 0 deletions scripts/test/unit-cov.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
pytest \
-vv \
--cov=cosmos \
--cov-report=term-missing \
--cov-report=xml \
--durations=0 \
-m "not (integration or perf)" \
--ignore=tests/perf \
--ignore=tests/test_example_dags.py \
--ignore=tests/test_example_dags_no_connections.py
7 changes: 7 additions & 0 deletions scripts/test/unit.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
pytest \
-vv \
--durations=0 \
-m "not (integration or perf)" \
--ignore=tests/perf \
--ignore=tests/test_example_dags.py \
--ignore=tests/test_example_dags_no_connections.py
Loading
Loading