Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved support for slim CI & defer #17

Open
noel opened this issue Jan 18, 2022 · 0 comments
Open

Improved support for slim CI & defer #17

noel opened this issue Jan 18, 2022 · 0 comments
Labels
enhancement New feature or request

Comments

@noel
Copy link

noel commented Jan 18, 2022

I am using dbt-snow-mask as part of a CI/CD pipeline.
We use dbt build with state:modifed and defer flags. We also create a new database per pull request.
We want masking to be applied in the test databases as well.
We also use custom schemas

This introduces a few issues:

  1. There may be models with masking rules that are not in the test db
  2. There may be models in schemas that do not exist in the test db

The script also does the deployment into production and this also introduces some issues:

  1. Masking policies need to be created automatically
  2. Masking policies must be recreated on future runs in case the policy is redefined
  3. Since recreating a policy requires un-assigning a policy and during a deploy not all models will run, the policies must be reassigned to all models outside of a run

Right now I solved via some custom macros, but I think this can be simplified by creating policies in one schema, e.g. public and having a run operation for models to re-apply masking policy. It would also be good to do sources and models from one run operation.

These are the steps we run

    {{ log("Creating schemas for masking policies", true) }}
    {% do create_missing_schemas_with_masking_policy() %}

    {{ log("Unapplying masking policies", true) }}
    {% do snow_mask_reapply_policies('unapply') %}

    {{ log("Create masking policies", true) }}
    {% do dbt_snow_mask.create_masking_policy(resource_type='sources') %}
    {% do dbt_snow_mask.create_masking_policy(resource_type='models') %}

    {{ log("Reapply masking policies", true) }}
    {% do snow_mask_reapply_policies('apply') %}

and this is the macro for removing/adding the policies without a dbt run.

{# This macro reapplies masking policies #}
{#
    To run: 
    dbt run-operation snow_mask_reapply_policies
#}

{% macro snow_mask_reapply_policies(operation_type="apply") %}

    {% set schema_list = [] %}
    {% set schemas =  list_schemas() %}
    {% for schema in schemas %}
        {% do schema_list.append(schema['name']) %}
    {% endfor %}

    {% for node in graph.nodes.values() -%}

        {% set database = node.database | string %}
        {% set schema   = node.schema | string | upper %}
        {% set name   = node.name | string %}
        {% set identifier = (node.identifier | default(name, True)) | string %}

        {% set unique_id = node.unique_id | string %}
        {% set resource_type = node.resource_type | string %}
        {% set materialization = node.config.materialized | string %}

        {% set meta_key = 'masking_policy' %}

        {# Only run if there is a schema in the db for this relation #}
        {% set schema_exists = schema in schema_list %}

        {# If other materializations are added that create views or tables, #}
        {# this list will need to update #}
        {% if (schema_exists) and (materialization in ['table', 'view', 'incremental']) %}

            {# override materialization_type for incremental to table #}
            {% if materialization == 'incremental' %}
                {% set materialization = 'table' %}
            {% endif %}
            
            {% set meta_columns = dbt_snow_mask.get_meta_objects(unique_id, meta_key, resource_type) %}

            {% set masking_policy_list_sql %}
                show masking policies in {{database}}.{{schema}};
                select $3||'.'||$4||'.'||$2 as masking_policy from table(result_scan(last_query_id()));
            {% endset %}

            {%- for meta_tuple in meta_columns if meta_columns | length > 0 %}
                {% set column   = meta_tuple[0] %}
                {% set masking_policy_name  = meta_tuple[1] %}

                {% if masking_policy_name is not none %}

                    {% set masking_policy_list = dbt_utils.get_query_results_as_dict(masking_policy_list_sql) %}

                    {% set timestamp = modules.datetime.datetime.now().strftime("%H:%M:%S") ~ " | " %}

                    {% set policies_in_database = masking_policy_list['MASKING_POLICY'] | list %}
                    {% set full_masking_policy_name = database|upper ~ '.' ~ schema|upper ~ '.' ~ masking_policy_name|upper %}
                    {% if full_masking_policy_name not in policies_in_database %}
                        {{ log(timestamp ~ "Missing Masking policy defined for "~ unique_id ~" column: " ~ column,true) }}
                        {{ log(timestamp ~ "Policy Name: " ~ full_masking_policy_name, true) }}
                        {# Force an exit when masking policy is missing #}
                        {{ 0/0 }}
                    {% endif %} 

                    {% for masking_policy_in_db in masking_policy_list['MASKING_POLICY'] %}
            
                        {%- set relation = adapter.get_relation(
                            database=database,
                            schema=schema,
                            identifier=identifier) -%}

                        {% set relation_exists = relation is not none  %}

                        {% if (relation_exists) and (full_masking_policy_name == masking_policy_in_db) %}
                            {{ log(timestamp ~ operation_type ~ "ing masking policy to model : " ~ full_masking_policy_name ~ " on " ~ database ~ '.' ~ schema ~ '.' ~ identifier ~ '.' ~ column, info=True) }}
                            {% set query %}
                                {% if operation_type == "apply" %}
                                    alter {{materialization}}  {{database}}.{{schema}}.{{identifier}} modify column  {{column}} set masking policy  {{database}}.{{schema}}.{{masking_policy_name}}
                                {% elif operation_type == "unapply" %}
                                    alter {{materialization}}  {{database}}.{{schema}}.{{identifier}} modify column  {{column}} unset masking policy
                                {% endif %}
                            {% endset %}

                            {% do run_query(query) %}
                        {% elif (full_masking_policy_name == masking_policy_in_db) %}
                            {{ log(timestamp ~ "Skipping non-existant relation " ~ database ~ '.' ~ schema ~ '.' ~ identifier ~ ' column: ' ~ column, info=True) }}
                        {% endif %}
                    {% endfor %}
                {% endif %}

            {% endfor %} #}
        {% endif%}

    {% endfor %}

{% endmacro %} 
@entechlog entechlog added the enhancement New feature or request label Jan 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants