You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 17, 2024. It is now read-only.
I am running diff_tables through a Python script and materializing all rows to a table within my DB. This seems to work great for figuring out our updated columns and rows, however deletes are not being materialized.
Below is the code I'm using. I wanted to check to see if an ID I'm expecting to get an output for in my tables (which isn't there) would show up in the output that the Python script gives me, which it did. I would expect that anything that shows up in the output for diff_tables within my script would also be materialized in to the table that data_diff uses for materialization. From what I can tell, it is not outputting deletes in materialization which throws wrench in the pipeline I'm currently working on.
try:
for d in data_diff.diff_tables(
source_table,
target_table,
extra_columns=columns,
key_columns=key_columns,
materialize_to_table=f"NORSE_DIFF.{SNOWFLAKE_CONN_INFO['schema']}.{table_name}",
materialize_all_rows=True,
):
if d[1][0] == "c91e4af2-4585-5cbb-924b-cbeb12b7919e":
print(d[1][0])
except Exception as e:
print(e)
@devcshort can you explain how to materialize data-diff results to a redshift table for open source version for comparison with redshift db itself on a high level ? I am intend to do the same using dbt , redshift in local dbt core
I'm sorry for the delay in following up on this. Thank you for raising this issue and for looking into potential solutions!
We made a hard decision to sunset the data-diff package and won't provide further development or support.
If that's of interest, over the past few months, we have rewritten the diffing engine in Datafold Cloud and solved many issues that existed in this package's diffing algorithm.
I am running
diff_tables
through a Python script and materializing all rows to a table within my DB. This seems to work great for figuring out our updated columns and rows, however deletes are not being materialized.Below is the code I'm using. I wanted to check to see if an ID I'm expecting to get an output for in my tables (which isn't there) would show up in the output that the Python script gives me, which it did. I would expect that anything that shows up in the output for diff_tables within my script would also be materialized in to the table that data_diff uses for materialization. From what I can tell, it is not outputting deletes in materialization which throws wrench in the pipeline I'm currently working on.
I'm currently using
[email protected]
MacOS Apple Silicon
This is running within a Dagster environment as well.
The text was updated successfully, but these errors were encountered: