You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In my current setup, there are some steps between ingesting data and producing the final data product table.
For example, I don't treat intermediate tables as data products. Instead, I created data contracts for them for reference and my "data products" refer to one or more tables that directly deliver value to the customer. My structure aligns with the medallion architecture as the following example:
All the output ports listed above are represented as data contracts, following this logical sequence:
Upstream Database → Ingestion → Bronze → Silver
It doesn’t make sense to create data products from the Ingestion and Bronze stages because those tables are not customer-facing.
What I wish was possible:
Define intermediate processes within a data product: Configure the data contracts involved in the data product, specifying the processing sequence.
Simplify output ports: By only setting the final table in the chain (e.g., "Silver") as the data product's output port, intermediate stages would remain part of the internal process rather than appearing as standalone products.
The text was updated successfully, but these errors were encountered:
maikelpenz
changed the title
Chain Data Product output ports
Chain Data Contracts within single Data Product
Nov 20, 2024
Consider raw and bronze tables as internal details of the data product. Do not define them as output port. You can use assets (https://api.datamesh-manager.com/swagger/index.html#/Assets) (<- new feature) to assign these tables/views to a data product.
If you want to have a data contract for your source data, define a proxy data product (Sales Raw / Sales Bronze), which is internal to the team. We have in backlog a feature to define the visibility of data products.
In my current setup, there are some steps between ingesting data and producing the final data product table.
For example, I don't treat intermediate tables as data products. Instead, I created data contracts for them for reference and my "data products" refer to one or more tables that directly deliver value to the customer. My structure aligns with the medallion architecture as the following example:
Data Product: Sales Silver
Input Port: Upstream Database
Output Ports: Ingestion, Bronze, Silver
All the output ports listed above are represented as data contracts, following this logical sequence:
Upstream Database → Ingestion → Bronze → Silver
It doesn’t make sense to create data products from the Ingestion and Bronze stages because those tables are not customer-facing.
What I wish was possible:
The text was updated successfully, but these errors were encountered: