A set of 2 procedures and a python tool has been developed to handle change management tasks for the projects.
- edw_ops.ops.clone_db_objects
- edw_ops.ops.clone_db_objects_adv
- csv_to_json.ipynb
The aim is to be able to migrate the following Snowflake DB object types from source environment to target environment:
- Tables
- Views
- Stages
- Functions
- File Formats
- Stored Procedures
- Streams
- Tasks
The procedures leverage the cloning capability provided by snowflake which allows to create writable zero copy clones within the system.
The above 8 DB objects can be logically divided into two groups:
- The ones for which the metadata is readily available in the INFORMATION_SCHEMA - Tables, Views, Procedures, Functions and Stages
- The others for which it is not - File Formats, Streams and Tasks
For the former, the idea is to pick up the bits and pieces of information available in the INFORMATION_SCHEMA and assimilate them into a functional DDL which captures all the relevant details of the source object.
For the later, we either make use snowflake’s GET_DDL() or SHOW to get the metadata about a specific DB objects and generate the DDL.
NOTE: These procedures cannot perform cross-schema object migration; rather it assumes that the objects must be migrated between the corresponding schema in source and target. Also, it does not create new schema in the target; rather assumes that it already exists.
This procedure has been developed with the purpose to be able to do bulk object migrations from:
-
Source DB to Target DB – Moves all the 8 DB object types within all four standard schemata viz. STG, INT, SEC, PRS.
-
A schema in source DB to the corresponding schema in target DB – Moves all the 8 DB object types from source DB to the target DB existing within the specified schema.
-
A set of DB objects having a particular naming convention which is specific to a project – Moves all the 8 DB object types having the specified prefix from the source DB to the target DB.
-
A specific object type from source to the target – Moves all the DB objects belonging to the specified type from the source DB to the target DB irrespective of the schema they belong to.
- SOURCE_DB - Source Database name
- TARGET_DB - Target Database name
- OBJ_PATTERN - Project name (or any text pattern in general) which is matched as prefix against the DB object names.
- SCHEMA_FILTER - A string specifying the schema to be moved (if empty or ‘ALL’, all the schema in source DB will be moved).
- OBJECT_TYPE - A string specifying the DB type to be moved viz. ALL, TABLE, VIEW, FF, STAGES, SP, FUNCTION, STREAMS, TASK. These are the only valid argument values.
- FIRST_RUN – A Boolean specifying whether the current migration is the first run, in that case all the DDLs are ‘CREATE IF NOT EXISTS’ statements.
- MOVE_TABLE_STRUCTURE_ONLY – A Boolean specifying whether we want to move table structures only and not source data references are required. If set to 0, the DB objects will be simply cloned as a zero copy.
- PRESERVE_TARGET_TABLE_DATA – If the above argument is 1, this Boolean argument specifies whether we want to preserve the data existing the target DB. If set, it will create a backup table to store the original target table data and then migrate the new structure from source to the target.
- SOURCE_DB - Source Database name
- TARGET_DB - Target Database name
- OBJ_PATTERN - Project name (or any text pattern in general) which is matched as prefix against the DB object names.
- ADV_FILTER – JSON string specifying the schema name and the list of objects to be moved from that schema.
This procedure has been developed to give the user flexibility to be able to move a specific DB object rather than doing bulk migration of the entire DB or schema.
This is achieved by providing a JSON string having the list of objects to be moved categorized by object types as JSON keys.
For ex.
{
"STG": {
"SCHEMA": "STG",
"TABLES": "ACCOUNT_ACCOUNT_NUMBER,BAK_STG_Y5PPLZA0",
"VIEWS": "COMPANY_CODE_SV",
"FF": "",
"STAGES": "",
This is a simple python tool to convert CSV to JSON string. It takes a CSV file name as input and generates the JSON string.
The headers of the CSV file need to be as follows
SCHEMA TABLES VIEWS FF STAGES SP FUNCTIONS STREAMS TASKS