Skip to content

Commit

Permalink
Added required notebook (#97)
Browse files Browse the repository at this point in the history
* Added required notebook

* Updated read me

* Removed the second trigger

---------

Co-authored-by: souravg-db <souravg-db>
  • Loading branch information
souravg-db authored Jan 8, 2024
1 parent 87b0fd9 commit dfc9c3f
Show file tree
Hide file tree
Showing 3 changed files with 88 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ Operations are applied concurrently across multiple tables
* [GDPR right of access: extract user data from all tables at once](docs/GDPR_RoA.md)
* [GDPR right of erasure: delete user data from all tables at once](docs/GDPR_RoE.md)
* [Search in any column](docs/Search.md)
* Update Owner of Data Objects ([example notebook](examples/update_owner_of_data_objects.py))
* **Semantic classification**
* [Semantic classification of columns by semantic class](docs/Semantic_classification.md): email, phone number, IP address, etc.
* [Select data based on semantic classes](docs/Select_by_class.md)
Expand Down
10 changes: 10 additions & 0 deletions examples/scan_with_user_specified_data_source_formats.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,16 @@

# COMMAND ----------

# MAGIC %md
# MAGIC ### Install discoverx lib

# COMMAND ----------

# %pip install dbl-discoverx
# dbutils.library.restartPython()

# COMMAND ----------

# MAGIC %md
# MAGIC ### Declare Variables

Expand Down
77 changes: 77 additions & 0 deletions examples/update_owner_of_data_objects.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Databricks notebook source
# MAGIC %md
# MAGIC #Update Owner of Data Objects

# COMMAND ----------

# MAGIC %md
# MAGIC ### Install discoverx lib

# COMMAND ----------

# %pip install dbl-discoverx
# dbutils.library.restartPython()

# COMMAND ----------

# MAGIC %md
# MAGIC ### Declare Variables

# COMMAND ----------

dbutils.widgets.text("catalogs", "*", "Catalogs")
dbutils.widgets.text("schemas", "*", "Schemas")
dbutils.widgets.text("tables", "*", "Tables")
dbutils.widgets.text("owner","[email protected]","owner")
dbutils.widgets.dropdown("if_update_catalog_owner", "YES", ["YES","NO"])
dbutils.widgets.dropdown("if_update_schema_owner", "YES", ["YES","NO"])

# COMMAND ----------

catalogs = dbutils.widgets.get("catalogs")
schemas = dbutils.widgets.get("schemas")
tables = dbutils.widgets.get("tables")
owner = dbutils.widgets.get("owner")
if_update_catalog_owner = dbutils.widgets.get("if_update_catalog_owner")
if_update_schema_owner = dbutils.widgets.get("if_update_schema_owner")
from_table_statement = ".".join([catalogs, schemas, tables])

# COMMAND ----------

# MAGIC %md
# MAGIC ### Initiaize discoverx

# COMMAND ----------

from discoverx import DX

dx = DX()

# COMMAND ----------

# MAGIC %md
# MAGIC ### Update Owner of data objects to user specified value

# COMMAND ----------

def update_owner(table_info):
catalog_owner_alter_sql = f""" ALTER CATALOG `{table_info.catalog}` SET OWNER TO `{owner}`"""
schema_owner_alter_sql = f""" ALTER SCHEMA `{table_info.catalog}`.`{table_info.schema}` SET OWNER TO `{owner}`"""
table_owner_alter_sql = f""" ALTER TABLE `{table_info.catalog}`.`{table_info.schema}`.`{table_info.table}` SET OWNER TO `{owner}`"""
try:
if(if_update_catalog_owner == 'YES'):
print(f"Executing {catalog_owner_alter_sql}")
spark.sql(catalog_owner_alter_sql)

if(if_update_schema_owner == 'YES'):
print(f"Executing {schema_owner_alter_sql}")
spark.sql(schema_owner_alter_sql)

print(f"Executing {table_owner_alter_sql}")
spark.sql(table_owner_alter_sql)
except Exception as exception:
print(f" Exception occurred while updating owner: {exception}")

# COMMAND ----------

dx.from_tables(from_table_statement).map(update_owner)

0 comments on commit dfc9c3f

Please sign in to comment.