[Spark] Support external DSV2 catalog in RESTORE command #2033

gengliangwang · 2023-09-08T05:02:42Z

Which Delta project/connector is this regarding?

Description

How was this patch tested?

A new end-to-end test
A parser test case

Does this PR introduce any user-facing changes?

Yes, users can use RESTORE command on the tables of their external DSV2 catalogs.

ryan-johnson-databricks · 2023-09-09T00:06:04Z

spark/src/test/scala/org/apache/spark/sql/delta/CustomCatalogSuite.scala

+    properties: java.util.Map[String, String]): Table = {
+    val tablePath = getTablePath(ident.name())
+    // Create an empty Delta table on the tablePath
+    spark.range(0).write.format("delta").save(tablePath.toString)


Shouldn't this use the passed-in schema? Otherwise it takes a DDL later to update it with info we already had...

Alternatively, we can just create the DTV2 and call it good -- Delta knows how to handle the new/empty/missing directory case, tho it won't let you read such tables. Which comes back to the first comment -- if the table needs to be readable after this, it needs the correct schema, no?

Shouldn't this use the passed-in schema?

Since it is a dummy catalog, I try to make the schema fixed as "id: long"

Alternatively, we can just create the DTV2

Could you show me some details of how to use it?

DeltaTableV2(spark, tablePath.toString) should suffice? See DeltaTableV2.scala

gengliangwang · 2023-09-11T05:19:16Z

FYI I am closing this and continuing my works on #2036

gengliangwang added 4 commits September 7, 2023 16:01

workable version

d1c8379

fix temp view

013f964

add CustomCatalogSuite.scala

90b616b

move resolveCatalogAndIdentifier to DeltaTableUtils

be93ff5

gengliangwang changed the title ~~[SPARK] Support external DSV2 catalog in RESTORE command~~ [Spark] Support external DSV2 catalog in RESTORE command Sep 8, 2023

gengliangwang added 3 commits September 7, 2023 22:15

fix style

32e66aa

fix style in DeltaTable.scala

01e7d79

remove unnecessary changes; add one comment

c0baf09

ryan-johnson-databricks reviewed Sep 9, 2023

View reviewed changes

gengliangwang mentioned this pull request Sep 9, 2023

[Spark] Support external DSV2 catalog in RESTORE & Vacuum command #2036

Closed

5 tasks

gengliangwang closed this Sep 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Spark] Support external DSV2 catalog in RESTORE command #2033

[Spark] Support external DSV2 catalog in RESTORE command #2033

gengliangwang commented Sep 8, 2023 •

edited

Loading

ryan-johnson-databricks Sep 9, 2023

ryan-johnson-databricks Sep 9, 2023 •

edited

Loading

gengliangwang Sep 9, 2023

ryan-johnson-databricks Sep 11, 2023 •

edited

Loading

gengliangwang commented Sep 11, 2023

[Spark] Support external DSV2 catalog in RESTORE command #2033

[Spark] Support external DSV2 catalog in RESTORE command #2033

Conversation

gengliangwang commented Sep 8, 2023 • edited Loading

Which Delta project/connector is this regarding?

Description

How was this patch tested?

Does this PR introduce any user-facing changes?

ryan-johnson-databricks Sep 9, 2023

Choose a reason for hiding this comment

ryan-johnson-databricks Sep 9, 2023 • edited Loading

Choose a reason for hiding this comment

gengliangwang Sep 9, 2023

Choose a reason for hiding this comment

ryan-johnson-databricks Sep 11, 2023 • edited Loading

Choose a reason for hiding this comment

gengliangwang commented Sep 11, 2023

gengliangwang commented Sep 8, 2023 •

edited

Loading

ryan-johnson-databricks Sep 9, 2023 •

edited

Loading

ryan-johnson-databricks Sep 11, 2023 •

edited

Loading