From c4090952813aa9ea872f60b23fe96a652368e08e Mon Sep 17 00:00:00 2001 From: Jacky Hu Date: Wed, 16 Oct 2024 15:20:41 -0700 Subject: [PATCH] [PP-2196] Update API doc for OAuthM2M Support (#73) --- ApiSpecifications.md | 8 +- OnboardingDoc.md | 13 +- api-doc/Models/Auth.md | 2 +- api-doc/Models/ConnectRequest.md | 1 + api-doc/Models/PartnerConfig.md | 1 + openapi/partner-connect-2.0.yaml | 10 +- .../example/formatters/JsonFormatters.scala | 131 +++++++++++++++++- .../client/tests/JsonFormattersTest.scala | 83 +++++++++++ 8 files changed, 230 insertions(+), 19 deletions(-) create mode 100644 src/test/scala/com/databricks/partnerconnect/client/tests/JsonFormattersTest.scala diff --git a/ApiSpecifications.md b/ApiSpecifications.md index e1767a8..22244c2 100644 --- a/ApiSpecifications.md +++ b/ApiSpecifications.md @@ -68,7 +68,7 @@ The Connect API is used to sign-in or sign-up a user with a partner with Databri The order of events when connecting Databricks to a partner is as follows: 1. The user clicks the Partner tile. -2. The user confirms the Databricks resources that will be provisioned for the connection (e.g. the Service Principal, the PAT, the SQL Warehouse). +2. The user confirms the Databricks resources that will be provisioned for the connection (e.g. the Service Principal, the PAT or the service principal OAuth secret, the SQL Warehouse). 3. The user clicks Connect. 1. Databricks calls the partner's **Connect API** with all of the Databricks data that the partner needs. 2. The partner provisions any accounts and resources needed. (e.g. persisting the Databricks workspace\_id, provisioning a Databricks output node). @@ -138,9 +138,6 @@ POST : [example, can be customized] "is_connection_established" : true|false "auth": { [Only present if is_connection_established is false] "personal_access_token": "dapi..." - // or - "oauth_token": ..., [optional, reserved for future use] - "oauth_scope": ... [optional, reserved for future use] } } "hostname": "organization.cloud.databricks.com", @@ -162,7 +159,8 @@ POST : [example, can be customized] "is_sql_endpoint" : true|false, [optional: same value as is_sql_warehouse] "is_sql_warehouse": true|false, [optional: set if cluster_id is set. Determines whether cluster_id refers to Interactive Cluster or SQL Warehouse] "data_source_connector": "Oracle", [optional, unused and reserved for future use: for data connector tools, the name of the data source that the user should be referred to in their tool] - "service_principal_id": "a2a25a05-3d59-4515-a73b-b8bc5ab79e31" [optional, the UUID (username) of the service principal identity] + "service_principal_id": "a2a25a05-3d59-4515-a73b-b8bc5ab79e31", [optional, the UUID (username) of the service principal identity] + "service_principal_oauth_secret": "dose..." [optional, the OAuth secret of the service principal identity, it will be passed only when partner config includes OAuth M2M auth option] } ``` diff --git a/OnboardingDoc.md b/OnboardingDoc.md index 0c95eee..7197c13 100644 --- a/OnboardingDoc.md +++ b/OnboardingDoc.md @@ -15,7 +15,7 @@ Partner Connect is a destination inside a Databricks workspace that allows Datab We made Partner Connect for 2 reasons: -1. We want to give our customers access to the value provided by the best data products in the market. Partner Connect removes the complexity from connecting products to Databricks by automatically configuring resources such as SQL Warehouses, clusters, PAT tokens, service principals, and connection files. It can also initiate a free trial of partner products. +1. We want to give our customers access to the value provided by the best data products in the market. Partner Connect removes the complexity from connecting products to Databricks by automatically configuring resources such as SQL Warehouses, clusters, PAT tokens, service principals, OAuth secrets and connection files. It can also initiate a free trial of partner products. 2. We want to help our partners build their businesses and incentivize them to create the best possible product experience for Databricks customers. For more on this topic, see [this blog post](https://databricks.com/blog/2021/11/18/build-your-business-on-databricks-with-partner-connect.html). ### Sample marketing materials and user experience demo @@ -31,6 +31,9 @@ The following phrases will help you understand the Databricks product and this d - **Persona Switcher:** The component on the upper left of the UI that allows the user to choose the active Databricks product. This controls which features are available in the UI, and not all users have access to all 3 options. Partner Connect is available to all 3 personas. - **Personal Access Token (PAT):** A token that a partner product can use to authenticate with Databricks - **Service Principal:** An account that a partner product can use when calling Databricks APIs. Service Principals have access controls associated with them. +- **OAuth M2M** It uses service principals to authenticate Databricks. It is also known as 2-legged OAuth and OAuth Client Credentials Flow. Partner product can use service principal UUD (client_id) and OAuth secret (client_secret) to authenticate with Databricks. +- **Service Principal OAuth Secret**: The service principal's secret that a partner product use it along with service principal UUID to authenticate with Databricks. + ![](img/persona.png) @@ -83,10 +86,10 @@ While there's some customization available, most partners have one of the follow | Integration | Description | |------------- | -------------| -| Read Partner | This is used by partners that purely need to read (select) data from the Lakehouse. In Partner Connect, the user selects which data to grant access to your product. Databricks provides the partner a SQL Warehouse and PAT with permissions to query that data. This is often used by **Business Intelligence and Data Quality partners**. -| Write Partner | This is used by partners that purely need to write (ingest) data into the Lakehouse. In Partner Connect, the user selects which catalog to grant write access to your product. Databricks provides the partner a SQL Warehouse and PAT with permissions to create schemas and tables in that catalog. This is often used by **Ingestion partners**. -| Read-Write Partner | This is used by partners that both need to read from and write to the Lakehouse. In Partner Connect, the user selects which catalog to grant write access and which schemas to grant read access for your product. Databricks provides the partner a SQL Warehouse and PAT with permissions to create schemas and tables in that catalog, as well as query the selected data. This is often used by **Data Preparation partners**. -| Notebook Partner | This is used by partners that want to demonstrate their integration with Databricks using a Databricks Notebook. Databricks provides the partner an Interactive Cluster and PAT with permissions. The partner can use the PAT to publish a Databricks Notebook and configure the Interactive Cluster. +| Read Partner | This is used by partners that purely need to read (select) data from the Lakehouse. In Partner Connect, the user selects which data to grant access to your product. Databricks provides the partner a SQL Warehouse, PAT with permissions or OAuth secret of the service principal with permissions to query that data. This is often used by **Business Intelligence and Data Quality partners**. +| Write Partner | This is used by partners that purely need to write (ingest) data into the Lakehouse. In Partner Connect, the user selects which catalog to grant write access to your product. Databricks provides the partner a SQL Warehouse, PAT with permissions or OAuth secret of the service principal with permissions to create schemas and tables in that catalog. This is often used by **Ingestion partners**. +| Read-Write Partner | This is used by partners that both need to read from and write to the Lakehouse. In Partner Connect, the user selects which catalog to grant write access and which schemas to grant read access for your product. Databricks provides the partner a SQL Warehouse, PAT with permissions or OAuth secret of the service principal with permissions to create schemas and tables in that catalog, as well as query the selected data. This is often used by **Data Preparation partners**. +| Notebook Partner | This is used by partners that want to demonstrate their integration with Databricks using a Databricks Notebook. Databricks provides the partner an Interactive Cluster, PAT with permissions or OAuth secret of the service principal with permissions. The partner can use the PAT or service principal secret to publish a Databricks Notebook and configure the Interactive Cluster. | Desktop application Partner | This is used by partners that have a Desktop application (as opposed to a SaaS offering). In this integration, the user selects either an Interactive Cluster or SQL Warehouse and downloads a connection file to the partner product. This is often used by **Partners with Desktop applications**.

N.B. For this type of integration, there is no need for the partner to implement the SaaS APIs mentioned elsewhere throughout this documentation. ## Changelog diff --git a/api-doc/Models/Auth.md b/api-doc/Models/Auth.md index 9b4c6be..3e7e151 100644 --- a/api-doc/Models/Auth.md +++ b/api-doc/Models/Auth.md @@ -3,7 +3,7 @@ | Name | Type | Description | Notes | |------------ | ------------- | ------------- | -------------| -| **personal\_access\_token** | **String** | Personal Access Token created for the Service Principal or the user | [default to null] | +| **personal\_access\_token** | **String** | Personal Access Token created for the Service Principal or the user. Note will be null if the auth_options in PartnerConfig is not null and does not contain the value AUTH_PAT.| [default to null] | | **oauth\_token** | **String** | Oauth token. For future use. | [optional] [default to null] | | **oauth\_scope** | **String** | Oauth scope. For future use. | [optional] [default to null] | diff --git a/api-doc/Models/ConnectRequest.md b/api-doc/Models/ConnectRequest.md index 6ae1bbf..d971af9 100644 --- a/api-doc/Models/ConnectRequest.md +++ b/api-doc/Models/ConnectRequest.md @@ -24,6 +24,7 @@ | **is\_sql\_warehouse** | **Boolean** | Determines whether cluster_id refers to Interactive Cluster or SQL warehouse. | [optional] [default to null] | | **data\_source\_connector** | **String** | For data connector tools, the name of the data source that the user should be referred to in their tool. Unused today. | [optional] [default to null] | | **service\_principal\_id** | **String** | The UUID (username) of the service principal identity that a partner product can use to call Databricks APIs. Note the format is different from the databricks_user_id field in user_info. If empty, no service principal was created | [optional] [default to null] | +| **service\_principal\_oauth\_secret** | **String** | The OAuth secret of the service principal identity that a partner product can use to call Databricks APIs (see [OAuth M2M](https://docs.databricks.com/en/dev-tools/auth/oauth-m2m.html)). It will be set only when the `auth_options` in the [PartnerConfig](PartnerConfig.md) contains the value `AUTH_OAUTH_M2M`. | [optional] [default to null] | | **connection\_scope** | **String** | The scope of users that can use this connection. Workspace means all users in the same workspace. User means only the user creating it. | [optional] [default to null] | [[Back to Model list]](../README.md#documentation-for-models) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md) diff --git a/api-doc/Models/PartnerConfig.md b/api-doc/Models/PartnerConfig.md index 4a15c0e..e744487 100644 --- a/api-doc/Models/PartnerConfig.md +++ b/api-doc/Models/PartnerConfig.md @@ -23,6 +23,7 @@ | **require\_manual\_signup** | **Boolean** | True if the partner requires a manual signup after connect api is called. When set to true, connect api call with is_connection_established (sign in) is expected to return 404 account_not_found or connection_not_found until the user completes the manual signup step. | [optional] [default to null] | | **trial\_type** | **String** | Enum describing the type of trials the partner support. Partners can chose to support trial account expiration at the individual user or account level. If trial level is user, expiring one user connection should not expire another user in the same account. | [optional] [default to null] | | **supports\_demo** | **Boolean** | True if partner supports the demo flag in the connect api call. | [optional] [default to null] | +| **auth\_options** | **List** | The available authentication methods that a partner can use to authenticate with Databricks. If it is not specified, `AUTH_PAT` will be used. The allowed options include
  • AUTH_PAT
  • AUTH_OAUTH_M2M
| [optional] [default to null] | | **test\_workspace\_detail** | [**PartnerConfig_test_workspace_detail**](PartnerConfig_test_workspace_detail.md) | | [optional] [default to null] | [[Back to Model list]](../README.md#documentation-for-models) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md) diff --git a/openapi/partner-connect-2.0.yaml b/openapi/partner-connect-2.0.yaml index c7efb1d..2ddb4af 100644 --- a/openapi/partner-connect-2.0.yaml +++ b/openapi/partner-connect-2.0.yaml @@ -333,7 +333,9 @@ components: properties: personal_access_token: type: string - description: Personal Access Token created for the Service Principal or the user + description: | + Personal Access Token created for the Service Principal or the users. + It will be null if the auth_options in PartnerConfig is not null and does not contain the value AUTH_PAT. example: "token" oauth_token: type: string @@ -480,6 +482,12 @@ components: type: string description: The UUID (username) of the service principal identity that a partner product can use to call Databricks APIs. Note the format is different from the databricks_user_id field in user_info. If empty, no service principal was created example: "a2a25a05-3d59-4515-a73b-b8bc5ab79e31" + service_principal_oauth_secret: + type: string + description: | + The secret of the service principal identity that a partner product can use to call Databricks APIs. + It will be set only when the auth_options in PartnerConfig contains the value AUTH_OAUTH_M2M. + example: "secret" connection_scope: type: string description: The scope of users that can use this connection. Workspace means all users in the same workspace. User means only the user creating it. diff --git a/src/main/scala/com/databricks/partnerconnect/example/formatters/JsonFormatters.scala b/src/main/scala/com/databricks/partnerconnect/example/formatters/JsonFormatters.scala index 2f97d0d..629edd4 100644 --- a/src/main/scala/com/databricks/partnerconnect/example/formatters/JsonFormatters.scala +++ b/src/main/scala/com/databricks/partnerconnect/example/formatters/JsonFormatters.scala @@ -20,7 +20,7 @@ import org.openapitools.client.model.PartnerConfigEnums.{ import org.openapitools.client.model.ResourceToProvisionEnums.ResourceType import org.openapitools.client.model.TestResultEnums.Status import org.openapitools.client.model.{Connector, _} -import spray.json.{DefaultJsonProtocol, RootJsonFormat} +import spray.json._ object JsonFormatters extends DefaultJsonProtocol { // Order of declaration matters. Enums need to be defined first otherwise ProductFormats.scala throws NPE. @@ -75,12 +75,129 @@ object JsonFormatters extends DefaultJsonProtocol { implicit val errorResponse: RootJsonFormat[ErrorResponse] = jsonFormat3( ErrorResponse ) - implicit val connectRequest: RootJsonFormat[ConnectRequest] = jsonFormat22( - ConnectRequest - ) - implicit val connectionInfo: RootJsonFormat[ConnectionInfo] = jsonFormat1( - ConnectionInfo - ) + + // spray.json jsonFormat cannot parse more than 22 fields, custom format is needed + implicit object ConnectionRequestJsonFormat + extends RootJsonFormat[ConnectRequest] { + private def OptionJsString(value: Option[String]) = + value.map(JsString(_)).getOrElse(JsNull) + private def OptionJsBoolean(value: Option[Boolean]) = + value.map(JsBoolean(_)).getOrElse(JsNull) + + private def getString(fields: Map[String, JsValue], name: String): String = + fields.get(name) match { + case Some(JsString(value)) => value + case _ => throw DeserializationException(s"$name should be string") + } + + private def getNumber( + fields: Map[String, JsValue], + name: String + ): BigDecimal = + fields.get(name) match { + case Some(JsNumber(value)) => value + case _ => throw DeserializationException(s"$name should be number") + } + + private def getBool(fields: Map[String, JsValue], name: String): Boolean = + fields.get(name) match { + case Some(JsBoolean(value)) => value + case _ => throw DeserializationException(s"$name should be boolean") + } + + private def getOptionString( + fields: Map[String, JsValue], + name: String + ): Option[String] = + fields.get(name) match { + case Some(JsString(value)) => Some(value) + case Some(JsNull) | None => None + case _ => throw DeserializationException(s"$name should be string") + } + + private def getOptionBoolean( + fields: Map[String, JsValue], + name: String + ): Option[Boolean] = + fields.get(name) match { + case Some(JsBoolean(value)) => Some(value) + case Some(JsNull) | None => None + case _ => throw DeserializationException(s"$name should be boolean") + } + + def write(request: ConnectRequest): JsValue = JsObject( + "user_info" -> request.user_info.toJson, + "connection_id" -> OptionJsString(request.connection_id), + "hostname" -> JsString(request.hostname), + "port" -> JsNumber(request.port), + "workspace_url" -> JsString(request.workspace_url), + "http_path" -> OptionJsString(request.http_path), + "jdbc_url" -> OptionJsString(request.jdbc_url), + "databricks_jdbc_url" -> OptionJsString(request.databricks_jdbc_url), + "workspace_id" -> JsNumber(request.workspace_id), + "demo" -> JsBoolean(request.demo), + "cloud_provider" -> request.cloud_provider.toJson, + "cloud_provider_region" -> OptionJsString(request.cloud_provider_region), + "is_free_trial" -> JsBoolean(request.is_free_trial), + "destination_location" -> OptionJsString(request.destination_location), + "catalog_name" -> OptionJsString(request.catalog_name), + "database_name" -> OptionJsString(request.database_name), + "cluster_id" -> OptionJsString(request.cluster_id), + "is_sql_endpoint" -> OptionJsBoolean(request.is_sql_endpoint), + "is_sql_warehouse" -> OptionJsBoolean(request.is_sql_warehouse), + "data_source_connector" -> OptionJsString(request.data_source_connector), + "service_principal_id" -> OptionJsString(request.service_principal_id), + "service_principal_oauth_secret" -> OptionJsString( + request.service_principal_oauth_secret + ), + "connection_scope" -> request.connection_scope + .map(_.toJson) + .getOrElse(JsNull) + ) + + implicit val connectRequest: RootJsonFormat[ConnectRequest] = + ConnectionRequestJsonFormat + + def read(value: JsValue): ConnectRequest = { + val fields = value.asJsObject.fields + val scoptOpt = fields.get("connection_scope") match { + case Some(JsNull) => None + case Some(v) => Some(v.convertTo[ConnectRequestEnums.ConnectionScope]) + case None => None + } + + ConnectRequest( + user_info = fields("user_info").convertTo[UserInfo], + connection_id = getOptionString(fields, "connection_id"), + hostname = getString(fields, "hostname"), + port = getNumber(fields, "port").toInt, + workspace_url = getString(fields, "workspace_url"), + http_path = getOptionString(fields, "http_path"), + jdbc_url = getOptionString(fields, "jdbc_url"), + databricks_jdbc_url = getOptionString(fields, "databricks_jdbc_url"), + workspace_id = getNumber(fields, "workspace_id").toLong, + demo = getBool(fields, "demo"), + cloud_provider = + fields("cloud_provider").convertTo[ConnectRequestEnums.CloudProvider], + cloud_provider_region = + getOptionString(fields, "cloud_provider_region"), + is_free_trial = getBool(fields, "is_free_trial"), + destination_location = getOptionString(fields, "destination_location"), + catalog_name = getOptionString(fields, "catalog_name"), + database_name = getOptionString(fields, "database_name"), + cluster_id = getOptionString(fields, "cluster_id"), + is_sql_endpoint = getOptionBoolean(fields, "is_sql_endpoint"), + is_sql_warehouse = getOptionBoolean(fields, "is_sql_warehouse"), + data_source_connector = + getOptionString(fields, "data_source_connector"), + service_principal_id = getOptionString(fields, "service_principal_id"), + service_principal_oauth_secret = + getOptionString(fields, "service_principal_oauth_secret"), + connection_scope = scoptOpt + ) + } + } + implicit val deleteConnectionRequest : RootJsonFormat[DeleteConnectionRequest] = jsonFormat3( DeleteConnectionRequest diff --git a/src/test/scala/com/databricks/partnerconnect/client/tests/JsonFormattersTest.scala b/src/test/scala/com/databricks/partnerconnect/client/tests/JsonFormattersTest.scala new file mode 100644 index 0000000..1867414 --- /dev/null +++ b/src/test/scala/com/databricks/partnerconnect/client/tests/JsonFormattersTest.scala @@ -0,0 +1,83 @@ +package com.databricks.partnerconnect.client.tests + +import com.databricks.partnerconnect.example.formatters.JsonFormatters._ +import org.openapitools.client.model.{ + ConnectRequest, + ConnectRequestEnums, + UserInfo +} +import spray.json._ + +class JsonFormattersTest extends PartnerTestBase { + val testUserInfo = UserInfo( + email = "test@mail.com", + first_name = "test-first-name", + last_name = "test-last-name", + databricks_user_id = 5845867166711048519L, + databricks_organization_id = 4645065419173783088L, + is_connection_established = false + ) + + val testRequestFull = ConnectRequest( + user_info = testUserInfo, + connection_id = Some("test-connection-id"), + hostname = "test-hostname", + port = 443, + workspace_url = "https://test-workspace-url", + http_path = Some("test-http-path"), + jdbc_url = Some("jdbc://test-jdcc-url"), + databricks_jdbc_url = Some("jdbc://test-databricks-jdbc-url"), + workspace_id = 1L, + demo = true, + cloud_provider = ConnectRequestEnums.CloudProvider.Aws, + cloud_provider_region = Some("test-cloud-provider-region"), + is_free_trial = true, + destination_location = Some("test-destination-location"), + catalog_name = Some("test-catalog-name"), + database_name = Some("test-database-name"), + cluster_id = Some("test-cluster-id"), + is_sql_endpoint = Some(true), + is_sql_warehouse = Some(true), + data_source_connector = Some("test-data-source-connector"), + service_principal_id = Some("test-service-principal-id"), + service_principal_oauth_secret = + Some("test-service-principal-oauth-secret"), + connection_scope = Some(ConnectRequestEnums.ConnectionScope.Workspace) + ) + + val connectionRequestJson = + """{"catalog_name":"test-catalog-name","cloud_provider":"aws","cloud_provider_region":"test-cloud-provider-region","cluster_id":"test-cluster-id","connection_id":"test-connection-id","connection_scope":"workspace","data_source_connector":"test-data-source-connector","database_name":"test-database-name","databricks_jdbc_url":"jdbc://test-databricks-jdbc-url","demo":true,"destination_location":"test-destination-location","hostname":"test-hostname","http_path":"test-http-path","is_free_trial":true,"is_sql_endpoint":true,"is_sql_warehouse":true,"jdbc_url":"jdbc://test-jdcc-url","port":443,"service_principal_id":"test-service-principal-id","service_principal_oauth_secret":"test-service-principal-oauth-secret","user_info":{"databricks_organization_id":4645065419173783088,"databricks_user_id":5845867166711048519,"email":"test@mail.com","first_name":"test-first-name","is_connection_established":false,"last_name":"test-last-name"},"workspace_id":1,"workspace_url":"https://test-workspace-url"}""" + + test( + "serialize and deserialize ConnectRequest: All the fields are provided" + ) { + val jsonStr = testRequestFull.toJson.toString + assert(jsonStr === connectionRequestJson) + val actualRequest = jsonStr.parseJson.convertTo[ConnectRequest] + assert(actualRequest == testRequestFull) + } + + test( + "serialize and deserialize ConnectRequest: All the option fields are None" + ) { + val expectedRequest = testRequestFull.copy( + connection_id = None, + http_path = None, + jdbc_url = None, + databricks_jdbc_url = None, + cloud_provider_region = None, + destination_location = None, + catalog_name = None, + database_name = None, + cluster_id = None, + is_sql_endpoint = None, + is_sql_warehouse = None, + service_principal_id = None, + service_principal_oauth_secret = None, + connection_scope = None + ) + val jsonStr = expectedRequest.toJson.toString + val actualRequest = jsonStr.parseJson.convertTo[ConnectRequest] + assert(actualRequest == expectedRequest) + } +}