Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: improve compatibility with databricks sql client #252

Merged
merged 2 commits into from
Aug 7, 2024

Conversation

dbkegley
Copy link
Collaborator

@dbkegley dbkegley commented Aug 6, 2024

For reviewers: The important changes relevant for this PR's functionality are in src/posit/connect/external/databricks.py

The new credentials_strategy takes a fallback option that will be used automatically during local development, this way content can run locally and on Connect w/o requiring any code changes.

import streamlit as st
from databricks import sql
from databricks.sdk.core import ApiClient, Config, databricks_cli
from databricks.sdk.service.iam import CurrentUserAPI
from posit.connect.external.databricks import PositCredentialsStrategy

session_token = st.context.headers.get("Posit-Connect-User-Session-Token")
posit_strategy = PositCredentialsStrategy(
    local_strategy=databricks_cli,
    user_session_token=session_token)
cfg = Config(
    host=DATABRICKS_HOST_URL,
    # uses Posit's custom credential_strategy if running on Connect,
    # otherwise falls back to the strategy defined by local_strategy
    credentials_strategy=posit_strategy)

databricks_user = CurrentUserAPI(ApiClient(cfg)).me()
st.write(f"Hello, {databricks_user.display_name}!")

with sql.connect(
    server_hostname=DATABRICKS_HOST,
    http_path=SQL_HTTP_PATH,
    # https://github.com/databricks/databricks-sql-python/issues/148#issuecomment-2271561365
    credentials_provider=posit_strategy.sql_credentials_provider(cfg)
) as connection:
    with connection.cursor() as cursor:
        cursor.execute("SELECT * FROM samples.nyctaxi.trips LIMIT 10;")
        result = cursor.fetchall()

@dbkegley dbkegley requested a review from tdstein as a code owner August 6, 2024 18:48
Copy link

github-actions bot commented Aug 6, 2024

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
1235 1199 97% 0% 🟢

New Files

No new covered files...

Modified Files

File Coverage Status
src/posit/connect/external/databricks.py 88% 🟢
TOTAL 88% 🟢

updated for commit: 710c6b1 by action🐍

@dbkegley dbkegley force-pushed the kegs-update-connect-examples branch from d2c2475 to d9ef6f8 Compare August 6, 2024 18:54
@tdstein tdstein changed the title Update custom Databricks credentials providers and examples feat!: improve compatibility with databricks sql client Aug 6, 2024
@tdstein
Copy link
Collaborator

tdstein commented Aug 6, 2024

You can run make lint locally to debug the linting issues.

Copy link
Collaborator

@tdstein tdstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look reasonable. Is it worth adding any unit tests along with this change?

Comment on lines 29 to 30
# Use this environment variable to determine if the
# client SDK was initialized from a piece of content running on a Connect server.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this to a docstring?

src/posit/connect/external/databricks.py Show resolved Hide resolved
# client SDK was initialized from a piece of content running on a Connect server.
def is_local() -> bool:
return not os.getenv("RSTUDIO_PRODUCT") == "CONNECT"
class PositOAuthIntegrationCredentialsStrategy(CredentialsStrategy):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is superficial, but do we have options for making this class name shorter? As a user, giant names like this scare me and make me assume that the code is super complex.

Would PositCredentialsStrategy work?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here! That or ConnectCredentialsStrategy.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just another option in the merry-go-round of word permutations - because this is already under the posit/connect directory, would it be more informative to call it an OAuthCredentialsStrategy? Then we could have other credentials strategies that also work for posit products / connect but we don't run into naming collisions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted to go with PositCredentialsStrategy to avoid a naming conflict with the databricks sdk's OAuthCredentialsStrategy. I think this is a good balance of brevity and descriptive-ness since we are still under the posit/connect namespace

https://github.com/databricks/databricks-sdk-py/blob/7d22b4d3727478f0f5dbeb34b7f6fc17a03e31b7/databricks/sdk/credentials_provider.py#L57

@dbkegley dbkegley force-pushed the kegs-update-connect-examples branch 2 times, most recently from a2db2fb to 6c0b775 Compare August 6, 2024 20:23
@dbkegley dbkegley requested a review from tdstein August 6, 2024 20:23
@dbkegley dbkegley force-pushed the kegs-update-connect-examples branch from 6c0b775 to bf3c946 Compare August 6, 2024 20:30
assert cp() == {"Authorization": "Bearer static-pat-token"}

# posit_strategy is used when the content is running on Connect
os.environ["RSTUDIO_PRODUCT"] = "CONNECT"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible to mock the environment variables using unittest.mock.patch:

@patch.dict("os.environ", {"CONNECT_API_KEY": "foobar"})

This avoids modifying the developer's environment entirely.

@dbkegley dbkegley force-pushed the kegs-update-connect-examples branch from bf3c946 to 710c6b1 Compare August 7, 2024 16:18
Copy link
Collaborator

@tdstein tdstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good to me. Thanks for the additional documentation, tests, and explanations!

@dbkegley dbkegley merged commit 37bc1dc into main Aug 7, 2024
30 checks passed
@dbkegley dbkegley deleted the kegs-update-connect-examples branch August 7, 2024 16:35
@tdstein tdstein mentioned this pull request Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants