Issue with cached credentials when attempting to use different keyfiles in the same Spark App #1009

josecsotomorales · 2023-06-02T15:58:01Z

Hey folks, I have a Spark Application that reads from a source bucket and writes into a target bucket. I'm experiencing some issues when setting the keyfile for the second operation, as a Hadoop configuration, in theory, the keyfile should get overridden, but it's not the case, the application always uses the first keyfile, I tried to unset, and clear hadoop configs and everything but for whatever reason the connector always uses the first credentials file. Here is a code snippet of what I'm trying to accomplish:

from pyspark.sql import SparkSession

spark = SparkSession.builder \
    .appName("Multiple GCS Service Accounts") \
    .getOrCreate()

spark.conf.set("spark.hadoop.fs.gs.auth.service.account", "/path/to/first/keyfile.json")

# Perform Spark operations using the first key file

# Switch to a different key file
spark.conf.set("spark.hadoop.fs.gs.auth.service.account", "/path/to/second/keyfile.json")

# Perform Spark operations using the second key file

spark.stop()

For Hadoop AWS and Hadoop Azure connectors, there's multiple ways to set credentials per bucket, I would like to have the same in the GCS connector, for example:

  // See the bucket variable, I can set keys per bucket
  spark.sparkContext.hadoopConfiguration.set(s"fs.s3a.bucket.$bucket.access.key", accessKey)
  spark.sparkContext.hadoopConfiguration.set(s"fs.s3a.bucket.$bucket.secret.key", secretKey)

josecsotomorales · 2023-06-02T16:05:24Z

@medb @singhravidutt do you know if this is even possible with the current implementation?

sid-habu · 2024-09-12T21:08:36Z

Looking for a solution to allow different credentials for reading from a GCS bucket in a project and writing to another GCS bucket in another project

josecsotomorales mentioned this issue Jun 2, 2023

is there any way to set per bucket credentials #623

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with cached credentials when attempting to use different keyfiles in the same Spark App #1009

Issue with cached credentials when attempting to use different keyfiles in the same Spark App #1009

josecsotomorales commented Jun 2, 2023

josecsotomorales commented Jun 2, 2023

sid-habu commented Sep 12, 2024

Issue with cached credentials when attempting to use different keyfiles in the same Spark App #1009

Issue with cached credentials when attempting to use different keyfiles in the same Spark App #1009

Comments

josecsotomorales commented Jun 2, 2023

josecsotomorales commented Jun 2, 2023

sid-habu commented Sep 12, 2024