Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster - Client library API migration changes #177

Open
wants to merge 35 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
14a7916
list and get cluster temporary changes
Jeyaprakash-NK Apr 15, 2024
75e1576
Merge branch 'main' of https://github.com/Shubha-accenture/dataproc-j…
Jeyaprakash-NK Apr 15, 2024
c8f9643
Merge branch 'main' of https://github.com/Shubha-accenture/dataproc-j…
Jeyaprakash-NK Apr 16, 2024
654b621
cluster service BE code change
Jeyaprakash-NK Apr 16, 2024
9659c02
list cluster client library temp changes
Jeyaprakash-NK Apr 17, 2024
3708794
list and get cluster api status changes
Jeyaprakash-NK Aug 8, 2024
25e25c1
stop cluster BE and FE temp changes
Jeyaprakash-NK Aug 12, 2024
cdec0d7
controller rename changes
Jeyaprakash-NK Aug 12, 2024
2a37f2c
latest pull from main and conflicts resolved
Jeyaprakash-NK Aug 12, 2024
d539d90
start cluster and BE temp changes
Jeyaprakash-NK Aug 12, 2024
1d60038
Service file rename changes
Jeyaprakash-NK Aug 12, 2024
611fbac
Merge branch 'main' of https://github.com/Shubha-accenture/dataproc-j…
Jeyaprakash-NK Aug 14, 2024
cd84677
delete cluster and auth access token fix
Jeyaprakash-NK Aug 19, 2024
2b7b95b
await changes in start, stop, delete
Jeyaprakash-NK Aug 19, 2024
9b71492
delete cluster empty handled
Jeyaprakash-NK Aug 19, 2024
312902a
added new dependency "google-cloud-dataproc"
Jeyaprakash-NK Aug 20, 2024
68c21db
code cleanup
Jeyaprakash-NK Aug 22, 2024
4d70971
Code review comments fix
Jeyaprakash-NK Sep 5, 2024
65d004f
pull from main and conflicts resolved
Jeyaprakash-NK Sep 6, 2024
2eb3d98
package conflicts resolved in pyproject
Jeyaprakash-NK Sep 6, 2024
cb818d0
Code review feedback changes BE and FE
Jeyaprakash-NK Sep 6, 2024
8a6072a
line space removed
Jeyaprakash-NK Sep 6, 2024
c5ae698
api endpoint changes in client library
Jeyaprakash-NK Sep 6, 2024
6daa124
prettier changes
Jeyaprakash-NK Sep 6, 2024
7f64795
runtime list code review fix
Jeyaprakash-NK Sep 9, 2024
50f2d2d
Merge branch 'main' of https://github.com/Shubha-accenture/dataproc-j…
Jeyaprakash-NK Sep 11, 2024
a734f1e
changed all package versions >= instead ~=
Jeyaprakash-NK Sep 12, 2024
ff65fda
code format fix python
Jeyaprakash-NK Sep 12, 2024
7580728
pyproject version change
Jeyaprakash-NK Sep 13, 2024
36dfd9e
pyproject change '>=' from '~='
Jeyaprakash-NK Sep 13, 2024
a3edf3d
client library test for list cluster
Jeyaprakash-NK Sep 16, 2024
d8e3567
aioHttp pyproject version change
Jeyaprakash-NK Sep 16, 2024
f87ce93
list cluster revert changes test
Jeyaprakash-NK Sep 16, 2024
f656955
test changes for list cluster
Jeyaprakash-NK Sep 16, 2024
218ecde
list cluster test file changes
Jeyaprakash-NK Sep 23, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 95 additions & 0 deletions dataproc_jupyter_plugin/controllers/cluster.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Copyright 2023 Google LLC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have a file in the controllers directory for Dataproc called dataproc.

Move all of the methods from this file into that one and then delete this entire file.

#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


import json
from jupyter_server.base.handlers import APIHandler
import tornado
from dataproc_jupyter_plugin import credentials
from dataproc_jupyter_plugin.services import cluster


class ClusterListPageController(APIHandler):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely remove this method. It duplicates the ClusterListController

@tornado.web.authenticated
async def get(self):
try:
page_token = self.get_argument("pageToken")
page_size = self.get_argument("pageSize")
client = cluster.Client(
await credentials.get_cached(), self.log
)
cluster_list = await client.list_clusters(page_size, page_token)
self.finish(json.dumps(cluster_list))
except Exception as e:
self.log.exception(f"Error fetching cluster list")
self.finish({"error": str(e)})


class ClusterDetailController(APIHandler):
@tornado.web.authenticated
async def get(self):
try:
cluster_selected = self.get_argument("clusterSelected")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change all instances of clusterSelected to just cluster.

client = cluster.Client(
await credentials.get_cached(), self.log
)
get_cluster = await client.get_cluster_detail(cluster_selected)
self.finish(json.dumps(get_cluster))
except Exception as e:
self.log.exception(f"Error fetching get cluster")
self.finish({"error": str(e)})


class StopClusterController(APIHandler):
@tornado.web.authenticated
async def post(self):
try:
cluster_selected = self.get_argument("clusterSelected")
client = cluster.Client(
await credentials.get_cached(), self.log
)
stop_cluster = await client.stop_cluster(cluster_selected)
self.finish(json.dumps(stop_cluster))
except Exception as e:
self.log.exception(f"Error fetching stop cluster")
self.finish({"error": str(e)})


class StartClusterController(APIHandler):
@tornado.web.authenticated
async def post(self):
try:
cluster_selected = self.get_argument("clusterSelected")
client = cluster.Client(
await credentials.get_cached(), self.log
)
start_cluster = await client.start_cluster(cluster_selected)
self.finish(json.dumps(start_cluster))
except Exception as e:
self.log.exception(f"Error fetching start cluster")
self.finish({"error": str(e)})

class DeleteClusterController(APIHandler):
@tornado.web.authenticated
async def delete(self):
try:
cluster_selected = self.get_argument("clusterSelected")
client = cluster.Client(
await credentials.get_cached(), self.log
)
delete_cluster = await client.delete_cluster(cluster_selected)
self.finish(json.dumps(delete_cluster))
except Exception as e:
self.log.exception(f"Error deleting cluster")
self.finish({"error": str(e)})
6 changes: 6 additions & 0 deletions dataproc_jupyter_plugin/handlers.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
from dataproc_jupyter_plugin.controllers import (
airflow,
bigquery,
cluster,
composer,
dataproc,
executor,
Expand Down Expand Up @@ -193,6 +194,11 @@ def full_path(name):
"dagRunTask": airflow.DagRunTaskController,
"dagRunTaskLogs": airflow.DagRunTaskLogsController,
"clusterList": dataproc.ClusterListController,
"clusterListPage": cluster.ClusterListPageController,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop this line entirely; there's no justification for having two different endpoints that make the exact same API call.

"clusterDetail": cluster.ClusterDetailController,
"stopCluster": cluster.StopClusterController,
"startCluster": cluster.StartClusterController,
"deleteCluster": cluster.DeleteClusterController,
"runtimeList": dataproc.RuntimeController,
"createJobScheduler": executor.ExecutorController,
"dagList": airflow.DagListController,
Expand Down
171 changes: 171 additions & 0 deletions dataproc_jupyter_plugin/services/cluster.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
# Copyright 2023 Google LLC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have a file in the services directory for Dataproc called dataproc.

Move all of the methods from this file into that one and then delete this entire file.

#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from google.cloud import dataproc_v1 as dataproc
import proto
import json
import google.oauth2.credentials as oauth2
from google.protobuf.empty_pb2 import Empty

class Client:
def __init__(self, credentials, log):
self.log = log
if not (
("access_token" in credentials)
and ("project_id" in credentials)
and ("region_id" in credentials)
):
self.log.exception("Missing required credentials")
raise ValueError("Missing required credentials")
self._access_token = credentials["access_token"]
self.project_id = credentials["project_id"]
self.region_id = credentials["region_id"]

async def list_clusters(self, page_size, page_token):
try:
# Create a client
client = dataproc.ClusterControllerAsyncClient(
client_options={
"api_endpoint": f"us-central1-dataproc.googleapis.com:443"
},
credentials=oauth2.Credentials(self._access_token),
)

# Initialize request argument(s)
request = dataproc.ListClustersRequest(
project_id=self.project_id,
page_size=int(page_size),
page_token=page_token,
region=self.region_id,
)

# Make the request
page_result = await client.list_clusters(request=request)
clusters_list = []

# Handle the response
async for response in page_result:
clusters_list.append(json.loads(proto.Message.to_json(response)))

return clusters_list
except Exception as e:
self.log.exception(f"Error fetching cluster list")
return {"error": str(e)}

async def get_cluster_detail(self, cluster_selected):
try:
# Create a client
client = dataproc.ClusterControllerAsyncClient(
client_options={
"api_endpoint": f"us-central1-dataproc.googleapis.com:443"
},
credentials=oauth2.Credentials(self._access_token),
)

# Initialize request argument(s)
request = dataproc.GetClusterRequest(
project_id=self.project_id,
region=self.region_id,
cluster_name=cluster_selected,
)

# Make the request
response = await client.get_cluster(request=request)

# Handle the response
return json.loads(proto.Message.to_json(response))
except Exception as e:
self.log.exception(f"Error fetching cluster detail")
return {"error": str(e)}

async def stop_cluster(self, cluster_selected):
try:
# Create a client
client = dataproc.ClusterControllerAsyncClient(
client_options={
"api_endpoint": f"us-central1-dataproc.googleapis.com:443"
},
credentials=oauth2.Credentials(self._access_token),
)

# Initialize request argument(s)
request = dataproc.StopClusterRequest(
project_id=self.project_id,
region=self.region_id,
cluster_name=cluster_selected,
)

operation = await client.stop_cluster(request=request)

response = await operation.result()
# Handle the response
return json.loads(proto.Message.to_json(response))
except Exception as e:
self.log.exception(f"Error fetching stop cluster")
return {"error": str(e)}

async def start_cluster(self, cluster_selected):
try:
# Create a client
client = dataproc.ClusterControllerAsyncClient(
client_options={
"api_endpoint": f"us-central1-dataproc.googleapis.com:443"
},
credentials=oauth2.Credentials(self._access_token),
)

# Initialize request argument(s)
request = dataproc.StartClusterRequest(
project_id=self.project_id,
region=self.region_id,
cluster_name=cluster_selected,
)

operation = await client.start_cluster(request=request)

response = await operation.result()
# Handle the response
return json.loads(proto.Message.to_json(response))
except Exception as e:
self.log.exception(f"Error fetching start cluster")
return {"error": str(e)}

async def delete_cluster(self, cluster_selected):
try:
# Create a client
client = dataproc.ClusterControllerAsyncClient(
client_options={
"api_endpoint": f"us-central1-dataproc.googleapis.com:443"
},
credentials=oauth2.Credentials(self._access_token),
)

# Initialize request argument(s)
request = dataproc.DeleteClusterRequest(
project_id=self.project_id,
region=self.region_id,
cluster_name=cluster_selected,
)

operation = await client.delete_cluster(request=request)

response = await operation.result()
# Handle the response
if isinstance(response, Empty):
return "Deleted successfully"
else:
return json.loads(proto.Message.to_json(response))
except Exception as e:
self.log.exception(f"Error deleting cluster")
return {"error": str(e)}
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@ dependencies = [
"pendulum>=3.0.0",
"pydantic~=1.10.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we pinning the minor versions in these packages?

I.E. why "~=.." instead of ">=.."?

"bigframes~=0.22.0",
"aiohttp~=3.9.5"
"aiohttp~=3.9.5",
"google-cloud-dataproc~=5.10.2"
]
dynamic = ["version", "description", "authors", "urls", "keywords"]

Expand Down
Loading
Loading