Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run ReadME.md example, Bug Get FileNotFoundError: [Errno 2] No such file or directory: '/var/run/secrets/kubernetes.io/serviceaccount/namespace' #2436

Closed
Yumeka999 opened this issue Sep 30, 2024 · 14 comments · Fixed by #2438

Comments

@Yumeka999
Copy link

What happened?

when i use this code to test katib (ReadME.md example)

import kubeflow.katib as katib

# Step 1. Create an objective function.
def objective(parameters):
    # Import required packages.
    import time
    time.sleep(5)
    # Calculate objective function.
    result = 4 * int(parameters["a"]) - float(parameters["b"]) ** 2
    # Katib parses metrics in this format: <metric-name>=<metric-value>.
    print(f"result={result}")

# Step 2. Create HyperParameter search space.
parameters = {
    "a": katib.search.int(min=10, max=20),
    "b": katib.search.double(min=0.1, max=0.2)
}

# Step 3. Create Katib Experiment.
katib_client = katib.KatibClient()
name = "tune-experiment"
katib_client.tune(
    name=name,
    objective=objective,
    parameters=parameters,
    objective_metric_name="result",
    max_trial_count=12
)

# Step 4. Get the best HyperParameters.
print(katib_client.get_optimal_hyperparameters(name))

What did you expect to happen?

python run_katib.py

I get this error:

Traceback (most recent call last):
  File "run_katib.py", line 1, in <module>
    import kubeflow.katib as katib
  File "/root/.local/lib/python3.8/site-packages/kubeflow/katib/__init__.py", line 73, in <module>
    from kubeflow.katib.api.katib_client import KatibClient
  File "/root/.local/lib/python3.8/site-packages/kubeflow/katib/api/katib_client.py", line 30, in <module>
    class KatibClient(object):
  File "/root/.local/lib/python3.8/site-packages/kubeflow/katib/api/katib_client.py", line 36, in KatibClient
    namespace: str = utils.get_default_target_namespace(),
  File "/root/.local/lib/python3.8/site-packages/kubeflow/katib/utils/utils.py", line 37, in get_default_target_namespace
    return get_current_k8s_namespace()
  File "/root/.local/lib/python3.8/site-packages/kubeflow/katib/utils/utils.py", line 30, in get_current_k8s_namespace
    with open("/var/run/secrets/kubernetes.io/serviceaccount/namespace", "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/var/run/secrets/kubernetes.io/serviceaccount/namespace'

Environment

Kubernetes version:

WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.12", GitCommit:"ef70d260f3d036fc22b30538576bbf6b36329995", GitTreeState:"clean", BuildDate:"2023-03-15T13:37:18Z", GoVersion:"go1.19.7", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.12", GitCommit:"ef70d260f3d036fc22b30538576bbf6b36329995", GitTreeState:"clean", BuildDate:"2023-03-15T13:30:13Z", GoVersion:"go1.19.7", Compiler:"gc", Platform:"linux/amd64"}

Katib controller version:

harbor.xnunion.com/kubeflow/kubeflowkatib/katib-controller:v0.17.0(

Katib Python SDK version:

Name: kubeflow-katib
Version: 0.17.0
Summary: Katib Python SDK for APIVersion v1beta1
Home-page: https://github.com/kubeflow/katib/tree/master/sdk/python/v1beta1
Author: Kubeflow Authors
Author-email: [email protected]
License: Apache License Version 2.0
Location: /root/.local/lib/python3.8/site-packages
Requires: certifi, grpcio, kubernetes, protobuf, setuptools, six, urllib3
Required-by:

Impacted by this bug?

Give it a 👍 We prioritize the issues with most 👍

@andreyvelich
Copy link
Member

Thank you for creating this @Yumeka999!
From where did you run the SDK ?
Also, can you check this directory?

ls -la /var/run/secrets/kubernetes.io/

/area sdk
/remove-label lifecycle/needs-triage

@Yumeka999
Copy link
Author

I run python script code in physical machine which install k8s

ls -la /var/run/secrets/kubernetes.io/

drwxr-xr-x 3 root root 60 9月   4 14:36 .
drwxr-xr-x 3 root root 60 9月   4 14:36 ..
drwxr-xr-x 2 root root 40 10月  1 20:25 serviceaccount

and ls -la /var/run/secrets/kubernetes.io/serviceaccount

drwxr-xr-x 2 root root 40 10月  1 20:25 .
drwxr-xr-x 3 root root 60 9月   4 14:36 ..

@Yumeka999
Copy link
Author

Thank you for creating this @Yumeka999! From where did you run the SDK ? Also, can you check this directory?

ls -la /var/run/secrets/kubernetes.io/

/area sdk /remove-label lifecycle/needs-triage

here is result

@andreyvelich
Copy link
Member

andreyvelich commented Oct 1, 2024

Usually, this folder should indicate the namespace where you run your Pod's container: /var/run/secrets/kubernetes.io/serviceaccount/namespace
But, since you run this script from local machine, this directly should not exist.

Do you know how did you create the /var/run/secrets/kubernetes.io/ directory ?

@Yumeka999
Copy link
Author

Usually, this folder should indicate the namespace where you run your Pod's container: /var/run/secrets/kubernetes.io/serviceaccount/namespace But, since you run this script from local machine, this directly should not exist.

Do you know how did you create the /var/run/secrets/kubernetes.io/ directory ?

The Dir /var/run/secrets/kubernetes.io/ has been exsisted and I don't know the directory how to be cretead

Is the python script in Quickstart could be run in local machine?
Or
Is the python script in Quickstart should be run in pod ?

@tenzen-y
Copy link
Member

tenzen-y commented Oct 8, 2024

I guess that the root cause is

def is_running_in_k8s():
return os.path.isdir("/var/run/secrets/kubernetes.io/")
.

Even though Katib SDK recognizes based on "/var/run/secrets/kubernetes.io/" that SDK is performed in Pod, your local (not in Pod) has the directory.

@andreyvelich
Copy link
Member

That's right, and you can execute the above code from local machine and from the pod.
We are just using different mechanism to get current namespace:

@Yumeka999
Copy link
Author

I guess that the root cause is

def is_running_in_k8s():
return os.path.isdir("/var/run/secrets/kubernetes.io/")

.
Even though Katib SDK recognizes based on "/var/run/secrets/kubernetes.io/" that SDK is performed in Pod, your local (not in Pod) has the directory.

if i delete the dir "/var/run/secrets/kubernetes.io/" and run the python code again, i try

@Yumeka999
Copy link
Author

when i delete the dir "/var/run/secrets/kubernetes.io/" and run the python code again

There is the new exception , it's

Traceback (most recent call last):
  File "/root/.local/lib/python3.8/site-packages/kubeflow/katib/api/katib_client.py", line 91, in create_experiment
    self.custom_api.create_namespaced_custom_object(
  File "/root/.local/lib/python3.8/site-packages/kubernetes/client/api/custom_objects_api.py", line 225, in create_namespaced_custom_object
    return self.create_namespaced_custom_object_with_http_info(group, version, namespace, plural, body, **kwargs)  # noqa: E501
  File "/root/.local/lib/python3.8/site-packages/kubernetes/client/api/custom_objects_api.py", line 344, in create_namespaced_custom_object_with_http_info
    return self.api_client.call_api(
  File "/root/.local/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 348, in call_api
    return self.__call_api(resource_path, method,
  File "/root/.local/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
    response_data = self.request(
  File "/root/.local/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 391, in request
    return self.rest_client.POST(url,
  File "/root/.local/lib/python3.8/site-packages/kubernetes/client/rest.py", line 275, in POST
    return self.request("POST", url,
  File "/root/.local/lib/python3.8/site-packages/kubernetes/client/rest.py", line 234, in request
    raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({'Audit-Id': '8f8ede55-980b-4ea8-9a17-4a7f1a1e377c', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': '7e96f0b0-f9b8-4ece-89db-ae85dc2e5bb9', 'X-Kubernetes-Pf-Prioritylevel-Uid': '97abe02e-d257-421c-89b7-0fba6242fd4f', 'Date': 'Wed, 09 Oct 2024 03:29:23 GMT', 'Content-Length': '335'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"admission webhook \"validator.experiment.katib.kubeflow.org\" denied the request: Cannot create the Experiment \"tune-experiment\" in namespace \"default\": the namespace lacks label \"katib.kubeflow.org/metrics-collector-injection: enabled\"","code":400}



During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "run_katib.py", line 22, in <module>
    katib_client.tune(
  File "/root/.local/lib/python3.8/site-packages/kubeflow/katib/api/katib_client.py", line 314, in tune
    self.create_experiment(experiment, namespace)
  File "/root/.local/lib/python3.8/site-packages/kubeflow/katib/api/katib_client.py", line 103, in create_experiment
    raise RuntimeError(
RuntimeError: Failed to create Katib Experiment: default/tune-experiment

@Yumeka999
Copy link
Author

Now my env:

Kubernetes version:

WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.12", GitCommit:"ef70d260f3d036fc22b30538576bbf6b36329995", GitTreeState:"clean", BuildDate:"2023-03-15T13:37:18Z", GoVersion:"go1.19.7", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.12", GitCommit:"ef70d260f3d036fc22b30538576bbf6b36329995", GitTreeState:"clean", BuildDate:"2023-03-15T13:30:13Z", GoVersion:"go1.19.7", Compiler:"gc", Platform:"linux/amd64"}

Katib controller version:

harbor.xnunion.com/kubeflow/kubeflowkatib/katib-controller:v0.15.0

Katib Python SDK version:

Name: kubeflow-katib
Version: 0.15.0
Summary: Katib Python SDK for APIVersion v1beta1
Home-page: https://github.com/kubeflow/katib/tree/master/sdk/python/v1beta1
Author: Kubeflow Authors
Author-email: [email protected]
License: Apache License Version 2.0
Location: /root/.local/lib/python3.8/site-packages
Requires: certifi, grpcio, kubernetes, protobuf, setuptools, six, urllib3
Required-by:

Kubernetes Python SDK version:

Name: kubernetes
Version: 23.6.0
Summary: Kubernetes python client
Home-page: https://github.com/kubernetes-client/python
Author: Kubernetes
Author-email:
License: Apache License Version 2.0
Location: /root/.local/lib/python3.8/site-packages
Requires: certifi, google-auth, python-dateutil, pyyaml, requests, requests-oauthlib, setuptools, six, urllib3, websocket-client
Required-by: kubeflow-katib

@andreyvelich
Copy link
Member

RuntimeError: Failed to create Katib Experiment: default/tune-experiment

@Yumeka999 Please use the kubeflow namespace in your Katib Client as described in this getting started example: https://www.kubeflow.org/docs/components/katib/getting-started/#getting-started-with-katib-python-sdk.
Since the namespace where you create Katib Experiments must have this label: katib.kubeflow.org/metrics-collector-injection: enabled.
I will update the README instructions.

@Yumeka999
Copy link
Author

RuntimeError: Failed to create Katib Experiment: default/tune-experiment

@Yumeka999 Please use the kubeflow namespace in your Katib Client as described in this getting started example: https://www.kubeflow.org/docs/components/katib/getting-started/#getting-started-with-katib-python-sdk. Since the namespace where you create Katib Experiments must have this label: katib.kubeflow.org/metrics-collector-injection: enabled. I will update the README instructions.

It's error in kubeflow-katib==0.15.0, I find there not exists paramter namespace in the init() method ofclass KatibClient

@andreyvelich
Copy link
Member

It's error in kubeflow-katib==0.15.0, I find there not exists paramter namespace in the init() method ofclass KatibClient

@Yumeka999 Please can you try to use Katib 0.17 and try this example: https://www.kubeflow.org/docs/components/katib/getting-started/#getting-started-with-katib-python-sdk

@Yumeka999
Copy link
Author

It's error in kubeflow-katib==0.15.0, I find there not exists paramter namespace in the init() method ofclass KatibClient

@Yumeka999 Please can you try to use Katib 0.17 and try this example: https://www.kubeflow.org/docs/components/katib/getting-started/#getting-started-with-katib-python-sdk

OK, Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants