Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generic device does not support quantity #1217

Open
ChenYi015 opened this issue Nov 26, 2024 · 0 comments
Open

generic device does not support quantity #1217

ChenYi015 opened this issue Nov 26, 2024 · 0 comments
Labels

Comments

@ChenYi015
Copy link
Collaborator

What happened?

When I submit a training job with generic device:

arena submit tfjob \
    ... \
    --device kubernetes.io/batch-cpu=1000 \
    --device kubernetes.io/batch-memory=1024 \
    ...

arena will failed to submit, the error is as follows:

ERRO[0000] failed to validate command args: Invalid device value kubernetes.io/batch-cpu=1k should be a number, refer to amd.com/gpu=1.

What did you expect to happen?

The generic device value should support k8s quantity rather than integer.

Environment

Arena version:

$ arena version
arena: v0.12.0+6c24c79
  BuildDate: 2024-11-11T06:12:45Z
  GitCommit: 6c24c7950d9a959e2577d2c7c278a7c3e0611ec7
  GitTreeState: clean
  GoVersion: go1.23.2
  Compiler: gc
  Platform: darwin/arm64

Impacted by this bug?

Give it a 👍 We prioritize the issues with most 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant