We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When I submit a training job with generic device:
arena submit tfjob \ ... \ --device kubernetes.io/batch-cpu=1000 \ --device kubernetes.io/batch-memory=1024 \ ...
arena will failed to submit, the error is as follows:
ERRO[0000] failed to validate command args: Invalid device value kubernetes.io/batch-cpu=1k should be a number, refer to amd.com/gpu=1.
The generic device value should support k8s quantity rather than integer.
Arena version:
$ arena version arena: v0.12.0+6c24c79 BuildDate: 2024-11-11T06:12:45Z GitCommit: 6c24c7950d9a959e2577d2c7c278a7c3e0611ec7 GitTreeState: clean GoVersion: go1.23.2 Compiler: gc Platform: darwin/arm64
Give it a 👍 We prioritize the issues with most 👍
The text was updated successfully, but these errors were encountered:
No branches or pull requests
What happened?
When I submit a training job with generic device:
arena will failed to submit, the error is as follows:
What did you expect to happen?
The generic device value should support k8s quantity rather than integer.
Environment
Arena version:
Impacted by this bug?
Give it a 👍 We prioritize the issues with most 👍
The text was updated successfully, but these errors were encountered: