We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Under some circumstances slurm epilog fail to cleanup processes because of parsing of nvidia-smi pmon
From /var/log/slurm/prolog-epilog
Regular output should work well, but if for some reason output will contain one more comment line before processes list it will catch non PID line
root@hpc-hostname:~# nvidia-smi pmon -c 1 # gpu pid type sm mem enc dec command # Idx # C/G % % % % name 0 - - - - - - - 1 - - - - - - - 2 - - - - - - - 3 - - - - - - - 4 - - - - - - - 5 - - - - - - - 6 - - - - - - - 7 - - - - - - -
The text was updated successfully, but these errors were encountered:
#1316 proposed solution
Sorry, something went wrong.
This issue is stale because it has been open for 60 days with no activity. Please update the issue or it will be closed in 7 days.
No branches or pull requests
Under some circumstances slurm epilog fail to cleanup processes because of parsing of nvidia-smi pmon
From /var/log/slurm/prolog-epilog
<13>Sep 10 15:12:33 slurm-epilog: Killing residual GPU process Idx ...
/etc/slurm/epilog.d/50-exclusive-gpu: line 12: kill: Idx: arguments must be process or job IDs
Regular output should work well, but if for some reason output will contain one more comment line before processes list
it will catch non PID line
root@hpc-hostname:~# nvidia-smi pmon -c 1
# gpu pid type sm mem enc dec command
# Idx # C/G % % % % name
0 - - - - - - -
1 - - - - - - -
2 - - - - - - -
3 - - - - - - -
4 - - - - - - -
5 - - - - - - -
6 - - - - - - -
7 - - - - - - -
The text was updated successfully, but these errors were encountered: