Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track and display pids in spfs runtime info #1059

Open
dcookspi opened this issue Jun 25, 2024 · 1 comment
Open

Track and display pids in spfs runtime info #1059

dcookspi opened this issue Jun 25, 2024 · 1 comment
Labels
agenda item Items to be brought up at the next dev meeting SPI AOI Area of interest for SPI SPI-0.41

Comments

@dcookspi
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

Sometimes we see a lot of stopping runtimes on a host when using spfs runtime list. Particularly on renderfam hosts that launch many subprocesses inside each /spfs runtime.

spfs-monitor is running for those runtimes, but the owner process and original /spfs mounstspawn process are not running (according to spfs runtime list). But some other processes are still running in/using that /spfs mount.

Unless you have root permissions and wade through the /proc/fs looking for the mounts you can't tell what processes are still keeping that /spfs alive.

Describe the solution you'd like
Have spfs runtime list, or spfs runtime info, show which pid/s are still using a runtime's /spfs so we can see which processes spfs-monitor and /spfs are waiting on and see what's keeping the mount active.

Additional context
This probably involves keeping more pid info in the runtime and having spfs-monitor update it periodically as processes die/start. That needs some discusson and sanity checking.

@dcookspi dcookspi added agenda item Items to be brought up at the next dev meeting SPI AOI Area of interest for SPI SPI-0.41 labels Jun 25, 2024
@rydrman
Copy link
Collaborator

rydrman commented Sep 4, 2024

From the meeting today:

  • adding at least one PID to the runtime to represent why the runtime is still alive
  • would be great if we could try at least a little to ensure it's a useful one - maybe prefer numbers that still exist across time? Not critical but nice to have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agenda item Items to be brought up at the next dev meeting SPI AOI Area of interest for SPI SPI-0.41
Projects
None yet
Development

No branches or pull requests

2 participants