Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting munge key does not restart the munged service #15

Closed
jedel1043 opened this issue Jun 7, 2024 · 3 comments
Closed

Setting munge key does not restart the munged service #15

jedel1043 opened this issue Jun 7, 2024 · 3 comments
Labels
wontfix This will not be worked on

Comments

@jedel1043
Copy link
Contributor

Related to #14.

Running snap set slurm munge.key=<KEY> does not automatically restart the munged service. The user has to manually run snap restart slurm.munged in order for munged to pick up the new key.

@NucciTheBoss
Copy link
Member

Hmm... so this one is intentional. This key being synchronized across the cluster is crucial to Slurm's functionality. If your munge key gets out of sync during a refresh, your entire cluster will collapse. No exit code is emitted by Slurm as well if the keys do not match. The Slurm daemons will still be marked as active even though they cannot communicate with each other. My concern here is with being able to do controlled refreshes of the key. This is the typical flow I've seen for refreshing the munge key in a Slurm cluster:

  1. Update the munge key on the main Slurm controller (slurmctld)
  2. Propagate the key out to the other Slurm daemons (slurmd, slurmdbd, slurmrestd)
  3. Restart munged cluster wide

I think having the user explicitly restart munged when they're ready after all the keys have been set into position is better than doing it automatically in the configure hook since we can't necessarily guarantee how the user will go about setting the new key if they're just using the snap.

What if we included visual feedback in the shell indicates that the munged service needs to be restarted after setting a new key? There's already a message sent to the hooks log in $SNAP_COMMON:

$ snap set slurm munge.key=<key>
INFO: service `slurm.munged` must be restarted for latest key to take effect

This way we make it clear to the user that they need to restart munged for their latest changes to take effect, and gives them more control over the refresh. Also, less chance of us eating their cluster unintentionally. Note that we can set our own refresh policy within the Slurm charms, so it's relatively inexpensive for us to set the new key and restart the service when we're ready from charm code.

@NucciTheBoss
Copy link
Member

Also, if we go ahead with the enhancement proposed in #14, I will likely remove the option to configure the munge key using snap set ... and snap get ... since it could introduce coherency issues.

@NucciTheBoss
Copy link
Member

#14 was implemented, so I'm going to close this issue. You can still set the munge key via the snap configuration options, but we are going to remove this in the future when we go to fix #19 and refactor slurmhelpers

@NucciTheBoss NucciTheBoss added the wontfix This will not be worked on label Jul 12, 2024
@NucciTheBoss NucciTheBoss closed this as not planned Won't fix, can't repro, duplicate, stale Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants