You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It may happen to cancel jobs during their execution. However, this currently only stops the job running on the host machine, but it doesn't prevent the device to keep executing the current experiment and pulses, since it is actually receiving no message at all (the corresponding process is just completely halting w/o telling anything to the device).
So, we should handle signals happening during experiments, to ensure that they are properly propagating the message to the devices themselves before closing.
Implementation proposal
With Python, it is possible to handle system signals with standard library.
We can install a signal handler like the following:
though I wonder whether we should make it part of the library, or ask the user to invoke it explicitly (since it will affect the global state of execution).
While .disconnect() is already part of the Platform and Instrument interface, that's not the case for a job cancellation action. So, we should add even that.
Fun facts
Signals sent by some events.
scancel: SIGTERM (15)
scancel -s N: SIG* (N)
CTRL+C: SIGINT (2)
Warning
On srun CTRL+C on Linux is consistently turned into a SIGINT, but on srun, the srun command itself is in the way of the CTRL+C, so the signal sometimes is received by the Slurm process on the client, rather than propagated to the process on the queue. The best advice for these cases is avoiding srun to dispatch jobs on the devices, and in case you did, do not stop it with a CTRL+C, but rather use scancel as well (which will send a proper SIGTERM)
The text was updated successfully, but these errors were encountered:
It may happen to cancel jobs during their execution. However, this currently only stops the job running on the host machine, but it doesn't prevent the device to keep executing the current experiment and pulses, since it is actually receiving no message at all (the corresponding process is just completely halting w/o telling anything to the device).
So, we should handle signals happening during experiments, to ensure that they are properly propagating the message to the devices themselves before closing.
Implementation proposal
With Python, it is possible to handle system signals with standard library.
We can install a signal handler like the following:
though I wonder whether we should make it part of the library, or ask the user to invoke it explicitly (since it will affect the global state of execution).
While
.disconnect()
is already part of thePlatform
andInstrument
interface, that's not the case for a job cancellation action. So, we should add even that.Fun facts
Signals sent by some events.
scancel
: SIGTERM (15)scancel -s N
: SIG* (N)Warning
On
srun
CTRL+C on Linux is consistently turned into a SIGINT, but onsrun
, thesrun
command itself is in the way of the CTRL+C, so the signal sometimes is received by the Slurm process on the client, rather than propagated to the process on the queue. The best advice for these cases is avoidingsrun
to dispatch jobs on the devices, and in case you did, do not stop it with a CTRL+C, but rather usescancel
as well (which will send a proper SIGTERM)The text was updated successfully, but these errors were encountered: