You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
By default when you remove a job from condor queue it is also immedetialy removed from the condor_q jobs list. We have seen that in this condition BLAH/CREAM does not update the job status so the job remains in the last registered status for two months [blah default purge interval]. With high jobs cancel rate the DB can grow indefinitely, slowing the queries thus lead the CREAM CE to an unresponsive status.
Here at INFN-Bari T2 HTCondor cluster the JOB_IS_FINISHED_INTERVAL knob is set to the default 0,
because we can access to the job information using the condor_history command.
But from blah/CREAM point of view it is mandatory to have access to the removed job trough the condor_q command. For this reason the "leave_in_queue" knob in the file "src/scripts/condor_submit.sh" should care about the removed jobs too.
The text was updated successfully, but these errors were encountered:
By default when you remove a job from condor queue it is also immedetialy removed from the condor_q jobs list. We have seen that in this condition BLAH/CREAM does not update the job status so the job remains in the last registered status for two months [blah default purge interval]. With high jobs cancel rate the DB can grow indefinitely, slowing the queries thus lead the CREAM CE to an unresponsive status.
Here at INFN-Bari T2 HTCondor cluster the JOB_IS_FINISHED_INTERVAL knob is set to the default 0,
because we can access to the job information using the condor_history command.
But from blah/CREAM point of view it is mandatory to have access to the removed job trough the condor_q command. For this reason the "leave_in_queue" knob in the file "src/scripts/condor_submit.sh" should care about the removed jobs too.
The text was updated successfully, but these errors were encountered: