You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
*** Discussion title: Computing Tools
thanks Tamas,
well.. if you type 'crab status --long' things are more clear [1],
all probe jobs failed because running out of memory.
I do not think we can venture into improving the automatic splitting
algorithm to better estimate the needed memory, some work on that
side had been done already originally and it is supported in an
'as is' way since its developer left.
But it is indeed a long standing shortcoming that the plain
'crab status' output in this case could say "all probe jobs failed"
instead of just "refer to this FAQ for ....".
I will add to the (alas long) list of minor improvements to do.
Stefano
[1]
Status on the scheduler: FAILED
The job splitting of this task is 'Automatic', please refer to this FAQ for a description of the jobs status summary:
https://twiki.cern.ch/twiki/bin/view/CMSPublic/CRAB3FAQ#What_is_the_Automatic_splitting
Probe stage log: https://cmsweb.cern.ch:8443/scheddmon/0144/cms578/211220_225709:tvami_crab_Analysis_SingleMuon_UL2017CEra_wProbQ_newMethod_v1/AutomaticSplitting_Log0.txt
More details of Automatic Splitting process for this task (including possible failures) are in the Dagman Log files in:
https://cmsweb.cern.ch:8443/scheddmon/0144/cms578/211220_225709:tvami_crab_Analysis_SingleMuon_UL2017CEra_wProbQ_newMethod_v1/AutomaticSplitting/
Probe jobs status: no output 100.0% (5/5)
No publication information available yet
Error Summary: (use crab status --verboseErrors for details about the errors)
5 jobs failed with exit code 50660
Have a look at https://twiki.cern.ch/twiki/bin/viewauth/CMSPublic/JobExitCodes for a description of the exit codes.
Extended Job Status Table:
Job State Most Recent Site Runtime Mem (MB) CPU % Retries Restarts Waste Exit Code
0-1 no output T2_DE_DESY 0:15:11 2122 13 0 0 0:02:46 50660
0-2 no output T2_DE_DESY 0:15:12 2118 17 0 0 0:02:51 50660
0-3 no output T2_DE_DESY 0:15:11 2155 12 0 0 0:02:55 50660
0-4 no output T2_DE_DESY 0:10:11 2009 7 0 0 0:02:54 50660
0-5 no output T2_DE_DESY 0:15:12 2142 21 0 0 0:02:52 50660
On 21/12/2021 01:00, Tamas Vami wrote:
>
> *** Discussion title: Computing Tools
>
> Hi Stefano,
>
> My automatic splitting failed on the scheduler, here is the HELP link
> https://cmsweb.cern.ch/crabserver/ui/task/211220_225709%3Atvami_crab_Analysis_SingleMuon_UL2017CEra_wProbQ_newMethod_v1
> and here is the Probe stage log:
> https://cmsweb.cern.ch:8443/scheddmon/0144/cms578/211220_225709:tvami_crab_Analysis_SingleMuon_UL2017CEra_wProbQ_newMethod_v1/AutomaticSplitting_Log0.txt
>
> The error it reports is "ZeroDivisionError: division by zero".
> The Status on the CRAB server still says "SUBMITTED".
>
> It seems to be working fine if I change to a lumi based splitting.
>
> Can you please have a look at this error?
> Cheers,
> Tamas
>
> [ MIME part of type text/html without a name stripped ]
>
> -------------------------------------------------------------
> Visit this CMS message (to reply or unsubscribe) at:
> https://hypernews.cern.ch/HyperNews/CMS/get/computing-tools/6264.html
>
-------------------------------------------------------------
Visit this CMS message (to reply or unsubscribe) at:
https://hypernews.cern.ch/HyperNews/CMS/get/computing-tools/6264/1.html
The text was updated successfully, but these errors were encountered:
from https://hypernews.cern.ch/HyperNews/CMS/get/computing-tools/6264/1.html
and also the predag log should avoid raising a divide by 0 exception because the
number of processed events is 0 !
The text was updated successfully, but these errors were encountered: