-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically handle failed DBS migrations #8244
Comments
side task
|
this happened again https://cms-talk.web.cern.ch/t/crab-job-pubilcation-failed/39892 Better fix the code. |
take this change to
|
dump of most recent error in human format (i.e. pprint)
|
notice that the [1] https://github.com/dmwm/dbs2go/blob/28e02bde209af797af7d59d4f7e3baba25a98605/dbs/errors.go#L64 |
Thanks Dario, As much as I hate to parse messages, I'd rather not ask for a change in semantic where the API returns success and reports the existing migrationId, as the old server was doing. I tried to "protect us against changes" via |
all of this is now in my branch https://github.com/belforte/CRABServer/tree/deal-with-failed-migrations-8244 |
my test task got 50 files correctly published. So at least code is not badly broken by the changes to unify DBS access in the common PublisherDbsUtils. |
put my branch on preprod and ran a couple Jenkins ST tests. In the meanwhile I am declaring changes to TaskPublish.py (the FTS one) tested enought to make a PR. I still need to find a way to test migrations. But do not know how to quickly find a dataset which nobody used yet ! |
Closed via #8378 I could not test failed migrations, but at worst they will be still broken and I will debug when I have an example |
see #7469 (comment)
I did this [1] in a python shell in the Publisherl, following example in /afs/cern.ch/user/b/belforte/WORK/DBS/migDbg.py [2]
NOTE need to change from
cmsweb
tocmsweb-prod
. migration server does not run on the "for users" cmsweb cluster[1]
I replayed the migration request that failed in the TaskPublish script, then deleted the exisint migrationId and submitted again, eventually the new migration failed with
status 4
which means "block already at destination", which is just fine. So I manually ran TaskPublish for that task and everything went OK[2]
cat migDbg.py
The text was updated successfully, but these errors were encountered: