-
Notifications
You must be signed in to change notification settings - Fork 107
Data Locking in WMCore
Datasets and files that are in use by active workflows need to be excluded from the automatic data cleanup mechanisms in Dynamic Data Management (DDM/Dynamo). This "data locking" feature was previously provided by Unified. Now the functionality is provided by WMCore.
- Data locking is done at the dataset level. Locking at the block level is not implemented.
- Data locking is global. Site specific locks are not implemented.
- Datasets associated with workflows in the following states need to be locked.
['assignment-approved', 'assigned', 'staging', 'staged', 'failed',
'acquired', 'running-open', 'running-closed', 'force-complete', 'completed', 'closed-out']
-
Any parent datasets of datasets used by workflows with the property:
"IncludeParents": True
-
Transient output and unmerged LFNs Generated by retrieving the workflow property:
OutputModulesLFNBases
-
"Ad hoc" locks A set of manually configurable things to lock. Provided by "adhoc_lock.json" in Unified. Should now be configued manually by DDM.
A combination of the WMStats API globallocks
and the ReqMgr2 API parentlocks
is used to determine the set of global datasets that are in use and should not be removed. THe WMStats protectedlfns
API provides a list of transient output datasets and "unmerged" base directories. API details are at https://github.com/dmwm/WMCore/wiki/wmstatsserver-api and https://github.com/dmwm/WMCore/wiki/ReqMgr2-apis.
DBS - must be queried to discover parent datasets
These APIs will likely go away after the transition to Rucio. New requests could include a property for any required parent datasets. This would eliminate DBS dependency and potential issues arising from failed DBS queries. Any request already in the system would not have this property, some could still be active for ~6 months.